Multi-sensory programs for physical understanding – Modeling and Inference

Date:  3/23/23

Speaker:  Krishna Murthy

Location: Zoom

Time: 2:40 p.m.-3:30 p.m.


Modern machine learning has unlocked a new level of embodied perception and reasoning abilities by leveraging internet-scale training data. However, such systems fail in unpredictable and unintuitive ways when deployed in real-world applications. These advances have underplayed many classical techniques developed over the past few decades. I postulate that a flexible blend of classical and learned methods is the most promising path to developing flexible, interpretable, and actionable models of the world: a necessity for intelligent embodied agents.

My research intertwines classical and learning-based techniques to bring the best of both worlds, by building multi-sensory models of the 3D world. In this talk, I will share some recent efforts (by me and collaborators) on building world models and inference techniques geared towards spatial and physical understanding. In particular, I will talk about two themes:

  1. leveraging differentiable programs for physical understanding in a dynamic world
  2. integrating features from large learned models for open-set and multimodal perception

Bio:   Krishna Murthy is a postdoc at MIT with Josh Tenenbaum and Antonio Torralba. His research focuses on building multi-sensory world models to help embodied agents perceive, reason about, and act in the world around them. He has organized multiple workshops at ICLR, Neurips, ICCV on themes spanning differentiable programming, physical reasoning, 3D vision and graphics, and ML research dissemination.

His research has been recognized with graduate fellowship awards from NVIDIA and Google (2021); a best paper award from Robotics and Automation letters (2019); and an induction to the RSS Pioneers cohort (2020).