Building Caregiving Robots

Tapomayukh Bhattacharjee, Cornell University

5/20/2021

Location: Zoom

Time: 2:40p.m.

Abstract:How do we build robots that can assist people with mobility limitations with activities of daily living? To successfully perform these activities, a robot needs to be able to physically interact with humans and objects in unstructured human environments. In the first part of my talk, I will show how a robot can use multimodal sensing to infer properties of these physical interactions using data-driven methods and physics-based models. In the second part of the talk, I will show how a robot can leverage these properties to feed people with mobility limitations. Successful robot-assisted feeding depends on reliable bite acquisition of hard-to-model deformable food items and easy bite transfer. Using insights from human studies, I will showcase algorithms and technologies that leverage multiple sensing modalities to perceive varied food item properties and determine successful strategies for bite acquisition and transfer. Using feedback from all the stakeholders, I will show how we built an autonomous robot-assisted feeding system that uses these algorithms and technologies and deployed it in the real world that fed real users with mobility limitations. I will conclude the talk with some ideas for future work in my new lab at Cornell.

A dirty laundry list for paper writing

Kirstin Petersen, Cornell University

5/6/2021

Location: Zoom

Time: 2:40p.m.

Abstract: This is meant as a highly interactive discussion on how to write and review research papers. The “to dos” and, perhaps more importantly “not to dos”.

Learning through Interaction in Cooperative Multi-Agent Systems

Kalesha Bullard, Facebook AI Research

4/29/2021

Location: Zoom

Time: 2:40p.m.

Abstract: Effective communication is an important skill for enabling information exchange and cooperation in multi-agent systems, in which agents coexist in shared environments with humans and/or other artificial agents.  Indeed, human domain experts can be a highly informative source of instructive guidance and feedback (supervision).  My prior work explores this type of interaction in depth, as a mechanism for enabling learning for artificial agents.  However, dependence upon human partners for acquiring or adapting skills has important limitations.  Human time and cognitive load is typically constrained (particularly in realistic settings) and data collection from humans, though potentially qualitatively rich, can be slow and costly to acquire. Yet, the ability to learn through interaction with other agents represents another powerful mechanism for enabling interactive learning.  Though other artificial agents may also be novices, agents can co-learn through providing each other evaluative feedback (reinforcement), given the learning task has been sufficiently structured and allows for generalization to novel settings.

This talk presents research that investigates methods for enabling agents to learn general communication skills through interactions with other agents.  In particular, the talk will focus on my ongoing work within Multi-Agent Reinforcement Learning, investigating emergent communication protocols, inspired by communication in more realistic settings.  We present a novel problem setting and a general approach that allows for zero-shot coordination (ZSC), i.e., discovering protocols that can generalize to independently trained agents.  We also explore and analyze specific difficulties associated with finding globally optimal ZSC protocols, as complexity of the communication task increases or the modality for communication changes (e.g. from symbolic communication to implicit communication through physical movement, by an embodied artificial agent).  Overall, this work opens up exciting avenues for learning general communication protocols in complex domains.

Physically Interactive Intelligence — A path towards autonomous embodied agents

Roberto Martin-Martin, Stanford University

4/22/2021

Location: Zoom

Time: 2:40p.m.

Abstract: What is the role of physical interaction in embodied intelligence? In robotics, physical interaction is often reduced to a minimum because it is considered difficult to plan, control and execute, has unpredictable effects and may be dangerous for the robot and anything or anyone around it. To compensate, we impose extremely high requirements on computation: perception, planning and control. However, when observing humans, we see that our autonomy to perform tasks in a versatile and robust manner come from rich, continuous and resourceful interactions with the environment, what I call Physically Interactive Intelligence.

In my research, I develop new learning algorithms to enable embodied AI agents to exploit interactions to gain autonomy, and I test them in realistic integrated robotic systems. I propose to promote physical interaction to foundational component of novel robotic solutions. I will present new methods to learn to control and exploit physical interactions even for tasks where they are not traditionally used such as perception and navigation. These lines of work support my overall research hypothesis: autonomous behavior and grounded understanding in embodied AI agents are achieved through the resourceful use of physical interaction with the environment, i.e. through physically interactive intelligence.

From Semantics to Localization in LiDAR Maps for Autonomous Vehicles

Abhinav Valada, University of Freiburg

4/15/2021

Location: Zoom

Time: 2:40p.m.

Abstract: LiDAR-based scene interpretation and localization play a critical role in enabling autonomous vehicles to safely navigate in the environment. The last decade has witnessed unprecedented progress in these tasks by exploiting learning techniques to improve the performance and robustness. Despite these advances, the unordered spare, and irregular structure of point clouds pose several unique challenges that lead to suboptimal performance while employing standard convolutional neural networks (CNNs). In this talk, I will discuss three efforts targeted at addressing some of these challenges. First, I will present our state-of-the-art approach to LiDAR panoptic segmentation that employs a 2D CNN while explicitly leveraging the unique 3D information provided by point clouds at multiple stages in the network. I will then present our recent work that incorporates a differentiable unbalanced optimal transport algorithm to detect loop closures in LiDAR point clouds and outperforms both existing learning-based as well as hardcrafted methods. Next, to alleviate the need for expensive LiDAR sensors on every robot, I will present the first approach for monocular camera localization in LiDAR maps that effectively generalizes to new environments without any retraining and independent of the camera parameters. Finally, I will conclude the talk with a discussion on opportunities for further scaling up the learning of these tasks.

Tracking Beyond Recognition

Aljosa Osep, Technical University in Munich

4/8/2021

Location: Zoom

Time: 2:40p.m.

Abstract: Spatio-temporal interpretation of raw sensory data is vital for intelligent agents to understand how to interact with the environment and perceive how trajectories of moving agents evolve in the 4D continuum, i.e., 3D space and time. To this end, I will first talk about our recent efforts in the semantic and temporal understanding of raw sensory data. I will first present our work on multi-object and segmentation. Then, I will discuss how to generalize these ideas towards holistic temporal scene understanding, jointly tackling object instance segmentation, tracking, and semantic understanding of monocular video sequences and LiDAR streams. Finally, I will move on to the challenging problem of scaling object instance segmentation and tracking models to the open world, in which future mobile agents will need to continuously learn without explicit human supervision. In such scenarios, intelligent agents encounter and need to react to unknown dynamic objects that were not observed during the model training.

Model-Based Visual Imitation Learning

Franziska Meier, Facebook AI Research

3/25/2021

Location: Zoom

Time: 2:40p.m.

Abstract: How can we teach robots new skills by simply showing them what to do? In this talk I’m going to present our recent work on learning reward functions from visual demonstrations via model-based inverse reinforcement learning. Given the reward function a robot can then learn the demonstrated task autonomously. More concretely, I will show how we can frame model-based IRL, as a bi-level optimization problem, which then allows to learn reward functions by directly minimizing the distance between a demonstrated trajectory and a predicted trajectory. In order to do so from visual demonstrations, a key ingredient is a visual dynamics model, that enables the robot to predict the visual trajectory if it were to execute a policy. I will discuss, the opportunities and challenges of this research directions, and will end with an outlook for future work.

Generalized Lazy Search for Efficient Robot Motion Planning

Aditya Mandalika, University of Washington

3/11/2021

Location: Zoom

Time: 2:40p.m.

Abstract: Robotics has become a part of the solution in various applications today: autonomous vehicles navigating busy streets, articulated robots tirelessly sorting packages in warehouses, feeding people in care homes and mobile robots assisting in rescue operations. Central to any robot that needs to navigate its environment for its application, is Motion Planning: the task of computing a collision-free motion for a (robotic) system between given start and goal states in an environment cluttered with obstacles. As tasks become more complex, there is a need to develop more sophisticated motion planning algorithms that can compute high quality solutions for the robot quickly.

In this talk, we will specifically investigate the computational bottlenecks in search for the shortest path on a graph: search effort and collision evaluations. Lazy search algorithms can efficiently solve shortest path problems where evaluating edges for collision is expensive, as is the case in robotics. We show that the existing algorithms can provably minimize the number of collision evaluations, but at the cost of increased graph operations. This can be prohibitively expensive in cluttered environments that necessitate large graphs. In this talk, we discuss a framework of lazy search algorithms that seamlessly interleave lazy search with edge evaluations to prevent wasted computational effort and to minimize the total planning time. I will close the talk with a brief discussion on the efficacy of the framework, the potential extensions and (exciting) future work.

Improving Model Predictive Control in Model-based Reinforcement Learning

Nathan Lambert, University of California, Berkeley

3/4/2021

Location: Zoom

Time: 2:40p.m.

Abstract: Model-based reinforcement learning is developing into a useful candidate in data-efficient control synthesis for complex robotic tasks. Using simple one-step dynamics models learned from few data has proven useful in a wide variety of simulated and experimental tasks. Frequently, the one step-models are unrolled to form longer trajectory predictions for optimization in model-predictive control. In this talk, we detail how the dual optimizations of accurate one-step predictions and then a trajectory control mechanism can result in an objective mismatch. We then detail work that can begin to address this mismatch and improve the peak performance and computational efficiency of model-based reinforcement learning.

Open-source legged robotics, from hardware to control software

Majid Khadiv, Max-Planck Institute for Intelligent Systems

2/25/2021

Location: Zoom

Time: 10:00a.m.

Abstract: Legged robots (especially humanoids) are the most suitable robot platform that could be deployed in our daily lives in the future. However, the complexity in the mechanical structure of these robots as well as the need for an advanced control software hindered progress in this field. On the hardware side, there is no standard hardware such that researchers can use to benchmark and compare their algorithms. Furthermore, legged robots are expensive and not every lab can afford to buy them for research. On the control side, the dynamics of these robots are highly complex which makes their control extremely challenging. This complexity has several aspects: 1) These robots are under-actuated and could easily fall down if not controlled properly, 2) locomotion can only be realized through establishing and breaking contact which enforces a hybrid dynamics, 3) The system is very high dimensional (up to 100 states and 50 control inputs) and the dynamic model is highly nonlinear, 4) the system is extremely constrained due to the limited amount of contact forces between the robot and the environment, etc. In this talk, I will first briefly present our recent efforts in the Open Dynamic Robot Initiative (ODRI) to provide the community with low-cost, but high-performance legged platforms that are fully open-source and can be replicated quickly using 3D-printing technology. I will also extensively talk about my recent efforts to find tractable ways at the intersection of optimal control and reinforcement learning to safely control legged robots in the presence of different uncertainties and disturbances.