Improving Model Predictive Control in Model-based Reinforcement Learning

Nathan Lambert, University of California, Berkeley


Location: Zoom

Time: 2:40p.m.

Abstract: Model-based reinforcement learning is developing into a useful candidate in data-efficient control synthesis for complex robotic tasks. Using simple one-step dynamics models learned from few data has proven useful in a wide variety of simulated and experimental tasks. Frequently, the one step-models are unrolled to form longer trajectory predictions for optimization in model-predictive control. In this talk, we detail how the dual optimizations of accurate one-step predictions and then a trajectory control mechanism can result in an objective mismatch. We then detail work that can begin to address this mismatch and improve the peak performance and computational efficiency of model-based reinforcement learning.

Open-source legged robotics, from hardware to control software

Majid Khadiv, Max-Planck Institute for Intelligent Systems


Location: Zoom

Time: 10:00a.m.

Abstract: Legged robots (especially humanoids) are the most suitable robot platform that could be deployed in our daily lives in the future. However, the complexity in the mechanical structure of these robots as well as the need for an advanced control software hindered progress in this field. On the hardware side, there is no standard hardware such that researchers can use to benchmark and compare their algorithms. Furthermore, legged robots are expensive and not every lab can afford to buy them for research. On the control side, the dynamics of these robots are highly complex which makes their control extremely challenging. This complexity has several aspects: 1) These robots are under-actuated and could easily fall down if not controlled properly, 2) locomotion can only be realized through establishing and breaking contact which enforces a hybrid dynamics, 3) The system is very high dimensional (up to 100 states and 50 control inputs) and the dynamic model is highly nonlinear, 4) the system is extremely constrained due to the limited amount of contact forces between the robot and the environment, etc. In this talk, I will first briefly present our recent efforts in the Open Dynamic Robot Initiative (ODRI) to provide the community with low-cost, but high-performance legged platforms that are fully open-source and can be replicated quickly using 3D-printing technology. I will also extensively talk about my recent efforts to find tractable ways at the intersection of optimal control and reinforcement learning to safely control legged robots in the presence of different uncertainties and disturbances.

Trait-based Coordination of Heterogenous Multi-Agent Teams

Harish Ravichandar, Georgia Institute of Technology


Location: Zoom

Time: 2:40p.m.

Abstract:Heterogeneous multi-agent teams have the potential to carry out complex multi-task operations that are intractable for their homogeneous counterparts. Indeed, heterogeneous teams can impact a wide variety of domains, such as disaster relief, warehouse automation, autonomous driving, defense, and environmental monitoring. However, effective coordination of such teams requires the careful consideration of the teams’ diverse, but finite, resources, as well as its ability to satisfy complex requirements associated with concurrent tasks. In this talk, I will introduce a family of application-agnostic approaches that can coordinate heterogenous multi-agent teams by effectively leveraging the relative strengths of agents when satisfying the requirements of different tasks. A unifying theme across these approaches is that both agents and tasks are modeled in terms of capabilities (i.e., traits). As such, these approaches are readily generalizable to new teams and agents without much additional modeling or computational effort. In particular, I will discuss techniques and challenges associated with i) forming effective coalitions that can satisfy known requirements of heterogenous tasks, and ii) learning how to coordinate heterogenous teams from human experts when exact requirements are unavailable.