Keynote/Invited speakers

  • Byoung-Tak Zhang, Seoul National University
    He is a POSCO Chair professor of Computer Science and Engineering, Seoul National University (SNU) and Director of the SNU Artificial Intelligence Insitute. He has served as President of the Korean Society for Artificial Intelligence (2010-2013) and of the Korean Society for Cognitive Science (2016-2017). He received his PhD in computer science from University of Bonn, Germany in 1992 and his BS and MS in computer science and engineering from Seoul National University, Korea in 1986 and 1988, respectively.
    Title: "Embodied AI: Machine Learning to Learning Machines"
    Abstract: Machine learning (including deep learning) has changed the paradigm of AI from rule-based “manual” programming to data-driven “automatic” programming. However, the current paradigm of machine learning requires some external system that provides them with data, making their scalability limited. Here we argue that the learner can feed itself the data autonomously if it is embodied, i.e. equipped with sensors and actuators. With the perception-action cycle the embodied AI can continually learn to solve problems in a self-teaching way by doing new actions, observing their outcomes, and correcting their own predictions like the humans and animals do. In this talk, I will show some of our studies in this direction of “(embodied) learning machine” research and discuss its implications for achieving truly human-level general AI.



  • Yonatan Bisk, Carnegie Mellon University
    He is an assistant professor of computer science in Carnegie Mellon's Language Technologies Institute. His group works on grounded and embodied natural language processing, placing perception and interaction as central to how language is learned and understood. Previously, he received his PhD from the University of Illinois at Urbana-Champaign working on unsupervised Bayesian models of syntax, before spending time at USC's ISI (working on grounding), the University of Washington (for commonsense research), and Microsoft Research (for vision+language).
    Title: "Following Instructions and Asking Questions"
    Abstract: As we move towards the creation of embodied agents that understand natural language, several new challenges and complexities arise for grounding (e.g. complex state-spaces), planning (e.g. long horizons), and social interaction (e.g. asking for help or clarifications). In this talk, I'll discuss several recent results both on improvements to embodied instruction following within ALFRED and initial steps towards building agents that ask questions or model theory-of-mind.



  • Saurabh Gupta, University of Illinois Urbana-Champaign
    He is an Assistant Professor in the ECE Department at UIUC. Before starting at UIUC, he received his Ph.D. from UC berkeley in 2018 and spent the following year as a Research Scientist at Facebook AI Research in Pittsburgh. His research interests span computer vision, robotics, and machine learning, with a focus on building agents that can intelligently interact with the physical world around them. He received the President's Gold Medal at IIT Delhi in 2011, the Google Fellowship in Computer Vision in 2015, an Amazon Research Award in 2020, and a NSF CAREER Award in 2022.
    Title: "Scaling Robot Learning by Understanding Videos"
    Abstract: True gains of machine learning in AI sub-fields such as computer vision and natural language processing have come about from the use of large-scale diverse datasets for learning. In this talk, I will discuss if and how we can leverage large-scale diverse data in the form of egocentric videos (first-person videos of humans conducting different tasks) to similarly scale up policy learning for robots. I will discuss the challenges this presents, and some of our initial efforts towards tackling them. In particular, I will describe techniques to acquire low-level visuomotor subroutines, high-level value functions, and an interactive understanding of objects from in-the-wild egocentric videos.