CS Ph.D. Student

Mila, University of Montreal

alt text My name is Yuchen Lu. I am currently a P.hD. candidate at the Mila lab of University of Montreal, supervised by Prof. Aaron Courville. Before that I received my undergraduate degree at UIUC working with Prof. Jian Peng. I was also an undergrad at Shanghai Jiao Tong University.

My fundamental research interest is language learning as systematic generalization. Humans are able to generate unseen novel utterances from a limited sample of data, while current machine learning approaches fall short on. Building an intelligent agent that is able to acquire the language as efficient as humans is the important next step as we are seeing the marginal effect of increasing model size of language models. I believe there are two main missing pieces of puzzles:

Humans learn the language in an embodied environment, and humans acquire language as a tool to influence the world around them. We should model situated language learning beyond learning from a static corpus.
Language evolves and adapts to an iterated transmission process so that it becomes structured and easy-to-acquire for the later generations. We should model this cultural evolution aspect of language in our language learning.

I enjoying seeing the impact of my research. Recently, our research team, parternered with WebDip successfully developed an AI player for the board game Diplomacy, and it’s covered in one of the most popular podcast channels in the community.

I also co-founded Tuninsight, an award-winning Montreal-based start-up.

The email is luyuchen [DOT] paul [AT] gmail [DOT] com.

Interests

Natural Language Processing
Emergent Commuincation
Embodied Language Learning

Education

BSc in CS, 2015-2017

University of Illinois, Urbana-Champaign
BSc in ECE, 2013-2015

Shanghai Jiaotong University

Recent News

[05/17/2021] I will join Facebook as a research intern this summer on the topic of muli-modal pretraining

[01/12/2020] Our paper on Iterated learning for emergent systematicity in VQA is accepted at ICLR2021 (Oral)

[01/12/2020] New paper on Unsupervised Task Decomposition is accepted at ICLR2021

Experience

Research Intern

Facebook

May 2021 – Aug 2021 Montreal, Canada

I focus on the problem of large-scale multimodal pretraining.

Research Intern (Canceled due to COVID)

MIT-IBM Watson AI Lab

Jun 2020 – Sep 2020 Boston, US

Hosted by Chuang Gan. I studied the problem of unsupervised task decomposition from unstructured demonstration. Work accepted at ICLR2021.

Research Intern

Horizon Robotics

May 2017 – Aug 2017 Beijing, China

Hosted by Heng Luo, I researched about adverasial example and the label leaking effects

Selected Publications

Ankit Vani, Max Schwarzer, Yuchen Lu, Eeshan Dhekane, Aaron Courville

January 2021 ICLR(Oral)

Iterated learning for emergent systematicity in VQA

Although neural module networks have an architectural bias towards compositionality, they require gold standard layouts to generalize systematically in practice. When instead learning layouts and modules jointly, compositionality does not arise automatically and an explicit pressure is necessary for the emergence of layouts exhibiting the right structure. We propose to address this problem using iterated learning, a cognitive science theory of the emergence of compositional languages in nature that has primarily been applied to simple referential games in machine learning. Considering the layouts of module networks as samples from an emergent language, we use iterated learning to encourage the development of structure within this language. We show that the resulting layouts support systematic generalization in neural agents solving the more complex task of visual question-answering. Our regularized iterated learning method can outperform baselines without iterated learning on SHAPES-SyGeT (SHAPES Systematic Generalization Test), a new split of the SHAPES dataset we introduce to evaluate systematic generalization, and on CLOSURE, an extension of CLEVR also designed to test systematic generalization. We demonstrate superior performance in recovering ground-truth compositional program structure with limited supervision on both SHAPES-SyGeT and CLEVR.

PDF

Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Josh B. Tenenbaum, Chuang Gan

January 2021 ICLR

Learning Task Decomposition with Ordered Memory Network

Many complex real-world tasks are composed of several levels of sub-tasks. Humans leverage these hierarchical structures to accelerate the learning process and achieve better generalization. In this work, we study the inductive bias and propose Ordered Memory Policy Network (OMPN) to discover subtask hierarchy by learning from demonstration. The discovered subtask hierarchy could be used to perform task decomposition, recovering the subtask boundaries in an unstructured demonstration. Experiments on Craft and Dial demonstrate that our model can achieve higher task decomposition performance under both unsupervised and weakly supervised settings, comparing with strong baselines. OMPN can also be directly applied to partially observable environments and still achieve higher task decomposition performance. Our visualization further confirms that the subtask hierarchy can emerge in our model.

PDF Code Project

Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

May 2020 ICML

Countering Language Drift with Seeded Iterated Learning

Supervised learning methods excel at capturing statistical properties of language when trained over large text corpora. Yet, these models often produce inconsistent outputs in goal-oriented language settings as they are not trained to completethe underlying task. Moreover, as soon as theagents are finetuned to maximize task completion, they suffer from the so-called language drift phenomenon: they slowly lose syntactic and semantic properties of language as they only focus on solving the task. In this paper, we proposea generic approach to counter language drift by using iterated learning. We iterate between finetuning agents with interactive training steps, and periodically replacing them with new agents that are seeded from last iteration and trained to imitate the latest finetuned models. Iterated learning does not require external syntactic constraint nor semantic knowledge, making it a valuable task-agnostic finetuning protocol. We first explore iterated learning in the Lewis Game. We then scale-up the approach in the translation game. In both settings, our results show that iterated learning drastically counters language drift as well as it improves the task completion metric.

PDF Code Slides

Philip Paquette (co-author), Yuchen Lu (co-author), Steven Bocco, Max O. Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, Aaron Courville

May 2019 NeurIPS

No Press Diplomacy: Modeling Multi-Agent Gameplay

Diplomacy is a seven-player non-stochastic, non-cooperative game, where agents acquire resources through a mix of teamwork and betrayal. Reliance on trust and coordination makes Diplomacy the first non-cooperative multi-agent benchmark for complex sequential social dilemmas in a rich environment. In this work, we focus on training an agent that learns to play the No Press version of Diplomacy where there is no dedicated communication channel between players. We present DipNet, a neural-network-based policy model for No Press Diplomacy. The model was trained on a new dataset of more than 150,000 human games. Our model is trained by supervised learning (SL) from expert trajectories, which is then used to initialize a reinforcement learning (RL) agent trained through self-play. Both the SL and RL agents demonstrate state-of-the-art No Press performance by beating popular rule-based bots.

PDF Code Poster

Slides

Sep 19, 2019 2:00 PM Mila, University of Montreal

CS Ph.D. Student

Mila, University of Montreal

Interests

Education

Recent News

Experience

Research Intern

Facebook

Research Intern (Canceled due to COVID)

MIT-IBM Watson AI Lab

Research Intern

Horizon Robotics

Selected Publications

Iterated learning for emergent systematicity in VQA

Learning Task Decomposition with Ordered Memory Network

Countering Language Drift with Seeded Iterated Learning

No Press Diplomacy: Modeling Multi-Agent Gameplay

Slides

Paper Presentation: Re-evaluate Evaluation

Reinforcement Learning and Control as Probabilistic Inference

Iterated Learning for Deep Learning