Ph.D candidate
🌀 Reasoning and Learning Lab,
McGill University, School of Computer Science
🌀 Mila, Quebec AI Institute
Supervised by Doina Precup.

meditator | yogi | casual naïve artist | self-taught

amateur handbalancer & contortionist-in-progress

earthling | hooooman | occasionally plant

👁️‍🗨️ favorite piece of contemplative wisdom 👁️‍🗨️

"If your mind is empty...it is open to everything.
In the beginner's mind there are many possibilities, but in the expert’s there are few."

—Shunryu Suzuki, “Zen Mind, Beginner’s Mind”

give anonymous feedback 🙏🏽

Hi 🖖🏽

Don't hesitate to reach out if you want to chat about research Schedule a meeting

Bio

Broadly, I am interested in the study of cognition in humans 🧠 and machines 🤖, aiming both to enhance scientific knowledge and understanding, as well as to assist in the development of technology enhancing human health and well-being, and minimizing suffering.

My research spans Reinforcement Learning (RL) and sequential decision making, policy optimization, deep learning and theoretical & computational neuroscience.

I use tools and perspectives from mathematics, statistics, optimization, machine/deep learning, numerical methods, dynamical systems, as well as knowledge and insights from psychopharmacology.

Research questions

🤖 AI-RL, policy optimization, learning theory, deep & lifelong learning, dynamical systems
🧠 neuroscience-study and understanding of the mechanisms underlying sensory, motor or cognitive computations, neuroplasticity, psychedelic research

Fan of research mysteries 🔮 and fundamental questions 🦄

Appreciate research that offers a new understanding insight or connection across branches of research or fields in a unifying truth

Interests

reinforcement learning/policy optimization
deep/online/lifelong learning
computational neuroscience
psychedelic research
altered states of consciousness
causality

Education

PhD in Computer Science (AI/RL/neuro), 2019-present
McGill University / Mila Quebec AI Institute
M.Sc in Computer Science (AI), 2013
University Politehnica of Bucharest
B.Sc in Computer Science (math/CS), 2009
University Politehnica of Bucharest

Research overview

I'm currently focusing on 🤖 acceleration for policy optimization in RL and 🧠 computational models of psychedelic action.

Past research chapters

I previously explored the idea of anticipating the future and adapting to it, and proposed a simple template for accelerating policy gradient algorithms by integrating foresight into the policy improvement step, via optimistic and adaptive policy updates. I defined optimism as predictive modeling of the future behavior of a policy, and adaptivity as taking immediate and anticipatory corrective actions to mitigate accumulating errors from overshooting predictions or delayed responses to change. Currently, I am investigating acceleration within the Policy Mirror Descent (PMD) general family of algorithms, which cover a wide range of novel and fundamental methods in reinforcement learning.

I also studied policy optimization as a joint-maximization problem and worked on a surrogate policy learning objective for the joint maximization of a policy and its value-function. Practical implementations of policy-based algorithms rely on value-functions, represented as neural networks, to compute the policy gradient, introducing challenges related to the accuracy of the policy gradient, particularly in low-capacity regimes, characteristic of agents with bounded rationality.

Previously, I studied credit assignment in value-based agents, focusing on questions related to how agents should model the environments they interact with, whether that be in anticipation using forethought, or retrospectively using hindsight models and backward looking mechanisms for adaptivity.

I then showed how these models can be extended to include selectivity via simple contextual attention-based mechanisms and learn those from experience.

Traditionally, in machine learning, we often assume that the stream of data in the future will resemble the data seen so far, yet these assumptions may not align with the complexity of real-world settings, where the dynamics of the environment evolve in partially predictable ways over time and space. Previously, I have explored some of these challenges, and proposed incorporating predictive knowledge (of the agent's future behavior) to mitigate them.

Industry experience

Research Scientist Intern 🌀 DeepMind Montreal

April 2022 – April 2023

Research Scientist Intern 🌀 DeepMind London

February 2021 – June 2021

Machine Learning Engineer 🌀 Apsisware (‘17-‘18) / Sparktech Software (‘16)

January 2016 – January 2018 Bucharest, Romania

Software Engineer 🌀 Deutsche Bank (‘15)/ Misys (‘14)/ Cronian Labs (‘13)

January 2013 – January 2015 Bucharest, Romania

Academic awards

Bourses de doctorat en recherche

Fonds de recherche du Québec – Nature et technologies (FRQNT) Jun 2022

IVADO’s PhD Excellence scholarship

IVADO Jan 2021

Borealis AI Fellowship

Borealis AI Jan 2020

Excellence Scholarship

Romanian government Jan 2016

Merit Scholarship

Romanian government Jan 2009

Featured Publications & Preprints

Arthur Juliani, Veronica Chelu*, Laura Graesser, Adam Safron (2024). A dual-receptor model of serotonergic psychedelics: therapeutic insights from simulated cortical dynamics. bioRxiv preprint.

PDF

Veronica Chelu*, Tom Zahavy, Arthur Guez, Doina Precup, Sebastian Flennerhag (2023). Acceleration in Policy Optimization. arXiv preprint.

PDF

Veronica Chelu*, Diana Borsa, Doina Precup, Hado van Hasselt (2022). Selective Credit Assignment. arXiv preprint.

PDF

Veronica Chelu*, Doina Precup, Hado van Hasselt (2020). Forethought and Hindsight in Credit Assignment. Advances in Neural Information Processing Systems.

PDF

Recent Publications & Preprints

Quickly discover relevant content by filtering publications.

Anthony GX-Chen, Veronica Chelu*, Blake A. Richards, Joelle Pineau (2022). A Generalized Bootstrap Target for Value-Learning, Efficiently Combining Value and Feature Predictions. AAAI Conference on Artificial Intelligence.

PDF

Arushi Jain, Veronica Chelu*, Sharan Vaswani, Nicolas Le Roux (2022). Actor-critic as a joint maximization problem. RLDM.

Ray Jiang, Shangtong Zhang, Veronica Chelu*, Adam White, Hado van Hasselt (2021). Learning Expected Emphatic Traces for Deep RL. AAAI Conference on Artificial Intelligence.

PDF

Anthony GX-Chen, Veronica Chelu*, Blake Richards, Joelle Pineau (2020). Lambda Successor Return Error. Biological and Artificial Reinforcement Learning Workshop at Neural Information Processing Systems (NeurIPS).

Veronica Chelu*, Doina Precup (2019). Option Discovery by Aiming to Predict. Proceedings of Reinforcement Learning and Decision Making (RLDM), Multi-Task and Lifelong Reinforcement Learning Workshop, Self-Supervised Learning Workshop at International Conference on Machine Learning (ICML).

PDF

See all publications

Ph.D candidate🌀 Reasoning and Learning Lab,McGill University, School of Computer Science🌀 Mila, Quebec AI InstituteSupervised by Doina Precup.