Elaine Lau

I am a Member of Technical Staff at Handshake AI, building frontier benchmark suites and evaluation infrastructure for large language models and agentic AI systems. My work spans benchmark design, data curation, evaluation methodology, and model analysis.

Previously, I was a Machine Learning Research Engineer at Scale AI, building reasoning datasets, training pipelines, and safety evaluation frameworks for LLM-based browser agents, and a researcher at Valence Labs – Recursion developing QGFN. I completed my M.Sc. at McGill University and Mila advised by Doina Precup and Emmanuel Bengio.

news

Dec 2025

Joined Handshake AI as Member of Technical Staff

Building frontier benchmark suites and evaluation infrastructure for LLMs and agentic AI.
2026

ICML 2026 SciPredict

Can LLMs predict outcomes of scientific experiments in natural sciences?

arXiv
Dec 2025

NeurIPS 2025 Workshop Adaptive Guidance for RL of Reasoning Models

Workshop on Efficient Reasoning. Co-authors: V. Nath, A. Gunjal, et al.

arXiv
Oct 2025

ICLR 2026 Rubrics as Rewards

Reinforcement learning beyond verifiable domains. Co-authors: A. Gunjal, A. Wang, et al.

arXiv OpenReview
Jan 2025

Invited Talk AAAI Web Agents 2025

Invited talk on jailbreaks in LLM-powered browser agents.
ICLR 2025

Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents

Refusal-trained LLMs can still be jailbroken as web agents; 100 adversarial behaviors + 40 sandbox sites.

arXiv Code
NeurIPS 2024

QGFN: Controllable Greediness with Action Values

Combines GFlowNet policies with action-value estimates for diverse high-reward generation.

arXiv Code
AAAI 2023

Deep Conservative RL for Mechanical Ventilation

Equal-contribution work applying conservative offline RL to personalize ventilator settings in ICUs.

Paper

Looking for specifics? See publications or projects for links and code.