Projects

Industry-focused builds, RL work, and evaluation tooling.

BrowserART

Browser Agent Red-teaming Toolkit

Agents & Safety · 2024–25

Safety benchmark for LLM web agents.

  • 100+ adversarial behaviors across 40 sandboxed sites.
  • Harness + reports for reproducible evaluations.
  • Surfaced jailbreak classes; informed mitigations.
Reasoning RL

Reasoning Datasets & RL Training

RL for LLMs · 2024–25

Curricula + reward design for math/STEM reasoning.

  • Built custom datasets and staged reward schemata.
  • Improved pass@k on targeted benchmarks.
  • Production training pipelines with partners.
CurriculumRLAIF/RLHFEval
Adaptive Guidance

Adaptive Guidance for RL of Reasoning Models

RL for LLMs · 2025 (under review)

Guided training signals to accelerate reasoning RL.

  • Stability and sample-efficiency improvements.
  • Reduces reward hacking via staged curricula.
Rubrics

Rubrics as Rewards: RL Beyond Verifiable Domains

RL for LLMs · 2025

Rubric-driven rewards to train models where exact verification is hard.

  • Task-specific rubrics (clarity, safety, usefulness) as reward signals.
  • Reduces reliance on ground-truth labels; aligns with evaluator preferences.
ToolRL-Val

Tool-RL Data Valuation (ToolRL-Val)

RL for LLMs · 2025 — in progress

Data valuation for tool-using LLMs to guide RL training and curation.

QGFN

QGFN — Controllable Greediness

Exploration & Generative RL · NeurIPS 2024

Action-value modulation for diverse high-reward discovery.

  • Mixture policies with action-value guidance.
  • ~4× more distinct high-reward modes on benchmarks.
Replay

Replay Buffers for Mode Discovery

Exploration & Generative RL · ICML 2023 WS

Ablations on buffer policies for generative exploration.

  • Improved mode coverage vs. baselines.
  • Open scripts for reproducibility.
DeepVent

DeepVent — Clinical RL

Applied RL · AAAI 2023 / RLDM 2022

Conservative RL for ventilator personalization.

  • Offline clinical data; safety-aware training.
  • Equal contribution; peer-reviewed results.
Data Review

Automated Multi-Agent Data Review

Data & Systems · 2024–25

Pipeline that filters low-quality code examples in real-time.

  • Multiple reviewers (heuristic + model-based) with quorum rules.
  • Streaming moderation; audit logs & dashboards.
GFN Feedback

Feedback Usefulness Detection (GFN)

Data & Systems · NVIDIA 2022

Decision model to detect useful user feedback.

  • End-to-end pipeline integrated with team workflows.
  • Telemetry + behavior features; statistical analyses.
Vision Retrieval

Multimodal Small-Object Retrieval

Data & Systems · 2023

Upgraded retrieval stack for small objects.

  • Object-level similarity search; dev-friendly endpoints.
PGiF

Policy Gradients Incorporating the Future

Exploration & RL · ICLR 2022

Improves PG by looking ahead to future state values.