Skip to Content

Pursuits

My research spans

  • RouteLLM: I was LMSYS research lead and co-first author on a paper for LLM routing using preference data. We showed that our routers can reduce cost by up to 85% on academic benchmarks without compromising quality, beating several startups in the process. I also created an open-source framework to productionize this work and actively maintain it, with 2k+ stars on GitHub.
  • Efficient distributed inference: I wrote my Master’s research thesis on speeding up distributed transformer inference using a technique called “dynamic partitioning”, switching between different tensor parallel strategies at inference time depending on GPU and model characteristics. As part of this, I also explored adjacent ideas for speeding up inference related to prompt disaggregation and KV cache offloading.
  • Chatbot Arena: I added support for several models on Chatbot Arena and conducted analysis on understanding model performance through the lens of human preference.
  • Tensor Trust: I was co-author on an AI safety paper analyzing the adversarial robustness of LLMs using an online game where users could attack and defend against prompt injection attacks. In particular, I spent quite a bit time trying to transfer the insights gathered from the game to real-world applications, jailbreaking Notion AI, Claude, Bing Chat, and ChatGPT :)
  • SkyPilot: I was part of the core dev team of ~10 on SkyPilot, a framework for seamlessly executing ML workloads across clouds (towards achieving the vision of the sky). I contributed towards several efforts to improve the robustness of SkyPilot’s multi-node provisioning and configuration setup.
  • Exoshuffle: I was co-author on a distributed systems paper introducing a new architecture for generalized, large-scale shuffle algorithms built on top of Ray and distributed futures.
  • Exoshuffle-Cloudsort: We demonstrated how Exoshuffle could match the performance of monolithic shuffle systems, creating the world's most cost-efficient sort system and breaking the previous record on the Cloudsort benchmark at $0.97/TB
  • I’m affiliated with LMSYS, Sky Lab RISE Lab, CHAI and BAIR.

Previously, I

  • interned at
    • Reka: I optimized multi-modal inference, investigated video understanding, and worked on long-context modeling for Reka Core, a SOTA multimodal LLM rivaling GPT-4 performance; was the 1st external intern hired at the company.
    • Citadel Securities: I led the design and development of a new architecture for sending securities over a low-latency, distributed message bus and deployed this into production systems before I left; lots of C++, networking, and system design.
    • Monad Labs: I tackled 2 main projects in distributed, low-latency systems to parallelize the EVM; 1) I implemented lazy optimizations for gossip protocols (libp2p) as part of the consensus mechanism, reducing bandwidth requirements by over 50%; 2) I created the first prototype of Monad's mempool in Rust, achieving latency improvements of up to 6x using the Tokio runtime; one of the first few interns hired in a lean team of ~15.
    • Google : I saved tens of thousands of engineering hours and improved the efficiency of global Google Cloud networking deployments by creating a new, distributed service to identify and cluster flaky workflows; worked with C++, gRPC, and clustering algorithms.
    • Motional: I built mapping software infrastructure for self-driving vehicles deployed on Uber and Lyft; specifically, I created a new service to visualize mapping algorithms so as to allow engineers to better debug these algorithms; I also built a backend service to index and search lidar, radar, and camera data collected from vehicles, processing terabytes of data each day.
    • Bot MD: I built an AI assistant used by doctors in the fight against COVID-19 across Southeast Asia; notably, I spearheaded the design and development of a new internal task orchestration platform for the entire company called Bach; I worked across Python / Django and Go, leveraging custom Docker Compose files and AWS images for scalable deployment.
    • GovTech: I was part of the team that built a new web application called OneCV used to streamline the delivery of social services to the underprivileged across Singapore; I was involved in the entire process, from user research and requirements gathering to design and development.
  • directed Cal Hacks, the world's largest collegiate hackathon
  • tackled crypto governance and research at Blockchain at Berkeley
  • researched software for 3D brain visualizations to treat neurodegenerative disorders at the Roland Henry Lab,
  • mentored at Google's Summer of Code and Code-in programmes, guiding college and high school students into the world of open-source software
  • contributed to 3D graphics projects at The Terasology Foundation

Before all that, I

  • conducted cs + bio research at the Singapore University of Technology & Design, developing an Android application to analyze an athlete's running pace to determine the optimal music for running performance via auditory-motor synchronization (2016)
  • was the first Singaporean to win Google Code-in, winning a trip to Google’s Mountain View HQ in high school (2016)
  • worked on nlp research at the Defense Science & Technology Agency, building a web application that uses natural language understanding to intelligently categorize and visualize search engines (2014)
  • created CatAn Lab, an app to help students learn qualitative analysis in Chemistry, and won 2nd at a nationwide competition (2013)
  • dipped my toes my in competitive programming e.g. Project Euler (2012)
  • developed apps in middle school for jailbroken iOS devices with 350k downloads and features in various tech articles and videos: InfoPage, FakeBadges, LabelAbove (2010)
    • one the first things I built was actually text-based widgets for the homescreen in iOS 5!
  • dabbled in 3D modeling (2009)

I also built

  • socialscan: a high performance async Python CLI for querying usage on online platforms and scaled it to 1k stars on GitHub
  • mosaic: an AR-focused social experience prototyped at a week-long hacker house
  • pong: the classic Pong game supporting {0, 1, 2}-players
  • babelfish: a VSCode extension that automatically translates docstrings to multiple language
  • bluebird: a web app for real-time sentiment analysis and visualization of Twitter

And some fun facts

  • I love musicals - current favorite: Wicked!
  • I served in the Singapore Army for ~2 years as a military engineer prior to college
  • I somehow snagged i.o@berkeley.edu
  • I am the proud recipient of 2 gold medals on S/O lol - looking at these questions are never not embarrassing but also a reminder of how much I've grown :')