Yatong Bai

Yatong Bai 

Research Scientist
Music Foundation Models, ByteDance

UC Berkeley, Ph.D. 2025

Email: yatong_bai (at) berkeley.edu

Ph.D. Dissertation
Defense Talk Slides

Biography

I am a Research Scientist at ByteDance Inc., working on music foundation models.

I recently graduated with a Ph.D. from UC Berkeley's engineering department, where I worked with Professor Somayeh Sojoudi.

My research aims to make deep learning and generative AI more efficient, aligned, robust, and reliable. My interests span across generative models (particulaly audio/music), robust deep learning, (convex) optimization, reinforcement learning, and controls.

Specifically, my research areas so far include:

  • Diffusion Models (Audio/Music Generation).

    • Accelerating diffusion-based text-to-audio generation with consistency distillation (ConsistencyTTA).
    • Aligning text-to-music generation to human preferences by optimizing distributional rewards (DRAGON).
  • Robust Deep Learning.

    • Analyzing the vulnerabilities of large language models (LLMs) when used in conversational search engines (RAGDOLL).
    • Mixing classifiers to address the "accuracy-robustness trade-off" for image classifiers (MixedNUTS, Adaptive Smoothing).
  • Convex Neural Network Optimization.

    • Efficient algorithms for training two-layer ReLU neural networks with global optimality guarantees (link).
    • Convex formulations for adversarial training, which build two-layer networks with adversarial robustness (link).

Prior to joining Berkeley, I obtained Bachelor's degrees in computer engineering and mechanical engineering from Georgia Institute of Technology, where I researched with Professors Julien Meaud and Thomas Conte.

I have interned at Adobe Research, Microsoft, Scale AI, Honda Aircraft Company, and Tesla.

News