Yihan Wang

UCLA Computer Science

prof_pic.jpg

Hi, I am a final-year Ph.D. candidate in Computer Science at UCLA, working with Prof. Cho-Jui Hsieh. I completed my B.Eng. degree at Tsinghua University in June 2020.

My research interests focus on the robustness and generalization of machine learning models, with a recent emphasis on large language models. I work on projects that both interest me and benefit society. I was supported by an Amazon Fellowship.

My research specifically includes:

I’ve also worked on formal verification of neural networks/machine learning models in my early PhD years.

For anyone interested in my research: Please feel free to email me if you are interested in a discussion on research or potential collaborations.

selected publications

* indicates equal contribution.

  1. Arxiv Preprint
    On the loss of context-awareness in general instruction fine-tuning
    Yihan Wang*, Andrew Bai* , Nanyun Peng , and Cho-Jui Hsieh
    2024
  2. ACL Findings 2024
    Defending LLMs against Jailbreaking Attacks via Backtranslation
    Yihan Wang*, Zhouxing Shi* , Andrew Bai , and Cho-Jui Hsieh
    ACL Findings, 2024
  3. ICLR 2024
    Two-stage LLM Fine-tuning with Less Specialization and More Generalization
    Yihan Wang, Si Si , Daliang Li , Michal Lukasik , Felix Yu , Cho-Jui Hsieh , Inderjit S Dhillon , and Sanjiv Kumar
    In The Twelfth International Conference on Learning Representations , 2024
  4. NeurIPS 2023
    Universality and limitations of prompt tuning
    Yihan Wang, Jatin Chauhan , Wei Wang , and Cho-Jui Hsieh
    Advances in Neural Information Processing Systems, 2023
  5. TACL
    Red teaming language model detectors with language models
    Zhouxing Shi* , Yihan Wang*, Fan Yin* , Xiangning Chen , Kai-Wei Chang , and Cho-Jui Hsieh
    Transactions of the Association for Computational Linguistics, 2023
  6. NeurIPS 2021
    Fast certified robust training with short warmup
    Zhouxing Shi* , Yihan Wang*, Huan Zhang , Jinfeng Yi , and Cho-Jui Hsieh
    Advances in Neural Information Processing Systems, 2021
  7. NeurIPS 2020
    Automatic perturbation analysis for scalable certified robustness and beyond
    Kaidi Xu , Zhouxing Shi , Huan Zhang , Yihan Wang, Kai-Wei Chang , Minlie Huang , Bhavya Kailkhura , Xue Lin , and Cho-Jui Hsieh
    Advances in Neural Information Processing Systems, 2020