$$ \Huge \textbf {Qizhen Zhang (Irene)} \\ $$

IMG_0318-photoaidcom-cropped.png

I am a machine learning PhD student at the University of Oxford, where I work on large language models. I’m also interning as a research scientist with the Llama team at Meta GenAI.

Previously, I was a member of technical staff / researcher at Cohere doing pretraining research and building LLM pretraining frameworks for models on the scale of O(100B) parameters. I wrote my Master's thesis on cooperative multi-agent reinforcement learning at the University of Toronto and the Vector Institute.


<aside> <img src="/icons/graduate_blue.svg" alt="/icons/graduate_blue.svg" width="40px" /> Google Scholar

</aside>

<aside> <img src="/icons/mail_blue.svg" alt="/icons/mail_blue.svg" width="40px" /> Email

</aside>

<aside> <img src="/icons/close_blue.svg" alt="/icons/close_blue.svg" width="40px" /> Twitter

</aside>

<aside> <img src="/icons/sharing_blue.svg" alt="/icons/sharing_blue.svg" width="40px" /> Linkedin

</aside>

Research

representative papers are highlighted, ***** indicates equal contribution


BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts paper, talk, tweet Qizhen Zhang, Nikolas Gritsch, Dwaraknath Gnaneshwar, Simon Guo, David Cairuz, Bharat Venkitesh, Jakob Foerster, Phil Blunsom, Sebastian Ruder, Ahmet Üstün*****, Acyr Locatelli*** NeurIPS 2024** Also at workshops ES-FoMo, NGSM @ ICML 2024 (Spotlight Talk) Attention experts are important for upcycling MoEs, we use a soft-variant of MoA with parallel-attention architecture.


Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts paper, tweet Nikolas Gritsch, Qizhen Zhang, Acyr Locatelli, Sara Hooker, Ahmet Üstün Under submission Also at the AFM workshop @ NeurIPS 2024 ****An adaptable, specialized & efficient MoE framework that easily adapts to new data distributions.


PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition **paper, tweet** Ziyang Zhang, Qizhen Zhang, Jakob Foerster ICML 2024 An LLM defence method by asking the LLM itself to repeat its own output.


Analysing the Sample Complexity of Opponent Shaping [paper](https://arxiv.org/abs/2402.05782#:~:text=Learning in general-sum games,group performances in many settings.)** Kitty FungQizhen Zhang, Chris Lu, Jia Wan, Timon Willi, Jakob Foerster AAMAS 2024 (Oral) An opponent shaping algorithm for general-sum games. We derive sample complexity bounds and show connections with empirical scaling laws.


Centralized Model and Exploration Policy for Multi-Agent RL papertalk, tweet Qizhen Zhang, Chris Lu, Animesh Garg, Jakob Foerster AAMAS 2022 (Oral) A sample-efficient cooperative multi-agent settings via model-based RL. We derive sample complexity bounds and show empirical gains.


Experience