Friday 14 November - 11am | ILCC | School of Informatics

Speaker : Minghao Wu (Monash University)

Title: Squeezing Your Fine-Tuning Data to the Last Drop: From Selection to Rebalancing

Abstract: The quality and composition of training data are paramount for the effective supervised fine-tuning (SFT) of large language models (LLMs). This talk presents two independent studies that tackle the challenge of data optimization from different, yet complementary, angles. The first study introduces GraphFilter, a novel data selection method that formulates the selection process as a set cover problem. By modeling the dataset as a bipartite graph and employing a priority function that balances quality and diversity, GraphFilter iteratively selects the most informative examples for training. The second study presents Mixture-of-Skills (MoS), a reinforcement learning framework designed to optimize data usage during fine-tuning. MoS dynamically adjusts the focus on different datasets to ensure balanced skill development in LLMs. Together, these two studies offer a comprehensive look at the data optimization landscape, providing valuable insights into both static data selection and dynamic data utilization for building more capable LLMs.

Biography: Minghao Wu is a final-year Ph.D. candidate at Monash University and is currently visiting The University of Edinburgh. His research focuses on large language models, multilinguality, and machine translation. He has published over 20 papers in top-tier conferences and journals, including ICML, ACL, EMNLP, COLING, and TACL. His work has been recognized with the Outstanding Paper Award at ACL 2025. He has also visited/interned at Huawei, Tencent, Alibaba, and MBZUAI. You can read more about him and his research at https://minghao-wu.github.io/