糖心Vlog

Beyond scale: efficient pre-training and controllable post-training for language models

EVENT DATE
27 Aug 2025
Please refer to specific dates for varied timings
TIME
2:00 pm 4:00 pm
LOCATION
SUTD Think Tank 13 (Building 1, Level 5, Room 1.508)

Language models are foundational to modern artificial intelligence, but their development is often constrained by challenges in efficiency, controllability, and reasoning. In this thesis, we aim to address these limitations by introducing advanced paradigms at both the pre-training and post-training stages.

First, we address the challenge of building efficient yet powerful foundation models. We explore an alternative to the existed scaling paradigm by pre-training smaller models on massive datasets. Our work on TinyLlama, a 1.1B parameter model, demonstrates that this strategy can produce models that are highly competitive with, and often superior to, other models in the same size class.

Next, we focus on enhancing foundation models through a suite of novel post-training techniques. To address parameter-inefficiency in multi-task adaptation, we propose PROPETL, a method using structural pruning to reduce storage costs by an order of magnitude. To address the lack of fine-grained capability control, we introduce Unsupervised Non-Transferable Learning (UNTL), which allows a model’s knowledge in specific domains to be dynamically restricted and recovered using secret keys.

Furthermore, we aim to improve the intrinsic reasoning abilities of language models in specialized domains. For mathematical reasoning, we propose the Satori framework, which uses Chain-of-Action-Thought (COAT) and reinforcement learning to enable autoregressive search and self-correction. For code reasoning, we develop two complementary approaches: EFFICODER improves the computational efficiency of the final generated code through efficiency-aware fine-tuning, while Satori-SWE improves the sample efficiency of the problem-solving process itself via an evolutionary search algorithm.

Looking forward, the principles of efficient pre-training and modular post-training specialization offer a path toward more capable and practical models. Future work should focus on integrating these diverse techniques and extending them to more complex, multimodal, and agentic settings.

Speaker’s profile

Zeng Guangtao is a PhD candidate at Singapore University of Technology and Design (SUTD). He received his BEng from Sun Yat-sen University, China in 2020. His PhD research focuses on large language model, NLP, and reasoning.

ADD TO CALENDAR
Google Calendar
Apple Calendar