Pierre Colombo (CentraleSupélec – Equall.ai) ran a Seminar@SystemX on the topic “Advancing Domain-Specific Large Language Models”, on September 19, 2024.
Resume
Large Language Models (LLMs) have revolutionized natural language processing, demonstrating remarkable versatility across various tasks. However, task and domain specialization is crucial to maximize their potential and achieve higher accuracy and relevance in specific fields while remaining efficient. This talk delves into innovative domain adaptation for LLMs through three models. We begin with SaulLM, the first LLM tailored for the legal field, excelling in legal research, document drafting, and contract analysis. Next, we introduce CroissantLLM, a bilingual model proficient in French and English, enabling seamless translation and bilingual text generation with cultural and linguistic precision. Lastly, we present TowerLLM, designed for translation tasks, offering high-fidelity translations while maintaining contextual integrity. In this talk, we will showcase the methodologies and breakthroughs that enable large-scale specialization of LLMs in their respective domains/applications.
Biography
Pierre Colombo is an Associate Professor at CentraleSupélec and the Chief Science Officer at Equall.ai, a LegalTech startup. Specializing in AI and Natural Language Processing (NLP), Pierre leads the development of AI-driven legal products and workflows to enhance the efficiency of legal professionals. At MICS CentraleSupelec – Université Paris-Saclay, Pierre’s research focuses on making AI practical for NLP industrial systems, with publications in top-tier conferences and journals such as ACL, NAACL, EMNLP, and NeurIPS. His work encompasses training large language models, ensuring their safety and robustness, and developing metrics for accurate performance measurement. Pierre’s notable contributions have been recognized with the Best Student Paper Award at AAAI 2022 and other leading conferences.