Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Date: 10 décembre 2025 – 12h30-13h30

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Short bio:

Antoine Bosselut is an assistant professor at the École Polytechnique Fédéral de Lausanne (EPFL). Previously, he was a postdoctoral fellow at Stanford University and a Young Investigator at the Allen Institute for AI (AI2). He received his PhD at the University of Washington in 2020. His research focuses on developing AI reasoning methods that can be translated to important societal problems in health, education, and helping underserved communities. He was named as one of the Forbes 30 under 30 list for Science and Healthcare in 2021, an ELLIS Scholar, and an AI2050 Early Career Fellow.  He is also on the steering committee of the Swiss AI Initiative and one of the co-leads of the Apertus project to responsibly develop LLMs for societal good.

Résumé:

As nations increasingly recognize the strategic importance of artificial intelligence, many efforts have emerged to develop Sovereign AI that is aligned with priorities of different national environments. In this talk, I'll present Apertus, a fully open, compliant, and neural suite of large language models (LLMs) that can serve as a novel backbone of Sovereign AI efforts. I'll specifically highlight the two dimensions that make Apertus different from all previous offerings: data compliance and large-scale multilinguality. First, I'll discuss how Apertus goes beyond open-weight models that forgo reproducible data pipelines or regard for content-owner rights, and is instead pretrained exclusively on openly available data, retroactively respecting robots.txt exclusions and filtering for non-permissive, toxic, and personally identifiable content. Then, I'll highlight Apertus' multilingual coverage, training on 15T tokens from over 1000 languages, and discuss the data curation, training, and evaluation challenges of this considerable multilingual expansion.

Inscription

Télécharger la présentation

Revoir le séminaire