You are using an outdated browser. For a faster, safer browsing experience, upgrade for free today.

Loading...

about event

Thematic School - MDD 2026
Data Management for Language Models

  • Where

    Institute for Scientific Studies of Cargèse - University of Corsica

  • When

  • Thematic axes
    • High-dimensional vector indexing and similarity search,
    • Data Reduction,
    • Data Provenance,
    • Data Modalities and Explainable AI,
    • AI Agents in Human-AI Collaboration,
    • Conversational NL Interfaces for Data Analysis,
    • AI Regulation,
    • Green AI
      • In addition, we will organize a workshop on scientific writing, science, publications, and open data, including the sharing and archiving of such data, dissemination, and finally, the popularization of scientific results. We also plan to organize a dynamic poster session and a gong show during coffee breaks.
  • The Venue

    The 2026 edition will take place in Cargese from April 26 to May 2. The host venue is the Institute for Scientific Studies, located a 20-minute walk from the village of Cargèse. The center is fully equipped to provide excellent conditions for productive working days in a beautiful environment. This site presents the facilities, equipment, and surroundings.

    IESC

Welcome to the BDA Thematic School website

The thematic school "Distributed Data Masses" (MDD) comes from the community of database researchers. Its goal is to complement the annual national conference “Data Management — Principles, Technologies, and Applications (BDA)”, organized for 40 years, whose consistent quality is recognized both nationally and internationally. The BDA steering committee wished to strengthen the educational component (tutorials) within an independent event, which led to the creation of the MDD school. The first session was held in Les Houches in 2010, in Aussois in 2012, in Oléron in 2014, in Urrugne in 2016, in Aussois in 2018, postponed in 2020, in Bastia in 2022, and in Ceillac-en-Queyras in 2024.

Our thematic school aims to shed light on these various societal challenges by inviting experts from multiple fields in computer science and law. The general theme, “ Data Management for Language Models ” will be developed through eight major topics listed alongside.

The Audience
We have two objectives:
  • To train master’s students, PhD students, postdoctoral researchers, and research engineers. It is important for these profiles to gain knowledge of new scientific challenges and pitfalls of language models.
  • We also target an audience outside of computer science, mainly AI regulation stakeholders.
We will cover research areas at the intersection of data and language models, from data production, exploitation, and reuse, to limiting the impact of training and repeated model use.

The school is also open to faculty members and senior researchers aiming for thematic openness, networking opportunities, or career transitions. It is, of course, open to research engineers from the private sector (R&D, startups, etc.).

The MDD 2026 school will be organized jointly with ARMADA, a European ITN (International Training Network) aimed at training researchers.

Keywords
Data for Language Models, Provenance in AI, Explainable AI, Techniques for Data Reduction and Large-Scale Data Management, Human-AI Collaboration and NL Interfaces, Agentic AI, AI Regulation, Green AI.

Voelas
BDA MDD 2026

Institute for Scientific Studies of Cargèse - University of Corsica

View map location