Base de Français Médiéval – Old French Corpus

Host: Ecole Normale Supérieure de Lyon, France

URL: http://txm.bfm-corpus.org

Description: The Base de Français Médiéval database (or BFM), founded in 1989, currently comprises 170 Old and Middle French texts. Thanks to its volume (approximately 4 700 000 words) and the diversity of the texts included, this database is unique in France for this period of the history of French. It has been used by a research community of around three hundred scholars, teachers, and students worldwide.

The texts included in the BFM cover a considerable geographic area and an extensive chronological breadth, with texts from the 9th century (including the first known French text, the Serments de Strasbourg) to the end of the 15th century. Both verse and prose texts are represented, as well as different genres and domains (e.g., fiction, history, hagiography, law, the sciences…).

Since May 2012, the BFM is accessible via a web portal powered by the TXM corpus search and analysis platform. All texts can be searched, visualized and downloaded in PDF format. TEI P5 XML files are provided on demand by the corpus adminstrator (see contact information below). All BFM texts are tokenized, morphologically tagged and lemmatized with the help of TreeTagger (using BFM own parameter file). The direct speech is marked up with tei:q tags. As of June 2019, morphological annotation of 39 texts has been verified and corrected by experts.

Contact: Alexei Lavrentiev, ENS de Lyon / IHRIM.

Email: bfm@ens-lyon.fr