Recent & Upcoming Talks

Des Méthodes de TAL modernes pour l'Enrichissement de Documents

Nous présentons une pipeline pour le traitement et l’enrichissement de documents basée sur les dernières méthodes d’apprentissage neuronal.

A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages

We explore the impact of the training corpus on contextualized word embeddings in five mid-resource languages.

Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures

We propose a new pipeline to filter, clean and classify Common Crawl by language, we publish the final corpus under the name OSCAR.

Preparing the Dictionnaire Universel for Automatic Enrichment

A talk about automatic enrichment of dictionaries.

Reducing computation time by months by rewriting Bash scripts in Go