Cedille, the most powerful French language model!

They talk about us

Various fields of application

Generated by Cedille

Do you have a specific project?

Contact us for a customized adaptation of our model to your needs.

Contact us

Cedille, the most powerful language models available

Generate high quality contents within seconds

Speed up and optimise your writing process

Rewrite content to match your brand tone

Access the best French-speaking model on the market for your NLP & NLG projects

Our mission

Access the best models on the market for your NLP projects

A model with unprecedented quality

Prediction score

(Lower is better)

We measure the "perplexity" of the model, i.e. its ability to predict the next word in a given document. We use the Wikitext-FR corpus, composed of thousands of quality articles from Wikipedia in French.

Cedille

3.93

GPT-3 (Davinci)

3.99

GPT-J

5.79

GPT-FR

12.92

Toxicity score

(Lower is better)

The generation of so-called "toxic" content is a common problem with language models. For Cedille, we take care to train the model on high quality data. This greatly improves the generated text! We use a version of the benchmark Real Toxicity Prompts adapted for French, showing the maximum expected unprompted toxicity.

Cedille

96%

GPT-3 (Davinci)

99%

GPT-J

99%

GPT-FR

96%

Translation score

(Higher is better)

With the WMT14 dataset, we measured Cedille's performance on translation tasks from English to French.

Cedille

24.91%

GPT-3 (Davinci)

20.40%

GPT-J

14.84%

GPT-FR

1.47%

Summary score

(Higher is better)

We use the OrangeSum benchmark to measure the capacity and performance of the model on text summarisation. OrangeSum is based on the XSum dataset and was created from the "Orange Actu" website.

Cedille

13.73%

GPT-3 (Davinci)

15.49%

GPT-J

12.96%

GPT-FR

10.20%

Read our latest research paper on the publication of the model

read the paper

Prediction score

(Lower is better)

We measure the “perplexity” of the model, i.e. its ability to predict the next word in a given document. We use a corpus composed of thousands of quality articles from Wikipedia in German.

Cedille

3.84

GPT-3 (Davinci)

3.76

GerPT2

18.68

Toxicity score

(Lower is better)

The generation of so-called "toxic" content is a common problem with language models. For Cedille, we take care to train the model on high quality data. This greatly improves the generated text! We use a version of the benchmark Real Toxicity Prompts adapted for German, showing the maximum expected unprompted toxicity.

Cedille

91%

GPT-3 (Davinci)

99.5%

GerPT2

92.4%

Translation score

(Higher is better)

With the WMT16 dataset, we measured Cedille's performance on translation tasks from English to German.

Cedille

20.35

GPT-3 (Davinci)

18.52

GerPT2

2.25

Summary score

(Higher is better)

We use the MLSUM benchmark to measure the capacity and performance of the model on text summarization.

Cedille

15.89

GPT-3 (Davinci)

23.33

GerPT2

7.52

Open Source

The French model of Cedille is an open source model, available on Github and Huggingface.

@CedilleAI

Carlos

@MCarlosUCC

@coteri_es based at @EPFL_Park launches a specialized text generation model in French #AI. According to @coteri_es its #Cedille technology rivals the GPT-3 multilingual model developed by @OpenAI and is the best model...

See tweet

Le Kapharnaüm

@Le_Kapharnaum

Do you know @CedilleAI? I've just tested it with a sentence that should speak to those who have followed our pack games: Raccoon alert in rut around Lake Aiguebelette. I think the result is absolutely wonderful...

See Tweet

Alexan Vorritold

@avorritold

I just discovered @CedilleAI. I gave it the catchphrase from my company. So it's clearly Cedille who will be writing my business proposals from now on.

See tweet

Agar 🌱

@akaAgar

The @CedilleAI generator works so well with journalistic formats that I want to use it to create a gazette of imaginary news. Here are some excerpts (the prompt is in bold each time).

See tweet