Skip to content

Lefebvre Sarrut's AI blog

Recent articles

Recent articles

November 5, 2021
in Optimization, Transformers
1 min read

Optimization of Hugging Face Transformer models to get Inference < 1 Millisecond Latency + deployment on production ready inference server

Hi,

I just released a project showing how to optimize big NLP models and deploy them on Nvidia Triton inference server.

May 20, 2020
in Pytorch
16 min read

Divide Hugging Face Transformers training time by 2 or more with dynamic padding and uniform length batching

Reducing training time helps to iterate more in a fixed budget time and thus achieve better results.

December 10, 2019
in Benchmarking
19 min read

NER algo benchmark: spaCy, Flair, m-BERT and camemBERT on anonymizing French commercial legal cases

Does (model) size matters?

September 26, 2019
in Optimization
18 min read

Why we switched from Spacy to Flair to anonymize French case law

… and why you should always review your options