Skip to content

Lefebvre Sarrut's AI blog

Tags

Tags

Following is a list of relevant tags:

Anonymization

Why we switched from Spacy to Flair to anonymize French case law

Bert

Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels

Cuda

What we learned by making T5-large 2X faster than Pytorch (and any autoregressive transformer)

Cuda Graph

Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl

Data Science

NER algo benchmark: spaCy, Flair, m-BERT and camemBERT on anonymizing French commercial legal cases

Deep Learning

Divide Hugging Face Transformers training time by 2 or more with dynamic padding and uniform length batching

Flair

Why we switched from Spacy to Flair to anonymize French case law

GPT-3

Are LLM relevant for legal topics?

GPU Quantization

1st ever method to perform GPU quantization on most 🤗 HF transformer models: > 2X faster inference!

Hugging Face

Justice

NER algo benchmark: spaCy, Flair, m-BERT and camemBERT on anonymizing French commercial legal cases

Kernel

Llama

Deep Dive into Kernel Fusion: Accelerating Inference in Llama V2

Machine Learning

NLP

Divide Hugging Face Transformers training time by 2 or more with dynamic padding and uniform length batching

Nvidia Triton

ONNX Runtime

OpenAI Triton

Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels

OpenAI Whisper

Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl

Programming

NER algo benchmark: spaCy, Flair, m-BERT and camemBERT on anonymizing French commercial legal cases

Python

Divide Hugging Face Transformers training time by 2 or more with dynamic padding and uniform length batching

Pytorch

Divide Hugging Face Transformers training time by 2 or more with dynamic padding and uniform length batching

Spacy

Why we switched from Spacy to Flair to anonymize French case law

T5

Technology

NER algo benchmark: spaCy, Flair, m-BERT and camemBERT on anonymizing French commercial legal cases

TensorRT

TorchDynamo

What we learned by benchmarking TorchDynamo (PyTorch team), ONNX Runtime and TensorRT on transformers model (inference)