We are happy to announce the support of OpenAI Whisper model (ASR task) on Kernl.
We focused on high quality transcription in a latency sensitive scenario, meaning:
- whisper-large-v2 weights
- beam search 5 (as recommended in the related paper)
We measured a 2.3x speedup on Nvidia A100 GPU (2.4x on 3090 RTX) compared to Hugging Face implementation using FP16 mixed precision on transcribing librispeech test set (over 2600 examples). For now, OpenAI implementation is not yet PyTorch 2.0 compliant.