Divide Hugging Face Transformers training time by 2 or more with dynamic padding and uniform length batching
Reducing training time helps to iterate more in a fixed budget time and thus achieve better results.
Reducing training time helps to iterate more in a fixed budget time and thus achieve better results.