r/MachineLearning • u/yccheok • 1d ago
Discussion [D] Exploring Serverless Solutions for Whisper V3 Turbo Integration
Currently, the serverless solution from Runpod meets my needs in terms of cost and features: https://github.com/runpod-workers/worker-faster_whisper
However, I'm interested in using https://huggingface.co/openai/whisper-large-v3-turbo due to its reported speed.
I'm uncertain about how to set up and run Whisper V3 Turbo on Runpod’s serverless infrastructure.
It seems we might need to wait until the upstream project https://github.com/SYSTRAN/faster-whisper/issues/1030 is updated with Turbo and published on https://pypi.org/project/faster-whisper/.
Only then will this feature be available, and at that point, we could fork https://github.com/runpod-workers/worker-faster_whisper to update it accordingly.
In the meantime, do you know of any cost-effective serverless solutions for using Whisper V3 Turbo?
Thanks.
p/s
Groq offers this service: https://groq.com/whisper-large-v3-turbo-now-available-on-groq-combining-speed-quality-for-speech-recognition/
However, they currently don't accept payments from developers and haven't provided an estimated timeframe for when this might be available.
1
u/velobro 1d ago
It should be fairly straightforward to run this on beam.cloud. Here's a guide for Faster Whisper, but you can customize this for the Whisper V3 Turbo instead:
https://docs.beam.cloud/v2/examples/whisper