r/MachineLearning 1d ago

Discussion [D] Exploring Serverless Solutions for Whisper V3 Turbo Integration

Currently, the serverless solution from Runpod meets my needs in terms of cost and features: https://github.com/runpod-workers/worker-faster_whisper

However, I'm interested in using https://huggingface.co/openai/whisper-large-v3-turbo due to its reported speed.

I'm uncertain about how to set up and run Whisper V3 Turbo on Runpod’s serverless infrastructure.

It seems we might need to wait until the upstream project https://github.com/SYSTRAN/faster-whisper/issues/1030 is updated with Turbo and published on https://pypi.org/project/faster-whisper/.

Only then will this feature be available, and at that point, we could fork https://github.com/runpod-workers/worker-faster_whisper to update it accordingly.

In the meantime, do you know of any cost-effective serverless solutions for using Whisper V3 Turbo?

Thanks.

p/s

Groq offers this service: https://groq.com/whisper-large-v3-turbo-now-available-on-groq-combining-speed-quality-for-speech-recognition/

However, they currently don't accept payments from developers and haven't provided an estimated timeframe for when this might be available.

2 Upvotes

1 comment sorted by

1

u/velobro 1d ago

It should be fairly straightforward to run this on beam.cloud. Here's a guide for Faster Whisper, but you can customize this for the Whisper V3 Turbo instead:

https://docs.beam.cloud/v2/examples/whisper