Extract speech from video and audio files

Hello everyone

This is not Rust specific topic, other than I want to use it from a Rust program.

Is there any Rust-accessible library (ideally a Rust crate) that can be used to take input video file (i.e. video.mp4) or input audio file and to extract the audio voice/speech content from the file, in form of Text output?
Or even a tool/utility that can do this, from command line or CLI ?

I want to be able to analyze, index and search for words/phrases said in a video and audio files.

I just Googled for answers and the only thing I found on Cloud was Microsoft Azure Cognitive Services API. I have not tried it (could be great!!) yet but perhaps there are more options to investigate - such as AWS cloud, Google GCP cloud or C/C++ libraries and Rust crates?

Also this: Picovoice (github.com)

Any more ideas? Or perhaps someone has used something else?

Many thanks

Thank you.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.