Transcribe speech to text with Python or the Web Speech API:
Python
Make sure you have Python installed:
python --version
Python version 3 is recommended.
Install SpeechRecognition module:
pip install SpeechRecognition
Create script speech_to_text.py that transcribes audio file Hello World.wav to text:
# speech_to_text.py
import speech_recognition as sr
r = sr.Recognizer()
filename = "Hello World.wav"
with sr.AudioFile(filename) as source:
audio = r.listen(source)
text = r.recognize_google(audio)
print(text)
Run script:
python speech_to_text.py # hello world
Library
SpeechRecognition supports the following engines/API’s:
recognize_sphinx(works offline)recognize_googlerecognize_witrecognize_bingrecognize_apirecognize_houndifyrecognize_ibm
| Pros | Cons |
|---|---|
| Free | API limitations (e.g., network timeout, file too big, rate limiting) |
| Fairly accurate | Transcript can be off |
| No punctuation marks |
See guide and GitHub repository for more details.
Web Speech API
If audio input can be directed to your microphone, then you can use the JavaScript Web Speech API: