Transcribe speech to text with Python or the Web Speech API:
Python
Make sure you have Python installed:
python --version
Python version 3 is recommended.
Install SpeechRecognition
module:
pip install SpeechRecognition
Create script speech_to_text.py
that transcribes audio file Hello World.wav
to text:
# speech_to_text.py
import speech_recognition as sr
r = sr.Recognizer()
filename = "Hello World.wav"
with sr.AudioFile(filename) as source:
audio = r.listen(source)
text = r.recognize_google(audio)
print(text)
Run script:
python speech_to_text.py # hello world
Library
SpeechRecognition supports the following engines/API’s:
recognize_sphinx
(works offline)recognize_google
recognize_wit
recognize_bing
recognize_api
recognize_houndify
recognize_ibm
Pros | Cons |
---|---|
Free | API limitations (e.g., network timeout, file too big, rate limiting) |
Fairly accurate | Transcript can be off |
No punctuation marks |
See guide for more details.
Demo
See GitHub repository for more details.
Web Speech API
If audio input can be directed to your microphone, then you can use the JavaScript Web Speech API: