Transcribe Speech to Text

Transcribe speech to text with Python or the Web Speech API:

Python
Web Speech API

Python

Make sure you have Python installed:

python --version

Python version 3 is recommended.

Install SpeechRecognition module:

pip install SpeechRecognition

Create script speech_to_text.py that transcribes audio file Hello World.wav to text:

# speech_to_text.py
import speech_recognition as sr

r = sr.Recognizer()
filename = "Hello World.wav"

with sr.AudioFile(filename) as source:
    audio = r.listen(source)

text = r.recognize_google(audio)
print(text)

Run script:

python speech_to_text.py # hello world

Library

SpeechRecognition supports the following engines/API’s:

recognize_sphinx (works offline)
recognize_google
recognize_wit
recognize_bing
recognize_api
recognize_houndify
recognize_ibm

Pros	Cons
Free	API limitations (e.g., network timeout, file too big, rate limiting)
Fairly accurate	Transcript can be off
	No punctuation marks

See guide for more details.

Demo

Repl.it:

See GitHub repository for more details.

Web Speech API

If audio input can be directed to your microphone, then you can use the JavaScript Web Speech API:

See Web Speech API Demonstration.