Transcribe Speech to Text

Transcribe speech to text with Python or the Web Speech API:


Make sure you have Python installed:

python --version

Python version 3 is recommended.

Install SpeechRecognition module:

pip install SpeechRecognition

Create script that transcribes audio file Hello World.wav to text:

import speech_recognition as sr

r = sr.Recognizer()
filename = "Hello World.wav"

with sr.AudioFile(filename) as source:
    audio = r.listen(source)

text = r.recognize_google(audio)

Run script:

python # hello world


SpeechRecognition supports the following engines/API’s:

  • recognize_sphinx (works offline)
  • recognize_google
  • recognize_wit
  • recognize_bing
  • recognize_api
  • recognize_houndify
  • recognize_ibm
Pros Cons
Free API limitations (e.g., network timeout, file too big, rate limiting)
Fairly accurate Transcript can be off
  No punctuation marks

Web Speech API

If audio input can be directed to your microphone, then you can use the JavaScript Web Speech API:

See Web Speech API Demonstration.

