Transcribe audio to text


Make sure you have Python installed:

$ python --version

Python version 3 is recommended.

Install SpeechRecognition module:

$ pip install SpeechRecognition

Create script that transcribes audio file Hello World.wav to text:

import speech_recognition as sr

r = sr.Recognizer()

with sr.AudioFile('Hello World.wav') as source:
    audio = r.listen(source)

text = r.recognize_google(audio)

Run script:

$ python
hello world


SpeechRecognition supports the following engines/API’s:

  • recognize_sphinx (works offline)
  • recognize_google
  • recognize_wit
  • recognize_bing
  • recognize_api
  • recognize_houndify
  • recognize_ibm
Pros Cons
Free API exceptions (network timeout, file too big, rate limiting)
Accurate Transcript does not include punctuation marks and can be off

See guide for more details.


See GitHub repository for more details.


If the audio can be inputted to your microphone, then you can use the JavaScript Web Speech API:

If you enjoyed this post, please consider supporting this site!