Audio requirements

Details on audio requirements for sending requests to the APIs

Audio format

Our API currently accepts the following audio specifications:


Audio format	WAV, MP3, M4A, OGG, WEBM
Sample rate	Minimum of 16Hz
Bit rate	Minimum of 16Bit
Audio channels	MONO or STEREO
Audio length	30s (Pronunciation), 60s (Scripted), 120s (Unscripted)

Recommended format

Our API will accept any of the formats documented above for ease of use, but for optimal performance and latency we strongly recommend that you send us your audio in the following format:


Audio format	MP3 or WAV (MP3 will be smaller and faster over the network)
Sample rate	16kHz
Bit rate	16Bit
Audio channels	MONO
Audio length	5 - 15 seconds (Pron & Scripted), 15 - 60 seconds (Unscripted)

Base64 encoding

In order to send us your audio for assessment you will need to encode your recording as a base64 string, this is a way to encode the binary data of your audio and send it in an HTTP request.

Most programming languages have a built-in base64 encoding/decoding library. For testing purposes, you can use this online converter: https://base64.guru/converter/encode/audio

Audio format​

Recommended format​

Base64 encoding​

Audio format

Recommended format

Base64 encoding