Skip to main content

Audio requirements

Details on audio requirements for sending requests to the APIs


Audio format

Our API currently accepts the following audio specifications:

Audio formatWAV, MP3, M4A, OGG, WEBM
Sample rateMinimum of 16Hz
Bit rateMinimum of 16Bit
Audio channelsMONO or STEREO
Audio length30s (Pronunciation), 60s (Scripted), 120s (Unscripted)

Our API will accept any of the formats documented above for ease of use, but for optimal performance and latency we strongly recommend that you send us your audio in the following format:

Audio formatMP3 or WAV (MP3 will be smaller and faster over the network)
Sample rate16kHz
Bit rate16Bit
Audio channelsMONO
Audio length5 - 15 seconds (Pron & Scripted), 15 - 60 seconds (Unscripted)

Base64 encoding

In order to send us your audio for assessment you will need to encode your recording as a base64 string, this is a way to encode the binary data of your audio and send it in an HTTP request.

Most programming languages have a built-in base64 encoding/decoding library. For testing purposes, you can use this online converter: https://base64.guru/converter/encode/audio