Use cases
Description of the product use cases of our APIs
Our API returns a detailed speech report in the form of a JSON object. With the information in the report, you have the full flexibility to create a variety of products and learning experiences on your end.
Practice vs Assessment
You might want to provide automated assessment of spoken English for your users or automated practice of spoken English. Fundamentally the difference in the two lies in:
- The number of attempts you give to users.
- How often you give feedback to users.
- The level of detail of the feedback you give to users.
In an assessment scenario you might only give high level scores back to your users and indicate if they passed a certain threshold of performance. e.g: scored above a certain grade. You might also only give one attempt for each question / exercise and offer the scoring at the end of the full assessment.
In a practice and teaching scenario you typically want to provide multiple attempts, give feedback immediately and provide much more detailed feedback on where the user is doing well or badly as well as provide suggestions for improvement.
Our APIs support both practice and assessment:
- We provide high level scores as well as detailed metrics and feedback for each skill of spoken English: Pronunciation, Fluency, Grammar, Vocabulary, Relevance.
- Our API gives you the flexibility to decide in your application if you want to build a practice or assessment experience:
- The number of attempts you give to users.
- How often you give feedback to users.
- The level of detail of the feedback you give to users.
Creating question types
Our APIs allow you to easily build the following question types in your application
Repeat / read-aloud use case
In this scenario you know up front what you want and expect the user to speak. This might be:
- Repeating a word or sentence from a recording or video
- Reading aloud a word, sentence or paragraph
Our scripted speech assessment API allows you to specify the expected text and will return a JSON report containing:
- A detailed pronunciation assessment report
- A detailed fluency report
- A detailed reading report
- Relevance information, indicating how close to the expected text the user actually spoke
More details on the scripted API response: Speech Assessment Report Scripted
Here is an example of the scripted speech assessment API used to build a read aloud question type:
Open ended questions
In this scenario you do not know up front what exactly the user will say, but instead are asking them an open ended question to which they might respond in a multitude of ways. You do however expect their answer to be relevant to your question.
This might be open ended questions like:
"What is your favorite hobby ?"
"Tell me about your favorite book ?"
"What do you want to study when you are older ?"
Our unscripted speech assessment API solves this use case, you do not need to provide it with the expected text and will return a JSON report containing:
- A detailed pronunciation assessment report
- A detailed fluency report
- A detailed grammar report
- A detailed vocabulary report
More details on the unscripted API response Speech Assessment Report Unscripted
Our API can also solve the hard problem of assessing relevance in an open ended context. If you provide our API with the question asked to the user, it will return a relevance classification indicating if the users answer was relevant or not and also providing an explanation as to why.
It can also solve the hard problem of assessing answer validity in an open ended context. How do you validate answers if you do not know exactly what the user will say upfront ? If you provide our API with a description of the valid answer you expect, it will return a validity classification indicating if the users answer was valid or not, also providing an explanation as to why.
You can read more details about our content relevance and answer validation features in this guide: Content relevance (Beta)
Here is an example of the unscripted speech assessment API used to build an open ended question type:
Open ended recall or description
In this scenario you expect and open ended answer, but might not be directly asking a question to the user. For example:
- Asking the user to describe a picture
- Showing some content to the user (text, audio, video) and asking them to recap it so that you can test their comprehension skills.
Our unscripted speech assessment API solves this use case as well. You can provide a context description and valid answer description in your request without needing to provide the question context. Our API will then still return the content relevance and/or valid answer classification.
For a describe the picture question, you could show a picture of a sunset on the beach and provide the context description: "The user should describe in the picture a beautiful sunset on the beach"
For a recall question, you could play a recording of Martin Luther king's famous "I have a dream" speech and provide the context description: "The user should mention the speech outlines the long history of racial injustice in America and encourages his audience to hold their country accountable to its own founding promises of freedom, justice, and equality"
You can read more details about our content relevance and answer validation features in this guide: Content relevance (Beta)
Here is an example of the unscripted speech assessment API used to build a describe the picture question type:
Other use cases and capabilities
Use cases of both the scripted and unscripted APIs
Give your users a prediction of their English test scores
Our API returns estimated predictions for the following recognized tests:
- IELTS
- PTE
- CEFR
We also return our own internal score which is on a scale of 0 - 100, if you do not want to report on the standards above.
Score a user's pronunciation at different levels (overall, word, phoneme)
The pronunciation report part of our API output gives back "nativeness scores" which is how close the user's speech is to that of a native speaker. These scores are given at the phoneme, word and overall level. This allows you to pinpoint where your users are going wrong and help them identify the specific words or phonemes to need to improve on.
Target different accents
Our API supports both US and UK accents, so you can pick and choose which accent your users should be compared against.
Cumulative scoring and tracking of users over time
While our API is stateless, meaning each API call is independent and does not hold information from previous calls. You can implement cumulative scoring in your own code and application by simply storing the core metrics that our API gives you over time. This opens up the ability for you to:
- Score multiple questions from a user and give them a summarized speaking report at the end.
- Build up a profile of English proficiency for your users and see how they progress over time. Identify their strengths and weaknesses and recommend relevant content.