The Utterance End feature can be used to detect the end of speech by waiting a configured amount of milliseconds of silence after the last detected speech. The end of speech is detected using a combination between the VAD (Voice Activity Detection) and the transcription model. The transcription model is also used to prevent false positives of speech detection by VAD.Documentation Index
Fetch the complete documentation index at: https://docs.vatis.tech/llms.txt
Use this file to discover all available pages before exploring further.
Configuration
The Utterance End feature can be configured by using theutteranceEnd=1000 query parameter, where 1000 is the amount of milliseconds of silence to wait after the last detected speech.
Result
When an utterance is ended, a transcription response with the"utterance": true attribute is emitted. The content of the response can be a transcription response or an empty transcription, depending on the underlying engine implementation.