New to Kendo UI for AngularStart a free 30-day trial

SpeechToTextButton Configuration Options

To enhance the speech recognition experience of the Kendo UI for Angular SpeechToTextButton, you can configure how the component handles speech input and provides recognition results.

  • Continuous recognition—Enable the SpeechToTextButton to keep listening for additional speech input without requiring the user to click the button again.
  • Interim results—Allow the SpeechToTextButton to provide interim results while the user is still speaking.
  • Multiple recognition alternatives—Enable the SpeechToTextButton to return multiple recognition alternatives for a given speech input.
  • Language recognition—Configure the SpeechToTextButton to recognize speech in different languages by specifying a BCP 47 language tag.
Change Theme
Theme
Loading ...

Continuous Recognition

By default, the SpeechToTextButton stops listening after recognizing a single phrase. You can enable continuous recognition to keep listening for additional speech input without requiring the user to click the button again.

To enable continuous recognition, set the continuous property to true:

html
<button kendoSpeechToTextButton 
        [continuous]="true"
        (result)="handleResult($event)">
    Click to speak
</button>

When continuous recognition is enabled:

  • The button remains active and continues listening after recognizing speech.
  • Multiple phrases can be recognized in sequence.
  • The recognition session continues until the user manually stops it or an error occurs.
  • The component fires multiple result events as different phrases are recognized.

The following example demonstrates continuous recognition in action.

Change Theme
Theme
Loading ...

Interim Results

The SpeechToTextButton can provide interim (in-progress) results while the user is still speaking. This feature is useful for creating real-time speech-to-text experiences where users can see their words appear as they speak. For example, when working with voice input in messenger or note taking apps.

To enable interim results, set the interimResults property to true:

html
<button kendoSpeechToTextButton 
        [interimResults]="true"
        (result)="handleResult($event)">
    Click to speak
</button>

When interim results are enabled:

  • The result event fires multiple times during speech recognition.
  • Each result event contains the isFinal property indicating whether the result is final or interim.
  • Interim results may change as the speech recognition engine processes more audio.
  • Final results are provided when the engine has completed processing a phrase.

To properly handle interim results, check the isFinal property in your event handler:

typescript
export class AppComponent {
    public finalText = '';
    public interimText = '';
    
    public handleResult(event: SpeechToTextResultEvent): void {
        if (event.alternatives && event.alternatives.length > 0) {
            const transcript = event.alternatives[0].transcript;
            
            if (event.isFinal) {
                this.finalText += transcript + ' ';
                this.interimText = '';
            } else {
                this.interimText = transcript;
            }
        }
    }
}

The following example demonstrates the interim results functionality.

Change Theme
Theme
Loading ...

Multiple Recognition Alternatives

The speech recognition engine can provide multiple alternative transcripts for the same audio input. This is useful when you want to give users choices or implement custom logic to select the best transcript.

To configure the number of transcripts provided, set the maxAlternatives property:

html
<button kendoSpeechToTextButton 
        [maxAlternatives]="3"
        (result)="handleResult($event)">
    Click to speak
</button>

Each alternative in the alternatives array contains:

  • transcript—The recognized text.
  • confidence—A confidence score (0-1) indicating the engine's certainty.

Process the multiple alternatives in your event handler:

typescript
export class AppComponent {
    public selectedTranscript = '';
    public allAlternatives: Array<{transcript: string, confidence: number}> = [];
    
    public handleResult(event: SpeechToTextResultEvent): void {
        if (event.alternatives && event.alternatives.length > 0) {
            this.allAlternatives = event.alternatives.map(alt => ({
                transcript: alt.transcript,
                confidence: alt.confidence
            }));
            
            // Select the alternative with the highest confidence
            const bestAlternative = this.allAlternatives.reduce((best, current) => 
                current.confidence > best.confidence ? current : best
            );
            
            this.selectedTranscript = bestAlternative.transcript;
        }
    }
}

The following example demonstrates multiple recognition alternatives.

Change Theme
Theme
Loading ...

Language Recognition

The SpeechToTextButton supports recognizing speech in different languages by specifying a BCP 47 language tag through the lang property. This allows you to tailor the speech recognition experience to your application's audience and support multilingual scenarios.

Setting the Language

To configure the language for speech recognition, set the lang property to the desired BCP 47 language tag (for example, 'en-US' for American English, 'de-DE' for German, or 'es-ES' for Spanish).

html
<button kendoSpeechToTextButton lang="es-ES"></button>

By default, the SpeechToTextButton uses 'en-US' (American English) if no language is specified.

The following example demonstrates how to set the language for the SpeechToTextButton.

Change Theme
Theme
Loading ...

Supported Languages

The available languages depend on the underlying speech recognition engine. For the browser's Web Speech API, refer to the list of supported languages.

Browser Support and Considerations

These Kendo UI for Angular SpeechToTextButton advanced features rely on the browser's Web Speech API implementation:

  • Continuous Recognition (continuous): Supported in Chrome, Edge, and Safari. May have time limits in some browsers.
  • Interim Results (interimResults): Supported in Chrome and Edge. Safari support may vary.
  • Multiple Alternatives (maxAlternatives): Supported in Chrome and Edge. The number of alternatives may be limited by the browser.

For the most current browser support information, refer to the Web Speech API compatibility table on MDN.

For cross-browser compatibility, consider:

  • Providing fallback behavior when features are not supported.
  • Testing in your target browsers.
  • Using feature detection to enable/disable functionality.

Known Limitations

  • Some browsers may impose time limits on continuous recognition sessions.
  • Interim results quality varies between browsers and languages.
  • The number of alternatives returned may be less than the requested maxAlternatives.
  • Features may not be available when using custom speech recognition providers (integrationMode="none").