I was writing a chat bot where a user interacts with a machine learning powered bot, then I wanted to write a general example application for anybody to use it. In this application, there will not be any intelligence. The bot will simply recite what it heard so that anyone can implement his/her own logic. I used Angular & reactive approach and the main logic is mainly the service.  Demo can be found here . A small catch for now this application will only work on Chrome browsers as it's the only browser supports Web Speech API currently. Since I will be using reactive approach, I created an interface with an optional payload parameter named . All our actions will implement this interface. Components will be subscribing to these actions and react to them. Action Action {
	payload?: ;
} SpeakingStarted Action {} SpeakingEnded Action {} ListeningStarted Action {} ListeningEnded Action {} RecognizedTextAction Action { ( ) {}
} SpeakAction Action { ( ) {}
} export interface any export class implements export class implements export class implements export class implements export class implements constructor payload: public string export class implements constructor payload: public string In the constructor, we will inject because web speech API lives outside of Angular, we need NgZone to bring it into the Angular realm. NgZone ( ) { .window = ( unknown) IWindow; .createSpeaker(); .createListener(); .subscriptions();
	} constructor zone: NgZone private this window as as this this this One challenge is that synthesizer & recognizer are separate objects. When synthesizer is speaking, the recognizer picks up and it goes into a loop. So, we need to pause one or the other. As you can see in the snippet above, we are creating speaker and listener separately and then we will set up subscriptions to prevent issues in between them. For example: We are reacting to action (as shown below), when it's received, we stop listening and when or we start listening again. SpeakingStarted SpeakingEnded ListeningEnded subscriptions() { .getType(SpeakingStarted)
    .pipe( tap( .stopListening()),
      takeUntil( .destroy$)
    )
    .subscribe();

  merge( .getType(SpeakingEnded), .getType(ListeningEnded))
    .pipe(
      filter( ! .isSpeaking), tap( .startListening()),
      takeUntil( .destroy$)
    )
    .subscribe(); .getType(SpeakAction)
    .pipe(
      tap( ._speak(text)),
      takeUntil( .destroy$)
    )
    .subscribe();
} this // tap(()=> console.log('will stop recognition')), => () this this this this => () this // tap(()=> console.log('will start recognition')), => () this this this ( ) => text this this We need a couple observables that application can subscribe to. Speaker & listener will use the observable for dispatch their actions. action$ is a utility method so that consumer can just pass the type of action it wants and gets only those events. getType such as getType(SpeakerStarted) . _voices$ = BehaviorSubject( );
voices$ = ._voices$.asObservable(); _activeVoice$ = BehaviorSubject( );
activeVoice$ = ._activeVoice$.asObservable(); _action$ = Subject<Action>();
action$ = ._action$.asObservable();

getType(action: Action | ) { .action$.pipe(
    filter( i action),
    map( i.payload),
    takeUntil( .destroy$)
  );
} private new null this private new null this private new this any return this ( ) => i instanceof ( ) => i this Based on the API, we are creating an instance and setting up the parameters and attaching functions that we are interested in. and are the functions we need in our case. SpeechSynthesisUtterance onstart onend We are assigning functions so that whenever invoked, it will trigger an action. We are wrapping them in so that actions actually work in Angular. Lastly, we are loading voices. zone.run createSpeaker() { .speaker = SpeechSynthesisUtterance(); .speaker.lang = .language; .speaker.onstart = { .zone.run( { .isSpeaking = ; ._action$.next( SpeakingStarted());
    });
  }; .speaker.onend = { .zone.run( { .isSpeaking = ; ._action$.next( SpeakingEnded());
    });
  }; .loadVoices();
} private this new this this this => () this => () this true this new this => () this => () this false this new this To load voices, we need to add function to object on the window (This is not the speaker object). Similarly, we are emitting an event after receiving voices and remove this function after the first run. The reason is function may be invoked more than once during the lifetime of our service. (We could also check if the voices are changed but we don't need such a feature for now). onvoiceschanged speechSynthesis onvoiceschanged loadVoices() { .window.speechSynthesis.onvoiceschanged = { .zone.run( { voices = .window.speechSynthesis.getVoices(); .voices = voices; ._voices$.next(voices); voice_us = voices.find( { i.name.indexOf( .defaultVoiceName) > ;
      }); .onVoiceSelected(voice_us, );
    }); .window.speechSynthesis.onvoiceschanged = ;
  };
} private this => () this => () const this this this const ( ) => i // console.log(i.name); return this -1 this false // we are removing the function after its called, // as we will not need this to be called any more. this null Similar to speaker, we are instantiating our listener, setting up parameters and actions. Documentation can be found for the parameters. here When listener resulted in some value, it will invoke function with possible results. We call method to get the text out of it and dispatch RecognizedTextAction with the actual recognized text. Any component subscribed to this action can get the actual value without dealing with details. onresult extractText When listener ended, we are restarting the listener so it can start over. Full Service: { Observable, merge, Subject, BehaviorSubject } ; { Injectable, NgZone } ; { map, filter, tap, takeUntil } ; IWindow Window {
	webkitSpeechRecognition: ;
	SpeechRecognition: ;
	SpeechSynthesisUtterance: ;
} RecognizedText {
	term: ;
	confidence: ;
	isFinal: ;
} Action {
	payload?: ;
} SpeakingStarted Action {} SpeakingEnded Action {} ListeningStarted Action {} ListeningEnded Action {} RecognizedTextAction Action { ( ) {}
} SpeakAction Action { ( ) {}
} ({
	providedIn: ,
}) SenseService { defaultVoiceName = ; language = ;

	destroy$ = Subject(); : IWindow;
	listener: ;
	speaker: ;

	isAllowed = ;

	voices: [] = ; _voices$ = BehaviorSubject( );
	voices$ = ._voices$.asObservable(); _activeVoice$ = BehaviorSubject( );
	activeVoice$ = ._activeVoice$.asObservable(); _action$ = Subject<Action>();
	action$ = ._action$.asObservable(); isSpeaking(val: ) { .speaker._isSpeaking = val;
	} isSpeaking(): { !! .speaker._isSpeaking;
	} isListening(val: ) { .listener._isListening = val;
	} isListening(): { !! .listener._isListening;
	} ( ) { .window = ( unknown) IWindow; .createSpeaker(); .createListener(); .subscriptions();
	}

	subscriptions() { .getType(SpeakingStarted)
			.pipe( tap( .stopListening()),
				takeUntil( .destroy$)
			)
			.subscribe();

		merge( .getType(SpeakingEnded), .getType(ListeningEnded))
			.pipe(
				filter( ! .isSpeaking), tap( .startListening()),
				takeUntil( .destroy$)
			)
			.subscribe(); .getType(SpeakAction)
			.pipe(
				tap( ._speak(text)),
				takeUntil( .destroy$)
			)
			.subscribe();
	}

	getType(action: Action | ): Observable< > { .action$.pipe(
			filter( i action),
			map( i.payload),
			takeUntil( .destroy$)
		);
	} createSpeaker() { key = ; ( .window[key]) { .log( ); .speaker = .window[key]; ;
		} .speaker = SpeechSynthesisUtterance(); .window[key] = .speaker; .speaker.lang = .language; .speaker.onstart = { .zone.run( { .isSpeaking = ; ._action$.next( SpeakingStarted());
			});
		}; .speaker.onend = { .zone.run( { .isSpeaking = ; ._action$.next( SpeakingEnded());
			});
		}; .loadVoices();
	} loadVoices() { .window.speechSynthesis.onvoiceschanged = { .zone.run( { voices = .window.speechSynthesis.getVoices(); .voices = voices; ._voices$.next(voices); voice_us = voices.find( { i.name.indexOf( .defaultVoiceName) > ;
				}); .onVoiceSelected(voice_us, );
			}); .window.speechSynthesis.onvoiceschanged = ;
		};
	} createListener() { key = ; ( .window[key]) { .log( ); .listener = .window[key]; .startListening(); ;
		} webkitSpeechRecognition = .window.webkitSpeechRecognition; .listener = webkitSpeechRecognition(); .window[key] = .listener; .listener.continuous = ; .listener.interimResults = ; .listener.lang = .language; .listener.maxAlternatives = ; .listener.maxResults = ; .listener.onstart = { .zone.run( { .isListening = ; ._action$.next( ListeningStarted());
			});
		}; .listener.onresult = { (speech.results) { term: RecognizedText;
				term = .extractText(speech); (term.isFinal) { .zone.run( { ._action$.next( RecognizedTextAction(term.term));
					});
				}
			}
		}; .listener.onerror = { (error.error === ) {
			} (error.error === ) { .isAllowed = ;
			} { .error(error.error);
			}
		}; .listener.onend = { .zone.run( { .isListening = ; ._action$.next( ListeningEnded());
			});
		}; .startListening();
	} stopListening() { .listener.stop();
	} startListening() { (! .startListening) { ;
		} { .log( ); (! .isAllowed) { ;
			} .listener.start();
		} {}
	}

	onVoiceSelected(voice: , speak = ) { .speaker.voice = voice; ._activeVoice$.next(voice); (speak) ._speak( );
	}

	extractText(speech: ): RecognizedText { term = ; result = speech.results[speech.resultIndex]; transcript = result[ ].transcript; confidence = result[ ].confidence; (result.isFinal) { (result[ ].confidence < ) { } {
				term = transcript.trim(); }
		} { (result[ ].confidence > ) {
				term = transcript.trim();
			}
		} <RecognizedText>{
			term,
			confidence,
			isFinal: result.isFinal,
		};
	}

	speak(text: ) { ._action$.next( SpeakAction(text));
	} _speak(text: ): { .log( ); .speaker.text = text; .window.speechSynthesis.speak( .speaker);
	}
} import from 'rxjs' import from '@angular/core' import from 'rxjs/operators' interface extends any any any export interface string number boolean export interface any export class implements export class implements export class implements export class implements export class implements constructor payload: public string export class implements constructor payload: public string @Injectable 'root' export class private 'Google US English' private 'en-US' new window any any true any null private new null this private new null this private new this set boolean this get boolean return this set boolean this get boolean return this constructor zone: NgZone private this window as as this this this this // tap(()=> console.log('will stop recognition')), => () this this this this => () this // tap(()=> console.log('will start recognition')), => () this this this ( ) => text this this any any return this ( ) => i instanceof ( ) => i this private const '_ms_Speaker' if this console 'speaker found' this this return this new this this // this.speaker.voiceURI = 'native'; // this.speaker.volume = 1; // 0 to 1 // this.speaker.rate = 1; // 0.1 to 10 // this.speaker.pitch = 0; //0 to 2 // this.speaker.text = 'Hello World'; this this this => () this => () this true this new this => () this => () this false this new this private this => () this => () const this this this const ( ) => i // console.log(i.name); return this -1 this false // we are removing the function after its called, // as we will not need this to be called any more. this null private const '_ms_Listener' if this console 'recognition found' this this this return const this this new this this this true this true this this this 1 this 25 this => () this => () this true this new this ( ) => speech if let this // console.log(term) if this => () this new this ( ) => error if 'no-speech' else if 'not-allowed' this false else console this => () this => () // console.log('recognition onend'); this false this new this private this private if this return try console 'recognition started' if this return this catch any true this this if this 'Hello' any let '' let let 0 let 0 if if 0 0.3 // console.log('Not recognized'); else // console.log(term); else if 0 0.6 // return term; return string this new private string void console 'speaking...' this this this Usage on any component/service: .senseService.getType(RecognizedTextAction).subscribe( .log(text)); .senseService.speaker( ); .senseService
.getType(RecognizedTextAction)
.pipe(
  debounceTime( ),
  tap( { .senseService.speak( );
  }, takeUntil( .destroy$))
)
.subscribe(); this => text console this 'test speak' // or this 200 ( ) => msg // process ... this `response....` this Demo can be found here .

Google

Speech Recognition And Speech Synthesis on Angular

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Creating Extension Methods Using Typescript

10 Useful JavaScript Functions to Learn

10 Cool Angular Material Admin Dashboard Templates

13 Angular App Optimization Tips for Frontend Developers

25 Stories To Learn About Angularjs

16 JavaScript Protips [2020 Edition]

Creating Extension Methods Using Typescript

10 Useful JavaScript Functions to Learn

10 Cool Angular Material Admin Dashboard Templates

13 Angular App Optimization Tips for Frontend Developers

25 Stories To Learn About Angularjs

16 JavaScript Protips [2020 Edition]

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps