Skip to content

WebNinjaDeveloper.com

Programming Tutorials




Menu
  • Home
  • Youtube Channel
  • Official Blog
  • Nearby Places Finder
  • Direction Route Finder
  • Distance & Time Calculator
Menu

Javascript Web Speech Recognition API Project to Build Speech to Text App in Browser

Posted on November 5, 2022

 

 

Welcome folks today in this blog post we will be building speech to text app using web speech recognition api in javascript. All the full source code of the application is shown below.

 

 

 

Get Started

 

 

In order to get started you need to make an index.html file and copy paste the following code

 

 

index.html

 

 

For this we need to include the bootstrap 4 cdn link as shown below

 

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <link
      href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta1/dist/css/bootstrap.min.css"
      rel="stylesheet"
      integrity="sha384-giJF6kkoqNQ00vy+HMDP7azOuL0xtbfIcaT9wjKHr8RbDVddVHyTfAAsrekwKmP1"
      crossorigin="anonymous"
    />
    <title>Speech To Text</title>
  </head>
  <body>
  </body>
  <script src="./speechRecognition.js"></script>
</html>

 

 

 

After including the bootstrap cdn links we now need to include the html code as shown below

 

 

1
2
3
4
5
6
7
8
9
10
11
12
13
<body class="container pt-5 bg-dark">
  
    <h2 class="mt-4 text-light">Transcript</h2>
    <div class="p-3" style="border: 1px solid gray; height: 300px; border-radius: 8px;">
      <span id="final" class="text-light"></span>
      <span id="interim" class="text-secondary"></span>
    </div>
    <div class="mt-4">
      <button class="btn btn-success" id="start">Start</button>
      <button class="btn btn-danger" id="stop">Stop</button>
      <p id="status" class="lead mt-3 text-light" style="display: none">Listenting ...</p>
    </div>
  </body>

 

 

As you can see we are attaching the bootstrap 4 classes for styling the application. And then we will have the heading and after that we have the section to display the speech to text widget and then we have the button to start and stop the user’s microphone. And also at the bottom we are showing the status of the microphone.

 

 

 

 

 

 

 

So now we need to create the speechRecognition.js file which will contain the javascript code to make the speech to text app as shown below

 

 

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
if ("webkitSpeechRecognition" in window) {
    // Initialize webkitSpeechRecognition
    let speechRecognition = new webkitSpeechRecognition();
  
    // String for the Final Transcript
    let final_transcript = "";
  
    // Set the properties for the Speech Recognition object
    speechRecognition.continuous = true;
    speechRecognition.interimResults = true;
  } else {
    console.log("Speech Recognition Not Available");
  }

 

 

First of all we are checking if the support of webkitSpeechRecognition api is available or not. And then we are making the new object of the webkitSpeechRecognition class and then we are declaring the transcript variable where we will be storing all the text which is spoken by the user.

 

And then we are setting the properties for the speechRecognition which is continous and interimResults. These options are set because we need to continously check the user microphone and convert the speech to text and display it in textarea.

 

 

Adding the SpeechRecognition Events

 

 

Now we will be adding the different events which are available inside the webspeech recognition api. It can include what should happen when it starts ,stops and gets any error. And also when it gets some kind of data

 

 

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
speechRecognition.onstart = () => {
      // Show the Status Element
      document.querySelector("#status").style.display = "block";
    };
    speechRecognition.onerror = () => {
      // Hide the Status Element
      document.querySelector("#status").style.display = "none";
    };
    speechRecognition.onend = () => {
      // Hide the Status Element
      document.querySelector("#status").style.display = "none";
    };

 

 

As you can see we have defined all the events here inside the start event we are displaying the status i.e listening for user speech. And also if any error comes we will be displaying the error and when it ends or stops then we again hiding the status.

 

 

Now guys we will be binding the addEventListener to both the start and stop buttons present inside the DOM and here we will be starting the webspeech api.

 

 

JavaScript
1
2
3
4
5
6
7
8
9
10
// Set the onClick property of the start button
    document.querySelector("#start").onclick = () => {
      // Start the Speech Recognition
      speechRecognition.start();
    };
    // Set the onClick property of the stop button
    document.querySelector("#stop").onclick = () => {
      // Stop the Speech Recognition
      speechRecognition.stop();
    };

 

 

As you can see we are using the start() and stop() methods to start and stop the web speech recognition api. It’s really easy to do this.

 

 

Now we need to define what happens when we get some kind of speech or data from the user.

 

 

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
speechRecognition.onresult = (event) => {
      // Create the interim transcript string locally because we don't want it to persist like final transcript
      let interim_transcript = "";
  
      // Loop through the results from the speech recognition object.
      for (let i = event.resultIndex; i < event.results.length; ++i) {
        // If the result item is Final, add it to Final Transcript, Else add it to Interim transcript
        if (event.results[i].isFinal) {
          final_transcript += event.results[i][0].transcript;
        } else {
          interim_transcript += event.results[i][0].transcript;
        }
      }

 

 

As you can see we have the onresult event inside the webspeech api and here we are just adding the user spoken words to the textarea. For this we are using the for loop to get all the results or words spoken by the user and then appending it to the textarea. The user spoken speech or words is available inside the transcript property.

 

 

Full Source Code

 

 

Wrapping the blog post this is the full source code of the speechRecognition.js file

 

 

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
if ("webkitSpeechRecognition" in window) {
    // Initialize webkitSpeechRecognition
    let speechRecognition = new webkitSpeechRecognition();
  
    // String for the Final Transcript
    let final_transcript = "";
  
    // Set the properties for the Speech Recognition object
    speechRecognition.continuous = true;
    speechRecognition.interimResults = true;
  
    // Callback Function for the onStart Event
    speechRecognition.onstart = () => {
      // Show the Status Element
      document.querySelector("#status").style.display = "block";
    };
    speechRecognition.onerror = () => {
      // Hide the Status Element
      document.querySelector("#status").style.display = "none";
    };
    speechRecognition.onend = () => {
      // Hide the Status Element
      document.querySelector("#status").style.display = "none";
    };
  
    speechRecognition.onresult = (event) => {
      // Create the interim transcript string locally because we don't want it to persist like final transcript
      let interim_transcript = "";
  
      // Loop through the results from the speech recognition object.
      for (let i = event.resultIndex; i < event.results.length; ++i) {
        // If the result item is Final, add it to Final Transcript, Else add it to Interim transcript
        if (event.results[i].isFinal) {
          final_transcript += event.results[i][0].transcript;
        } else {
          interim_transcript += event.results[i][0].transcript;
        }
      }
  
      // Set the Final transcript and Interim transcript.
      document.querySelector("#final").innerHTML = final_transcript;
      document.querySelector("#interim").innerHTML = interim_transcript;
    };
  
    // Set the onClick property of the start button
    document.querySelector("#start").onclick = () => {
      // Start the Speech Recognition
      speechRecognition.start();
    };
    // Set the onClick property of the stop button
    document.querySelector("#stop").onclick = () => {
      // Stop the Speech Recognition
      speechRecognition.stop();
    };
  } else {
    console.log("Speech Recognition Not Available");
  }

Recent Posts

  • Android Kotlin Project to Load Image From URL into ImageView Widget
  • Android Java Project to Make HTTP Call to JSONPlaceholder API and Display Data in RecyclerView Using GSON & Volley Library
  • Android Java Project to Download Youtube Video Thumbnail From URL & Save it inside SD Card
  • Android Java Project to Embed Google Maps & Add Markers Using Maps SDK
  • Android Java Project to Download Random Image From Unsplash Using OkHttp & Picasso Library & Display it
  • Angular
  • Bunjs
  • C#
  • Deno
  • django
  • Electronjs
  • java
  • javascript
  • Koajs
  • kotlin
  • Laravel
  • meteorjs
  • Nestjs
  • Nextjs
  • Nodejs
  • PHP
  • Python
  • React
  • ReactNative
  • Svelte
  • Tutorials
  • Vuejs




©2023 WebNinjaDeveloper.com | Design: Newspaperly WordPress Theme