Speech trainer

Web application "Speech trainer" is designed to teach the correct pronunciation of Russian sounds.


Figure 1 shows the software architecture applications, based on the concept of "client - Web server - the server applications."

Рис. 1. Архитектура системы

Fig. 1. System's architecture

The system consists of the following components:

  • The client is an application written in Adobe Flash Platform and a built-in Web page.
  • The software framework is a client application Adobe Flex 4.
  • The web server nginx
  • Application server - media server Red5.
  • Speech recognizer Nuance Recognizer 9.

In the course of the application on the server side software is also used for encoding and decoding of media information ffmpeg.
Adobe Flex Framework 4 server nginx and Red5, ffmpeg is open source code and, accordingly, free of charge.


Figure 2 shows the interface of main window.

Рис. 2. Интерфейс приложения.

Fig. 2. Application's interface

In the upper left corner (1, Fig. 2) there is a link to a page with additional background information on the application and the sounds of the Russian language. On the selection panel (2) the user can select the sound interesting, then, displays detailed information on the sound (3). The sound information includes:

  • sketch animation of the vocal tract at the time of pronouncing the sound (a);
  • video showing the position of the lips (b, c);
  • Images PCM audio reference (e);
  • detailed textual information about the features of articulation (e).

To start the animation and video clips a button "Demonstration of sound" (e). If you attach a webcam, the camera image is displayed on the screen (4), so the user can control the movement of his lips. To evaluate the pronunciation, you must click the "Check the pronunciation" (5). A dialog box appears (Figure 3), signaling the start of the recording.

Рис. 3. Начало записи

Fig. 3. Start of recording

After recording and analyzing the sound will be displayed results (Fig. 4) works, which consist of:

  • conformity assessment of spoken audio reference to 100-point scale (1);
  • one of five images corresponding estimate for the five-point scale (2);
  • Images PCM spoken audio (red) and reference (blue) (3).

Рис. 4. Результаты

Fig. 4. Results

When you click on the link "More information about the application" in the browser tab opens a page help.html, located on a web server (see Fig. 5, 6). The page contains flash-application that displays help information on a 10-page interactive booklet.

Рис. 5. Интерфейс справочной страницы (обложка)

Fig. 5. Interface of help book (cover)

Рис. 6. Интерфейс справочной страницы (один из разворотов)

Fig. 6. Interface of help book (one of the turns)

Created and launched in the computer lab network Web application "Speech Trainer". Application is ready for placement on the Internet. You need a hosting cloud platform with the ability to install operating system Red Hat Linux. The choice of this operating system is caused by the presence of Nuance Recognizer recognizer version only for this distro linux. Also for the possibility of providing public access to the application needs to purchase a commercial license of speech recognizer Nuance Recognizer.