Researchers to Develop Accent Detection AI

The team of scientists was from Cisco, the Moscow Institute of Physics and Technology and the Higher School of Economics
13 July 2018   585

A team of scientists used machine learning to develop an improved model for speech recognition. This is reported by Venture Beat.

Previously, scientists manually identified phonological similarities between units of language in general American English and the pronunciation dictionary of the Carnegie Mellon University. To create an improved model, they went non-standard way and allowed it to automatically form the rules. Then, it compared the resulting unique list with a set of examples from George Mason University's speech accents archive.

More non-native accented speech data is necessary to enhance the performance of … existing [speech recognition] models. However, its synthesis is still an open problem.


Based on the received examples, the team created a phonetic data set, through which a neural network, often used for speech recognition, was trained. The accuracy of the definition of words, after overcoming the mark of 800,000 examples, was 59%.

The study was called preliminary due to fewer sounds in the Carnegie-Mellon University dictionary. Despite phonetic coincidences in 13 out of 20 dictionary comparisons, scientists managed to increase the data array from 103 thousand phonetic transcriptions with one accent to 1 million samples with several accents.

Neural Network Now Can Animate People on Photos

Algorithm can even make people on the photos to 'go out' picture's borders
12 December 2018   105

Researchers at the University of Washington, together with the developers of Facebook, have created an algorithm that “revives” people in the photographs. In a single snapshot, it generates a three-dimensional moving model of a figure that can sit, jump, run, and even "go" beyond the limits of the image. The algorithm also works for drawings and anime characters.

To create such a technology, researchers used the experience of colleagues.

  • Mask R-CNN recognizes a human figure in the image and makes it stand out from the background.
  • Another algorithm imposes a simplified skeleton markup on the shape, defining how it will move.
  • The third algorithm "fills" the background space, previously hidden by the figure.

Further, the own algorithm of researchers on the basis of a marked two-dimensional figure creates a three-dimensional model and generates a texture level from the original image.

The developers added a user interface that allows you to change the shape of the figure in order to edit the photo itself or determine where the animation will begin. In addition, you can “revive” a drawing or photo in augmented reality and see a three-dimensional figure in VR or AR glasses.