MelNet Algorithm to Simulate Person's Voice

It analyzes the spectrograms of the audio tracks of the usual TED Talks, notes the speech characteristics of the speaker and reproduces short replicas
11 June 2019   840

Facebook AI Research team has developed a MelNet algorithm that synthesizes speech with characteristics specific to a particular person. For example, it learned to imitate the voice of Bill Gates.

MelNet analyzes the spectrograms of the audio tracks of the usual TED Talks, notes the speech characteristics of the speaker and reproduces short replicas.

Just the length of the replicas limits capabilities of the algorithm. It reproduces short phrases very close to the original. However, the person's intonation changes when he speaks on different topics, with different moods, different pitches. The algorithm is not yet able to imitate this, therefore long sentences sound artificially.

MIT Technology Review notes that even such an algorithm can greatly affect services like voice bots. There just all communication is reduced to an exchange of short remarks.

A similar approach - analysis of speech spectrograms - was used by scientists from Google AI when working on the Translatotron algorithm. This AI is able to translate phrases from one language to another, preserving the peculiarities of the speaker's speech.

AI to Recognize Text Written by Invisible Keyboard

Developers said they tried to increase the typing speed on the on-screen keyboards
06 August 2019   328

Korean developers have created an algorithm that recognizes text printed on an imaginary keyboard on a touchscreen. Such a “keyboard” is not tied to a specific area on the screen, and the “keys” are not limited to clear squares.

As a result, a person types blindly in a QWERTY layout without thinking about where the keyboard should be and whether it got into the key.

Imaginary Buttons Press CloudsImaginary Buttons Press Clouds

According to the developers, they tried to increase the typing speed on the on-screen keyboards. The on-screen keyboard, unlike the hardware keyboard, does not offer feedback that confirms pressing. There is a risk to miss and not press the desired button. Because of this, people endlessly stare at the screen and eventually print more slowly.

The new algorithm allows you not to worry about this, you can enter text from memory, and the keyboard with 96% accuracy will guess what the person wanted to say. Tests have shown that the average typing speed on an imaginary keyboard is slightly less than on a hardware keyboard: 45 words per minute versus 51.