MelNet Algorithm to Simulate Person's Voice

It analyzes the spectrograms of the audio tracks of the usual TED Talks, notes the speech characteristics of the speaker and reproduces short replicas
11 June 2019   848

Facebook AI Research team has developed a MelNet algorithm that synthesizes speech with characteristics specific to a particular person. For example, it learned to imitate the voice of Bill Gates.

MelNet analyzes the spectrograms of the audio tracks of the usual TED Talks, notes the speech characteristics of the speaker and reproduces short replicas.

Just the length of the replicas limits capabilities of the algorithm. It reproduces short phrases very close to the original. However, the person's intonation changes when he speaks on different topics, with different moods, different pitches. The algorithm is not yet able to imitate this, therefore long sentences sound artificially.

MIT Technology Review notes that even such an algorithm can greatly affect services like voice bots. There just all communication is reduced to an exchange of short remarks.

A similar approach - analysis of speech spectrograms - was used by scientists from Google AI when working on the Translatotron algorithm. This AI is able to translate phrases from one language to another, preserving the peculiarities of the speaker's speech.

Neural Network to Create Educational Videos

The Automontazh project consists of 2 subsystems and is designed for automatic video lectures creating, no human editor needed
17 September 2019   243

Russian engineers from SPbPU taught the neural network how to mount video lectures. It will help to quickly create educational videos.

The "Avtomontazh" (auto editing) project has been developed since 2017. It consists of two subsystems: "Autoslide" and "Auto Operator". The first automatically combines the video with the presentation of the lecturer. It receives two files and puts slides on the video in the right places. The second one receives a general plan video from different cameras and creates a final video: builds a plan, frames, changes angles.

The system has not yet entered the market, it is still being tested and finalized. However, the educational project "Lectorium", the owner of "Automontazh", is already using the program for processing finished lectures.