AI to Associate Objects and Spoken Words

MIT scientists believe that this approach will simplify the automatic translation between several languages
21 September 2018   572

Scientists from the Massachusetts Institute of Technology’s Computer Science and Artificial Intelligence Lab published a report on a new model of machine learning, which is able to compare objects on the image with their voice description. As a basis, the researchers took the work of 2016 and improved it by teaching it to combine certain spectrograms of the voice with certain fragments of pixels. Engineers hope that in the future their model will be useful in simultaneous translation.

The MIT algorithm is based on two convolutional neural networks. The first divides the image into a grid of cells, and the second composes a voice spectrogram - a visual representation of the frequency spectrum - and also breaks it into segments in a single word length. Then the system compares each cell of pixels with a segment of the spectrogram and considers the degree of similarity. Based on this parameter, the neural network determines which pair "object-word" is correct and which is not.

We wanted to do speech recognition in a way that’s more natural, leveraging additional signals and information that humans have the benefit of using, but that machine learning algorithms don’t typically have access to.

David Harwath

Researcher, CSAIL 

After studying the database of 400,000 images, the system was able to match several hundred words with objects. After each iteration, it narrowed the matching parameter to determine specific words associated with specific objects.

MIT believes that this approach will simplify the automatic translation between several languages, since it does not require a text description of objects.

Image recognition systems and voice are already coping with their task, but they require a lot of resources for this. In April 2018, Google announced a development competition in the field of deep networks and computer vision on smartphones. It is designed to find ways to optimize the operation of real-time recognition systems.

OpenAI to Create Fake News Creating Algorithm

On the basis of one or two phrases that set the theme, it is able to “write” a fairly plausible story
18 February 2019   108

The GPT-2 algorithm, created by OpenAI for working with language and texts, turned out to be a master in creating fake news. On the basis of one or two phrases that set the theme, it is able to “compose” a fairly plausible story. For example:

  • an article about scientists who have found a herd of unicorns in the Andes;
  • news about pop star Miley Cyrus caught on shoplifting;
  • artistic text about Legolas and Gimli attacking the orcs;
  • an essay on how waste recycling harms the economy, nature, and human health.

The developers did not publish the source code of the model entirely, fearing abuse by unscrupulous users. For fellow researchers, they posted on GitHub a simplified version of the algorithm and gave a link to the preprint of the scientific article. The overall results are published on the OpenAI blog.

GPT-2 is a general purpose algorithm. The developers taught it to answer questions, “understand” the logic of a text, a sentence, finish building phrases. In this case, the algorithm worked worse than the model of a specific purpose. Researchers suggest that the indicators can be improved by expanding the training datasets and choosing computers more efficiently.