AttnGAN Neural Network to Draw Strange Pictures

The neural network is good at drawing birds by the text description, but bad everything else
20 August 2018   369

The author of the AI Weirdness blog Janelle Shane had discovered the generative-controversial neural network called AttnGAN, which is trained to draw images on the text description. The problem is that it requires too accurately defined picture parameters and sometimes can not determine the boundaries of objects.

Janelle notes that, while the neural network was trained on a narrow set of data in the form of birds, it obtained nice images:

AttnGAN
AttnGAN

However, when the creators trained it on a dataset that included pictures from sheep to shopping centers, it could not create a meaningful image in a similar way. The author of AI Weirdness believes that the error lies in too wide a set of initial data, in which AttnGAN could not select the appropriate instances:

AttnGAN
AttnGAN

In addition, it somehow has a problem with determining the correct number of holes on the human face. Developers AttnGAN added to the control dataset person celebrities to create photorealistic portraits, but the neural network couldn't do that:

AttnGAN
AttnGAN

Additionally, neural network is real bad at displaying animals:

AttnGAN
AttnGAN

Janelle Shane calls the project AttnGAN "Visual Chatbot on the contrary." This chat bot analyzes the image that the user sends and describes it, often implausibly.

AI to be Used to Create 3D Motion Sculptures

The system developed by the MIT and Berkeley scientists is called MoSculp and is based on artificial inteligence
21 September 2018   119

MoSculp, the joint work of MIT scientists and the University of California at Berkeley, is built on the basis of a neural network. The development analyzes the video recording of a moving person and generates what the creators called "interactive visualization of form and time." According to the lead specialist of the project Xiuming Zhang, software will be useful for athletes for detailed analysis of movements.

At the first stage, the system scans the video frame-by-frame and determines the position of key points of the object's body, such as elbows, knees, ankles. For this, scientists decided to resort to the OpenPose library, developed by the Carnegie Mellon University. Based on the received data, the neural network compiles a 3D model of the person in each frame, and calculates the trajectory of the motion, obtaining a "motion sculpture".

At this stage, the image, according to the developers, suffers from a lack of textures and details, so the application integrates the "sculpture" in the original video. To avoid overlapping, MoSculp calculates a depth map for the original object and the 3D model.

MoSculp 3D Model
MoSculp 3D Model

The operator can adjust the image during the processing, select the "sculpture" material, color, lighting, and also what parts of the body will be tracked. The system is able to print the result using a 3D printer.

The team of researchers announced plans to further develop the MoSculp technology. Developers want to achieve from the processing system more than one object on the video, which is currently impossible. The creators of the technology believe that the program will be used to study group dynamics, social disorders and interpersonal interactions.

The principle of creating a 3D model based on human movements has been used before. For example, in August 2018, scientists at the same University of California at Berkeley demonstrated an algorithm that transfers the movements of one person to another.