Saturday, 20 January 2018

Microsoft creates an AI program that draws detailed images from descriptions

Microsoft is developing in its research laboratories, a new technology of Artificial Intelligence, capable of generating images from text descriptions similar to subtitles. This technology has been called simply Drawing Bot.

Bot of drawing, can generate all kinds of images, from landscapes to scenes of the absurd with an impressive and detailed capacity, only through descriptions of the object.
The images made by the Drawing Bot, contain details absent in the descriptions of the text, which indicates that this artificial intelligence contains an artificial imagination. In practice, it is possible to ask him to draw a "yellow bird, with black wings, perched on a branch" for the AI ​​to present an image like this:

That is, the system does not look for existing images that correspond to its description, the software creates the root image, materializing all the details described "pixel by pixel", as explained by one of the project's researchers. "These birds may not even exist in the real world - they are only the fruit of computer imagination," writes Xiaodong He.

This AI has two other systems at its base: the CaptionBot, which generates automatic descriptions for existing photographs and Seeing AI , which provides additional information about the images. To these programs, Microsoft has added other features that qualify the quality of the generated image.

The core of the Microsoft drawing bot is a technology known as the Generative Adversarial Network, or GAN. The network consists of two models of machine learning, one that generates images from text descriptions and another known as a discriminator, which uses text descriptions to judge the authenticity of the generated images. The generator tries to make false photos pass over the discriminator; the discriminator never wants to be cheated. Working together, the discriminator pushes the generator towards perfection.

For this Drawing Bot to achieve this, Microsoft trained him with image data with a title, which allowed him to understand how the image corresponds to each word. He learned to draw a bird, when the title says "bird" and he learned how it should look like a bird. For this reason, researchers believe that "a machine can learn".

The Microsoft drawing bot closes a circle of research around the interrelation between computer vision and natural language processing, a field that has been developed by Microsoft.   during the last half decade. They started with the "CaptionBot", a technology that automatically writes a photo, then moved on to a technology that responds to the questions that humans ask about an image, such as the location or attributes of objects, and now Drawing Bot that draws through description. No doubt technologies that would be especially useful for blind people.

