Scientists from Twente University in the Netherlands have developed a new method of artificial intelligence that enables the generation of realistic and coordinated images by building scenes from images. Their research results have been recently published in the IEEE TRANSACTIONS ON PATTERN ANALYSIS and Machine Intelligence journal.
The new method, developed by Michael ING Young, a researcher from the ITC Faculty of Twente, addresses a limitation of generative models in artificial intelligence (AI): the ability of these models to create images of full scenes. While generative models can create images of single objects based on text requests, creating full scenes remains challenging.
According to Young, improving a computer’s ability to detect and understand visual relations is necessary in generating images, as well as in enhancing the perception of autonomous vehicles and robots. However, current methods for constructing a semantic understanding of the image are slow, utilizing a two-stage approach that displays all objects on the stage before passing through possible connections and marking them with the correct attitude. This method increases exponentially with the number of objects.
Young’s one-stage method, on the other hand, looks at the visual characteristics of objects on the stage and focuses on the most important details to determine the relationship. It distinguishes important areas where objects interact or are associated with each other and determines the most important relations between different objects, requiring only relatively small data for training.
“The model discovers that in the approximate image a person is very likely interacting with a baseball bat. Then she learns to describe the most likely attitude: ‘The man-mother-bassball bat’,” explains Young.
The use of this AI method opens up opportunities for the generation of images that can be used in various industries, including gaming and entertainment.
The complete details of the research can be accessed here: https://dx.doi.org/10.1109/tpami.2023.3268066