In the world of AR, VR and video game design, designers may run into the challenge of inserting certain objects into an image and having it conform to a scene. However, due to new research by Seoul National University, the University of California at Merced, and Google AI, there’s a new system that can make designers’ jobs easier, says Venture Beat.
The new system can learn how to insert an object into an image in a “semantically coherent,” or “convincing” manner; in other words, in a scene depicting a city, pedestrians appear on the sidewalk, and cars appear on the street. The technology, Venture Beat says, can potentially serve as a powerful tool for image editing and scene parsing applications.
Here’s how it works:
The editing technology contains two modules – one determines where certain objects should be in a scene, and the other distinguishes what the objects should look like. These modules leverage two-part neural networks made of generators that produce samples and discriminators that work to determine the created images from real-world images, Venture Beat says.
“Because the system simultaneously models the distribution with respect to an inserted image, it enables both modules to communicate with and optimize each other.” Over time, the system learns different distributions for different categories put in a scene.
Based on Venture Beat’s report, it looks like this solution has been able to outperform the baseline by inserting realistic-looking objects. “When an image recognizer…was applied to images produced by the AI, it was able to detect the synthesized objects with .79 recall. More tellingly, in a survey performed with workers from Amazon’s Mechanical Turk, 43 percent thought that the AI-generated objects were real.”
As a result, decision makers in the AR, VR and video game-creation space might consider keeping an eye on this technology, as it might one day make their jobs easier and tighten their work up.
“This shows that our approach is capable of performing the object synthesis and insertion task,” the researchers from Seoul National University, the University of California at Merced and Google AI wrote in their report about the system. “As our method jointly models where and what, it could be used for solving other computer vision problems. One of the interesting future works would be handling occlusions between objects.”