Predictive Machine Vision

Creating 3D Models from Flat 2D Images
March 10, 2020
Ambient Computing Expands
March 10, 2020

Predictive Machine Vision

Research will someday enable robots to more easily navigate human environments—and to interact with us humans by taking cues from our own body language.

In 2019, the DeepMind team developed a generative adversarial network that generates videos from images.

For example: Imagine a photo of a person holding a basketball. Based on their posture, face and other data within the picture, the GAN figures out what likely happened next and generates a video clip of the action.

Earlier, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) trained computers to not only recognize what’s in a video, but to predict what humans will do next using YouTube videos and TV shows such as “The Office” and “Desperate Housewives.” CSAIL’s system predicts whether two people are likely to hug, kiss, shake hands or slap a high five.

This research will someday enable robots to more easily navigate human environments—and to interact with us humans by taking cues from our own body language. It could also be used in retail environments, while we’re operating machinery, or while we’re in classrooms learning.