Hybrid AI-powered computer vision combines physics and big data

Computer vision allows AIs to see and make sense of their surroundings by decoding data and inferring properties of the physical world from images.

While such images are formed through the physics of light and mechanics, traditional computer vision techniques have predominantly focused on data-based machine learning to drive performance.

Physics-based research has, on a separate track, been developed to explore the various physical principles behind many computer vision challenges.

It has been a challenge to incorporate an understanding of physics -- the laws that govern mass, motion and more -- into the development of neural networks, where AIs modeled after the human brain with billions of nodes to crunch massive image data sets until they gain an understanding of what they "see." But there are now a few promising lines of research that seek to add elements of physics-awareness into already robust data-driven networks.

The UCLA study aims to harness the power of both the deep knowledge from data and the real-world know-how of physics to create a hybrid AI with enhanced capabilities.

"Visual machines -- cars, robots, or health instruments that use images to perceive the world -- are ultimately doing tasks in our physical world," said the study's corresponding author Achuta Kadambi, an assistant professor of electrical and computer engineering at the UCLA Samueli School of Engineering.

"Physics-aware forms of inference can enable cars to drive more safely or surgical robots to be more precise."

The research team outlined three ways in which physics and data are starting to be combined into computer vision artificial intelligence:

Incorporating physics into AI data sets Tag objects with additional information, such as how fast they can move or how much they weigh, similar to characters in video games

Incorporating physics into network architectures Run data through a network filter that codes physical properties into what cameras pick up

Incorporating physics into network loss function Leverage knowledge built on physics to help AI interpret training data on what it observes

These three lines of investigation have already yielded encouraging results in improved computer vision.

For example, the hybrid approach allows AI to track and predict an object's motion more precisely and can produce accurate, high-resolution images from scenes obscured by inclement weather.

With continued progress in this dual modality approach, deep learning-based AIs may even begin to learn the laws of physics on their own, according to the researchers.

The other authors on the paper are Army Research Laboratory computer scientist Celso de Melo and UCLA faculty Stefano Soatto, a professor of computer science; Cho-Jui Hsieh, an associate professor of computer science and Mani Srivastava, a professor of electrical and computer engineering and of computer science.

The research was supported in part by a grant from the Army Research Laboratory. Kadambi is supported by grants from the National Science Foundation, the Army Young Investigator Program and the Defense Advanced Research Projects Agency. A co-founder of Vayu Robotics, Kadambi also receives funding from Intrinsic, an Alphabet company. Hsieh, Srivastava and Soatto receive support from Amazon.