Inspired by the behaviour of the human eye, Boston College computer scientists have developed a technique that lets computers see objects as fleeting as a butterfly or tropical fish with nearly double the accuracy and 10 times the speed of earlier methods.
The linear solution to one of the most vexing challenges to advancing computer vision has direct applications in the fields of action and object recognition, surveillance, wide-base stereo microscopy and three-dimensional shape reconstruction, according to the researchers, who will report on their advance at the upcoming annual IEEE meeting on computer vision.
BC computer scientists Hao Jiang and Stella X. Yu developed a novel solution of linear algorithms to streamline the computer's work. Previously, computer visualisation relied on software that captured the live image then hunted through millions of possible object configurations to find a match. Further compounding the challenge, even more images needed to be searched as objects moved, altering scale and orientation.
Rather than combing through the image bank - a time- and memory-consuming computing task - Jiang and Yu turned to the mechanics of the human eye to give computers better vision.
'When the human eye searches for an object it looks globally for the rough location, size and orientation of the object. Then it zeros in on the details,' said Jiang, an assistant professor of computer science. 'Our method behaves in a similar fashion, using a linear approximation to explore the search space globally and quickly; then it works to identify the moving object by frequently updating trust search regions.'
Trust search regions act as visual touchstones the computer returns to again and again. Jiang and Yu's solution focuses on the mathematically-generated template of an image, which looks like a constellation when lines are drawn to connect the stars. Using the researchers' new algorithms, computer software identifies an object using the template of a trust search region. The program then adjusts the trust search regions as the object moves and finds its mathematical matches, relaying that shifting image to a memory bank or a computer screen to record or display the object.
Jiang says using linear approximation in a sequence of trust regions enables the new program to maintain spatial consistency as an object moves and reduces the number of variables that need to be optimised from several million to just a few hundred. That increased the speed of image matching 10 times over compared with previous methods, he said.
The researchers tested the software on a variety of images and videos - from a butterfly to a stuffed Teddy Bear - and report achieving a 95 percent detection rate at a fraction of the complexity. Previous so-called 'greedy' methods of search and match achieved a detection rate of approximately 50 percent, Jiang said.