Using the forest to See the trees Context-based object Recognition Bill freeman Joint work with Antonio Torralba and Kevin Murphy Computer Science and artificial Intelligence Laboratory MIT A computer vision goal Recognize many different objects under many viewing conditions in unconstrained settings There has been progress on restricted cases one object and one pose(frontal view faces Isolated objects on uniform backgrounds But the general problem is difficult and unsolve
Using the Forest to See the Trees: Context-based Object Recognition Bill Freeman Computer Science and Artificial Intelligence Laboratory MIT A computer vision goal • many viewing conditions in unconstrained settings. • restricted cases: • But the general problem is difficult and unsolved. Joint work with Antonio Torralba and Kevin Murphy Recognize many different objects under There has been progress on – one object and one pose (frontal view faces) – Isolated objects on uniform backgrounds
How we hope to make progress on this hard problem Various technical improvements Exploit scene context if this is a forest these must be trees Local(bottom-up approach to object detection Classify image patches/features at each location and scale Classifier p car I VL)
• • How we hope to make progress on this hard problem Classify image patches/features at each location and scale features No car Classifier p( car | VL ) VL Local (bottom-up) approach to object detection Various technical improvements Exploit scene context: – “if this is a forest, these must be trees”. Local
Problem 1 Local features can be ambiguous Solution 1 Context can disambiguate local features
Problem 1: Local features can be ambiguous Solution 1: Context can disambiguate local features
Effect of context on object detection car Identical local image features! Images by Antonio Torralba Even high-resolution images can be locally ambiguous
Effect of context on object detection car pedestrian Identical local image features! Even high-resolution images can be locally ambiguous Images by Antonio Torralba
Object in context (Courtesy of Fredo durand and William Freeman. Used with permission
Object in context (Courtesy of Fredo Durand and William Freeman. Used with permission.)
Problem 2: search space is HUGE Like finding needles in a haystack Slow(many patches to examine) Error prone(classifier must have very low false positive rate) eed to search over x y locations nd scales s 10,000 patches/object/image 1, 000,000 images/day Plus, we want to do this for 1000 objects
Problem 2: search space is HUGE x 1,000,000 images/day Plus, we want to do this for ~ 1000 objects y s positive rate) “Like finding needles in a haystack” Need to search over x,y locations and scales s - Error prone (classifier must have very low false - Slow (many patches to examine) 10,000 patches/object/image
Solution 2: context can provide a prior on what to look for, and where to look for it Computers/desks unlikely outdoors People most likely here Torralba. IJCV 2003 Talk outline Context-based vision Feature-based object detection Graphical model to combine both sources
Solution 2: context can provide a prior on what to look for, and where to look for it People most likely here Torralba, IJCV 2003 cars 1.0 0.0 n Talk outline • Context-based vision • • pedestria computer desk Computers/desks unlikely outdoors Feature-based object detection Graphical model to combine both sources
Talk outline Context-based vision Feature-based object detection Graphical model to combine both sources Context-based vision · Measure overall scene context or‘gist Use that scene context for Location identification Location categorization Top-down info for object recognition Combine with bottom-up object detection Future focus: training set acquisition
Talk outline • Context-based vision • • Context-based vision • • • Combine with bottom-up object detection • training set acquisition. Feature-based object detection Graphical model to combine both sources Measure overall scene context or “gist” Use that scene context for: – Location identification – Location categorization – Top-down info for object recognition Future focus: