Guy Ben-Yosef: "Beyond object labels: full interpretation of minimal images"

Much of the current computer vision research is focused on labeling visual objects, and impressive results have been achieved for this task. However, human image understanding is much richer, involving understanding that is both below object recognition (e.g., localizing and labeling the object’s parts), as well as above the object level (e.g., categorizing the interactions between two or more ‘person’ objects). In particular, human understating involves structure: identifying objects and their parts, together with a rich set of semantic relations between them. This mapping of sensory input (pixels) to the semantic structure that is perceived by humans is termed here ‘full interpretation’.

In this talk I will describe a set of studies towards a full interpretation process, which is both below and above the object level. Full interpretation is approached by dividing the image into multiple so-called ‘minimal images’, namely reduced local image regions that are minimal in the sense that further reduction will turn them unrecognizable and uninterpretable for humans. Minimal images make the interpretation task tractable, and also provide valuable insights into the computational mechanisms underling interpretation in the human visual system. I will show results of incorporating such mechanisms in a structured prediction algorithm for full interpretation, and discuss how the interpretation processes are combined with convolutional networks.

Date and Time: 
Thursday, January 11, 2018 - 11:30 to 12:30
Speaker: 
Guy Ben-Yosef
Location: 
IDC, TBD
Speaker Bio: 

Guy Ben-Yosef is a postdoctoral associate at MIT's Computer Science and Artificial Intelligence Laboratory, and also an affiliate of the NSF Center for Brains, Minds and Machines. He studies human and machine vision, with a focus on visual recognition and image interpretation.