Cognitive Neuroscience Lecture 4: Vision: The Computational Challenges

''L4: Vision: The Computational Challenges''

-         Vision is better than any computer’s

-         55% of cortex for vision

o Only 11% for touch, 3% audition

The problem of vision

-         To recover the structure of the world from part of the electromagnetic spectrum based on shifting retinal images

-         Multiple possibilities of structure from the same pattern of light (3D to 2D)

o Retinal projection could be from different objects ( \ | / )

Myths about vision

-         Vision provides a faithful record

o Fraser spiral/optical illusion

o Motion illusion

o Size illusion

-         Vision is passive

o Impossible figures: visual representation is coarse

o Change blindness:

o Troxler’s fading

o Color perception (b&w to color when eyes close)

-         Vision is accomplished by our eyes

-         A lot of our brain is devoted to vision

Truths about vision

-         Vision is tricky to study

o Instinct blindness: you need to make the natural seem strange in order to ask why of an instinctive human act

o Self-referencing:

§  up, down, left, right

§  how would you describe color to a blind person?

§  What’s it like to be a bat

-         Vision itself is impossible

o Inverse problem: retina is 2D and world is 3D

§  Inverse problem: orientation and shape (circle could be a cylinder of many lengths)

§  Inverse problem: distance and size (the monsters chasing are an example of \ | / )

§  Inverse problem: reflectance and illumination (perceive things in shadow being extra bright)

o Solution: make assumptions and use them to make inference

o Result: vision is an inferential, constructive process

§  You don’t see what’s there; you see what’s most likely to be there given your assumptions

-         Vision makes assumptions

o Unconscious inference: infer most likely

§  we perceive objects most likely to produce received sensory stim

§  E.g. straight instead of wiggly b/c straight more likely

·       Avoid “accidental viewpoint”

o Coincidence avoidance

§  E.g. kanizsa figures, idesawa’s spike sphere

·       Interpreted as sphere with spikes b/c more likely than independent spikes

o Pragnanz (succinctness, simplicity)

§  Assumes simplest thing is correct

·       E.g. square with plus, square with spike

o Lighting is uniform, shadows are smooth, and local info matters most

§  E.g. cylinder with one checker in shadow, one in light: both are same color

o Shadows move with object

§  E.g. Sphere moving above checkerboard

o Unchanging is uninformative

§  That’s why you don’t perceive blood vessels on your retina

o Faces are convex

§  Spinning mask

o Vision fills in the blanks

§  Blind spot: where optic nerve feeds back into brain

Case study: depth perception

How do we see the third dimension? Inverse problem

Depth cues: cues that support depth perception in 3-D moving world

-         Monocular cues

-         Binocular cues

-         Dynamic cues

-         Pictorial cues

Accomodation: degree of strain on the lens—not much use beyond 1m MONOCULAR

-         Far: thin lens

-         Near: thick lens

Convergence: relative angle between eyes—not much use beyond 1m MONOCULAR

-         Far: small angle of convergence

-         Near: large angle of convergence

Motion parallax: relative motion of objects MONOCULAR

-         Distant objects move least, close objects move most

Binocular disparity: having two eyes

-         Motion parallax requires moving head

-         Accommodation and convergence require moving eyes

-         Binocular disparity does not require moving head

-         The amount of disparity differs depending on distance between two objects

Pictorial Cues: cues that support depth perception in flat, static images

-         Occlusion: one object is covering the other

o T junctions imply occlusion, and thus a depth ranking (BUT NOT MEASUREMENT)

-         Shadows: depth ranking, relationship between objects and surfaces

-         Linear perspective: parallel lines must converge in the distance

-         Relative height  (horizon) : Height in field relative to horizon

o We perceive objects near the horizon as more distant

-         Known size: if a known object has a ‘canonical size’, we can determine distance based on retinal input

o Often overruled by other cues

o E.g. child looks big b/c standing close to you

o Top-down knowledge

-         Texture gradient: assuming ground is uniform, changes in spatial scale reflect distance

-         Atmospheric perspective: more intervening atmosphere at distance increases, growing haze

o Blue mountains with farther back ones light blue, hazy

Constraint satisfaction: putting it all together

-         Cues arise from depth distances in 3D world

-         Each cue in a local region gives rise to a set of possible interpretations

-         Each possibility constrains assignment of edges and surfaces around cue

-         Final interpretation is what is most compatible with all cues (like Sudoku!)

Conflict of depth cues

-         E.g. ames room: assume corners meet at 90 deg. But not necessarily true