You are on page 1of 12
VISION A Computational Investigation into the Human Representation and Processing of Visual Information David Marr ta ofthe Masichne Inuie of Teolegy 1B ‘Wt Freeman and Company ‘Sen Fancco Fret Eo jth Wika apy tor Fal Monoue Frelcon Corator nds Jupiter Hisraon Gordie Read Quine Ari extherine Briel nd itor Roe Compan: Grate Typseting Sevier ‘mer an nde Te Maple al Bok Manucorng Group 78 76707 Engin oF Mar 19882 bearyof Congress Caaoging in Publcton Data ar, Dei 19451980, iat Ince nder 1 Vso Data peocesng. 2 Viion—Mahematcal models 3, Haman Information pocesing tie urs) 192 sora sass San overs Copyright © 1982 by W. H. Freeman and Company erp ofthis book may be repreduce by ay seca, photographie, "Serene proce, ote frm os ponogrpicrecring, oe ma ie Sore na reel sytem, ned er eherwie cpt x publ [rate we without wraten permis rom he plies Frnt inthe Unned Sets of meron ‘To my parents and to Lucia owe Reade ay ree seconcopi ewes nodes cis he ae eal cogs ofthe sere rage vrai is book These eter om ewig compan, plese Wr FeqUE Set panes ‘Mabbard Siete Company O.Bax 108 otro is 062 mand Sense Company 179 scp Dulng fing, New ey 08007 re ender may be abe cin te serene ec witout a ota bev ete ere nage bo ten sche vay ro te ees and el he eg uo te dsonce venaly he leblan member fhe pe 2 tac oe an hei and meer of he pal sen by he leo SStetgew price awl spent be a thre diensonl gs ite to old ange tot alway Dewees he eo pa ou jes he psn ote ager oat when Joong wih oly yo acre ae fog non fe eh deo be deg hand meet of (EE rte tame me when looking wah your Hace oy 0 se a so eg ee ote lca mem fhe pai Whe out a anak ie Bag wih bth ye Tis proce wi se mane ofthe seco pn rere, Dey wl BOM a oat ees and yf foes he sere pa wou sing te ce finger Ts wick sts to px er 00g le Contents Detaled Contents xt refice x0 PARE [INTRODUCTION AND ‘PHILOSOPHICAL PRELIMINARIES ‘General Introduction 3 Chapter 1 ie Phlsophy and the Approach & Background 8 Understanding Complex Infrmaion Processing Spsems 39 “A Representational Framework for Vision 31 PARTI ‘IsiON comet Taig eae A phyla Background of arty Vision $1 eco eoasngs ad the aw Primal Skech 54 “Spal Arangemen ofan image 79 ight Sources and Transparency 86 ‘Grouping Processes andthe Full Primal Sketch 9 cuapter 3 roar images to Surices 99 Movhlar Onanization ofthe Human Visual Processor. 99 proceso Comsat ad the Avlable Represenuaons ofan Image 103, ‘Sercopsis IL iceconalSelecivy 159 ‘apparent Motion 182 ‘Supe Comours 215, Surace Feure 233 Shading sod Photometric Stereo 239 Drighmnes, Lighiness, and Color 250 Simmicy 264 eager 4 “The Immediate Representation of Visible Suraces 268 tnxroduction 268 Image Segenstion 270 Reforming the Problem 272 ‘te information to be Represented 275, (General Form ofthe 24D Sketch 277 possible Forme forthe Representation 279 Possible Coordinate Sys 283 1 Contination, and Dscomimnties 285. computational Aspects ofthe Interpolation Problem 288 “Othe lteral Compuations 291 Chapre 5 "Representing Shapes for Recogation 295 Iivroduction 295 Issues Raised bythe Representation of Shape 296 "The 30 Mosel Representation 302 Natural Extensions 309 Desiving and Using the 3D Model Representation 315 Tayehohgicl Cosideraions 225, chapter 6 ‘Sopa 325 aR ML EPILOGUE capes 7 “A Conversion 335 Irducion 335 AWay of Thinking 336 Glossary 362 Billogapty 369 Index 387 Detailed Contents PREFACE xt Pawt INTRODUCTION AND PHILOSOPHICAL PRELIMINARIES (GENERALINTHODUCTION 3 Couper 1 “Te Pitoophy and the Approach & Background 6 ‘Understanding Complex Information Processing ystems 19 Representation snd deseipuon 20, Process 22 ‘he tree levels 24 Importance of computational theory 27 ‘The approach of.) Gibson 29. ‘ARepresenationl Framework for Vision 31 The purpose of vision 32 Advanced vision 3 ‘ore debe via the possible 36 etd Contes awe IsioN chapter 2 Representing the Image #1 Pyscl Background of Early Vision Representing the image | Uerpng ply arumpons 44 Titec of surices 4 Hierarchical organization 44 ‘Simian 47 Spal cominaty 49 Consinty of ceconinaties 49 ‘ero-cicesings andthe Raw Primal Sketch 54 eso-rosings | 54 Biologlal implications 61 “The pejbiphysis of early vision 6, ‘The fysiologtea reslanton ofthe WiC Rites 66 ‘The ppiologial deecsion of zo crossings 64 “The Hit complete symbolic representation of the image 67 ‘theca prima sketch 68 Phlosophical aide 75 ‘Spal Arrangement ofan image 79 ight Sources and Transparency 86 “Othe iit source efecs 88 ‘Tanepareney #9 Concissons 90 Grouping Proceses and the Pll Pinal Sketch 91 ‘ain pois in the argument 96 “The computational peach and the psychophysics of texure ‘discrimination 96 apter 3 ‘rom images to Srces 99 Modular Orgaization of the Human Visual Processor 99 Processes, Constante Available Representations ofan Image 103 Seteopsis “11. ‘Measuring stereo digpacy 111 ‘Compettional theory 111 Agen forsee maching_ 116 SCccopoateagorin 118 Cceplenteugetin and te sereo matching robin 22 ‘Bilogcaleidenoe "12 Raclnd ago 12F ‘iq cope, and te plingeffct 40 Panum's fusional area 44 es Ingress of dap fom rg dsparties 18 ie coed eri prodon’ Ge loses movement nde D shh 169 ew inp feo sion 152 campuing dane at vrs oreo em arty 135 “Compasonal hey 15, “Dance om te ewer te sce 185 Siac ean om pay change” 16 irecional Selectivity 159 Tnrodction eo vsul motion 159 ‘Computational tory 16 ‘snalgoitun 167 ‘Neue implementation 169 ‘Using directonl selec to Separate Independently moving sures 175 ‘Cemmputinal theory. 175 ‘gor and Implementation 177 ooming “182 ‘Apparent Motion 182 ‘Wy apparent mesion? 183 ‘The to halves ofthe problem 184 “The coerespondence problem 185, ‘imple dings 188 What he inpet representation? 8 Tho dimensional ofthe comespondence process 193 ilman' theory ofthe cortespondence process 195 ‘Nerisque of Ulan’ theory 199 ‘Anew lok atthe coreespondence peoblem 202 (One problem or neo?" 202 ‘Sparate stems for suture and object constancy 204 sgctre fom Mouen 205 Dated Cones ‘The input representation 212 Matera resol 213, Shape Contours 215, ‘Some examples. 216 Cetiing contoues 218, “Consusining esimpdons 219 Implictions ofthe asumpions 222 surface oxtentton cgconinites 225, Surface comours 225, “Te puazle and dificult of surice contours, 28 Detrmining the shape othe comtour generator 229 ‘he elles of mare than one contour "230, Sunce Teste 233 "The wolation oftexmure elements 234 ‘Surce paramesers 234 Possible measurements 234 Eimating sales distance dcecly 238, summary 239 Studing and Photomere Stereo 239 ‘Gradient space 240 Surfce illumination, suce reflectance and image intensy 248, ‘he releance map 245 Resorery of ape fom staing 248 Photomete stereo 249 Baighenes,Lighines, and Color 250 “he Heson-judd approsch 252 Retinex theory of ighness and color 253, ‘Algordns 255 Extension to coor vision 256, Comments on the eines dheory_ 257 Some physical reasons forthe imporance of simultaneous contrast 259 Fiypotiesls of the supetcl origin of nonlinear changes in intensity 261 Tpliction for messrements on atidhomatc image 262 Summary ofthe approach 264 Summary 264 chapter "Te Inucdlare Representation of isle Surces 268 Iexroducion 268 Image Segmentation 270 Reformulating the Peoblem 272 Desa Contens 2 ‘The lformation to be Repesented 275, General Form ofthe 2¥+D Skerch 277 ‘Pasible Forms forthe Representation 279 Possible Coordinate Systems 283, Imerpolation, Continvition, and Disconnies 285, Computational Aspects ofthe Interpolation Problem 288 ‘Dscontnnties 239 Inerpoation methods 290 Cortes ncernal Compuraions 291, oapter 5 ‘epesentng Shapes for Recognition 255 Inwroducton 295 Issues Rlsd by the Representation of Shape 296 ‘Cetra for odging the eecvencs of shape presentation 296, ‘Aecesibiliny 297 Scope and uniqueness 297 Sty and senstvny 298 Coes In the design oa shape representation 298 ‘Coordinate sem 298 Prams 300 Organization "302 “The 3 Model Represenation 302 "Natural coordinate systems 303 [Aisbased deseripions 204 ‘Modilar organiation ofthe 3D model sepresetason 305 ‘Cooedinate seem of the 3D model 307 ‘aur Extensions 309 Deriving and Using the 3D Model Representation 313 Deriving 4D aodel description 313 Relaing iewercenered to objec-ceecred coordinates 317 Iadering tnd the eaalogue of 3D models 318, TImeracson berncen dervtion and recognition 321 nding the covrespondence berween rage and caalogued tod 322 constrain analysis 322 Psychologie Considerations 325, Guapter 6 ‘Synopsis 329 EPILOGUE chapter 7 Ta Defense of the Approach 335 Inerodvcion 385, AConiersation 336 Glosay 362 Bibiogrphy 369 Index 387 Preface “This book i meant to be enjoyed. desetbes the adventures ave had Inthe years since Masia Minsky and Seymour Paper ited me tothe ‘rica Intelligence Tsboraory athe Nasorhuses Insite of Technol ‘gy in 1973. Working condions were ideal, hanks to Patrick Winston's ‘Sli administration tothe generosy ofthe Advanced Research Projects ‘Agency ofthe Deparment of Defense tnd ofthe National Science Foun don and w the freedom arranged for me by Whitman Richards, under the benevolent ee of Richard Held. Twas fortunate enough o meet and ‘collaborate with remarkable collection of people, most expecially Tomaso Poggio. Included among these people were many esate sudens who ‘oeame collesgucs and from whom flared much—Keith Nshiars, Shi zmon Ullnan, Ken Forbus, Kent Sevens, Ec Grimson, Ellen Hilde, ‘Michael Rly an John Balk: Berhad Horn kept us close wo the physics (flight and Whitt Richards, to the ables and inblies of people in December 1977, certain events occurred tat forced me to wite this book afew yeas enter than I had planned, Akhough the book has Important gaps, which hope wll son be filed, a new framework for Studying vision i aleady clear snd supponed by encugh sold esas to bbewort seting down asa coherent whole ‘arty people have helped me to live though tis somewhat dificult peviod Parca, my paren, my ser, my wife Loca, and Jenner, Tomaso, Shimon, Whitman, and Inge gave to me moce than T often deserved although mere thanks ate Inadequate, I thane them, Wiliam Drince tected me to Profesor FG. Haye and De Jon Rees at Aden ‘brooke’ Hospital in Cambridge, and them T dank for ving me time. Summer 1979 Davie Mare “Should like to express our gratitude to those who helped us bring sid ares Vion vo allment thank Gunther Sten, whose feendship brought David Marr and W. cman and Company topetber and whose sound guidance helped ws pate the book foe publicion thanke Dav Maes colleague, Ket Nishars, fo ssl and great act work could noe have been inised without im, thanke David Mar’ assistant, CrolFapneau, for tein 50 well 12 needs ofthe manuscript and the publisher. tank the vision group atthe MIT Ania Ineigence Laborato, secally len Hidreth nd Erle Grimson, who parsed in ways large {smal wo being thls book ite ‘The Publisher —— Introduction and Philosophical Preliminaries General Introduction ‘Whar does it mean, 1 se? The plain man answer (and Arise, 100) ‘mould be, to non what whereby looting In er words, vison fs the proces of discovering om unages what is present in the word nd where Vision is therefore, rst nd foremost, an informatin-procesing tsk, bur we cant think of ust proces. ori we are capable of knowing ‘vat is wherein the worl, our brane most somehow be capable of ep ‘senting ths information—in als peofsion of color and fom, bes, Tnovon snd deal The study of sion mos therefoe include net onthe ‘Sou of how to eats from images the various aspects ofthe world that fe tefl tos at also an guy into the natu ofthe internal rep fesenmatons by which we ome disinformation and thus make it val Sie asa basis for decisions about our thoughts and aeons. This dality— the representation and the processing of formations athe heat of tos information processing tks and will profoundly shape our ime {ton ofthe particular problems posed by Vision, “The need to understand information processing tasks and machines Ins arisen only gute recent Until people began to dream ofand then co bull such machines, thece Was n0 very pressing need to thik deeply ner troduction about them. Once people did begin to speculate about such tasks and Imachins, however, soon Became clea thet many aspects ofthe world {ound us could Benet from an information processing polot of ie Most ofthe phenomena that are cenel to us ae human beings the ‘nsec of fe and evolution, of perception and feling and thought Sr primary phenomena of information procesing, and ifwe are eve (0 “nesta them fll our thinking abou them must include tis Pex spective. “The next polnt—which has to be made eather quickly wo those who Inhabit a world in which the loa! ule’ bling computer sll expable ‘of sending final demand for $9.0—i to emphasize that saying that 20 's “only” an information processing tsk or that an organise i “only” an Information processing machine is nota liming ora peforatve desc tion. ven more important. {shall a no way use such desripion oy to ling the kind of explanations that are necessary Quite the eonuary a {act One ofthe fiscintng features of information processing machines is that a order to understand dhem completely one fas bested with cone explanations at many diferent levels For example, let us look a the range of perspectives that mus be sted before one canbe said fom anuman and sentic point of ew, to have understood visual perception Fist and I think foremost, theres the perspective ofthe pain man, He knows what tis ike se, and unless the bones of one arguments and theories roughly coerespond io what this person knows wo be ue t Bist hand, one wl probably be wrong (a pola made wit force al elegance by Asi, 1982). Secon there Is tbe Perspective ofthe brain scien, the physiologists and anatomist who now a grest del about how the nero system bul and how pars of ‘lncave. Te ses that concern them —how te eels are connected hy they respond a they do tke neuronal dogmas of Bartow (1972)—mus be resolved and addressed in any full account of perception. And the same rgumene apples tothe perspective ofthe xperinenal pychologlss. ‘On the ther hand, someone wh has boughe and played witha small ome computer may make quite diferent demands. "ir he eight sx, “vision realy in information pecessing ak, then I should be able to rake my computer do provided that as sufient power, meron 4nd some way of being conteced to 2 home television camera” The ‘iplanton he wants s therefore a rather abstract One telling hn what ‘program and, i possible, aint abour the best algorithms for doing so. He Aloesnt want to know about rhodopsin, o te tral geniculate auleus, ‘or inhibitory ivereroas. He wants to know how to program vision ‘The fundamental pot i thet in order to understand a device that petfoems an information processing task one needs many diferent kinds of explanations Part ofthis bookie concemed with this po, and it plays ‘prominent role because ane ofthe keystones ofthe bookisthe realization thar we ave ad to be move careful about what consittes an explanation than has been necessary in cher recent sia development hike those Jn molecular biology For the subject of vision, there no single equation fr ew tht explains eventing Hath problem has be addressed from vera points of view a8 a problem in representing information, 3s 3 Computation capable of dering tat eopeesenation, and sa problem in the arctecure of computer capable fcrzying out both things quichly and reliably {fone keeps strongly in mind this neces rather broad aspec of the nature of explanation one can avo a aumber of pills [One conse ‘quence ofan emphasis on information processing might be, ft example, tb invoduce s comparison between the uaa bran anda compute. In ‘een, of cause the brain is computer, but say this wihout quali ‘ation io misleading, Because the essence ofthe bran sx spl that it {ss computer but ha itis a computer whlch sin the habit of performing Some rer particule computations. The teem computer usually eer (0 2 achine wh a rater standard ype of istroction 9 thet usally runs Serially but nowadays sometimes n parallel under the conto of progsams that have beenstoredina memory Inorderto mndersandsuch a computer, ‘one needs to undersand what is made of how ls ptogetes, wa Insructon sci how much memory his and how kis accessed, and hw the machine may be made ou. But dis Forms only a sal part of Undeedanding a computer that is performing an infoemaon processing tale “This point bears refcton, because central to wy most analogs beeen bins and computers are too superficial be useful Think, for rample, of the International network of alline reservation computers, ‘which performs te tsk of signing fights for ailions of passengers ll, ‘rer the world, To understnd this sem is no enough 0 know how | modern computer works One also has o understand ite about what sire ace and wh they do, about geography me zones, fares, exchange fetes, and connections and something abou polis, des, andthe various ‘other aspects of human nature dat happen to be relevant to this particular wk “Thus the esta poi i that undersanding compuces is iferent from understanding computations. To understand 2 compute, one as tO stody that computer To understand an information processing tsk, One fhe to study hat information processing ade To understand fll a pate- ‘larmachine ceying outs pric information processing task one has to do both things Nether alone wil suice ora tnoduczon From a philosophical point of view, the approach that deseibe sa | uersion ofr Hr omnes ben alt repeat is ‘of mind.On the whole, i seeas the more recent excursions Bln pec ih spun soc een en cules of perception, andthe vaidayof whar dhe Senses tll us; instead, this appeoach looks back o an oer view, according © which the senses te for the mest part conceened wit ling one what there Modern ‘eprezenttona theories conceive of the mind saving acess to ystems of tmeral representations; mental sas are churaceraed by asserting ‘what he eral representations curently speci, and mental processes by how soc internal representations are cbained and how they ierac. "i scheme fords a comforable framework for cur sty of val ‘perception, and Tam content to let fom te point of departure for ou Inquiry As we shall sce, pursuing us approach wil lead us goa from ‘eatonal avenues ito what almost anew itlleual landscape. Some ‘of the things we find wil seem sirang, and it will be hard to reconcile ‘ubjeciely some ofthe ideas and theories that are freed oa us with what dewaly goes on itsde ourselves when we open our cjes and lok a Things. ven the basic notion of what constnats an explanation wil have to be developed and broadened alte, to ensure that we do not leave tnything out and that every imporant perspective on the problem ssa ‘shed or susfable. “The book isl is divided into thee pars In the fest are contained the philosophical preliminaries, a deseripon of the approach the repre- enstlonalframemor this proposed forthe overall process of sul Deveeion, and he way tha leo have adopted a fay personal syle Inthe hope thai the reader understands why particule directions were taken at each pois, the reasons fo the overall approach wil be dearer "Te second part of te book, Chapters 2106 contains the eal analy. ‘describes informally, bur in some deta, hw the approach and fame ‘wore are scaly realized, nd the ress that have been achieved “The tied partis somehar unorinodox and consists ofa set of ques done andl anawers thi are designed to help the fader to understand the ‘vay of hinking behind te appeoach—o help him acquire the right re tes fou ikea to relate these explanations to his personal expe: ‘ence of cing Ihave often found tt one oF ofthe remarks Se out In Par ll ve lped a person ose dhe post of part ofthe theory orto ‘drcunwent some priate dificuly wih t, and I hope they may serve a ‘ilar purpose here. The reader may find this secon means more afer having fad the fst two pars ofthe bonk, bat an early gnce at may Provide te motintion to ake the trouble “The deuledexpostion comes, then in Fast I. Ofcourse, the subject cof human veal perepsia is aot salve hereby along way. But over the last sc year, my collegues and I have been fortunate enogh see the ‘xublisiment of an overall theoedcal famework aswel asthe solution ‘of sever ane central problems in visual peroepdon. We fee! that the ‘combination amounts 2 reasonably song case that the representational “approach sa ureful one, and the point ofthis book sto make that case How fr this approach can be pusued, of course, remains to be see

You might also like