Tony Wang
02/02/2025
纯粹视觉批判
Patricia S. Churchland, V. S. Ramachandran, and Terrence J. Sejnowski
This is an interesting paper saluting to Kant’s ‘A Critique of Pure Reason’. Instead of criticizing the human vision, the authors criticize the “pure vision” in computer vision.
In 1994, the the cretaceous period of computer vision, the academic world was largely influenced by the “Theory of Pure Vision”. This theory, which the authors refer to as an “orthodoxy,” points that the goal of vision is to create a detailed internal replica of the visual world. This can be achieved through hierarchical processing and operating independently of other senses or actions. However, the authors propose a series of argument against which much computer vision research was being conducted at the time.
In 1994, during the Cretaceous period of computer vision, academia was largely shaped by the “Theory of Pure Vision.” This theory, described by the authors as an “orthodoxy,” suggested that the purpose of vision was to create a detailed internal representation of the world, primarily through hierarchical processing, and independent of other sensory modalities or motor functions. However, these has been challenged by Authors with a series of arguments:
While these foundational ideas dominates computer vision, the strict “pure vision” doctrine was not purely accurate anymore.
From biology, psychology and philosphy, authors argue against the traditional hierarchical model. For example, backprojections in the visual cortex—where higher-order brain regions modulate early visual processing. In addition, research has shown that motion control directly affects vision, fundamentally challenging one-way, isolated visual processing assumptions.
This paradigm shift—from a passive, reconstructive model to an active, embodied, and predictive one—was a fundamental rethinking of both biological and artificial vision systems. Interestingly, it is exactly what robotics fighting against till even now.
The robotics is always closely connected with computer vision. Even in 2025, modern AI and robotics research continues similar issues, varying in new forms. The rise of end2end deep learning has replaced traditional hand-crafted feature extraction. But as discussed in Canvas: https://canvas.upenn.edu/courses/1843097/discussion_topics/9979345, we still unknow how to integrate multi-modal perception, predictive modeling, and active control.
The 1994 critique of “pure vision” was a watershed moment, but its insights remain highly relevant today. Just as the early 90s saw a shift from static, hierarchical perception to interactive, embodied vision, modern AI is undergoing a transformation—from static, supervised learning toward real-time, action-driven learning. Future advances in multi-modal learning, predictive modeling, and vision-language integration may continue to shape the next era of robot perception.
In essence, the lesson remains: vision is not an isolated process but a deeply integrated, active, and goal-driven function—one that extends far beyond pixels and into the very heart of intelligent behavior.
their 1994 paper, “A Critique of Pure Vision,” Patricia S. Churchland, V.S. Ramachandran, and Terrence J. Sejnowski challenge the traditional view that the visual system operates independently to create a detailed internal representation of the external world. propose that vision is deeply interconnected with other sensory modalities, motor planning, and prior knowledge, suggesting an interactive model of perception. ([triciachurchland.co(https://patriciachurchland.com/wpontent/uploads/2020/05/1994-Critique-Pure-Vision.pdf?utm_source=chatgpt.com))
Influential Preceding Work:
An earlier work that significantly influenced this paper is David Marr’s “on: A Computational Investigation into the Human Representation and Processing of Visual Information” (1982). Marr’s framework, which emphasizes a hierarchical and modular approach to visual processing, serves as a foundational reence point that Churchlanand colleagues critique and build upon in their discussion of interactive vision.
Subsequent Impact:
A notable work thates “A Critique of Pure Vision” is the 1997 paper titled “A Critique of Pure Audition” by Malcolm Slaney. In this paper, Slaney extends the discussion to the auditory domain, examining whether auditory perception also involves significant top-down processing and interactive elements, thereby badening the implicatnof Churchland et al.’s critique beyond vision to sensory processing in general. (engineering.purdue.edu)
In summary, “A Critique of Pure Vision” serves as a pivotal work that challenges traditional models of sensory processing, advocating for a more integrated and interactive understanding of perception.