Color Perception (in 3000 Words)


Austen Clark
Department of Philosophy
University of Connecticut
Storrs, CT 06269-2054

In William Bechtel and George Graham (eds), A Companion to Cognitive Science. Oxford: Blackwells, 1998.



A neighbor who strikes it rich evokes both admiration and envy, and a similar mix of emotions must be aroused in many neighborhoods of cognitive science when the residents look at the results of research in color perception. It provides what is probably the most widely acknowledged success story of any domain of scientific psychology: the success, against all expectation, of the opponent process theory of color perception. Initially proposed by a Ewald Hering, a nineteenth century physiologist, it drew its inspiration from the existence of opposing muscle groups. Hering thought that analogous opposing processes could explain some aspects of color perception, but the resulting theory was more complicated and less intuitive than that proposed by the great Hermann von Helmholtz. Helmholtz carried his day, but in the long run Hering turned out to be right.

The Opponent Process model

How opposing muscles might cast light on color perception takes some explaining. It helps first to allocate descriptive vocabulary to three distinct levels: the physics of stimuli, the physiology of receptors, and the psychology of post-receptoral processes. The visible spectrum is linearly ordered by wavelengths ranging (for humans) from approximately 400 to 700 nanometers (nm). Newton's well known experiments with prisms yielded the spectral hues, or hues each produced by a particular wavelength of electromagnetic radiation within the spectrum. Sunlight is a mixture of light of all those different hues. Hues that when mixed yield white are called complements. But the ordering of colors is complicated immediately by the existence of extra-spectral hues--hues not found in the rainbow, such as the purple needed to connect the endpoints, or colors such as brown. Even the so-called unique red--a red which is not at all yellowish and not at all bluish--is found nowhere in the spectrum. Furthermore we find that a given spectral hue can be matched by light composed of many different combinations of wavelengths, and that there is no simple rule of physics that yields all and only the combinations that match in hue. Those matching combinations are called metamers. The existence and constitution of metamers is an entry level puzzle that any theory of color perception must explain.

Details of the physiology of receptors can help. The optic writers after Newton confirmed that if one chose carefully, any spectral hue could be matched using just three different lights--three different primaries--in different intensities. One had to take care that none of the primaries was a complement of the others, and that it could not be matched by a combination of the others. Thomas Young took cognizance of this trichromatic character of human color vision, noted that the retina of the eye was limited in surface area, and proposed in 1801 that the retina contained exactly three different types of color sensitive elements. The many different hues manifest in visual experience could be produced by suitable combinations of outputs of those three. Young's deduction was basically correct; there are three classes of cones in the normal human retina, which differ in the parts of the spectrum to which each is optimally sensitive. Short wavelength (S) cones are optimally sensitive to radiation of about 430 nm, middle (M) wavelength to 530 nm and long wavelength (L) to 560 nm. Each photopigment will absorb photons of other wavelengths, but with less reliability.

The properties of retinal receptors can explain many of the facts of color mixing and matching. Knowing the absorption spectrum for each of the three classes of cones, and the energy spectrum of light entering the eye, one can calculate the likely number of absorptions in each of the three systems S, M, and L. This yields a point in a three dimensional wavelength mixture space, whose axes are numbers of absorptions in the three cone systems. Since the visual system has no inputs other than the absorptions in its receptors, stimuli that yield the same point in wavelength mixture space will match. Complex combinations of wavelengths can be treated as vector sums; as long as the combination, no matter how complex, eventually arrives at the same point, you have a metamer. The various laws of color mixing--Grassman's laws--have an algebraic flavor, with "+" standing for "mix together" and "=" for "match." For example if A = B and C = D, then A + B = C + D. With retinal physiology better understood, all these laws can be interpreted literally, with "=" now meaning equal numbers of absorptions in S, M, and L cone systems. Sums become vector sums. Grassman, who was a mathematician, would be pleased. Physics here fails us, but the physiology of the retina allows us to write simple rules for the constitution of metamers.

Retinal details do not constitute a theory of perception, but they do suggest a simple and intuitive model. From the retina there proceed three channels of chromatic information (three "fibres"), one corresponding to each class of cone. These three channels are combined centrally to yield sensations of color. Such was the proposal of the Young-Helmholtz theory. References in the older literature to the S cones as "blue cones" are probably holdovers from this long dominant theory.

In contrast, Hering's opponent process theory is complex and counterintuitive. Hering thought that there are four fundamental colors, organized in two pairs: red vs. green, and blue vs. yellow. Hue information could be carried in just two channels, one for each such pair. Each channel takes inputs from at least two of the classes of cones, and has an opponent organization, being excited by inputs from some classes of cones and inhibited by inputs from the others. In this model, no cone is a "blue cone", since blue only arises in a more central process, requiring inputs from at least two classes of cones. In addition, the model proposes a third, achromatic channel, which sums inputs from all three cones.

It is important to recognize that Helmholtz and Hering could agree that the retina contains three distinct classes of cones, and could agree on all the facts about color mixing and matching. Any similarities between colors that could be explained by similarities of retinal processes would also fail to distinguish between the theories. They agree on what matches what. Their dispute concerns only processes that commence beyond the retina; they propose differing organizations for post-receptoral processes. How might one distinguish between such theories?

The answer lies in other aspects of the qualitative similarities among colors. If for convenience one shifts to colored paint chips, and tries to arrange the chips so that their relative distances correspond to their relative similarities, one finds that hues are not linearly ordered as along the spectrum, but rather form a circle (a hue circle), with extra-spectral purples connecting the spectral reddish blues to the long wavelength reds. The center of the circle will be achromatic--some point on the gray scale, which matches the lightness of all the chips in the circle, but is neither red, nor green, nor yellow, nor blue. The distance of a chip from that center reflects the saturation of the hue--roughly, the extent to which the hue of the chip is mixed with white. Colors on the end points of a diameter are complements; their hues cancel to yield the achromatic center. Each hue circle is two dimensional, and is typically coordinatized with hue as the angular coordinate, saturation as the radius. To capture the entire gamut of colors that humans can perceive, one must construct hue circles of different lightness levels, from white to black, and then stack them one on top the other. The entire order is hence three dimensional, with dimensions of hue, saturation, and lightness.

Hering hypothesized that hue cancellation was due to opposing physiological processes. Processes set in motion by some stimuli could be inhibited by others. This requires that each opponent process receive inputs from more than one class of cone, and that some are excitatory, others inhibitory. Instead of using angular coordinates, the organization of the hue circle could be captured by two orthogonal opponent processes, one running from red through the achromatic center to green, and the other from blue to yellow. The neutral point--baseline activation--of the red-green process yields a hue neither red nor green, found at the achromatic center point; and similarly for the yellow-blue process. If one of the opponent processes is neutral, excitation of the other yields one unique hue, and inhibition yields its complement. So if yellow-blue is quiescent, we get a color sensation of either unique red or unique green--the hues at the endpoints of that opponent process axis. Yellow and blue are the other unique hues; the remaining hues are binary, or produced by combinations of activation and inhibition of the two opponent processes.

None of these facts about the qualitative similarities among the colors follow from the facts of color mixing and matching. From retinal-based explanations we get at best receptoral similarity, or proximity within wavelength mixture space. But the perceptual similarities of colors do not map in any simple way onto such receptor-based similarities. The orientations of the opponent axes in color space and the consequent identities of the unique hues are not determined by color mixing and matching, or even by the structure of perceptual similarities among colors. Many pairs of colors are complements, and so as far as mixing and matching data go, could serve equally well as endpoints of the opponent axes. Even though the model is a model of color perception, it proposes principles of organization that lie rather deep within the physiology of the organism, remote from direct empirical test.

Contemporary Successes

Largely for this reason the simpler Young-Helmholtz theory continued to dominate until the 1950's, when the team of Leo Hurvich and Dorothea Jameson began formulating quantitative versions of the opponent process model (see Hurvich 1981). They demonstrated the robustness of hue cancellation, and devised a technique to derive "chromatic response functions" from such experiments, and to use them to predict the appearances of broad-band stimuli. Quantitative links were proposed between opponent processes and the absorption spectra of the three cone systems. The model gives a simple explanation for the various patterns of color vision deficiency: why, for example, dichromats (who can match any hue they can see with mixtures of just two primaries) either confuse reds and greens, or (more rarely) confuse yellows and blues. They lose one or another opponent process. If you carve nature at the joints you have also picked the places where things are most likely to break, and so the theory does.

Besides hue cancellation, the best evidence for the identity of unique hues came from research in color naming and from cross-cultural linguistic evidence (the "Berlin Kay" hypothesis; see WORD MEANING). Individuals can readily describe all the hues in the color circle using just the four terms for the four unique hues: red, green, yellow, and blue. If prevented from using those terms, description requires a larger and more complex vocabulary. Orange is a reddish yellow, but it is hard to see red as an orangish purple.

Opponent process theory thus had some compelling success in explaining psychological data, but it was physiology that finally gave investigators confidence that the theory described real processes in the nervous system. Russell De Valois and collaborators founds cells in the lateral geniculate nucleus (LGN) of the macaque monkey whose spiking frequencies were spectrally opponent--excited by some wavelengths, inhibited by others. With this the race was on, and physiological details burgeoned. Now it is known that spectrally opponent cells are rife through the parvocellular pathways of the LGN and through the termination points of those pathways in primary visual cortex (area V1). These spectrally opponent cells, as predicted, fall into discrete types, with differing inhibition/excitation curves and neutral points.

With all this progress we have yet to identify cells anywhere in the brain that behave in precisely the fashion proposed by opponent process theory. One difficulty lies in reconciling spectral opponency with the spatial organization of receptive fields of the cells that have been found. For example in V1 the typical spectrally opponent cell has a center/surround receptive field, with the center excited by just one type of cone, and the surround inhibited by outputs from the other two, or even from all three. They might be L+ in the center, M- and S- in the surround. In fact one finds a bewildering variety of different arrangements. Furthermore all the opponent cells so far found respond to achromatic stimuli. (Interestingly, in the early flush of enthusiasm, L+M- cells in the LGN were often labeled "R+G-", as if the locus of opponent processes had been found. This label will probably go the way of the earlier Helmholtz-inspired "blue cone, green cone" terminology.) Opponent cells of various sorts have been identified in secondary visual areas as well, through at least V4, but even those fail one or another of these tests. In the near future watch for the identification of the cortical locus of pure opponent processes. Failing such identification, watch for revisions in the model. The notion of opponency itself might be modified, so as to take into account the noted spatial organizations. Loci proposed thus far have all been too peripheral; as they are marched inwards, the complexity of the processes identified can only increase.

The sudden receipt of confirming evidence from a different discipline is one way to endow what were merely theoretical entities with reality. Another is to start using them as tools in other experiments. We find color researchers doing the latter as well. Stimuli constructed using the assumptions of opponent process theory are used to test other perceptual models. One example is provided by the use of "equiluminant" stimuli--stimuli that match in luminance, and differ only in wavelength composition. The borders between such stimuli are visually indistinct, or at least much less distinct than those across which there is also a change in luminance. Perhaps some of the modules involved in edge detection, perception of shape, or spatial perception are "color blind", or insensitive to pure differences in wavelength composition. Equiluminant stimuli can in this way be used to test for modularity--although results thus far are subject to an ongoing debate (see PERCEPTION). Another example is provided by the construction of special stimuli to test the identities of opponent process channels. Such channels are just made of neurons, and neurons generally adapt, or slow their response to constant input. Perhaps one could construct a stimulus that would selectively adapt the neurons in a post-receptoral channel without adapting the receptors. To do this, the stimulus must change in wavelength over time so as not to affect any one class of cones in a constant fashion, yet constantly affect opponent cells that take input from all those classes of cone. The strategy is similar to that of isolating a specific muscle group when working out with weights; here the muscles are remote from the periphery. Such "second site" adaptation effects have been found, and have been used to test for channels.

As a final testimony to the reality of formerly theoretical entities, recent developments in genetics have provided a totally unexpected route to the confirmation of some opponent process claims. In the past few years it has become possible to sequence the genes for the various photopigments in the cones of many different species. Color sensations leave no fossils, but the similarities of such sequences among existing species can reveal evolutionary kinships and divergences. Genes for the various photopigments diverged at different times. The photopigments in M and L cones are relatively similar to one another, and diverged relatively recently, perhaps 30 million years ago (Mya). But both the S cone photopigment, and the common ancestor of M and L, are much older, diverging some 530-670 Mya. The earliest mammals probably had two cone photopigments. Our ancestors in Paleozoic and Mesozoic times were at best dichromats. They could have had only one opponent process system, corresponding to yellow-blue; adding the red-green system required a third cone photopigment. It appears among mammals as a relatively recent acquisition, found only in some primates (see Jacobs 1993).

That gene sequencing can contribute to a model of color perception testifies to the robustness of the model. It has led to an intriguing recent revision in Young's trichromatic retina: several variants of human M and L photopigments have been found. Their peak sensitivities vary in discrete steps. DNA sequencing of an individual can help identify that individual's particular variants of M and L photopigments, and are in turn highly correlated with that individual's color matching in red/green (see Neitz, Neitz and Jacobs 1993). Gene sequencing and better techniques for measuring photoreceptor sensitivity have also led to an explosion of results in the study of comparative color vision. Many species have much better color vision than any of the rather impoverished old world primates. Some have color "hyperspaces" of more than three dimensions, and are tetra- or even penta-chromatic (see Thompson 1995).

Future Directions

One fly in the ointment has already been mentioned: the failure so far to identify the precise neural locus for opponent processes. It should be emphasized that opponent process theory continues to change and grow, partly to explain some of the puzzling aspects of the standard model. The spatial organization of spectrally opponent receptive fields has already been mentioned. Why spectrally opponent cells should have this spatial organization is a puzzle; their responses would seem to confound luminance and chromatic information. Furthermore, the sheer variety of different spectral and spatial organizations is bewildering, and it is difficult to see how to fit them all into a single stage model.

Russell and Karen De Valois (1993) show how sophisticated recent variants of opponent process theory have become. They make a virtue out of the variety of opponent organizations, demonstrating that if there is a third stage of processing, at which outputs of various types of cone opponent cells are combined, one can completely separate chromatic and luminance information. This also quite neatly solves the puzzle about the spatial organization of receptive fields. The model is quantitative and tied directly to neuroanatomy. Testing and elaboration of models of this sort is the wave of the future.

Scanning those waves, one spies a final, philosophical puzzle. It may or may not be flotsam. Some contemporary philosophers urge that models which purport to explain color appearance in fact do no such thing. They all fail to explain the qualitative character of chromatic experience. The failure is allegedly simple to demonstrate: all those models would, it is said, be true of a functionally equivalent zombie--an entity that makes the same discriminations as a person, and has internal machinery that functionally mimics the human nervous system, but whose internal states lack qualitative character altogether. Or perhaps those internal states have qualitative character, but of a different sort than ours, so that those models would be true of someone who suffers from inverted qualia. But if the models would be true of someone whose internal states lack qualitative character or have a qualitative character that differs from ours, then such models do not explain the qualitative character that our states have. They might explain how humans discriminate this stimulus from that stimulus, but they do not explain what it is like to see red. This problem, like any philosophical problem, is easier to get into than to get out of (see CONSCIOUSNESS). Part of the difficulty lies in understanding what it would mean to explain what it is like to see red, and this conceptual issue is one that current empirical models are unlikely to touch. Various answers to it have been proposed, but they all suffer from the rhetorical disadvantage of being much more complicated than the question itself. Perhaps a simple answer to this seemingly simple question can be devised. Or perhaps intuitions can be altered, so that the question no longer seems simple. Either would be progress.

References

De Valois, R. L. and De Valois, K. K.: 'A multi-stage color model', Vision Research, 33 (1993), 1053-65.

Hurvich, L. M.: Color Vision (Sunderland, Mass.: Sinauer Associates Inc., 1981).

Jacobs, G. H.: 'The distribution and nature of colour vision among the mammals', Biological Reviews, 68 (1993), 413-71.

Neitz, J., Neitz, M., and Jacobs, G. H.: 'More than three different cone pigments among people with normal color vision', Vision Research, 33 (1993), 117-22.

Thompson, E.: Colour Vision (London: Routledge, 1995).

Recommended reading

Hardin, C. L.: Color for Philosophers (Indianapolis: Hackett Publishing Company, 1988).

Hurvich, L. M.: Color Vision (Sunderland, Mass.: Sinauer Associates Inc., 1981).


Back to Austen Clark online papers.

Back to Uconn Philosophy home page.