Whole-Person Education:  Thinking  Learning  Teaching  Schools  Science  Origins  Worldview 


 Visual Thinking in Education

( Visual-Verbal Language in Learning & Teaching )

by Craig Rusbult, Ph.D.

This paper, written mostly in 1995, was originally intended
to be (after appropriate revision) one section in the
introductory chapter of my PhD dissertation:

    This section will discuss the educational value of communication that combines visual and verbal representations, with a focus on the relatively underutilized visual aspects of this partnership.  It is commonly accepted that visual representations can serve a number of valuable functions, both affective and cognitive.  Although the affective functions of illustrations (such as motivating students) are often important, the current discussion will concentrate on cognitive functions, beginning with some ways in which meaning can be expressed visually.


Visual Representations: Literal and Symbolic

When discussing the communication of visual meaning it is useful to distinguish between literal representations that are intended to resemble the object they portray, and symbolic representations.  { These categories can also be labeled using other terms, such as realistic and abstract, respectively. }  Of course, there is a continuous range between literal and symbolic, and many pedagogically useful pictures are a complex mixture with characteristics that fit into both categories.

    To the extent that a representation is literal, the intended meaning is obvious.  But a learner must still form mental images of the object, and — especially with an unfamiliar object such as a biological cell — this may require some concept-formation or concept-restructuring.  Always, drawings are intended to be accurate in some respects but not others.  For example, in a textbook a drawing of a cell will be simplified, with the amounts and types of simplification depending on the instructional objectives; different types of cell drawings will be used for elementary school students, high school biology students, graduate students, or experienced scientists.  Similarly, if a drawing of a subway system is intended for use by passengers, it may show the proper sequence of stops but not accurate distances; but a drawing that is made for use by subway designers or construction workers will be more accurate, especially in those characteristics for which accuracy is important.
    Sometimes the meaning of literal must be carefully defined.  For example, in order to construct a literal interpretation for one type of depiction for a 3p atomic orbital, a viewer could imagine (contrary to actual possibility) that a "magic camera" can take multiple-exposure photographs of an electron that continuously remains in a 3p orbital; the result, if this could occur, would be the multiple-dot photo that is being viewed.  One potential difficulty with this depiction is that if a viewer misinterprets the ways in which the picture is and is not literal, the result will be a misconception about quantum mechanics.  On the other hand, if a viewer understands the picture — including a claim that "although quantum mechanics refuses to predict, based on Photo #1, where the electron will appear in Photo #2, it can make a statistically correct prediction for a multiple-exposure photograph" — the result will be a stronger, more accurate understanding of quantum theory.
    Winn (1987) discusses three types of symbolic visual representations — charts, graphs, and diagrams — and describes their position at the middle of a continuum between realistic pictures (which resemble what they represent) and words (whose symbolism is based on arbitrary convention): "From words they inherit the attribute of abstraction; but like pictures they exploit spatial layout in a meaningful way.  Their abstract nature makes them well suited to explaining how processes work where realistic pictures would fail. ...  Presenting information graphically allows students to scan it rapidly and quickly to discover patterns of elements within the diagram that are meaningful and that lead to the completion of a variety of cognitive tasks. ...  [This is especially useful] in mathematics and science where patterns and structures are themselves important properties of the content area. (pp. 152, 191)"  In symbolic visual representations, meaning is partially communicated by the spatial organization of information — by supplementing the symbolic meaning of individual elements with meanings implied by the spatial positions and spatial relationships of these elements.  In a chart, meaning can be conveyed in a number of ways, including the classification of items in categories such as those indicated by the rows and columns of a table.  In a typical x-y graph, some quantitative characteristics of an item are indicated by its location in the x-y space of the graph; in other types of graphs, magnitudes (or other characteristics and relationships) are conveyed in other ways.  In a diagram there is greater freedom of expression, and a wide variety of symbolic conventions can be used:  conceptual closeness can be symbolized by spatial closeness;  inclusion in a category, or exclusion from it, can be shown with an enclosing box or a symbolic color scheme;  visual sequences can be explicitly stated with arrows, or implied by following the standard convention for verbal sequences (in European languages) of left-to-right and top-to-bottom ordering;  similarly, hierarchies can be implied by linking-lines and relative placement of elements.  And because verbal information is often incorporated into visual representations, in any diagram (symbolic or literal) meaning can be conveyed by verbal or typographical cues, such as captions, element-labels, and type size.


Theories of Interactive Visual-Verbal Learning

In an effort to explain the relatively efficient recall of pictures, Paivio (1978, 1986) proposed a theory of dual coding.  According to this theory there are two types of memory coding — in a verbal system and an image system.  Verbally presented material is encoded only in the verbal system, while visually presented material is encoded in both the verbal and image systems.  In contrast with the memory's "single coding" for text, pictures have "dual coding" in two types of memory codes; if these two codes provide more cues for recall, then it generally should be easier to remember pictures.  But, as pointed out by Schnotz (1993), "Graphics offer various advantages to the process of knowledge acquisition which go far beyond a mere memory effect."  Therefore, scientists have attempted to build theories that can explain the functions of verbal and visual information in helping learners construct their own mental models of the subject matter that is being portrayed in the verbal and visual material.
    Mayer (1993) outlines a framework, derived from Paivio's dual coding theory, for interpreting the cognitive processing of information that is presented both visually and verbally.  As shown in Figure 2.2, this framework postulates the formation of three types of mental "connections":   1) visual material is used to mentally form a visual representation, thus forming a connection between the external visual material and the internal visual representation;  2) verbal material is used to form a verbal representation, thus forming a verbal representational connection;  3) the learner builds referential connections between the visual representation and verbal representation.

A dual coding theory of learning from visual and verbal materials. (Mayer, 1993)  


This theory can be elaborated (Schnotz, Picard, & Hron, 1993; Schnotz, 1993) by interpreting the qualitative differences between verbal and visual representations in terms of their differing functions as symbolic and analog representations, respectively.  Verbal information, based on the meaning of individual words and the relationships implied by grammatical structures, is used to mentally construct a propositional symbolic representation, which can then be used to construct a mental model.  Visual graphics, which convey information by implied analogy between certain spatial characteristics of the graphic and characteristics of the content that is being described, can allow a more direct construction of a mental model.  Thus, "Texts and graphics are complementary sources of information insofar as they contribute in different ways to the construction of a mental model." (Schnotz et al, 1993, p. 183)  Along these same lines, Kirby (1993) describes a mental-models approach (such as that of Johnson-Laird, 1983) that "emphasizes the importance of connections between mental codes and ... a real-world model.  These real-world models take the form of mental images ... and the spatial mode of processing, from presented images, is thought to be the most efficient means of developing the appropriate representation.  The verbal mode is but one means of accessing that central representation, and perhaps an awkward one at that. (p. 203)"

     Based on a review of experimental and theoretical work in this area, Winn (1987) describes the cognitive value of visual representations and visually-oriented processing:

Visual representations use an entirely different type of logic based on the meaningful use of space and the juxtaposition of elements in a graphic.  ...  Graphic forms, conveying information by means of visual argument, can induce the use of cognitive processes that are themselves "visual" in some way.  ...  The advantage of using visual argument lies in the application, by students, of cognitive abilities that are particularly suited to what has to be learned.  ...  Graphic forms encourage students to create mental images that, in turn, make it easier for them to learn certain types of material.  ...  Presenting information graphically allows students to scan it rapidly and quickly to discover patterns of elements within the diagram that are meaningful and that lead to the completion of a variety of cognitive tasks.  ...  Charts, diagrams and graphs are effective in instruction because they allow students to use alternative systems of logic.  ...  Certain physiological strengths of learners, such as pattern recognition and the ability to recognize geometric shapes, as well as the advantages of "right-brain" processing, can be exploited.  (pp. 156, 157, 158, 159, 160)

These visually-oriented cognitive processes include, but are not limited to, the construction of mental models.

    In a dual-coding model such as Mayer's framework in Figure 2.2, there are three types of connections: visual, verbal, and referential.  Kirby (1993) discusses the contexts in which interactions between visual and verbal information are collaborative (to support learning) or competitive (to inhibit learning).  First, with difficult tasks where visual and verbal memory-encoding is not sufficiently automated, there can be "interference" due to competition for the limited executive resources needed to control the two types of processing.  Second, some information is easier to process in a particular mode, either visual or verbal; if a teacher attempts to force some of this information into the less effective mode, it can distract a learner from in-depth processing in the more efficient mode.  Similarly, some types of tasks may be facilitated by thinking in a visual mode, while for other tasks a verbal mode will be more effective.  Finally, individual learners differ in their affinities — with respect to abilities, prior knowledge, strategies, and interests — for visual and verbal learning styles.
    Kirby recommends using verbal memory-coding for some information in some situations, visual coding in other contexts, and forming referential connections between these codes — which is possible "when there is some degree of overlap or redundancy in the two sets of information" — so that each type of coding can access the other.  He emphasizes the value of verbal-visual instruction to "teach students how to perform conjoint processing optimally," and concludes his paper with a description of "integratability" and "conjoint education":

Conjoint processing can be seen as elaborative processing, increasing the connections among codes at a given level of abstraction, without increasing the level of abstraction.  Deeper processing, which usually occurs within the same mode as the original coding (i.e., verbal or visual), changes the level of abstraction.  Increases in depth or elaboration are ways of enhancing the memorability of information.  Elaboration works by increasing the number of retrieval cues, while depth works by chunking, that is, by reducing the number of independent codes required (by packing more information into each code).  Both are effective, but in different situations.  Students may need to be taught more about same- and other-mode processing, including when and how to use each on its own, and when and how to use them collaboratively.


Interpretation of Language, and Visual-Verbal Education

With a written text, accurate communication of ideas between an author and reader depends on the existence of a set of shared assumptions about the meaning of the verbal symbols that comprise the text.  Similarly, the quality of visual communication depends on the visual symbols shared by an artist and viewer.  The set of shared assumptions concerning the meaning of symbols — including a vocabulary for individual symbols and "grammatical rules" for combining symbols with each other in various ways — can be considered a culturally constructed language, whether this language is based on symbols that are verbal or visual.  Whether communication is verbal or visual, achieving a "language match" between the information sender (author or author/artist) and receiver (reader or viewer) is essential.
    I find it useful to imagine the visually mediated communication of ideas as a two-step process of encoding-and-decoding:  mental-to-visual, followed by visual-to-mental.   First, THE AUTHOR/ARTIST'S MENTAL MODEL OF A SYSTEM is encoded, by using analogy to move from conceptual characteristics (mental) to spatial characteristics (visual), to a DIAGRAM that is a symbolic visual representation of the system.  Second, this DIAGRAM is decoded by a viewer, using analogy to move from spatial characteristics (visual) to conceptual characteristics (mental), to form THE VIEWER'S MENTAL MODEL OF THIS SYSTEM.
    A similar process of encoding-and-decoding occurs in verbal communication, where a verbal text (analogous to a visual diagram) is used as an intermediary.  With either visual or verbal communication, there is not a direct correspondence between the original system and the mental model formed by a learner.  Instead, an understanding of the system, including its conceptual characteristics and their integration into larger structures of domain knowledge, is filtered through several layers of interpretation: there must be a perception of the system and formation of a mental model by the artist or author, an encoding of this mental model to make a visual or verbal representation based on a culturally constructed language, and a decoding by the learner to form a personally customized mental model.  An improvement in any of these steps can facilitate improved learning.
    As expected, the interpretation of visual language is a skill that depends, to some extent, on experience in a particular domain of knowledge.  For example, in a study to compare the ways in which professional meteorologists and non-meteorologists construct mental representations from a weather map diagram, Lowe (1993) found distinct differences between the performance of professional meteorologists and non-meteorologists.  While the nonmeteorologists focused on superficial, domain-general, visuo-spatial features, the meteorologists were more skillful at selecting those visual features that are essential for developing an understanding the state of the weather system being depicted.  The nonmeteorologists could recognize spatial patterns in the diagram, but they were not proficient at translating this spatial knowledge into weather knowledge.  The meteorologists, due to their deeper understanding of the concepts and visual symbolism associated with weather maps, were better able to decode the semantic analogies — between the visuo-spatial characteristics of the diagram and the physical characteristics of the weather system — that were encoded into the maps by the map-makers.
    Many educators believe that skills of visual interpretation, such as those used by the experienced meteorologists, can be taught in schools.  Moore (1993) describes a program that adapts reciprocal teaching (Palincsar & Brown, 1984) for instruction in visual skills.  Reciprocal teaching — a metacognitive training approach typically aimed at enhancing verbal skills through the use of reciprocal interactions between experts and novices in explicit demonstrations of strategy use — is designed to help students learn how to plan, monitor, and evaluate their own learning strategies and outcomes.  Adapted for instruction in visual skills (Moore, 1993), students are urged to metacognitively employ a repertoire of "SLIC" strategies for Summarizing, Linking diagrams with text, Imaging, and Checking for understanding.  To help students develop a deeper, broader range of skills in visual-verbal learning, these strategies could be explicitly developed and practiced in a variety of domains, using a variety of diagrams.  Peeck (1993) also discusses the modification, to include visual skills, of programs originally intended to help students focus their attention on skills for verbal processing.  Specifically, Peeck recommends "adding to the learning material specific instructions and tasks that require desirable learning activities, such as intensive processing of the pictures. (p. 233)"  These instructions and tasks can be: general directions to "pay attention to the illustrations"; specific directions about what to look for, in general or in a particular picture; or an assignment that requires students to actively construct a response or product based on their interpretation of an illustration.


Research on the Pedagogical Effectiveness of Visual Representations

Research on visual representations has produced mixed results.  Commenting on this, Peeck (1993) says, "There is therefore a good deal of ambivalence and paradox in the position of text illustrations in the educational process.  On the one hand, there is a general acknowledgment of their potential value, as their continuing and probably increasing presence in educational material testifies; on the other hand, there is plenty of reason to regard their effects with realistic pessimism. (p. 228)"
    Part of the mixed results, especially in early studies, can be explained by a lack of attention to detail in designing experiments, or by inadequate interpretations of observations.  For example, Levin & Mayer (1992) describe the confusion caused by not distinguishing between the stages of "learning to read" (at this time, illustrations are often detrimental because they can act as a crutch for students who would rather not depend on obtaining meaning from the text) and "reading to learn" (at this time, when students can read skillfully, illustrations that supplement text can improve comprehension and retention).  Similarly, Winn (1987) cites research by Holliday (1976) in which a diagram accompanied by text was less effective than the diagram by itself, evidently because students tend to ignore a diagram — instead of studying it intensely because it's all they have — if they believe they can get all the information they need by only reading the text.  Kirby (1991), as discussed above, might describe this as a "competitive" effect caused by distracting the focus of attention away from the processing mode that would be most effective.
    Winn (1987) and Peeck (1993) suggest that research should be interpreted by carefully considering the effect of three types of factors: treatment, learners, and task.  As with any instructional technique, the effectiveness of a diagram will depend on the entire context of treatment, including the diagram characteristics (e.g., is it realistic or symbolic) and quality (has the artist expressed the content clearly and appropriately), the support system (such as the "reciprocal teaching of SLIC" or "specific instructions and tasks" discussed above), the classroom environment, whether the treatment is well designed to achieve the educational objectives, and other relevant considerations.  It is also essential to consider characteristics of the learners, such as affinities (abilities, strategies, experience, field dependence, locus of control, interests, preferences) for learning and thinking in visual and verbal modes, prior knowledge of the domain being studied, attitudes toward schoolwork and the subject domain, and so on.  Also, is the evaluative task appropriate for the "treatment and learners" situation, and does it really indicate the extent to which the educational objectives (conceptual understanding, acquisition or improvement of skills, retention, transfer,...) have been achieved?
    The current consensus of scholars is that, when interpreting past research and planning future research, the objective should be to determine, with greater precision, how the effectiveness of various types of visual-verbal instruction depends on the context in which they are used, and how we can develop teaching methods that are more effective.  In making these evaluations, a wide range of relevant criteria — including the characteristics of the visual-verbal instruction, the nature of the learners and instructional environment, and the educational objectives — should be carefully considered.


{ sources for the citations in this paper will be provided here later }

This website for Whole-Person Education has TWO KINDS OF LINKS:
an ITALICIZED LINK keeps you inside a page, moving you to another part of it, and
 a NON-ITALICIZED LINK opens another page.  Both keep everything inside this window, 
so your browser's BACK-button will always take you back to where you were.

If you like this page, you may also like the following related pages:


• a sitemap for Thinking Skills in Education:
Scientific Method, Problem Solving, and Design Method

A Grand Tour of Learning, Teaching, Thinking
This is an overview of my ideas about education,
with tips for "what to do next" after reading
each of three introductory pages:

Motivations (and strategies) for Learning
goal-directed personal motives for learning;  teamwork;
how a friend learned to weld, and how I didn't learn to ski

Aesop's Activities for Goal-Directed Education
a creative coordinating of goals and activities will
help students gain experience and learn from it

An Introduction to Design Method
how to design a product, strategy, or theory
(this includes almost everything we do in life!)

This area of Effective Teaching has sub-areas of
Teaching Methods       Teaching Activities 


This page, written by CraigRusbult, has a URL of

 Search the Website

Whole-Person Science Education
The Nature
of Science