Multimodal discourse analysis

Authored by: Gunther Kress

The Routledge Handbook of Discourse Analysis

Print publication date:  November  2011
Online publication date:  June  2013

Print ISBN: 9780415551076
eBook ISBN: 9780203809068
Adobe ISBN: 9781136672927




The history of discourse analysis is beset by a vagueness around the homonym ‘discourse’. The term names a large territory, located somewhere between two ‘markers’, which might, generally speaking, be something like ‘providing accounts of connected stretches of language in use’ and ‘uncovering salient social, political, psychological features in text-like entities’. In sociolinguistics, by and large, the major emphasis has been on understanding the link between (environments of) language use and (features of) the language used (Hymes, 1964; Labov, 1966, 1972; Bernstein, 1984). In such work, ‘the social’ and its meanings and effects are foregrounded: who speaks, to whom, when, with what purposes, in what ways. These factors and the purposes leave traces – whether the details of pronunciation in Labov's early work or the regularity of use of a certain range of linguistic resources, leading to the development of the notion of ‘codes’ in Bernstein's theory.

 Add to shortlist  Cite

Multimodal discourse analysis

What is multimodal discourse analysis?

The history of discourse analysis is beset by a vagueness around the homonym ‘discourse’. The term names a large territory, located somewhere between two ‘markers’, which might, generally speaking, be something like ‘providing accounts of connected stretches of language in use’ and ‘uncovering salient social, political, psychological features in text-like entities’. In sociolinguistics, by and large, the major emphasis has been on understanding the link between (environments of) language use and (features of) the language used (Hymes, 1964; Labov, 1966, 1972; Bernstein, 1984). In such work, ‘the social’ and its meanings and effects are foregrounded: who speaks, to whom, when, with what purposes, in what ways. These factors and the purposes leave traces – whether the details of pronunciation in Labov's early work or the regularity of use of a certain range of linguistic resources, leading to the development of the notion of ‘codes’ in Bernstein's theory.

In more linguistically rather than sociolinguistically or sociologically oriented approaches, the emphasis has been on seeing whether regularities of a ‘formal’ kind could be discerned in ‘stretches’ of speech and writing ‘above’ the sentence, somewhat akin to those that linguistics had been able to establish in relation to the sentence in the 1970s – whether in mid-century American structuralism or in Chomskyan conceptions of the organization of language at or below the level of the sentence. For that latter kind of work, the term text-linguistics – rather than discourse analysis – has been commonly used in the ‘mainstream’ of linguistics (van Dijk, 1977; Wodak and Meyer, 2001). In between these there are countless positions, as the distinct takes – and histories – of contributors to this volume show. There were those who, like myself, had become interested in the expression of power, ‘knowledge’ in and through language (Kress and Hodge, 1979; Fowler et al., 1979; Hodge and Kress, 1993), for whom Foucault's use of the term discourse (Foucault, 1981; Kress, 1984/1989; Fairclough, 1992; Gee, 1999, 2008) provided an important means of extending the investigation of the relation of ‘social givens’ and language. In the writings of Foucault, discourse as institutionally produced ‘knowledge’ is a social rather than a linguistic category; the social is taken as the generative ‘source’ of meaning.

Given the range of uses just described, the terms text and discourse have frequently been used more or less interchangeably, as names for ‘extended stretches of speech or writing’ as well as pointing to the social meanings ‘inherent’ in such texts. Discourse has been readily used in relation to the (political/philosophical) approach of Foucault (Foucault, 1981; Kress, 1984/1989; Fairclough, 1992; Gee, 1999, 2008); or as a characterization of social interaction as the means to establish consensual knowledge, as in the work of Habermas (1984); or to name meanings of the social much more generally (as in the work of Labov, or of Hymes); or in the relatively formal approach of Sinclair and Coulthard (1975) to the organization of linguistic interaction in classrooms. The plethora of uses has blurred the meanings of the term discourse (and of the phrase discourse analysis) and has made its use as a descriptive and analytical tool problematic.

That leaves a question about two other terms: ideology and text. I use the former as the name for the specific configuration of discourses present in any one text. Text, in my approach, is the material site of emergence of immaterial discourse(s). The etymology of the word text draws attention to the result of processes of ‘weaving’ together differing ‘threads’ – usually assumed to be either speech or writing – into a coherent whole. ‘Weaving’ implies a ‘weaver’ who has a sense of coherence. In multimodal discourse analysis (MMDA) – as in others – the question of who the ‘weaver’ is, and what forms of ‘coherence’ are shaped by her, him, or them, is a significant issue at all times.

In MMDA the textual ‘threads’ are many and they are materially diverse: gesture, speech, image (still or moving), writing, music (on a website or in a film). These, as well as three-dimensional entities, can be drawn into one textual/semiotic whole. Text, in MMDA, is a multimodal semiotic entity in two, three or four dimensions: as when students in a science classroom make a 3D model of a plant cell, or when they perform a play scripted by them in a literature classroom (Franks, 1995, 1997; Franks and Jewitt, 2001). Texts, of whatever kind, are the result of the semiotic work of design, and of processes of composition and production. They result in ensembles composed of different modes, resting on the agentive semiotic work of the maker of such texts.

Texts realize the interests of their makers. A text is (made) coherent through the use of semiotic resources that establish cohesion both internally, among the elements of the text, and externally, with elements of the environment in which texts occur (Halliday and Hasan, 1976; van Leeuwen, 2005; Bezemer and Kress, 2008; Kress and Bezemer, 2009). In the semiotic work of interpretation, the internal re-making the text, the interpreter of a semiotic entity also produces a coherent, newly made text, the result of her or his interpretation. There is no guarantee that the kind of coherence of the new text will be as it was in the prompting text.

Coherence is a defining characteristic of text. The principles of coherence are social in their origins and, being social, they point to meanings about ‘social order’. The coherence of a text derives from the coherence of the social environment in which it is produced, or which it projects; it is realized by semiotic means. Nevertheless, the decision to select particular aspects of coherence, to shape coherence, to attribute coherence to a textual/semiotic entity or to deny it the status of coherence is always the act of a socially located maker and re-maker of a text. Power is involved in the making, recognition and attribution of coherence in a text.

Implicitly, ‘coherence’ as a textual characteristic gives rise to questions such as: How is ‘the social’ organized? What are its salient entities and how are they configured in this instance? and from there to the more semiotically oriented What links with what, in what ways? What belongs where, in the ensemble of entities in a text? As coherence is social and therefore ‘tracks’ social changes, texts exhibit the conceptions of order of the community that has elaborated these principles of order, and which uses them as a resource for establishing and maintaining cohesion and coherence in the community. In texts, these social principles appear as semiotic principles, made material, manifest, visible, tangible.

Being socially made, the principles of coherence differ from community to community and for different groups in communities. The principles held by a group defined by generation (as the social construction of age), teachers, let's say, are unlikely to be the same as those of a younger generation, their students. As structures of power now no longer necessarily work across generation, there is at the moment an ever-growing gap between the principles of (social and semiotic) order held by a younger generation and those ‘before them’.

In part, texts are constitutive of social institutions; in part they are traces of (inter-)actions in such institutions and, in this, they provide means of ‘reading’ the interests and purposes of those involved in the making of texts in an institution. That makes the category of text essential and significant in discourse analysis (DA); and it makes text clearly distinct, socially and semiotically, from discourse.

In broad terms, the aim of MMDA is to elaborate tools that can provide insight into the relation of the meanings of a community and its semiotic manifestations. In MMDA, the apt use of modes for the realization of discourses in text in a specific situation is a central question. A multimodal approach assumes that language, whether as speech or as writing, is one means among many available for representation and for making meaning. That assumes that the meanings revealed by forms of DA relying on an analysis of writing or speech are only ever ‘partial’ meanings. The meanings of the maker of a text as a whole reside in the meanings made jointly by all the modes in a text. If I am interested in understanding the meanings at large in a community, speech or writing – alone or even jointly – will provide a part of the meaning only.

The category of discourse (in the Foucauldian sense) does not deal with all meanings at issue in social (inter-)action that emerge in text. Genre, the category that realizes the organization of social participants involved in the making and re-making of a text, operates at the same level as discourse: jointly they are the social foundations of text (Kress, 1984/1989). Other meanings, beyond those of discourse and genre, need to be accounted for in a full description of social interaction – large or small, formal or informal, meanings about generation or region, for instance. Power is expressed in all these, everywhere, in a multiplicity of ways. Looking at discourse alone is not sufficient to provide a full account of meaning in social situations and practices in the texts that are produced there. A comprehensive account of power and meaning requires further semiotic categories.

So, for instance, irrespective of the discourses invoked, a speaker or writer will need to deal with a general social–semiotic category such as ‘proximity’ and ‘distance’ and to have the semiotic means to realize meanings of what Brown and Gilman called ‘power’ and ‘solidarity’ (Brown and Gilman, 1968): in English, the use of past as against present tense (Kress, 1975; Kress and Hodge, 1979), or of deictics of distance, such as ‘this’ vs ‘that’; and any number of other devices, different in different modes and different cultures. If a major issue in MMDA is a full account of power for instance, then it is entirely plausible to call that more comprehensive enterprise MMDA, even though in the scope of categories drawn into the ‘toolkit’ it goes beyond the description of the use of discourses. MMDA names the description and analysis of any text – as a complete and coherent semiotic entity – which aims at describing and analyzing what ‘goes on’ in a text, including the working of power in social interaction. In MMDA, an understanding of any text assumes understanding the selection of discourses, of their ‘arrangements’ – which one is dominant, what functions does each have. Other meanings are present, and they are framed by the discourses present in the text, in an ideological arrangement. MMDA, as do other forms of discourse analysis, sets out to develop tools that can be used in such a task.

In this chapter I try to elaborate on five questions. What are (some of) the key issues that MMDA has brought to light? Closely connected and central is: What is multimodal MMDA? Then there is the following issue: What does the theoretical frame of social semiotics entail for a view of communication and (inter-)action? Given my professional location in an educational institution, what I want to know is: What can MMDA tell us about learning and social life? And a question collecting up the responses to the preceding questions is: Why is a social semiotic MMDA important?

What are the key issues that multimodal discourse analysis has brought to light?

‘Multimodality’ names the field in which semiotic work takes place, a domain for enquiry, a description of the space and of the resources that enter into meaning in some way or another (see also Jewitt, 2009). In the perspectives of different theories and approaches – psychology, media-studies, pedagogy, museum studies, archeology, sociology of different kinds – differently constituted questions lead to distinct theoretical and methodological tools, elaborated for the needs of each case. As mentioned, the theoretical approach presented here is that of a theory of meaning and communication, social semiotics, so the tools developed are shaped by that theory.

Multimodality asserts that ‘language’ is just one among the many resources for making meaning. That implies that the modal resources available in a culture need to be seen as one coherent, integral field, of – nevertheless distinct – resources for making meaning. The point of a multimodal approach is to get beyond approaches where mode was integrally linked, often in a mutually defining way, with a theory and a discipline. In such approaches writing was dealt with by linguistics; image by art history; and so on. In a multimodal approach, all modes are framed as one field, as one domain. Jointly they are treated as one connected cultural resource for (representation as) meaning-making by members of a social group at a particular moment. All are seen as equal, potentially, in their capacity to contribute meaning to a complex semiotic entity, a text, and each is treated as distinct in its material potential and social shaping. Each therefore needs to be dealt with as requiring apt descriptive categories which arise from that difference.

This means that MMDA needs to encompass all modes used in any text or text-like entity, with each described both in terms specific to its material and historical affordances and in terms shared by all modes.

While this constitutes a profound challenge to dominant views about the place of language, by itself it does not constitute a theory. Rather it projects the domain in which a theory – in this case, social semiotics; in other cases, say psychology or anthropology – find its application. Multimodality and social semiotics, together, make it possible to ask questions around meaning and meaning-making; about the agency of meaning-makers, the constitution of identity in sign- and meaning-making; about the (social) constraints they face in making meaning; around social semiosis and knowledge; how ‘knowledge’ is produced, shaped and constituted distinctly in different modes; and by whom. Multimodality includes questions around the potentials – the affordances – of the resources that are available in any one society for the making of meaning; and how, therefore, ‘knowledge’ appears differently in different modes.

MMDA (and social semiotic theory) deepen and expand issues which concern other forms of DA more generally. At the same time it has brought to light issues which extend beyond the scope of DA as more usually conceived. I will draw attention to four of these.

One, mentioned just above, is the partiality of language. A second issue is the central one of the logics and affordances of modes, with their effects on ontology and epistemology and in terms of rhetoric, selection and design; a third issue is a move beyond the deeply pervasive notion of implicit meanings; and fourth, there is the matter of recognition: recognition of semiotic work, both in terms of who does such work – the question of agency – and in terms of the means by which such work is done – the issue of modes.

Recognizing the partiality of language entails that all modes in a multimodal ensemble are treated as contributing to the meaning of that ensemble; language is always a partial bearer of the meaning of a textual/semiotic whole. It problematizes the notion of ‘language‘ in two ways: first, in the context of MMDA, language can no longer be treated as providing a full account of meaning but is seen as only ever providing a partial account. Consequently the other means of making meaning must be given full recognition and attention in theories of meaning. Second, given the entirely distinct materiality of speech and writing and their different shaping in different social places, it becomes highly problematic to treat ‘language’ as a mode. It seems essential now to speak of the two linguistic modes of speech and writing: and to ‘retire’ the use of the term ‘language’ from the theoretical vocabulary of MMDA. So in MMDA speech and writing are treated as different modes; their meaning potentials and their discursive (and ideological) affordances are used in that way and are open for investigation. While the former view held sway, meanings expressed in other modes could be treated as marginal at best, or could remain invisible. In MMDA we are required to look seriously at all modes.

Closely allied to the partiality of language is that of ‘ implicit’ meanings. If all modes carry meaning, even if differently, then such meanings cannot be treated as ‘implicit’. For MMDA, a notion such as ‘implicitness’ is an (ideologically exploitable) barrier to transparency, including meanings around power. In MMDA attention is drawn to the part all modes have in constituting the meaning of a text: differently because of their different materiality and because of the affordances which derive from that. In a multimodal approach, all meanings, in any mode in a culture are explicit meanings – even though there may at any one moment exist a limited vocabulary for their description – a problem of means for transcription – either in ‘common parlance’ or in theoretical accounts. Discourses, crucially, as I will show just below, are realized in all modes.

Modes are distinct on the basis of their material characteristics and of the social shaping of the social–semiotic affordances of that material over (often) long periods of time. Speech and writing differ both on the basis of their materiality and on the basis of their different social shaping – differently in different societies – as for instance writing and image do, leading in all cases to distinct cultural–semiotic resources. This has one further consequence in this train of reasoning around materiality, social–semiotic work and mode. Materially, nothing links speech and writing – sound and inscriptions are materially distinct. Over long periods of social–semiotic work, in some societies – though clearly not in all – links have been forged between speech and (what became) writing, so that forms of image representation have become means of representing (aspects only of) speech – as in alphabetic scripts.

‘Recognition’ of semiotic work, both in terms of agency and in terms of mode, becomes a crucial matter in MMDA. It leads to two constant questions: Whose semiotic work? And what modes are involved in that work? The first is a matter of recognizing agency; the second a matter of recognizing the mode in which work was done. In institutional situations where power-difference is marked, work done in a mode that is not ‘recognized’ is easily disregarded. School is a paradigm example, but so are examples of bureaucratic uses of language.

In the first two examples – two signs that give directions to drivers on how to get into the car-parks of two supermarkets – these four issues are brought into view. The signs are about four metres up on two buildings, one on each side of a major urban road, located just before a large and complicated intersection. The signs are unremarkable; they serve to illustrate points about multimodality and multimodal discourse analysis more generally (Figures 3.1 and 3.2).

While there are ‘dictionaries’ of visual signs, they are quite unlike those for language, usually as inventories of quite abstract visual entities – ‘icons’. There are no dictionaries to look up for something like ‘directions into car-parks’, from which such signs could be taken. These signs, like the vast majority of visual signs – images – are ‘newly made’ from readily available, socially shaped cultural modal resources: here of layout, colour, writing, image, font. Each of the signs makes a specialized use of these five modes to construct an ensemble of modes to shape the meaning intended. Each mode plays its specific part: writing tells, image shows, colour frames and highlights; layout and font are used in part for reasons of compositional arrangements, and, as the other modes, too, always for reasons of ‘taste’. To write what the image shows would take too much space, and it would take too much time to read for motorists, who need to concentrate on the traffic at this busy intersection.

Morrison's car park

Figure 3.1   Morrison's car park

Waitrose car park

Figure 3.2   Waitrose car park

What is the meaning, overall, of each of the two signs? How is that meaning constituted? Does the meaning of one sign differ from the meaning of the other, and, if so, in what ways? How do the two signs function as messages? Who is being addressed and how?

The two signs use the same compositional elements and use them in similar arrangements. Yet they also differ: in how font is used – as capital letters alone on one sign, and as capital and lower case letters on the other; in type of font; in drawing style; in colour. The category of style is useful here to get a plausible account of that difference: style as the effect of the sum of choices made (Kress and Aers, 1982): choice of a colour palette, of individual colours, and of colour saturation; of font-type; of drawing style and of layout. Choice points to the semiotic work of selection, to preference: this colour rather than those others; this font as better for the designer's purposes here than others. Every choice of a signifier in each of the modes (colour, font, lettering, drawing) points to a decision made about an apt match of ‘what is to be meant’ with ‘what can best express that meaning’. The thick, heavy lines of one sign to carry a meaning of ‘no nonsense shopping’, of shoppers with ‘feet on the ground’, who care about ‘value for money’; the lighter lines of the other sign to carry a meaning of ‘we are, and we know you are, interested in elegance, in taste, in a light touch’. And so with all of the signs that make up the two multimodal ensembles.

The makers of each sign have constructed specific knowledge about this matter in this specific site, using the affordances of the modes in each ensemble. We may ask about design: How was this text designed? and about interpretation: How does this text here work, for anyone who engages with it?’ All ‘readers’ of these texts, each one in turn, make their new sign for themselves in their interpretation, drawing on all the modes in the ensemble. In writing and image – ideationally – the signs seem designed to answer the question: How do I get into the car-park of this supermarket? In font, colour, image – interpersonally – the signs seem designed to answer another question: Which of these supermarkets is the one I would prefer to go to? If the driver's/reader's interpretation in response to the prompting (see Kress, 2010) turns out to have been misleading, he or she will find themselves grumpily in the supermarket which matches neither their sense of what this supermarket is or of who they are.

In other words, the meanings of the signs are about ‘directions’ in two ways: about ‘geographical’ directions in the mode of writing and image, and about ‘social directions’ in the modes of font, colour, image: about ‘where you belong’ in terms of taste and social affiliation. Along with the practical directions – ‘This is how you can get into the car park’ – the signs carry meanings about identity: about the store's ‘brand’ and what that stands for. They project an image of its assumed customers – ‘This is who you are, this is the place for you.’ ‘Directions’ to lifestyle, identity, taste and dispositions, to the ‘social place’ where ‘I’ will feel at home, are expressed in these two signs. The affordances of modes go well beyond their ideational function alone.

To return to the notions of ‘explicit’ and ‘implicit’ just for a moment. The misconception that speech or writing provides explicit information and that other modes ‘leave things implicit’ can be used for ideological purposes. Instead of writing or saying (what would, at the moment at least, be unspeakable or impossible to write) ‘here in this store we appeal to a more discerning class of customers, the middle classes’ or ‘we appeal to a class of customers of coarser tastes or to people who do not care, the lower classes’, these messages are given explicitly, but in modes that, for the moment. are less subject to social policing. This ensures that power of certain kinds is much more difficult to challenge.

Such meanings are clearly in the domain of discourse. The banality of the two texts does not exclude discourse as a shaping influence: discourses around taste, identity, a position in life; and they have shaped the signs. The multiplicity of modal resources for the realization of the meanings in the text requires the selection of semiotic resources apt for this task: ‘choice’ of modal resources, of genre and of other forms of textual organization and arrangements.

Choice leads to selection, and both necessitate acknowledging design as part of a set of theoretical tools, as a means of answering questions such as: What mode is apt here? These are questions of design, themselves deriving from a rhetorical disposition to communication. Design assumes the prior action of the rhetor. The task of the rhetor is to assess and describe the salient aspects of the environment of communication. The rhetor's questions seek to establish the conditions for communication: who are the participants and what are their characteristics – for instance, are they 7-year-old school children or adult participants in a public debate? What are their relations of power; what are the semiotic requirements from the matter to be communicated – for instance, is it better to show the complexity of an elbow joint in a diagram, or to show it as a 3D model, or to describe it in writing, or to imitate it gesturally, supported by speech? And there are the rhetor's intent and purposes in communicating. The agency of the rhetor shapes the actions of the designer, whose agency in turn shapes the realization of the rhetor's intent. In that conception, rhetoric is the politics of communication, style is the politics of choice; aesthetics is the politics of style; and ethics is the politics of (e)valuation.

In a multimodal environment the possibilities for choice and selection multiply well beyond those in a monomodal one. My second example aims to show how the stance on recognition just outlined – of semiotic work, of agency and modes, of explicitness – makes possible a different take on ‘reading’, ‘reception’ and communication generally. The example comes from a research project on museum visitor studies, ‘The museum, the exhibition and the visitor’, funded by the Swedish National Science Foundation and conducted at the National History Museum; in an exhibition of Swedish prehistory; at the East Asia Museum in Stockholm; and at the Museum of London, in two exhibitions: ‘London before London’ and ‘Roman London’.

In the project, one aim was to understand how visitors ‘made sense of’ a specific exhibition. Visitors were invited to participate as couples (grandparent and grandchild, friends, married couples), in order for a sense of their interaction with the exhibition to be captured, at least in part. Participants were given wearable voice-recorders; they were given a camera to take whatever images they wished; and they were videoed as they made their way through the exhibition. At the conclusion of their visit they were asked to ‘draw a map’ that represented their sense of the exhibition, and they were asked to participate in a brief interview about the visit, prompted by their ‘map’. All of these – video, photos, voice-recording, interview and ‘map’ – were seen as means of documenting the visitors’ sense of the exhibition, as ‘signs of learning’.

Museums have an interest in knowing what the visitors ‘take’ from their visits. They cannot usually exercise over their visitors the kind of power that schools (attempt to) exercise over their students, whether in relation to communication or to learning. Hence an ‘assessment’ of understanding, based on the principle of interpretation (Kress, 2010), suggests itself as preferable. Here are two maps made by a member of two of the participating ‘couples’, both from the Museum of London and both from the exhibition ‘London before London’.

Curators (as designers) of an exhibition have specific aims and purposes – social, aesthetic or pedagogic, ideological. These are rarely stated overtly in the exhibition, though in interviews with curators or curatorial teams it was clear that much discussion around aims and purposes precedes the construction of an exhibition – discussion framed by the interests of curators, policies of the museum, of governments. Given the absence, usually, of overt accounts, and also the need to link such accounts where they were available with features of the exhibition, MMDA seemed an ideal tool for gaining an understanding – as a hypothesis – of what meanings had been made by the curators/designers and of what meanings visitors, in their turn, made from the exhibition.

Semiotically speaking, an exhibition is a complex multimodal text/message. It provides a complex set of signs for the visitors who come to engage with it, and from it they construct for themselves an infinite series of promptings for interpretation. In that context, the ‘maps’ made by the visitors at the conclusion of their visit can give some indication of which aspects of the overall design/message engaged the visitor's interest and how. None of the signs, singly or together, provides, by any means, a full account of the meanings made by any of the visitors; and that applies to these two visitors (an 18-year-old woman and an 11-year-old boy), but they certainly do give a clear sense of a difference in interest.

Most immediately, the two examples show a specific – and we might say unusual – sense of what a ‘map’ is or does, with specific conceptions of what ‘mapping’ means and what is to be mapped. In both cases the notion of ‘map’ is a ‘conceptual’ – rather than a ‘spatial’ – one. Signs make the sign-maker's interest and interpretation material and evident; in that sense, the maps-assigns give an insight, hypothetically, into an implicit question: What was the interest? In the case of Figure 3.3, the question seemingly was: What, for me, were (the) salient elements of this exhibition, and in what arrangement shall I present them? In Figure 3.4, what seems to be mapped is the map-maker's sense: This is what their life was like. Both are interpretations of the exhibition overall for these visitors; the maps represent (an aspect) of the knowledge made and of what had been learned by them.

Map of a museum exhibition (Heathrow)

Figure 3.3   Map of a museum exhibition (Heathrow)

If interest guides selection, attention, framing, interpretation, we need to ask about that ‘interest’: who are the map-makers, what shaped their interests; what principles of selection, attention, seem to be evident in these maps? As a shorthand account, it may help, to understand these two signs–maps, to know that the ‘map’ in Figure 3.4 was made by one of two 18-year-old German women who were spending a week in London to get to know England. The other map was made by an 11-year-old boy from London who had come – reluctantly – with his mother for a ‘day of activities’ (which did not eventuate) at the museum. His attention had been drawn by a model airplane at a display representing a neolithic campsite uncovered at the site of the present Heathrow airport, as well as by an African mask and some tools and weapons.

Map of a museum exhibition (integrated display)

Figure 3.4   Map of a museum exhibition (integrated display)

Questions of rhetoric and design in the use of modes goes to initial conceptions of the exhibition, and from there to the overall ‘shaping’ of the exhibition: it is evident in the selection of its objects and in the salience given to particular themes and to the modes chosen in representing specific meanings – for instance in the layout of the exhibition, in its lighting, in the use of written text or of image or of 3D objects. Are 3D objects more salient, more ‘attractive’, more noticeable than written captions? Is movement more salient as a means of explanation than long written accounts? Are painted scenes more engaging than 3D tableaux? What effect does lighting have in creating affect and mood? Is the distance at which visitors are able to engage with objects, or whether they are able to touch an object, a significant matter? The question of affect has to be addressed in the case of the exhibition: the ‘wrong’ affect will inhibit or detract the attention of visitors. But affect is equally significant in all sites of learning, institutional or not. With all modal resources, discourse, power, forms of knowledge, are constantly at issue.

In all this there is another core issue, that of the affordances and logics of modes and their effects, communicationally, in rhetoric, selection and design and in terms of the differential shaping of knowledge in ontology and epistemology.

Here is a simple example, on the issue of knowledge and mode, from a science classroom for 13–14-year-olds. In the fourth lesson, on cells, the teacher asks the children: ‘What can you tell me about a plant cell?’ A child says: ‘ Miss, a cell has a nucleus.’ The teacher asks her to come to the front and to draw on the whiteboard what she has just said. She takes a felt-tip pen and draws something, as in Figure 3.5.

In drawing the image, the young woman is faced with (implicit) questions, which she had not faced in making her spoken comment. She has to decide what shape the cell(-wall) is; what the nucleus looks like; how large it is; whether it is a circle or a dot; and she has to make a decision as to where in the circle she needs to place the nucleus. The results of the decisions she has made are realized in a drawing such as that of Figure 3.5. Having drawn the circular shape and placed the dot or circle, the maker of this sign has made an epistemological commitment: ‘this is what it is like, and this is the relation between the entities ‘cell(-wall)’ and ‘nucleus’. A student who looks at a teacher's drawing on the board or at a drawing in a text-book is entitled to take that as ‘the facts of the matter’.

Cell with nucleus

Figure 3.5   Cell with nucleus

Whatever the mode, epistemological commitment cannot be avoided: a shape of some kind has to be drawn to indicate the cell-wall and the cell; a dot or a circle of some size has to be made as a representation of the nucleus; and the dot or circle has to be placed somewhere. Yet in speech there is also an epistemological commitment: that there are two object-like things, a ‘cell’ and a ‘nucleus’, which are joined in a relation of possession – ‘has’; while the so-called ‘universal present tense’ of ‘has’ guarantees its factuality: it indicates that this is always the case. The drawing carries no suggestion of possession or of a timeless truth; in the drawing, the relation is one of spatial co-locations of a specific kind: proximate or distant, central or marginal. Epistemological commitment cannot be avoided, no matter what the mode. It varies in line with the affordances of each mode: here in a contrast of speech and image – of lexis vs depiction; of possession vs proximity or distance, of centrality or marginality; as a verb-form vs spatial co-location; sequence (as temporal succession in speech or linearity in writing) vs simultaneity (of appearance and arrangement) of the entities.

Both these signs were newly made. Both drawing and spoken utterance are based on the interest of the student,– manifested for instance in selecting ‘nucleus’ as the salient feature. Both the spoken utterance and the drawing represent this student's selection from a large variety of curricular material, encountered in the course of four lessons. Both signs represent a selection, transformation/interpretation and encapsulation of the student's knowledge at that moment. In making the signs, she is making knowledge for herself and for others. Both signs declare: ‘This is what I know.’

The two representations materialize (curricular) ‘knowledge’ about this topic differently: ontologically the two are different accounts of the world in focus. For learning and teaching, in the construction and presentation of a curriculum for a specific group, this matters. Until ‘knowledge’ is ‘made material’ in a specific mode, it has no ‘shape’: we cannot ‘get at it’. To me it is not at all clear what knowledge is before it is made material in a representation. In speech, knowledge is represented in a mode shaped by the underlying logic of sequence of elements in time; as image, it is shaped by the logic of simultaneity of elements in space. Each logic, with the social shaping of each in long histories of social and semiotic work, imposes its ontology and epistemology on what is represented through the organization of elements in arrangements.

To make a sign is to make knowledge. Knowledge is shaped in the use, by a social agent, of distinct representational affordances of specific modes at the point of making of the sign. Another student might have regarded ‘cytoplasms’ as most significant, or might have focused on the functions of the membrane of the cell; and in each case they could have written or drawn or represented in 3D what they had wanted to represent (Kress et al., 2001).

What can multimodal discourse analysis tell us about learning and social life?

Modes are the result of social shaping and bear the traces of that work of constant selection in many environments. Why were these materials selected and not others? And why have these aspects of the materials been emphasized and those others ignored? These are traces of work done in response to social concerns, focus, interest, need, and so on. That can tell us much about the histories of the groups of those who use the modes. It can also tells us why two cultures may share a mode and yet make profoundly different use of that mode semiotically. That means that the ‘reach’ of a mode is not the same across different societies and their cultures. As a simple yet stark example, we know that all societies use the mode of gesture; yet how that mode is used differs vastly between, say, communities of the speech impaired and communities of people who are not affected this way.

The insistence that the linguistic modes of speech and writing are – like all modes – partial means of making meaning forces attention onto the role of other modes in meaning-making. With that comes not just the potential, but the necessity for the recognition of the meaning made by those who, for whatever reasons, use writing – and maybe even speech – less than others, yet who are highly ‘articulate’ in all sorts of domains of social, personal, professional life. The same emphasis forces us to rethink from bottom up the notion of ‘implicitness’ and, with that, of ‘knowledge’ and its widely differing materializations. In short, this opens up a view on a much fuller sense of meaning and knowing.

The meanings of social and professional life – of the snooker player or the surgeon, of the child playing in a sandpit or of the amateur cook at home, re-creating a dish encountered on vacation – now appear everywhere, and all are becoming amenable to descriptions and accounts. In principle, this opens the windows to an encompassing and generous view of the meanings of all members of a social group, without the restricting perspective of linguistic lenses. The recognition of semiotic work – as agency and in all modes – has the same potentially freeing effect. The potentials of that for rethinking forms of assessment in all domains, and in schools in particular, are entirely untapped and hugely promising.

The central place accorded to materiality in MMDA – even though subject to constant social and semiotic work – remains: MMDA opens the possibility of moving against the reductiveness of twentieth-century generalization and abstraction (in much of linguistics for instance), and toward a full account – in conjunction with other theories and disciplines – of the impact of the fact that, as humans, we are physical, material bodies and that meaning cannot be understood outside the recognition of this materiality.

At one level, this is not much more than what many of us know ‘in our bodies’: for instance, that in switching from one language to another the musculature of our body, the muscles of the chest and head in particular, take on distinct configurations, which express and realize distinct, deeply embodied forms of identity, meanings of a deep kind. The ‘lazy drawl’ of the mythic Australian stockman is more than a mere manner of talking: it speaks of a far-reaching disposition to life and to the world.

What does a social semiotic multimodal discourse analysis of communication/(inter-)action and of semiotic entities/texts entail?

If ‘multimodality’ names the field of work and ‘social semiotics’ names the theory with which that field is approached, then a number of points arise in relation to each. Multimodality, first and foremost, refuses the idea of the ‘priority’ of the linguistic modes; it regards them as partial means of making meaning. In principle, any mode may be ‘prior’ in its use in a particular environment. Modes shape our encounter with the world and our means of re-making the world in semiotic entities of any kind. This is so both in terms of the ‘logics’ of modes – temporal or spatial – and in terms of the consequences which flow from that in the social development of modes in a particular community over time; and it is so in terms of affordances of other kinds, for example of (still) image compared to writing. The entities of writing – lexical, syntactic, textual – are entirely different from those of image: words work differently from depictions, and spatial means of showing ‘connection’ and ‘relation’ are quite unlike those of the syntax of writing.

Social semiotics serves to emphasize what is shared communicationally: that there need to be resources for showing connection and relation in any mode, even though they will be different in each mode; that features of meaning are shared among all modes – intensity, framing, foregrounding, highlighting, coherence and cohesion, forms of genre, etc. – even though they will differ from mode to mode. Intensity may be materialized as loudness in speech and as saturation in colour, or as thickness or bolding in writing or in image.

Communicationally, social semiotic theory brings a rhetorical approach: that is, rhetoric as the politics of communication demands an attitude that enquires about the social environment of communication and its participants, about their relations in terms of power and their social characteristics. It focuses on what is to be communicated and on the means available for materializing the meanings at issue and the means most apt in terms of the social environment and of the characteristics of the audience. The designer, usually the same person as the rhetor, then has the task of turning the rhetorical assessment of the environment, of the audience and of the means for materializing these into a design most likely to meet the political aims of the rhetor.

The availability of modes founded on the different logics of time and space – or of both, as in sign languages, or dance – is particularly useful as a resource for design, for instance in designing texts or other semiotic objects on the differing principles of modularity or linearity – or to use the insights of the theory to provide descriptions of how these principles work in different modes, as much as in different cultures. A theory that includes that distinction is essential in ‘the West’, where linear forms of semiotic organization are now challenged intensely by modular forms – to some extent as an effect of the ‘transport’ of one principle from a social cultural site where it has been dominant to a site where it has not been so hitherto, as much as of the displacement of one site of appearance and display – the page – by another – the screen.

In areas of cross-cultural communication a multimodal approach is an essential prerequisite, and it affects all forms of composition, everywhere, though differently in different sites.

Why is a social semiotic multimodal discourse analysis important?

Whatever view one takes of the social, economic, cultural, political and technological world, it is a world in rapid transition and a world where the pace of ‘transport’ in all these dimensions has accelerated – out of control nearly. The pace of transport, the instantaneity of access in many domains, have changed the social and political and economic framings of the world and, with that, the framings around – and of – the cultural resources at issue in the semiotic domain, the domain of meaning-making.

This entails that more adequate, sharper tools are needed, tools that are apt for the multiplicity of semiotic resources as much as for the intensely varying appearances and effects of power in a largely unbounded and barely framed semiotic world. Rhetoric is essential when every occasion of communication is likely to be new and often profoundly different.

Design, similarly, is at the forefront of essential semiotic dispositions in a world of vastly varying resources, many instantly accessible, needed and used. Design is needed for forms of social interaction as much as for the ‘content’ of messages. Both the need and the potentials for designing have increased and have moved centre stage. Notions of (in-)coherence are hugely more problematic and difficult: coherence and incoherence have become more visible with the ubiquity of screens and more difficult to establish with a move to horizontally organized power.

In a world of much greater variety and variability, the wide range of available modes increases the possibilities and potentials of apt representations of the world framed. This makes the ‘transcriptional possibilities’ of modes into desirable or essential characteristics: the world ‘transcribed’ in writing as narrative differs from the world ‘transcribed’ in several modes with different affordances, distinct logics and genres.

My use here of the term ‘transcription’ points to an urgent problem for MMDA: the terminology available to describe a multimodally constituted and recognized semiotic world is no longer apt, and that world urgently needs renaming. The labels we have come from a world that was founded on the pre-eminence of language, and of writing in particular. Using terms that carry a heavy freight of past theory designed for different tasks, now congealed into commonsense, is likely to skew the new enterprise in its development. There is a large agenda of work here. There is also the promise of seeing and doing better. Both will be essential in dealing with the problems that currently define the world of meaning.

Further reading

Jewitt, C. (2008) Technology, Literacy, Learning. London: RoutledgeFalmer.

The book introduces central concepts multimodal analysis and provides analyses of texts in different media from three areas of the secondary curriculum: English, mathematics and science.

Jewitt, C. (2009) The Routledge Handbook of Multimodal Analysis. London: RoutledgeFalmer.

Definitional chapters from leading theoreticians and practitioners in different domains of multimodal work, in the frame of a broad theoretical ‘location’ of the work by the editor.

Hodge, R. I. V. and G. R. Kress (1988) Social Semiotics. Cambridge: Polity Press.

A wide range of materials – photographs, sculpture, newspapers, paintings, literary texts – are used to develop a socially grounded, encompassing account of semiosis, not derived from linguistic theories.

Kress, G. R. (2010) Multimodality: A Social Semiotic Take on Contemporary Communication. London and New York: RoutledgeFalmer.

Multimodality approached in the encompassing frame of social semiosis, with a wide range of materials exemplifying meaning-making in contemporary sites and media.

Mavers, D. (2010) Children's Writing and Drawing: The Remarkable in the Unremarkable. New York and London: RoutledgeFalmer.

A meticulously detailed account documenting meaning-making in the visual–graphic domain, with a sharp focus on the means for recognition of semiotic work.

Norris, S. (2004) Analysing Multimodal Interaction: A Methodological Framework. London and New York: RoutledgeFalmer.

A closely detailed setting out of the intricacies of multimodal interaction, providing methodologies for dealing with the complexities of handling such materials in research.


Bernstein, B. (1984) Class, Codes and Control: Theoretical Studies Towards a Sociology of Language. Vol.1. London, Routledge and Kegan Paul.
Bezemer, J. and Kress, G. (2008) ‘Writing in multimodal texts: a social semiotic account of designs for learning’, Written Communication, 25 (2): 166–195 (Special Issue on Writing and New Media).
Bezemer, J. and Kress, G. (2009) ‘Visualizing English: a social semiotic history of a school subject’, Visual Communication, 8: 247–262 (Special Issue on Information Environments).
Brown, R. and Gilman, A. (1968) ‘The pronouns of power and solidarity’, in Thomas A. Sebeok (ed.) Style in Language. Cambridge, MA: MIT Press, pp. 253–276.
Chomsky, N. A. (1957) Syntactic Structures. The Hague: Mouton.
Chomsky, N. A. (1965) Aspects of the Theory of Syntax. Harvard, MA: MIT Press.
Fairclough, N. (1989) Language and Power. London: Longman.
Fairclough, N. (1992) Discourse and Social Change. Cambridge: Polity Press.
Foucault, M. (1981) ‘The Order of discourse’, in R. Young (ed.) Untying the Text: A Post-Structuralist Reader. London and New York: Routledge and Kegan Paul, pp. 48–78.
Fowler, R. , Hodge, B. , Kress, G. , and Trew, T. (eds.) (1979). Language and control. London: Routledge and Kegan Paul.
Franks, A. (1995) ‘The Body as a Form of Representation’, Social Semiotics, 5 (1): 1–21.
Franks, A. (1997) ‘Drama, desire and schooling’, Changing English, 4 (1): 131–148.
Franks, A. and Jewitt, C. (2001) ‘The meaning of action in learning and teaching’, British Educational Research Journal, 27 (2): 201–221.
Gee, J. P. (1999) Introduction to Discourse Analysis. New York: Routledge.
Gee, J. P. (2008) Social Linguistics and Literacies: Ideologies in Discourses. Abingdon: Routledge.
Gibson, J. J. (1986) The Ecological Approach to Visual Perception. Hillsdale, NJ: Lawrence Erlbaum.
Gumperz, J. (1982) Discourse Strategies. Cambridge: Cambridge University Press.
Habermas, J. (1984) The Theory of Communicative Action: Reason and the Rationalization of Society. Boston, MA: Beacon Press.
Halliday, M. A. K. and Hasan, R. (1976) Cohesion in English. London: Longman.
Hodge, R. and Kress, G. (1988) Social Semiotics. Cambridge: Polity Press.
Hodge, R. and Kress, G. (1993) Language as Ideology, 2nd edn. London: Routledge.
D. Hymes , (ed.) (1964) Language in Culture and Society: A Reader in Linguistics and Anthropology. New York: Harper and Row.
Insulander, E. (2008) ‘The museum as a semi-formal site for learning’. Medien Journal. Lernen. Ein zentraler Begriff für die Kommunikationswissenschaft. 32. Jahrgang. Nr. 1/2008.
Insulander, E. and Lindstrand, F. (2008) ‘Past and present – multimodal constructions of identity in two exhibitions’, Paper for Comparing National Museums: Territories, Nation-Building and Change, NaMu IV 18–20 February 2008, Linköping University, Norrköping, Sweden.
Jewitt, C. (2008) Technology, Literacy, Learning. London: RoutledgeFalmer.
Jewitt, C. (2009) The Routledge Handbook of Multimodal Analysis. London: RoutledgeFalmer.
Jewitt, C. and Kress, G. (2003) ‘Multimodal research in education’, in S. Goodman , T. Lillis , J. Maybin , and N. Mercer (eds.) Language, Literacy and Education: A Reader. Stoke on Trent: Trentham Books/Open University, pp. 277–292.
Kress, G. R. (1975) ‘Tense as Modality’, UEA Papers in Linguistics, 3.
Kress, G. R. (1982) Learning to Write. London: Routledge and Kegan Paul.
Kress, G. R. (1984/1989) Linguistic Processes in Sociocultural Practices. Geelong: Deakin University Press, Oxford: Oxford University Press.
Kress, G. R. (2001) Early Spelling. From Creativity to Convention. London: RoutledgeFalmer.
Kress, G. R. (2003) Literacy in the New Media Age. London: RoutledgeFalmer.
Kress, G. R. (2009) ‘What is mode?’, in C. Jewitt (ed.) Routledge Handbook of Multimodal Analysis. London: Routledge, pp. 54–67.
Kress, G. R. (2010) Multimodality: A Social Semiotic Approach to Contemporary Communication. London: RoutledgeFalmer.
Kress, G. and Aers, D. (1982) ‘The politics of style in measure for measure’, Style 16(1): 22–37.
Kress, G. and Bezemer, J. (2009) ‘Writing in a multimodal world of representation’, in R. Beard , D. Myhill , M. Nystrand , and J. Riley (eds.) SAGE Handbook of Writing Development. London: Sage, pp. 167–181.
Kress, G. R. , Bourne, J. , Franks, A. , Hardcastle, J. , Jewitt, C. , and Jones, K. (2004) English in Urban Classrooms: A Multimodal Perspective on Teaching and Learning. London: RoutledgeFalmer.
Kress, G. R. and Hodge, R. I. V. (1979) Language as Ideology. London: Routledge and Kegan Paul.
Kress, G. , Jewitt, C. , Ogborn, J. , and Tsatsarelis, C. (2001) Multimodal Teaching and Learning: The Rhetorics of the Science Classroom. London: Continuum.
Kress, G. R. and van Leeuwen, T. (1996/2006) Reading Images: The Grammar of Graphic Design. London: RoutledgeFalmer.
Labov, W. (1966) The Social Stratification of English in New York City. Cambridge: Cambridge University Press.
Labov, W. (1972) Language in the Inner City. Philadelphia: University of Pennsylvania Press.
Lindstrand, F. (2010) ‘Transformed meanings – multimodal meaning-making at the museum’, in Selander, S. (2008b). ‘Designs for learning – A theoretical perspective’. Designs for Learning, 1 (1): 10–24.
Mavers, D. (2007) ‘Semiotic resourcefulness: a young child's email exchange as design’, Journal of Early Childhood Literacy, 7 (2): 155–176.
Mavers, D. (2009) ‘Student text-making as semiotic work’, Journal of Early Childhood Literacy, 9 (2): 141–155.
Mavers, D. (2010) Children's Writing and Drawing: The Remarkable in the Unremarkable. New York and London: RoutledgeFalmer.
Rorty, R. (1967) The Linguistic Turn. Essays in Philosophical Method. Chicago and London: The University of Chicago Press.
Sinclair, J. M. H. and Coulthard, R. M. (1975) Towards an Analysis of Discourse: The English Used by Teachers and Pupils. Oxford: Oxford University Press.
van Dijk, T. (1977) Text and context. Explorations in the semantics and pragmatics of discourse. London: Longman.
van Dijk, T. and Kintsch, W. (1983) Strategies of Discourse Comprehension. London: Academic Press.
van Leeuwen, T. (2005) Introduction to Social Semiotics. London: RoutledgeFalmer.
R. Wodak and M. Meyer (eds.) (2001) Methods of Critical Discourse Analysis. London: Sage Publications.
Search for more...
Back to top

Use of cookies on this website

We are using cookies to provide statistics that help us give you the best experience of our site. You can find out more in our Privacy Policy. By continuing to use the site you are agreeing to our use of cookies.