MULTIMODAL CONVERSATION ANALYSIS AND CLIL CLASSROOM PRACTICES

All material supplied via JYX is protected by copyright and other intellectual property rights, and duplication or sale of all or part of any of the repository collections is not permitted, except that material may be duplicated by you for your research use or educational purposes in electronic or print form. You must obtain permission for any other use. Electronic or print copies may not be offered, whether for sale or otherwise to anyone who is not an authorised user. Multimodal conversation analysis and CLIL classroom practices Evnitskaya, Natalia; Jakonen, Teppo


Introduction
This chapter explores the contribution of multimodal conversation analysis (CA) to increasing our understanding of a range of phenomena in CLIL classroom interaction.
We begin by briefly introducing key methodological principles of CA and describing some focal areas of what might be called 'basic' CA: research that investigates generic orders of interaction. We also discuss how recent CA research has directed increasing attention to embodied aspects of social interaction. After that, we describe ways in which CA methodology has been 'applied' to research in second language (L2) learning in what is nowadays known as CA-for-SLA. Our review of existing content-based classroom research focuses on recent studies which have examined how participants employ language and other semiotic resources such as the body and available artefacts in CLIL and immersion classrooms. We also demonstrate a multimodal CA methodological orientation by conducting a sequential analysis of video-recorded interaction from a CLIL biology classroom, and suggest research topics related to CLIL teaching which could be fruitfully pursued using a multimodal CA approach.

Key CA principles and topics
Conversation analysis emerged in the 1960s in the context of sociology (for in-depth accounts of CA origins, see Heritage 2008;Psathas 1995), and only later developed into a research method in applied linguistics. The sociological origins of CA and its connections to Garfinkel's (1967) ethnomethodology are reflected in the fact that social order, action, its formation, interpretation and relation to other actions represent central foci of CA inquiry. Broadly speaking, CA differs from many forms of discourse analysis in its focus on local and contextual ways in which interactants use language and other semiotic resources to jointly accomplish social activities and make sense of each other (e.g. Sacks, Schegloff & Jefferson 1974;Schegloff 2007). Besides an analytical interest in the organisation of everyday conversation, CA has a longstanding history in examining how various institutions such as courtrooms and classrooms do their work through talk-in-interaction, i.e. how these institutions are talked into being (see e.g. Drew & Heritage 1992).
Among the key assumptions guiding CA analyses are the following (see Heritage 1984b:241-245): (1) Interaction is structurally organised: ordinary conversations and talk-ininteraction exhibit a rational design and order through systematic, organised patterns, which are oriented to by the participants.
(2) Contributions to interaction are contextually and sequentially oriented: participants assemble 'here-and-now' meaning of their contributions by accomplishing their social actions within the sequential environment (or context) of the unfolding interaction. This makes contributions context-shaped. However, every current action also transforms the sequential environment in which a next action will occur. This makes participants' contributions also context-renewing.
(3) No order of detail can be dismissed a priori as disorderly, accidental or irrelevant.
These three principles require the use of audio and video-recorded, naturally occurring interaction as the primary source of data, as opposed to post-event participant reports or experimental research designs. The principles serve the purpose of discerning what may be called the emic (Pike 1967:37) or participants' perspective to their activities and the meanings being constructed turn-by-turn during those activities. This orientation to phenomena that are relevant to the participants is behind a general bottom-up, inductive logic and an avoidance of pre-theorisation: principles termed by Psathas (1995) as 'unmotivated looking'.
A central research project in CA has been the identification of 'generic' orders of human interaction (Schegloff 2007) -possibly universal, culture-independent building blocks of interaction. One such generic order is the organisation of turn-taking.
When participants take turns to contribute to unfolding interaction with some specific action, they display an understanding of what the just-prior turn was about, and at the same time provide a context for the next turn (see principle 2 above). This next-turn proof procedure (Sacks et al. 1974:729) therefore offers a device for participants to display and maintain mutual understanding; simultaneously, it offers the analyst a view into emic participant orientations. Participants rarely display that they understand the prior turn by explicitly claiming so, rather they simply demonstrate it by performing an action that involves 'some sort of analysis' (Sacks 1992:253) of the previous turn.
Besides turn-taking, social life is also organised in relation to distinct actions, for example asking and answering, or assessing something, which participants 'do' and expect to appropriately, thereby building "coherent, orderly, meaningful successions or 'sequences' of actions" (Schegloff 2007:2) (see principle 1 above). Such sequences are often organised around an adjacency pair structure, that is, two linked turns that are produced by interactants. A wide range of adjacency pairs can be observed in any stretch of everyday or classroom interaction, for example, questions/answers or requests/acceptance. These examples illustrate basic functions of adjacency pairs. The first pair-part (FPP) initiates an exchange and makes 'conditionally relevant' (Schegloff 2007) a second pair-part (SPP) action which responds to the first. Adjacency pairs are often expanded, such as when recipients do not hear or understand a question and initiate an insert sequence to repair the 'trouble source'. The organization of repair (see e.g. Kitzinger 2012) is not only a central topic in 'basic' CA, but the ways in which L2 speakers repair understanding has been the focus of much CA-for-SLA research (see e.g. Hellermann 2009). In L2 classrooms, repair practices can also intimately relate to pedagogical goals (Seedhouse 2010).
Much of earlier CA findings have been obtained through the analysis of audiorecorded materials. Through the increased availability of video recorders, researchers have broadened the analytical focus to include non-verbal aspects of interaction and how action, meaning and interaction are routinely constructed through multiple modalities. Their interplay has been of interest in many disciplines beyond CA, ranging from psycholinguistics of speech and gesture to semiotic studies and mediated discourse analysis (see for example Kress & van Leeuwen 2001;Müller et al. 2013;Scollon 2001). As Deppermann (2013) points out, from a CA perspective, analysts have become more aware of the role and relevance of different semiotic resources in shaping human experience, and have begun to systematically explore these matters in the context of sequentially-evolving interaction. As many studies in a broad range of contexts have shown, participants employ an array of semiotic resources in addition to talk to construct action, both those afforded by the human body, such as gaze, facial expressions, gestures, head movement, and body posture, and those offered by the surrounding physical space, such as material objects (Goodwin 2000;Mondada 2008;Stivers & Sidnell 2005;Streeck, Goodwin & LeBaron 2011). It is important to note that these labels for different resources can be considered as conventions based on the analyst's perspective, and they are not necessarily distinctions made by participants, who may instead experience interaction in a more holistic manner (see e.g. Streeck 2013). In any case, multimodality seems to be a pervasive feature of human interaction.

CA-for-SLA
CA has become increasingly prominent in research on second language acquisition (SLA) after Firth & Wagner's (1997) seminal call for more sensitivity towards social aspects of language learning. Since the early 2000s, the enterprise of using CA's sociointeractionist perspective to investigate language learning has become known as CAfor-SLA (or CA-SLA) (e.g. Firth & Wagner 2007;Kasper 2004;Kasper & Wagner 2011;Markee 2005;Mondada & Pekarek Doehler 2004).
Much of existing CA-for-SLA research closely examines how L2 classroom interaction is organised on a moment-by-moment basis and how participants conduct the work of teaching and learning, for example, by orienting to institutional roles in the classroom and increasing their communicative repertoires (Markee 2005). Many studies attempt to identify interactional patterns, practices and situations that may afford or constrain L2 learning and explore how learning itself is socially constituted (e.g.

CA and bilingual classrooms
Unlike much other research on CLIL, explicit CA-based comparisons of CLIL and FL classroom interaction are extremely rare, something that perhaps reflects a general paucity of comparative CA work. Studies that investigate either CLIL or other bilingual classrooms have shown that participants use a wide range of semiotic resources to manage objects of knowledge or learning, and that such management may involve complex configurations and inter-relations of language and content. For example, Evnitskaya & Morton (2011) investigated how everyday knowledge becomes reified as linguistic manifestations of subject-specific knowledge of school science. In addition to pointing out the key role of L1 for such interactional work, the authors noticed differences across task types and educational levels. Knowledge construction could either be organised as a move from everyday observations elicited from students towards a scientific theorisation of the focal phenomenon or the other way around.
Similar observations on the important role of students' everyday experiences in the classroom were also made by Jakonen (2014), whose study investigated how students make their language expertise obtained from English language popular culture relevant for task work in a CLIL classroom. Escobar Urmeneta & Evnitskaya (2013) demonstrated how in two CLIL science classrooms, teachers' particular interactional and multimodal strategies resulted in different degree of the complexity of interactional organisation and the quality of the generated subject-specific conversations. This happened as these strategies triggered different turn-taking patterns, which afforded students varied opportunities to participate in such conversations.
Some of the existing CA work has also problematized a binary division between language and content in CLIL. An example is the study by Pekarek Doehler & Ziegler (2007), who conducted a single-case analysis of teacher-student interaction from a biology immersion class to show how participants' orientation to language-related work, such as the pronunciation and choice of scientific terms, was not only embedded in, but also functioned as 'stepping-stones' for advancing scientific work. On the basis of their observations, the authors suggested that practices of 'doing science' and 'doing language' are inseparable, each feeding into the other. Similar observations have also been made in tertiary CLIL classrooms by Moore & Dooly (2010), who described how the orientation of a group of teacher trainees doing a learning activity shifted between language and content as they were trying to decide whether apples 'grow' or 'reproduce (themselves)'. Drawing on their L1s (Catalan and Spanish), the group's attention shifted between whether the English word 'reproduce' is a reflexive verb and which of the two verbs more suitably and scientifically describes the growth cycle of apples. These studies point towards a complex relationship between language and content in CLIL teaching and learning.
Besides investigating what kinds of learning objects are pursued in CLIL classrooms, recent CA studies have also explored social relations and identities around knowledge in the classroom. Such work, carried out in the framework of interactional epistemics (e.g. Heritage 2012), approaches 'knowing' (or 'not knowing') as matters that participants 'do' in interaction. For example, Jakonen & Morton (2015) investigated sequences of peer interaction in a CLIL history class that began when one student conveyed a knowledge gap. By exploring how such gaps were treated, the authors showed that the interactional management of a student's epistemic status as either a 'knower' or 'not-knower' in peer interaction is very different from teacherstudent interaction, where teachers are treated as having primary access to knowledge.
These epistemic orientations mean that some actions, such as pointing out a teacher's error, can be socially problematic. This was illustrated by Kääntä (2014), who examined students' embodied conduct from 'noticing' a teacher's error during whole-class exercise checking to initiating interaction to correct it. Her analysis portrayed the students' subtle work such as sudden gaze shifts between their and their neighbour's task-related materials and facial expressions such as puckering lips and frowning that preceded the delicate action conveyed by a correction initiation.
A multimodal sensitivity to interactional data has also shed light on ways in which teachers and students coordinate semiotic resources other than talk to accomplish learning activities. This may be the case even in task types that we often take as verbal 'individual' performances, such as student explanations. In a case study on a CLIL geography lesson, Kupetz (2011) showed how explaining was constructed through finely coordinated semiotic resources such as the L2, facial expressions, pointing and other gestures, as well as objects such as the overhead projector. Similar multimodal resources are also often used by teachers to draw students' attention to subject-specific terminology, as was demonstrated by Pitsch (2005) in the context of a bilingual history classroom. The teacher in her study systematically marked certain concepts important using multimodal resources (e.g. hesitation right before the key word, prosody and gaze) and then initiated sequences to clarify or translate these concepts into L1. Pitsch argued that this allows concepts and academic knowledge to become linguistic 'objects' and be afforded to students as such. In their study on a CLIL science classroom, Escobar Urmeneta & Evnitskaya (2014) came to similar conclusions as they found that the teacher also used a wide range of semiotic resources when constructing an extended explanation of a lexical item 'harmful', relevant in the context of the ongoing pedagogical activity on bacteria. In this way the teacher supported student comprehension of the conceptually-loaded item and the integrated appropriation of language and content.
Besides 'traditional' classroom artefacts such as overhead projectors and blackboards, many CLIL classroom activities involve different kinds of objects, which intimately relate to complex subject-specific competencies and learning aims. A multimodal research approach can shed considerable light on the kinds of competencies activities such as lab experiments require, as such experiments are semiotically complex and routinely involve close coordination of talk, embodied actions and physical objects.
Kääntä & Piirainen-Marsh (2013) examined how a group of students in CLIL physics class worked together to balance two weights on a seesaw, an experiment used to introduce the concept of torsional moment. The authors demonstrated how the semiotic resources that the students used for instructing were also sensitive to the spatial arrangement of the task, so that in addition to using language, those students standing further away would also rely on gestures pointing at a suitable location for the weights.
In contrast, students positioned closer would occasionally manually guide the hand of the student in charge of manipulating the experimental objects.
All in all, CA-based investigations of CLIL classrooms have increasingly begun to examine embodied and material aspects of pedagogical activities. These studies have contributed to classroom research by offering detailed, qualitative explorations of practices found in bilingual classrooms, and as a result, by highlighting the situated, material and embodied nature of CLIL classroom interaction. We will now demonstrate a multimodal CA approach to CLIL classroom data by discussing a single case that involves one commonplace material equipment in CLIL science classrooms, the microscope.

Illustrating a multimodal CA approach to CLIL
Our interactional data come from an English-language CLIL biology class of grade 7 students (age 12) in a bilingual Catalan-Spanish community. The aim of this analytical demonstration is to show how a micro-sequential and multimodal analysis of a pedagogical activity and the semiotic resources deployed by participants in accomplishing such activity may further our understanding of teaching and learning practices in CLIL classrooms. The selected data have been transcribed following standard CA annotation system for talk (Jefferson 2004) and conventions for representing participants' visual and embodied conduct (Mondada 2008), including video screenshots (see Appendix).

'From so much heating, it's now dead'
The interaction we analyse takes place during a lesson-end plenary in which the students report a series of statements concerning the properties of a one-cell microorganism, Euglena, which they had been observing in small groups earlier in the lesson. To present our analysis in a reader-friendly format, we show the data in two excerpts. The excerpts demonstrate the kind of practical work by the teacher in making students' interventions into accountable pieces of subject-specific knowledge that 'fit into' the topic of the ongoing instruction. conjunction 'and' to introduce a 'we'-statement which contains an emphatically produced verb 'know' (line 8). These three words allow her to explicitly relate the students' empirical 'seeing' to their -the students and the teacher's -common subjectspecific knowledge (co-)constructed in previous lessons. In lines 9, 10, 12 and 15 the teacher finally exposes the knowledge which she treats as familiar to the students.
It may be that through such elaborated recapping the teacher orients to ascertaining that school-science knowledge which has already been co-constructed with individual students in private interactions is accessed by everyone. At the same time, the recap also works to support students' understanding of a particular scientific phenomenon, photosynthesis, in the L2. Apart from incorporating subject-specific reifications (Wenger 1998) Andrew has previously mentioned ('chloroplasts' and 'photosynthesis'), the teacher also uses other contextually-relevant lexical items ('function', 'food', 'organism' and 'process') to build a complex, multi-level and highly Despite these attention-calling actions, the teacher does not allocate a turn to the girls. This might explain why Marta abandons bidding for a turn and rather takes it directly when speaker transition is projected (for transition relevance place, or TRP, see Sacks et al. 1974). The student seems to interpret as the TRP the point where the teacher's turn can be considered syntactically (and pragmatically) possibly, but not prosodically complete (line 10). As Marta initiates a turn in line 11, she makes the microscope relevant for her action by pointing at it (Figure 1-b). However, Marta's turn is interrupted as the teacher continues her explanation; note how Marta waits until there is an even more clearly marked TRP and a lengthy silence (lines 12-13) until she resumes her turn, informing the teacher about the two students' observations. Marta produces the informing (lines 14 and 16) together with Sara, who uses her handout to complete Marta's turn (line 17).
Marta and Sara's actions do not align (see Stivers 2008) with the teacher's current activity of guiding the class through the final plenary -of which her reconstructive and more general recap of the students' empirical 'observations' is an essential part. Yet, their attention-seeking neither seems to be treated by the teacher as a Having been given the floor in the whole-class activity, Marta uses it to produce a description of a problem in the target language (lines 19-20), structuring it as a complex utterance 'the thing white that we see:: (2.5) e:m now it doesn't move' with an embedded dependent adjectival clause 'that we see'. From an interactional perspective, Marta also shows situated competencies to hold the interactional floor given to her for a multi-unit turn. She accomplishes this by projecting continuation of her turn with the embedded adjectival clause, slightly rising intonation and the verb stretching (line 19).
Marta marks the end of her report with falling intonation (line 20). This is followed by a 1.5 second silence (line 21) during which Marta, Sara and Arnau (a student belonging to another pair sitting next to Marta) fix their gaze on the teacher to display that they wait for her response to Marta's report.
The teacher's response in line 22 claims understanding the implications of Marta's news through the use of the Catalan/Spanish change-of-state token 'a' (cf. 'oh' in English, Heritage 1984a). In the same turn she also solicits Marta's confirmation by repeating the final part of her utterance. The declarative utterance is prosodically marked as a request for confirmation with the emphasis falling on the verb 'move'.
Marta and Sara provide the expected confirmation: one with a slight headshake (line 22) and the other with a short 'yes', followed by an attempt to specify her confirmation, which is, however, cut off (line 23). At this moment Andrew, a student from a row behind that of the two girls, intervenes by emphatically stating that 'it's dead' (line 24).
In his assessment he draws on Marta's public reporting that the observed microorganism has stopped to move.
The teacher, however, still seeks the identification of the referent discovered and reported by the students. She requests another confirmation (line 25) by suggesting a lexical item 'the transparent thing' as a candidate referent. In Excerpt 2, such teacher's colloquial reference to the micro-organism, which has already been established between the two girls and the teacher earlier in the lesson, is recognised by the two students and Andrew's 'it's dead' (lines 24, 27) and 'ours is dead' (line 30) into an L2 statement 'this organism is sensitive to temperature'. Such a reformulation of visual observations allows her to support students' understanding of the focal scientific phenomenon. They also help her transform the students' observation into a more academic and accountable piece of L2 school-science knowledge.

Discussion and concluding remarks
Summarising our observations on the brief analysis of CLIL classroom interaction, we conclude by sketching some implications that a multimodal approach to interaction has for CLIL research and classroom practices.
Firstly, the analysed single case illustrated how the practical work of teaching involves constant and sequentially contingent decision-making in order to manage the ongoing pedagogical activity and therefore learning. Such decision-making implicated in teacher action involves, for example, how she deals with different kinds of student contributions, such as turns that are teacher-allocated (Andrew) and student-selected (Marta, Sara, Arnau), and 'weaves' them into coherent instruction (see also Waring 2013). At any given time, teachers are faced with the task of monitoring and managing multiple actions and events in the classroom. In the analysed interaction, the teacherallocated turns by students were part of the ongoing activity of checking the findings obtained earlier during the experiment. However, at the same time related problems may be flagged up by students, examples of which were the girls' self-selected turns in Excerpt 1. As we have seen, these were temporarily ignored by the teacher, who structured other students' attention-seeking and news-report into one-at-a-time instruction. Once the teacher's current interactional and pedagogical action (rephrasing of Andrew's observation) was finished, she tackled the next student contributions 'due', a series of student-initiated announcements of problems, in which her task was to elaborate and thereby assist in the joint construction of content knowledge.
Secondly, we would like to draw attention to how action and participation in the classroom are constructed through a complex interplay of multiple semiotic resources, not only talk but also embodied conduct, particularly in the way classroom objects are oriented to and handled. This means that embodied conduct is an essential part of lessons, which researchers need to attend to in order to understand the range of discipline-specific interactional practices in CLIL classrooms. A case in point is Excerpt 1, in which Sara and Marta not only rely on hand-raising in soliciting the teacher's attention but also make texts and material objects (notes, microscope) relevant for the construction of their observations. In Excerpt 2, Arnau's access to the microscope affords his 'seeing' and subsequent verbalisation of the cause-effect relationship to explain the two girls' finding.
Thirdly, the analysis has also highlighted how participants orient to a clear division of responsibilities in managing classroom interaction and in interpreting action in the classroom. Notice how in their interactional contributions, Marta and Sara do not actually 'ask' anything (in grammatical terms) from the teacher but rather formulate a problem ('it doesn't move'). Yet, as the participants' subsequent conduct shows, such problem-stating is enough to shift the responsibility for the next interactional move to the teacher, e.g. in line 21 in Excerpt 2 when during the 1.5 second silence the students gaze at and wait for the teacher to take the lead and to construct a (scientific) explanation for 'not moving'.
Moving from these findings to more general implications for CLIL research community, we would like to highlight the benefits of a multimodal CA approach for furthering understanding of the complex reality of CLIL interaction. It has often been claimed that CLIL is driven by the idea that language is best learnt in a context of 'meaningful' or 'authentic' use. Perhaps paradoxically, such contexts-of-use do not only involve language but a whole lot of other semiotic resources and ways of making correspondingly, in this volume). For example, the ways in which participants handle classroom objects relevant in CLIL science classrooms such as microscopes, which may contribute to student learning, can be hard to examine if the research focus is on discourse as opposed to interaction. Yet, these ways are a significant part of scientific practices in the classroom and beyond. Investigating these practices which are ubiquitous in CLIL classrooms can contribute to the wider research project of (CA-for-) SLA by offering new ways to understand how embodied and material aspects of interaction relate to language learning processes, something which has been perhaps overlooked in interactional studies that have foregrounded learners' spoken language use.
As for implications for CLIL practice, our findings provide insight into the interactional and multimodal organisation of teaching and learning in the CLIL classroom. They shed light both on practices that can be expected to be fairly general (e.g., giving instructions; making individual students' interventions relevant for the whole class; students' code-switching) and on those that are more disciplinary-specific, such as practices related to laboratory experiments in CLIL science classrooms. While CLIL teachers need to support the development of their students' language skills and be aware of the role of language in their classrooms, they should also need to acknowledge the multimodal nature of teaching and learning disciplinary-specific practices and the interactional competences they require.
For the purposes of looking for effective teaching practices, CLIL practitioners can learn a lot from findings in other classroom contexts, L1 teaching included. We argue that, although the language of instruction in CLIL classrooms is an L2, classroom practices in such contexts bear many similarities with those in L1 contexts. These similarities also mean that students are familiar with classroom routines and expected ways of participating in lessons, which can provide important support to content and language learning in CLIL classrooms, particularly in cases where students may still have limited L2 skills.
To conclude, we hope that our observations have illustrated both the contribution that a multimodal CA perspective can have for CLIL classroom-based research and its pedagogical implications for CLIL practitioners.