“Novice learners may benefit most from well-guided low-paced instructional procedures, while more knowledgeable learners may benefit more from minimally guided forms of instruction.” -Slava Kalyuga
The Example that Led to Reflection
I never cease to be amazed at the level of knowledge that my teachers keep bringing to the table in my class. Last week we were discussing probability trees, and one student was leading the activity with the following tree (probabilities of drawing a yellow, green or black ball without replacement):
After the student was finished answering a couple of questions we had about the tree, I posed the challenge “Create a question where the final answer is 2/5.” I asked this question because I wanted them to get more comfortable with conditional probability. For example, the probability that we will draw a black ball, given that the first ball is yellow is 2/5, so P(B|Y) = 2/5.
Much to my surprise, the first answer given was “Determine the probability of drawing a black or green ball, given that the first ball drawn was black.” I had to sit back and try to figure out where this answer was coming from since I had not anticipated it (this is both the joy and challenge of allowing students to lead the discussion)!
Since the events of drawing a black ball and drawing a green ball are mutually exclusive, we can calculate
P(B or G | B) = P(B|B) + P(G|B) = 1/5 + 1/5 = 2/5.
Can you determine the branches used to create this question? After doing some of what Michael Jacobs calls “Maths C.S.I.” I had successfully determined how the student was thinking.
Are All of Our Students Really Novices?
Over the weekend I began pondering about how there is a lot of talk that mathematics students need to be treated like novices, especially in elementary school. For example, in Anna Stokke’s C.D. Howe Report, she states
To be effective, instructional techniques must cater to the limitations of a person’s working memory, which can hold only a limited amount of new information. This is particularly important for novice learners who have difficulty focusing on new concepts when their working memory is overwhelmed.
I don’t necessarily disagree with the statement above – one which is taken from Kirschner, Sweller & Clark, and heavily founded in Cognitive Load Theory – it is important for us as teachers to understand when learners may have limitations, and how to effectively combat these limitations. I do, however, think it is important for us to also reflect on how often we treat our students as novice learners, and realize their potential as non-novice learners. Those who argue in favour of CLT often view their learners as novices, effectively by-passing the expert-reversal effect. Stated briefly, the expert-reversal effect states methods that typically work well to elicit learning in novice learners are not necessarily the best methods to elicit learning in non-novice learners. For example, as one progresses in their knowledge of mathematics, worked examples become less conducive to learning.
In lieu of this thought, I pose some questions:
1) Are all of our students actually novice learners? Is it possible that our students are sometimes non-novices?
2) If we agree that at least some of our students are non-novices, what methods should we utilize to elicit learning in these individuals? Must it still be direct instruction and worked examples?
3) If we believe that our students are novice learners, will we ever see them as non-novice learners? Does this belief we hold affect their learning?
A post where we explore some concepts required to define understanding in a cognitive load theory framework.
In my last blog post, I briefly summarized element interactivity. When elements must be processed in working memory simultaneously due to them being logically connected, we say the elements have high element interactivity. By supporting schemata development in our pedagogical practices, we can combat the strain on working memory that element interactivity causes. There are also two other ideas to keep in mind when reflecting on our pedagogy: intrinsic and extraneous cognitive load. I would like the topic of this post to be dedicated to summarizing and exploring these topics.
Intrinsic Cognitive Load
Working memory load that is imposed by the intrinsic nature of the information we are trying to process is known as intrinsic cognitive load. Perhaps this can be explained nicely through the use of an example.
First, let’s think about solving for ? in the addition statement 3 + 5 = ?. We have seen that, for novice learners, there are many elements to process here, leading to high element interactivity. Novice learners may have to process all of these elements separately, perhaps first counting to three, then counting up again to eight. In this instance, the high element interactivity causes intrinsic cognitive load. It would be a significant challenge to process anything else in working memory since all of the processing power is dedicated to making sense of the symbols and using the counting-up strategy.
For those who know the fact that 3 + 5 = 8, this whole element can enter working memory, freeing up processing space. For expert learners with well-built schemata, this problem has low intrinsic cognitive load since they are able to interpret all the symbols in 3+5 = ? as one unit, and come up with a solution to their interpretation quickly.
In summary, information can have either high or low element interactivity. High element interactivity necessarily leads to high intrinsic cognitive load due to the complex nature of the information. This is especially evident in novice learners. However, as schemata develop in these areas, learners are able to process the interactions of the elements more efficiently, decreasing intrinsic cognitive load.
Extraneous Cognitive Load
Working memory load imposed by instructional design is called extraneous cognitive load. For example, open-ended problem solving is a challenge for novice learners since they may be unsure of where to focus their attention. Too much working memory capacity is being used to understand the teaching pedagogy, that little to no information can be learned. Based on the Borrowing & Reorganizing Principle, as well as the Narrow Limits of Change Principle, direct instruction through studying worked examples provides one of the best practices for learning novel information. In general, studying worked examples with expert instruction has low extraneous load. In novel situations with new information, instruction with little to no structure leads to high extraneous cognitive load.
Of course, this comes with some caveats, as worked examples can be structured poorly. The way the instructor approaches examples can also lead to high extraneous load. For instance, when working on related rates problems in calculus, most instructors will read the entire question, then proceed to working through the problem. Due to the high element interactivity and intrinsic load present in these types of problems, solving the problem using the typical approach causes high extraneous load in novice learners. A better approach comes through understanding the Split Attention Effect: interweave solution steps with information from the problem to decrease extraneous load.
We have seen that when there are many interacting elements in a given problem, intrinsic load is necessarily high for novice learners. As instructors, our primary focus should be on schemata formation, as this leads to decreased intrinsic load. Well-built schemata also enter working memory as single elements, freeing up more processing space for other novel information.
When information is presented in a way that causes the learner to focus on aspects unrelated to the problem, this creates unnecessary extraneous cognitive load, leading to decreased working memory capacity. To combat this, we can present novel information through the use of direct instruction & studying worked examples. This will free up working memory by decreasing extraneous cognitive load. As our learners move from novice to expert learners, it becomes easier to vary our teaching pedagogy, as well-built schemata help to decrease intrinsic load in instances when extraneous load is high.
A post where we explore some concepts required to define understanding in a cognitive load theory framework.
Elements & Schemata
In one of my earlier posts, I discussed biologically primary and secondary knowledge. In short, primary knowledge is knowledge in which we are biologically programmed to learn, such as how to communicate to others within our culture. Secondary knowledge, however, we are not biologically programmed to learn.
To keep things simple within the framework I want to discuss understanding in, let’s assume that facts and procedures can be divided into two classes: elements and schemata. Elements are single pieces of information that can be processed within our working memory, such as knowing that the number 3 corresponds to the numerical amount three. Once known, elements can be placed together to begin forming schemata. For instance, a schemata for “3” may include knowing that 3 can be mapped to the word “three” or to three objects (cardinal), is the whole number after 2 and before 4 (ordinal), or that the number 3 may be used on your football jersey (nominal).
Schemata, once well-known, can be linked. For instance, a schemata about prime numbers may include knowing that 2, 3 and 5 are the first three prime numbers. In addition to this, elements can form sub-schemata. Our reference to the ordinal, cardinal and nominal interpretations for the number three might all be considered sub-schemata of the overall schemata we have for three. As we know, the beauty of schemata is that, once well-formed, they can enter working memory as a single element, freeing up working memory space for other information.
Element interactivity occurs when two or more elements must be processed simultaneously in working memory because they are logically related. Think about the multiplication fact 3 x 4 = 12. There are actually five symbols that must all be interpreted at once due to them being logically connected. There are three numerals: 3, 4 and 12. There is the multiplication operation, which could be interpreted in a couple different ways (as an array, as repeated addition, as a multivariable function that returns the product). Finally, there is the equal sign, which is a symbol referring to the idea of 12 being equivalent in some way to the product of 3 and 4. As a novice learner, all five symbols must be processed individually in the working memory; whereas an expert learner has a well-built schemata that allows them to by-pass having to process all of the symbols every time they see a multiplication fact. In essence, an expert processes one element; whereas a novice may have to process all five elements.
As mathematics instructors, we need to be mindful of how the elements of our problems are interacting within the context of teaching our students. High element interactivity necessarily causes more working memory capacity to be used, increasing cognitive load. One potential way to combat curricular competencies involving high element interactivity is to re-visit pre-existing topics and ensuring our students have the well-formed schemata required to ease some of this cognitive load. Think about how challenging linear equations are for our students: they involve complex understanding of integers and fractions, as well as comprehension of how to manipulate all four of the main numerical operations. Before introducing equations, it would seem logical to review operations with integers and fractions so that students can consolidate their knowledge in these areas. By helping to create well-formed schemata in these topics, students can apply more working memory capacity to the new procedures that are intrinsic to linear equations, without applying too much working memory capacity to previous curricular topics. If consolidation does not happen, it is no surprise that the student struggles with linear equations, as the element interactivity is high and too much working memory is being allocated to topics that are not the focus of the lesson.
In my next blog post, I will explore two more interesting topics: intrinsic and extraneous cognitive load. We will see the interplay of element interactivity with these two topics and discuss instructional implications.
An interesting set-up of right triangles allows us to prove radical identities.
I was asked by a colleague last week to prove an identity involving radicals. The two expressions arise when considering cosine of the angle pi/12. Normally, one would apply a sum or difference formula
and this would simplify to
However, when one of his students used a calculator, the calculator returned back an unusual expression:
He and the student were able to verify that these expressions evaluated to a similar decimal expansion, so must be equivalent. But then his student asked him how to prove the equivalence of expressions like this. He tried for a bit, unsuccessful – then he tormented me with this problem all Easter weekend. Eventually, I was able to show the equivalence using an old right triangle trick I saw a few years back.
Attach two right triangles together in such a way so that the right leg of the second, and the bottom leg of the first meet at a right angle. On the hypotenuses of the smaller triangles write root 6 and root 2, respectively. This is done so that the hypotenuse of the larger right triangle is root 6 + root 2 – matching up with the numerator of expression (1).
Our goal is to apply the Pythagorean Theorem on the large right triangle, so we need to determine the legs of the larger triangle. To do this, we will determine the legs of the smaller right triangles. For the root 2 triangle, we have the obvious choice of making the legs (1, 1). For the root 6 triangle, we could make the legs (root 2, root 4), (root 3, root 3) or (root 1, root 5). Notice that in expression (2), we have a root 3. This suggests we might want to try the (root 3, root 3) combination for the root 6 triangle. This shows us that the legs of the larger right triangle are both root 3 + 1.
Now we can apply the Pythagorean Theorem on the large right triangle.
Taking the square root of both sides gives
And finally, dividing both sides of the equation by 4 yields the desired result.
I suppose that the moral of the story here, besides seeing some really interesting mathematics, is that I never would have solved this problem unless I had seen the previous problem involving something similar. In general, I believe it is safe to state that in order to be successful solving problems, one should be exposed to many different types of problems (ever wonder how those Math Olympiad contestants get so “smart”?). From a cognitive science perspective this makes sense – it allows us to create problem archetypes (schemata) that we can draw upon to help solve future problems. And the more well-connected these schemata become, the easier it becomes to solve problems.
What a crazy couple months it has been! It began mid-September when I was asked to give a talk about the new mathematics curriculum in BC through the college. From here, I attended researchED New York early in October, where I got to connect with some awesome educators and present about connecting interleaved practice with teaching with the amazing Yana Weinstein. After this, I gave a math workshop to some future BC education assistants. We talked about math anxiety, early numeracy, and cognitive science, all while working on interesting math problems together! Yesterday I was in Saskatoon at the SUM Conference to present on non-routine cognitive tasks – synthesizing some information I gathered from Dan Meyer and Steve Leinwand. Finally, I have a small break before the next two workshops to hopefully eke out a post on some reflections I have had from traversing between these different worlds.
#1) Cumulative Review — Why Isn’t Everyone Doing It?
I have recently read “Accessible Mathematics” by Steve Leinwand, in which he outlines 10 instructional shifts to help raise student achievement. One of those shifts is to shift toward giving ongoing cumulative practice at the beginning of your math lessons. It does not have to be terribly extensive – perhaps just four or five short recall-type questions to ensure that students are not forgetting past concepts. It seems obvious that we should be doing this – but many of us are not!
Why should we be doing it? Well, this was somewhat tied to the presentation that Yana and I gave at researchED. It seems that interleaved and spaced practice are highly effective strategies to increase long-term learning in our students. For instance, I saw a 10% increase in the discrimination of problem type when I used interleaved practice in my integral calculus class last year. However, there are some things that we don’t know about interleaving that warrant future studies – like how many problem-types should we include, or how interleaving affects attention in our students.
Why are we not doing it? Efrat discussed some of the practical limitations of using interleaved and spaced practice at researchED New York. Teachers typically list time investment, lack of support, or an incompatible system as reasons for not utilizing spaced practice. What might change their minds? It seems that teachers are interested in ongoing professional development in cognitive science, and time to work with colleagues in order to help ease them into implementation of such tasks. As this is an area of interest to me, please contact me if you or your school is interested in ongoing professional development in cognitive science – I would be happy to help!
#2) Depressing — Why Aren’t We Collaborating?
Continuing on the conversation, we could ask why aren’t we collaborating more as a community? Let’s take a look at an example from my life. I had a student come into my calculus class with a TI calculator stating that his teachers at high school said they would absolutely require a TI calculator for college calculus. Literally, what?! With tools like Desmos at our fingertips, why is there a need to drag around a $200 brick? In addition to this, my department doesn’t allow graphical display calculators on major tests anyway. So it looks like I will need to reach out to the local community and try to spread the Desmos love. Why? Let’s look at it form the alternate viewpoint: If I teach Desmos to my students this year, but when they move on, the next teacher doesn’t know how (or doesn’t want to know how) to use Desmos, these students are now potentially disadvantaged. In essence, a teaching tool is greater when we share it with others in the profession and we develop long-term learning goals using similar tools.
#3) Planning — Using Space, Not Time
In a presentation by Nat Banting and Ilona Vashchyshyn, we were asked to consider planning a lesson using quadrants labelled as “Teaching Actions”, “Teaching Spaces”, “Anticipation”, and “Improvisation.” In other words, when it comes to planning, we need to consider our space (the room, manipulatives, desk arrangement) and our actions (modelling, watching, telling). And Nat and Ilona see our actions and spaces situated on a continuum between anticipation and improvisation. In fact, there has to be some improvisation within our classrooms, since it would technically be impossible for us to plan all the possible divergence that may happen in any given lesson.
Of interest to me was their belief that false dichotomies arise when we believe an individual spends all their time within one of the half-planes. For instance, if we believe an educator continuously anticipates and does not improvise in the class, then they are defined as a traditional teacher. On the other hand, those who are thought to improvise all the time are branded as reform or progressive teachers.
This also works for the horizontal half-planes. If an educator is too focused on the teaching spaces, the lesson might be branded as a differentiated instruction type of lesson; and if an educator is too focused on the teaching actions, the lesson might be branded an inquiry type lesson. There is probably more to this conversation, but I am still trying to think more on these two particular diagrams.
#4) Synthesis — Finding Your Balance
In Saskatoon I tried to synthesize some reading that I have been doing as of late. The first bit of information was regarding non-routine cognitive tasks I originally heard of from Dan Meyer at OAME 2017. The main premise is that a mathematical task can either have a real-world context or not. In addition to this, a mathematical task can involve “real work” or “fake work.” There are certain verb choices that we make in a math class that lead to real work (question, predict, analyze, debate) or to fake work (evaluate, simplify). Finally, doing fake work in a real world context is overrated; that is, dressing up a routine task with the air of real worldness is overused in math education. However, pushing students to do real work not in a real world context is underrated; that is, we often fall short of allowing students to use meaningful verbs like question, predict or analyze outside of real world contexts. Think “Calculate when the phone will be charged given the model.” (routine, plug ‘n’ chug, dressed up in real world clothing) versus “Predict the y-value given the data.” (non-routine, analyzing data to predict, non-dressed up mathy question).
In addition to Dan’s thoughts on non-routine tasks, I embedded Steve Leinwands idea to lead lessons with data. My thoughts were that if we are interested in moving toward doing real work, data can help drive questioning, noticing and predicting. Provided things go well with the lesson, we can follow up with verbs that allow us to extend, such as generalize or debate. If you are interested in seeing a bit more, my slides from the conference can be found here.
Realistically, I think it would be quite the challenge to create every lesson as a non-routine cognitive task. To me, it feels unrealistic. Also, I firmly believe that the verbs recall, calculate and simplify have a place in mathematics classes and that they should be respected. For instance, John Mighton of JUMP Mathematics consistently reminds me that cognitive load is important – that is, our students require some skill in order to begin a rich-task such as data analysis. This skill comes with practice, which can easily be acquired via spaced practice involving recalling facts. However, on the other side, Bjork reminds me of desirable difficulties. Could non-routine cognitive tasks be shaped in such a way to support learning and long-term retention?
As I continue to navigate the large divide of what feels like a fake world of mathematics and a real world of mathematics education, I can’t help but wonder how we might all be able to help shift the collective from fake work to real work.
“The merit of painting lies in the exactness of reproduction. Painting is a science and all sciences are based on mathematics.” -Da Vinci
Take a moment to read the phrase: “The hungry caterpillar ate the juicy leaf.”
Now quickly complete the word by filling in a missing letter: SO_P.
Out of curiosity, did you complete the word using the letter U to make SOUP? According to Kahneman, author of Thinking Fast and Slow, after processing the words HUNGRY and ATE in a sentence, we are primed to select the letter U in the word above since SOUP is associated with the words HUNGRY and ATE. Let’s explore this a little bit, and see if and how we might think about using this idea in our math classrooms.
What is the Priming Effect?
An idea in our memory is associated with many other ideas. These associations may be categorical, such as connecting the words FRUIT and APPLE, or property-based, such as connecting ADDITION or MULTIPLICATION to COMMUTATIVITY. Ideas may also be associated through effects like how we may connect ALCOHOL to DRUNK, or CIGARETTE to CANCER. When primed with one of the links in an association, our mind has the ability to bring the other familiar and associated words into our working memory.
What Does Priming Look Like?
When priming occurs it is subconscious and Kahneman argues that we are likely not to believe it is occurring due to the way our brain functions (our brain allows us to believe that we are in full control). He mentions several studies in his book, but I will touch on only two to give you a sense of how priming is at work. In the first, participants were primed with images of money. The group that was primed with money images became more individualistic – less likely to help others and less likely to ask for help – on tasks that followed.
In the second group, it was shown that actions can also be primed. In this study, children read sentences involving words associated with the elderly such as FORGETFUL, BALD, GRAY, and WRINKLE. None of the sentences explicitly mentioned mentioned the elderly. When the participants were asked to walk down a hallway, they did so at a much slower pace than normal. The reverse association was true as well: children who were asked to walk slowly for a period of time were more apt to recognize words associated with old age.
Can We Use Priming in Mathematics Class?
I wonder if mathematics teachers have been using this idea already? In most classes and assessments, we tend to be explicit with word choice when we are asking students to perform a task. For example, if I want my students to think in a linear way, I could use an associated word like SLOPE or a similar word like STRAIGHT to help them recall ideas around linear functions. Use of certain cues to aid in recall are most likely beneficial since we know that recall of facts helps with both storage and retrieval strength. I could also see the argument of priming allowing students to access previous knowledge, which may be an appropriate action during the set-up of a teaching task.
On the other hand, we do have to be aware that priming may occur without our knowledge at any given time. That is, if we utilize unnecessary pictures or words to aid in a mathematical task, our students may be thinking about what we don’t want them to think about!
In closing, the priming effect is an interesting process to be aware of in our classrooms. However, Kahneman notes that the effect doesn’t work with all individuals, so we do not have to worry about students becoming zombies to priming effects. In addition to this, it seems that the priming effect has been under scrutiny for robustness, including replicability of certain findings. Perhaps we will have to wait to see what color the first coat is before delving deeper into this theory in our classrooms.