## Reflections from Teaching: When Are Learners Novices No More?

“Novice learners may benefit most from well-guided low-paced instructional procedures, while more knowledgeable learners may benefit more from minimally guided forms of instruction.” -Slava Kalyuga

The Example that Led to Reflection

I never cease to be amazed at the level of knowledge that my teachers keep bringing to the table in my class. Last week we were discussing probability trees, and one student was leading the activity with the following tree (probabilities of drawing a yellow, green or black ball without replacement):

After the student was finished answering a couple of questions we had about the tree, I posed the challenge “Create a question where the final answer is 2/5.” I asked this question because I wanted them to get more comfortable with conditional probability. For example, the probability that we will draw a black ball, given that the first ball is yellow is 2/5, so P(B|Y) = 2/5.

Much to my surprise, the first answer given was “Determine the probability of drawing a black or green ball, given that the first ball drawn was black.” I had to sit back and try to figure out where this answer was coming from since I had not anticipated it (this is both the joy and challenge of allowing students to lead the discussion)!

Since the events of drawing a black ball and drawing a green ball are mutually exclusive, we can calculate

P(B or G | B) = P(B|B) + P(G|B) = 1/5 + 1/5 = 2/5.

Can you determine the branches used to create this question? After doing some of what Michael Jacobs calls “Maths C.S.I.” I had successfully determined how the student was thinking.

Are All of Our Students Really Novices?

Over the weekend I began pondering about how there is a lot of talk that mathematics students need to be treated like novices, especially in elementary school. For example, in Anna Stokke’s C.D. Howe Report, she states

To be effective, instructional techniques must cater to the limitations of a person’s working memory, which can hold only a limited amount of new information. This is particularly important for novice learners who have difficulty focusing on new concepts when their working memory is overwhelmed.

I don’t necessarily *disagree* with the statement above – one which is taken from Kirschner, Sweller & Clark, and heavily founded in Cognitive Load Theory – it is important for us as teachers to understand when learners may have limitations, and how to effectively combat these limitations. I do, however, think it is important for us to also reflect on how often we treat our students as novice learners, and realize their potential as non-novice learners. Those who argue in favour of CLT often view their learners as novices, effectively by-passing the expert-reversal effect. Stated briefly, the expert-reversal effect states methods that typically work well to elicit learning in novice learners are not necessarily the best methods to elicit learning in non-novice learners. For example, as one progresses in their knowledge of mathematics, worked examples become less conducive to learning.

In lieu of this thought, I pose some questions:

1) Are all of our students *actually* novice learners? Is it possible that our students are *sometimes* non-novices?

2) If we agree that at least some of our students are non-novices, what methods should we utilize to elicit learning in these individuals? Must it still be direct instruction and worked examples?

3) If we believe that our students are novice learners, will we *ever* see them as non-novice learners? Does this belief we hold affect their learning?

## A Glimpse into the Mind of a Student with ADHD

“ADHD makes it sound like I have a lack of focus, but I think of it more like a mismanagement of focus.” -student with ADHD

Here are the first few pages of a recent calculus midterm of one of my students who has been diagnosed with ADHD. I’ll let you take a peek at what you see before I give my reflection.

Now, I want you to go back and take a look at the first page, where question #1 ii required the knowledge of the derivative of *log_{3}(x)*. You can see that the student set up the equation* log_{3}(x) = b* in order to help him determine the derivative using the quotient rule. But the giant “*?*” beside *b’* caught my interest. (Of course, if there is anything else of interest to you please leave a comment!)

Now, he begins playing around at the top of the page, recalling rules for how to deal with logarithms. There is a *y = 3^x* and a *y = log_{3}(x)* indicating to me that he was thinking about potentially finding the derivative using inverse functions or implicit differentiation. However, not much happens here, so we will catch up in a few pages.

The next page is nothing special, in that he tackles the next couple derivative questions without making any more thoughts on the log base three problem he is having. But check out the top of the third page. Here, he correctly gets the relationship between exponentials and logarithms: *3^x = b* means *log_{3}(b) = x* (or vice versa). Then there is a little bit of play at the bottom of the page trying to re-write this relationship in various ways to potentially get a nice equation to differentiate. Aside from now having the inverse relationship solidified, not much headway is gained on the initial problem.

Finally, on the last page, we see one last attempt to think about *3^x = u*, perhaps a nod to the variables I use when doing the chain rule (*dy/dx = dy/du * du/dx*). This is the final attempt to determine the solution to the log base three problem, and the rest of the test continues in a normal fashion.

The most interesting thing from my perspective is embedding what I see in a cognitive load theory setting. We know that the working memory has limited capacity to hold and synthesize information. This information can come from either environmental stimuli, or as schema entering from long-term memory. I was always under the impression that trying to cut back environmental stimuli for students with ADHD was a must, as this allows the working memory to focus more on the task at hand. However, seeing this test had me thinking a bit deeper.

At the college level, we are typically good at minimizing outside distractions; doors are closed, rooms are quiet and I cross my fingers that maintenance has fixed any lights that are in strobe-mode. However, as I do not have ADHD myself, I cannot comment on what outside stimuli might still be entering the working memory. Perhaps a song that was heard earlier that morning? Whether or not he forgot his lunch at home? What plans are for after school? So let’s assume that some working memory space has been allocated to this.

Now it’s test time. Since this particular student is quite adept at mathematics, most schema enter the working memory quite effortlessly. We can see this demonstrated on page 2, where some complex derivatives are handled. From my perspective, it is actually the snag of not knowing the derivative of *log_{3}(x)* that pushes the working memory over its capacity. Look at how often he returns back to the problem – at the top of page 1, the top & bottom of page 3, and at the top of page 4.

Just how taxing is it on the working memory to be subconsciously processing this log base three problem over the course of four test questions? How debilitating would this be if there were not well-developed schema to draw from when writing this test? How much more success would there have been if he was able to dislocate this log base three problem from his working memory, instead of it continually returning back to occupy his focus? I find these questions super interesting, and I have thoughts, but no particular answers. If there are any readers who have studied cognitive load theory from the perspective of individuals with ADHD, I’d love to read a bit more on this topic.

## Understanding “Understanding” Part III

A post where we explore how to define understanding in a cognitive load theory framework.

In my last two blog posts, I discussed the concepts of element interactivity, as well as intrinsic and extraneous cognitive load. We say information has high element interactivity if there are many elements of the information that must be processed together at the same time. High element interactivity generally implies high intrinsic cognitive load. Here, intrinsic cognitive load refers to a working memory load caused by the intrinsic nature of information that we are trying to process. Finally, extraneous cognitive load refers to a working memory load imposed by the pedagogical nature of the information being taught.

**Defining Understanding**

Now that we know about element interactivity, we can use this concept to define understanding. In a cognitive load setting, understanding is the ability to process all interacting elements in working memory at one time. Since the focus is on interacting elements, it does not make sense to define understanding to individual elements, such as learning one French vocabulary word (cat = chat).

Let’s analyze our previous examples. Consider the math fact *3 + 5 = 8*. According to our definition, if a learner is able to answer *3 + 5 = ?* correctly, without having process all of the interacting elements, we would say that she has demonstrated understanding of the question at hand. I would argue then, that using a strategy such as tallying up three and five on her fingers would display a lack of understanding. Even beginning with three fingers and counting up to eight, whilst being a more effective strategy, still displays a lack of understanding as she is processing some or all of the elements individually. Of course, I am not arguing that students shouldn’t be permitted to use these counting strategies. It is likely that these are crucial stepping stones in the learning trajectory, and the instructor needs to be mindful of when the student seems ready to move beyond these strategies.

**Understanding and Incorrectness**

One aspect of the definition that I am curious about is when the learner makes a mistake in the process. Consider solving for *x* in *3x – 10 = 5.* Is it possible for the student to understand, yet be incorrect? Are these mutually exclusive events? Let’s say the student solution is

*3x – 10 = 5
*

*3x = -15*

*x = -5*

This is incorrect, but it still shows us that they understand the process of solving for *x*, and that they can process all of this information in working memory at once. Does understanding come down to a judgement call on the side of the instructor in these cases?

**Instructional Implications – A Case for Quick Math Fact Recall**

Let’s try to deconstruct our current pedagogy in light of this definition of understanding. Consider all of the multiplication facts that our students must recall. There is element interactivity amongst one individual fact (*3 x 4 = 12*), as well interaction amongst all of the multiplication facts for three, as well as interaction amongst all facts up to *9 x 9 = 81*! Working memory might get overwhelmed, as intrinsic load is high due to the number of facts that must be remembered.

Think also about what our current curriculum states: students should be comfortable with knowing other concepts, such as knowing *3 x 4 = 4 + 4 + 4 = 12*, building array models, or knowing about the commutative property. All of this increases extraneous cognitive load; thus requiring more time and effort for the students to move the facts to long-term memory. I would argue that this is why we have seen a shift to moving recall of the multiplication facts to later grade levels. In British Columbia, students aren’t expected to recall facts for 3s or 4s until Grade 5; and there is no mention of the harder facts like 7s, 8s or 9s.

To compare, I had my multiplication facts memorized by the end of Grade 3 in the 80s in Ontario. Some might argue that we were taught without *understanding* (this alternate definition is a bit fuzzy, but typically is interpreted as knowing how to complete a question utilizing a model). This is false, as I have many documents showing that we indeed used models. But the key difference here is that * the focus of instruction was on automatization of facts*, and that models were used to introduce concepts and as help when students weren’t understanding.

**Models were used to decrease intrinsic load, not to increase extraneous load!**For such a large task, such as learning the multiplication facts, why not have students learn the individual facts first? Using techniques such as interleaved and spaced practice, and introducing new fact families after long exposure to previous ones, would be beneficial for learning. After students are comfortable with recall of the facts, then we can focus our teaching on developing *understanding* (the fuzzier definition) of how multiplication is connected to other concepts. Of course, once students can recall the multiplication facts, they have displayed understanding in the cognitive load sense, as they can process all of the elements together at once. So why would we want to learn our facts first, before connecting to other concepts? Once the facts are remembered well, then the can be retrieved quickly and efficiently, leading to lowered intrinsic load, and more working memory capacity to work on the current problem of connecting the fact to another concept.

In conclusion, I am not saying that we shouldn’t explain why certain facts are the way they are! This can certainly be done as motivation to the problem, and mixed throughout as needed; however, this should not be the focus of the learning because this increases extraneous load and not all students will successfully move the facts into long-term memory store this way.

## Understanding “Understanding” Part II

A post where we explore some concepts required to define understanding in a cognitive load theory framework.

In my last blog post, I briefly summarized element interactivity. When elements must be processed in working memory simultaneously due to them being logically connected, we say the elements have high element interactivity. By supporting schemata development in our pedagogical practices, we can combat the strain on working memory that element interactivity causes. There are also two other ideas to keep in mind when reflecting on our pedagogy: intrinsic and extraneous cognitive load. I would like the topic of this post to be dedicated to summarizing and exploring these topics.

**Intrinsic Cognitive Load**

Working memory load that is imposed by the intrinsic nature of the information we are trying to process is known as intrinsic cognitive load. Perhaps this can be explained nicely through the use of an example.

First, let’s think about solving for *?* in the addition statement *3 + 5 = ?*. We have seen that, for novice learners, there are many elements to process here, leading to high element interactivity. Novice learners may have to process all of these elements separately, perhaps first counting to three, then counting up again to eight. In this instance, the high element interactivity causes intrinsic cognitive load. It would be a significant challenge to process anything else in working memory since all of the processing power is dedicated to making sense of the symbols and using the counting-up strategy.

For those who know the fact that *3 + 5 = 8*, this whole element can enter working memory, freeing up processing space. For expert learners with well-built schemata, this problem has low intrinsic cognitive load since they are able to interpret all the symbols in *3+5 = ?* as one unit, and come up with a solution to their interpretation quickly.

In summary, information can have either high or low element interactivity. High element interactivity necessarily leads to high intrinsic cognitive load due to the complex nature of the information. This is especially evident in novice learners. However, as schemata develop in these areas, learners are able to process the interactions of the elements more efficiently, decreasing intrinsic cognitive load.

**Extraneous Cognitive Load**

Working memory load imposed by instructional design is called extraneous cognitive load. For example, open-ended problem solving is a challenge for novice learners since they may be unsure of where to focus their attention. Too much working memory capacity is being used to understand the teaching pedagogy, that little to no information can be learned. Based on the *Borrowing & Reorganizing Principle*, as well as the *Narrow Limits of Change Principle*, direct instruction through studying worked examples provides one of the best practices for learning novel information. In general, studying worked examples with expert instruction has low extraneous load. In novel situations with new information, instruction with little to no structure leads to high extraneous cognitive load.

Of course, this comes with some caveats, as worked examples can be structured poorly. The way the instructor approaches examples can also lead to high extraneous load. For instance, when working on related rates problems in calculus, most instructors will read the entire question, then proceed to working through the problem. Due to the high element interactivity and intrinsic load present in these types of problems, solving the problem using the typical approach causes high extraneous load in novice learners. A better approach comes through understanding the *Split Attention Effect*: interweave solution steps with information from the problem to decrease extraneous load.

**Instructional Implications**

We have seen that when there are many interacting elements in a given problem, intrinsic load is necessarily high for novice learners. As instructors, our primary focus should be on schemata formation, as this leads to decreased intrinsic load. Well-built schemata also enter working memory as single elements, freeing up more processing space for other novel information.

When information is presented in a way that causes the learner to focus on aspects unrelated to the problem, this creates unnecessary extraneous cognitive load, leading to decreased working memory capacity. To combat this, we can present novel information through the use of direct instruction & studying worked examples. This will free up working memory by decreasing extraneous cognitive load. As our learners move from novice to expert learners, it becomes easier to vary our teaching pedagogy, as well-built schemata help to decrease intrinsic load in instances when extraneous load is high.

## Understanding “Understanding” Part I

A post where we explore some concepts required to define understanding in a cognitive load theory framework.

**Elements & Schemata**

In one of my earlier posts, I discussed biologically primary and secondary knowledge. In short, primary knowledge is knowledge in which we are biologically programmed to learn, such as how to communicate to others within our culture. Secondary knowledge, however, we are not biologically programmed to learn.

To keep things simple within the framework I want to discuss understanding in, let’s assume that facts and procedures can be divided into two classes: elements and schemata. Elements are single pieces of information that can be processed within our working memory, such as knowing that the number 3 corresponds to the numerical amount three. Once known, elements can be placed together to begin forming schemata. For instance, a schemata for “3” may include knowing that 3 can be mapped to the word “three” or to three objects (cardinal), is the whole number after 2 and before 4 (ordinal), or that the number 3 may be used on your football jersey (nominal).

Schemata, once well-known, can be linked. For instance, a schemata about prime numbers may include knowing that 2, 3 and 5 are the first three prime numbers. In addition to this, elements can form sub-schemata. Our reference to the ordinal, cardinal and nominal interpretations for the number three might all be considered sub-schemata of the overall schemata we have for three. As we know, the beauty of schemata is that, once well-formed, they can enter working memory as a single element, freeing up working memory space for other information.

**Element Interactivity**

Element interactivity occurs when two or more elements must be processed simultaneously in working memory because they are logically related. Think about the multiplication fact *3 x 4 = 12*. There are actually five symbols that must all be interpreted at once due to them being logically connected. There are three numerals: 3, 4 and 12. There is the multiplication operation, which could be interpreted in a couple different ways (as an array, as repeated addition, as a multivariable function that returns the product). Finally, there is the equal sign, which is a symbol referring to the idea of 12 being equivalent in some way to the product of 3 and 4. As a novice learner, all five symbols must be processed individually in the working memory; whereas an expert learner has a well-built schemata that allows them to by-pass having to process all of the symbols every time they see a multiplication fact. In essence, an expert processes one element; whereas a novice may have to process all five elements.

**Instructional Implications**

As mathematics instructors, we need to be mindful of how the elements of our problems are interacting within the context of teaching our students. High element interactivity necessarily causes more working memory capacity to be used, increasing cognitive load. One potential way to combat curricular competencies involving high element interactivity is to re-visit pre-existing topics and ensuring our students have the well-formed schemata required to ease some of this cognitive load. Think about how challenging linear equations are for our students: they involve complex understanding of integers and fractions, as well as comprehension of how to manipulate all four of the main numerical operations. Before introducing equations, it would seem logical to review operations with integers and fractions so that students can consolidate their knowledge in these areas. By helping to create well-formed schemata in these topics, students can apply more working memory capacity to the new procedures that are intrinsic to linear equations, without applying too much working memory capacity to previous curricular topics. If consolidation does not happen, it is no surprise that the student struggles with linear equations, as the element interactivity is high and too much working memory is being allocated to topics that are not the focus of the lesson.

In my next blog post, I will explore two more interesting topics: intrinsic and extraneous cognitive load. We will see the interplay of element interactivity with these two topics and discuss instructional implications.

## Playing with Right Triangles

An interesting set-up of right triangles allows us to prove radical identities.

I was asked by a colleague last week to prove an identity involving radicals. The two expressions arise when considering cosine of the angle pi/12. Normally, one would apply a sum or difference formula

and this would simplify to

(1)

However, when one of his students used a calculator, the calculator returned back an unusual expression:

(2)

He and the student were able to verify that these expressions evaluated to a similar decimal expansion, so must be equivalent. But then his student asked him how to prove the equivalence of expressions like this. He tried for a bit, unsuccessful – then he tormented me with this problem all Easter weekend. Eventually, I was able to show the equivalence using an old right triangle trick I saw a few years back.

Attach two right triangles together in such a way so that the right leg of the second, and the bottom leg of the first meet at a right angle. On the hypotenuses of the smaller triangles write root 6 and root 2, respectively. This is done so that the hypotenuse of the larger right triangle is root 6 + root 2 – matching up with the numerator of expression (1).

Our goal is to apply the Pythagorean Theorem on the large right triangle, so we need to determine the legs of the larger triangle. To do this, we will determine the legs of the smaller right triangles. For the root 2 triangle, we have the obvious choice of making the legs (1, 1). For the root 6 triangle, we could make the legs (root 2, root 4), (root 3, root 3) or (root 1, root 5). Notice that in expression (2), we have a root 3. This suggests we might want to try the (root 3, root 3) combination for the root 6 triangle. This shows us that the legs of the larger right triangle are both root 3 + 1.

Now we can apply the Pythagorean Theorem on the large right triangle.

Taking the square root of both sides gives

And finally, dividing both sides of the equation by 4 yields the desired result.

I suppose that the moral of the story here, besides seeing some really interesting mathematics, is that I never would have solved this problem unless I had seen the previous problem involving something similar. In general, I believe it is safe to state that in order to be successful solving problems, one should be exposed to many different types of problems (ever wonder how those Math Olympiad contestants get so “smart”?). From a cognitive science perspective this makes sense – it allows us to create problem archetypes (schemata) that we can draw upon to help solve future problems. And the more well-connected these schemata become, the easier it becomes to solve problems.

## Recent Comments