When a teacher is working alone to figure out how to grade students' work in reference to a standards-based rubric, it can feel very daunting. It is hard to feel confident that you are making sound determinations when you are looking only at pieces of evidence produced by your own students. For that reason, teachers can benefit greatly from taking time together to calibrate student scores. But… what is calibration?
What is Calibration?
by Elicia Cárdenas, Director of Training
Calibrating scores means looking at samples of student work, ideally produced by a wide array of students from different contexts, and considering where each one scored on a common rubric. Ideally, calibration happens in a group of individuals, each of whom bring samples of work from their own students, and they have time to discuss how they would score each piece. If you are moving towards standards based grading in your classroom, department, or school, calibrating is an important tool for understanding what mastery of the standards looks like. Calibration ensures that standards are being evaluated in the same way: “Calibration safeguards that student work is assessed consistently and in alignment with the rubric.” - Brownstein & Chapin,Center for Collaborative Education
Calibrate scores together
I first experienced this process as a student teacher, working with a PLC (Professional Learning Community) to improve how we approached RTI (Response to Intervention). It was an incredibly eye-opening experience for me: a bunch of teachers with a pile of student writings, a pile of rubrics based on the standards, and a whole lot of coffee. We sat around reading the essays and scoring them, and then one by one, we shared our reasons for our scoring. We debated, disagreed, agreed, and eventually found exemplar texts that represented each designation on the rubric.
Whew! Calibrating scores was a lot of work, but as a pre-service teacher, I can not think of a better way to prepare myself for actually assigning grades to student work, knowing that my peers and I were all in agreement about how to use the rubric. I felt a lot more confident when I was scoring student writing, and that alone saved me precious time that year!
Calibrate scores apart
Fast forward ten years and a new colleague dropped by my room to ask for a second opinion on her scoring of a couple of reading assessments, and later, a couple of writing assessments. We realized that it would be worth doing a calibration activity for the both of us, even though we were not teaching the same levels. It took all year to squeeze out the time, but at the end of the year, she assigned a free write task to her students. She also hid student names to reduce bias. She then scored the papers in such a way that I couldn’t see her ratings and passed on the papers to me. I scored them on my own, and finally we sat down and saw where we agreed and disagreed. Because we both did the bulk of the work on our own time, then used shared time to discuss, the whole process didn’t take too long, and we both came out with a much stronger understanding and agreement of student performance levels.
Classroom grades are performance grades
None of the teachers with whom I calibrated grades had yet taken an official ACTFL MOPI nor OPI training, and I think is ok. Those trainings, while totally valuable, are intended to rate proficiency, not in-class performance. In the classroom context, I am not able to score proficiency (real world, spontaneous communication). Rather, I am always scoring performance because the classroom is never going to be the real world. The best I can do is grade a student based on observed proficiency as demonstrated by their performance, preferably on summative assessments. ACTFL’s performance rubrics are clear and easy to use, whether or not you have attended their trainings.
Getting Started with Calibration
Step 1: Form a calibration cohort
The first step to calibrating grades is to form a calibration cohort! Ideally, your calibration cohort will be teachers in your department or district that teach the same languages.
Keeping in mind that even students within the same class will be at a variety of levels of linguistic proficiency, don’t worry too much about only calibrating with other teachers who teach the same level. Do keep in mind, however that a Level 4 teacher and a Level 1 teacher that are using the same instructional approach might not have much overlap: the Level 1 teacher will provide mostly Novice-level samples and the Level 4 teacher will provide mostly upper-Novice and Intermediate-range samples.
Step 2: Select a rubric
Next, your Calibration Cohort will need to agree upon a common rubric and performance descriptors. If members of your cohort are already working from the same curriculum, this might be simple! If not, consider using the rubrics that I work from:
- These are the performance rubrics that I prefer to use.
- Here is an interpretive rubric as well.
Step 3: Administer an assessment
Presentational assessments are the least straightforward to evaluate, and so I would recommend starting there. All members of the cohort should administer the same writing assessment to their students. Using a timed freewrite with an optional prompt, is a great way to generate student work for scoring, because students will have a large amount of freedom to show what they can do.
Step 4: Come together and calibrate
When it comes time to calibrate, you have several options. Many institutions have put together calibration protocols to guide the process of sitting down together and calibrating, and you may find it useful to read through some of them:
- “Calibration Protocol for Scoring Student Work” from the Rhode Island Department of Education
- “Calibrating and Collaborative Scoring of Student Work” by Sophia.org
- “Creating Inter-Rater Reliability When Scoring Student Work” by Elevate Educators
- Semi-Structured Calibration Activity Protocol from the Stanford Center for Assessment, Learning, and Equity
Calibrating grades in a Department of 1
If you are a Department of 1 and are unable to gather with colleagues in your area, there are other ways for you to use a calibration protocol to help improve reliability when scoring language learner assessments. I have you covered!
Option 1: You can calibrate with yourself.
Eric Herman, the brain behind the Acquisition Memos, suggests that just by using a holistic rubric (such as the ones linked above), you improve your inter-rater reliability. Way to go, you!
Option 2: Calibrate with online colleagues
If you want to mimic the collaborative or synergistic nature of grade calibration but have no close colleagues, you still have options.
How to find calibration colleagues
Connect with colleagues online that share a similar vision for assessing and grading. A good starting point would be to make a post in a group of language teachers, such as the SOMOS Curriculum Collaboration group or the more broad iFLT/NTPRS/CI Teaching group. While it is okay to connect with teachers that teach different levels than you do, you WILL want to find teachers that teach the same language.
If you are part of a local or regional PLC, you could do this with a buddy from that group, or a few! Here is a list of regional PLCs.
Some folks like to harness the power of social media by posting student work samples in collaborative spaces and crowdsourcing an opinion. If you go this route, remember to hide any identifying student information! Post a student writing sample, with their name obscured, along with or referencing the rubric that you are using, and ask colleagues to comment and describe how they would score it and why.
Irene Evert, a middle school Spanish teacher from Illinois and a SOMOS user, had a really neat idea for gathering and analyzing opinions from many teachers. She created a google form with embedded images of student work and space for teachers to mark how they would score it. Irene put this resource together so that we can *all* work together to calibrate. Take a few minutes to rate the writing samples in Spanish here and, when you are finished, you can see responses from other teachers.
Calibrate scores to grade with confidence
Taking the time to calibrate might seem daunting and time consuming, but the confidence you gain after working through the process will save you time and energy in the long run. Calibrate THAT!
If your department is looking to take a serious look at your assessment practices, send us an inquiry to start the conversation about booking Director of Training Elicia Cárdenas to facilitate department-wide training and/or consultation.