AI Takes Over Grading of Texas STAAR Exams

In a significant shift in the education sector, Texas has implemented artificial intelligence (AI) to grade the written portion of the State of Texas Assessments of Academic Readiness (STAAR) exam. This move aims to streamline the scoring process and potentially save the state millions of dollars annually.

The Texas Education Agency (TEA) is rolling out an “automated scoring engine” for open-ended questions on the STAAR exams for reading, writing, science, and social studies. The technology, which uses natural language processing similar to AI chatbots, is expected to save the state agency about $15-20 million per year that it would otherwise have spent on hiring human scorers through a third-party contractor.

The STAAR test, which measures students’ understanding of state-mandated core curriculum, was redesigned in 2023 to include fewer multiple-choice questions and more open-ended questions, known as constructed response items. After the redesign, there are six to seven times more constructed response items.

To develop the scoring system, the TEA gathered 3,000 responses that went through two rounds of human scoring. From this field sample, the automated scoring engine learns the characteristics of responses and is programmed to assign the same scores a human would have given.

This spring, as students complete their tests, the computer will first grade all the constructed responses. Then, a quarter of the responses will be rescored by humans. When the computer has “low confidence” in the score it assigned, those responses will be automatically reassigned to a human. The same thing will happen when the computer encounters a type of response that its programming does not recognize, such as one using lots of slang or words in a language other than English.

Despite the integration of AI, rigorous quality control processes remain in place to ensure accuracy and fairness in scoring. The shift towards AI scoring signals a notable change in how assessments are conducted, underscoring the advantages of technology in education evaluation while still recognizing the vital role of human supervision and expertise in the process.

NIMBUS27

Read more at: www.theverge.com