•  
  •  
 

ORCID

Hongwen Guo: https://orcid.org/0000-0002-1751-0918

Matthew Johnson: https://orcid.org/0000-0003-3157-4165

Luis Saldivia: https://orcid.org/0009-0007-3482-7654

Michelle Worthington: https://orcid.org/0009-0006-0480-3769

Abstract

Digital tools in digitally based math assessments are designed to help students solve math problems more effectively. However, inappropriate use of these tools can lead to negative outcomes. Using process data from NAEP, we generalized the differential item functioning methodologies and conducted a systematic investigation of the effectiveness of tool uses. Three measures were proposed to evaluate the effectiveness of a digital tool on an item: its popularity (the proportion of students who used it), its impact on item score (the score gain/loss after matching on performance score and other tool uses), and its impact on item response time (the time saved/lost after matching on performance score and other tool uses). Using data from tens of thousands of students on the NAEP digital platform, we found that Calculator appeared to be the most useful tool, while Highlighter was least used among the dozen digital tools examined. Further applications of machine learning algorithms revealed clusters of tool uses on individual items associated with students having different levels of math proficiency and time management skills. Our systematic evaluation of digital tools contributes to general methodologies, solid data evidence, and valuable insights to inform the design of digital tools not only for the NAEP platform, but also for broad learning and assessment platforms.

DOI

https://doi.org/10.59863/YCHQ6160

Share

COinS