•  
  •  
 

Abstract

Natural language processing (NLP) has become an increasingly popular approach for analyzing textual responses in educational assessments. An important part of NLP involves cleaning and structuring examinees' written responses to create input data that conserves the syntax, semantics, and pragmatics of the words, thereby enabling the extraction of these features. This paper provides foundational knowledge on the steps needed for using NLP in educational measurement tasks, guiding researchers and practitioners through text preprocessing, feature extraction, and analyzing textual data from constructed response items. Additionally, an R-based example using Latent Dirichlet Allocation is provided, illustrating each step in the pipeline.

DOI

https://doi.org/10.59863/SDYZ2049

Share

COinS