•  
  •  
 

ORCID

Yue Huang: https://orcid.org/0000-0003-2175-9852

Corey Palermo: https://orcid.org/0000-0003-1921-5127

Abstract

Automated writing evaluation (AWE) has long supported assessment and instruction, yet existing systems struggle to capture deeper rhetorical and pedagogical aspects of student writing. Recent advances in generative language models (GLMs) such as GPT and Llama present new opportunities, but their effectiveness remains uncertain. This review synthesizes 29 studies on automated essay scoring and 14 on automated writing feedback generation, examining how GLMs are applied through prompting, fine-tuning, and adaptation. Findings show GLMs can approximate human scoring and deliver richer, rubric-aligned feedback, but fairness, validity, and ethical issues remain largely unaddressed. We conclude that GLMs hold promise to enhance AWE, provided that future work establishes robust evaluation frameworks and safeguards to ensure responsible, equitable use.

DOI

https://doi.org/10.59863/FAMJ7696

Share

COinS