An Early Review of Generative Language Models in Automated Writing Evaluation: Advancements, Challenges, and Future Directions for Automated Essay Scoring and Feedback Generation

Yue Huang, Measurement IncorporatedFollow
Corey Palermo, Measurement IncorporatedFollow
Ruitao Liu, 2U Follow
Yong He, Measurement IncorporatedFollow

ORCID

Yue Huang: https://orcid.org/0000-0003-2175-9852

Corey Palermo: https://orcid.org/0000-0003-1921-5127

Abstract

Automated writing evaluation (AWE) has long supported assessment and instruction, yet existing systems struggle to capture deeper rhetorical and pedagogical aspects of student writing. Recent advances in generative language models (GLMs) such as GPT and Llama present new opportunities, but their effectiveness remains uncertain. This review synthesizes 29 studies on automated essay scoring and 14 on automated writing feedback generation, examining how GLMs are applied through prompting, fine-tuning, and adaptation. Findings show GLMs can approximate human scoring and deliver richer, rubric-aligned feedback, but fairness, validity, and ethical issues remain largely unaddressed. We conclude that GLMs hold promise to enhance AWE, provided that future work establishes robust evaluation frameworks and safeguards to ensure responsible, equitable use.

Recommended Citation

Huang, Yue; Palermo, Corey; Liu, Ruitao; and He, Yong (2025) "An Early Review of Generative Language Models in Automated Writing Evaluation: Advancements, Challenges, and Future Directions for Automated Essay Scoring and Feedback Generation," Chinese/English Journal of Educational Measurement and Evaluation | 教育测量与评估双语期刊: Vol. 6: Iss. 2, Article 5.
DOI: https://doi.org/10.59863/FAMJ7696
Available at: https://www.ce-jeme.org/journal/vol6/iss2/5

DOI

https://doi.org/10.59863/FAMJ7696

Download

Included in

Educational Assessment, Evaluation, and Research Commons, Educational Technology Commons

COinS

An Early Review of Generative Language Models in Automated Writing Evaluation: Advancements, Challenges, and Future Directions for Automated Essay Scoring and Feedback Generation

ORCID

Abstract

Recommended Citation

DOI

Included in

Special Issues:

Search

An Early Review of Generative Language Models in Automated Writing Evaluation: Advancements, Challenges, and Future Directions for Automated Essay Scoring and Feedback Generation

Authors

ORCID

Abstract

Recommended Citation

DOI

Included in

Share

Special Issues:

Search