Kembali ke Berita Riset
Abstract illustration of feedback loops and assessment rubrics
Makalah jurnal202429 Apr 2026

Assessing Generative Feedback in K-12 Writing and STEM Learning

Claire Evans, Jun Wei, Nora Singh

Computers & Education: Artificial Intelligence

generative AIfeedbackassessment
Sumber

Ringkasan 500 kata

This mock summary reviews a journal paper about evaluating generative AI feedback for K-12 learning. The authors start from a common observation: generative AI systems can produce fluent comments very quickly, but fluency does not guarantee educational value. Feedback may be too vague, too directive, factually incorrect, misaligned with the rubric, or inappropriate for a student's developmental level. The paper therefore asks how schools and product teams should assess AI-generated feedback before using it with learners.

The authors propose a four-part evaluation framework. The first dimension is correctness: whether the feedback accurately identifies strengths, errors, and misconceptions. The second is pedagogical usefulness: whether the feedback helps the learner take a productive next step rather than simply receive an answer. The third is alignment: whether the comment reflects the learning objective, rubric, curriculum, and task constraints. The fourth is relational tone: whether the language is encouraging, specific, and respectful without pretending to know the student's personal circumstances.

The paper applies the framework to writing tasks and STEM explanation tasks. In writing, useful feedback often points to structure, evidence, clarity, and revision strategies. In STEM, useful feedback may identify conceptual gaps, prompt reasoning, or encourage representation changes. The authors find that generic prompts produce inconsistent results, while task-specific rubrics and examples improve feedback quality. They also show that the same feedback can be helpful for one learner and unhelpful for another if the system does not account for prior knowledge.

A significant part of the paper discusses human review. Teachers cannot review every generated sentence in a high-volume system, but they need control over feedback policies. The authors recommend feedback templates, risk categories, sampling audits, and escalation rules for sensitive cases. They also encourage student-facing transparency: learners should know that feedback was AI-assisted and should be invited to question or discuss it.

For AIEDHK, the paper is highly practical. It provides a bridge between research evaluation and product quality assurance. Any AI education product that generates feedback needs more than a demonstration video; it needs an evaluation protocol that measures learning alignment, usefulness, safety, and teacher control. The paper's framework could become part of a Hong Kong AIED product review checklist, especially for multilingual writing and STEM learning contexts.