Robo-graders like long words, not so big on intellectual coherence
When I glanced at the title of a recent New York Times piece on automated essay grading, “Facing a Robo-Grader? Just Keep Obfuscating Mellifluously,” I assumed it was just another fluffy popular science article. Surely no serious organization would use a computer program to grade essays. Not long into the article, however, I discovered that the “robo-grader,” named the E-rater, was developed not by university scientists but by the Educational Testing Service — the organization that administers the GRE and the TOEFL, among other exams.
For now, E-rater only grades essays that are also read by a human grader. Though the grades given by humans and E-rater have been remarkably similar, Les Perelman, an MIT professor, has his reservations about the software. After a month of testing, he has determined that E-rater favors long paragraphs and sentences, connecting words like “moreover,” and words with many syllables. Most troubling is that the E-rater can’t determine the truth or intellectual coherence of statements in the essay, used to hilarious effect in an example essay by Perelman.
READ MORE ARTICLES