|Marisa Ferrara Boston||Cornell University|
|John Hale||Cornell University|
|Reinhold Kliegl||University of Potsdam|
|Umesh Patil||University of Potsdam|
|Shravan Vasishth||University of Potsdam|
The surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difﬁculty. Using two different grammar types, surprisal is shown to have an effect on ﬁxation durations and regression probabilities in a sample of German readers’ eye movements, the Potsdam Sentence Corpus. A linear mixed-effects model was used to quantify the effect of surprisal while taking into account unigram frequency and bigram frequency (transitional probability), word length, and empirically-derived word predictability; the socalled “early” and “late” measures of processing difﬁculty both showed an effect of surprisal. Surprisal is also shown to have a small but statistically non-signiﬁcant effect on empirically-derived predictability itself. This work thus demonstrates the importance of including parsing costs as a predictor of comprehension difﬁculty in models of reading, and suggests that a simple identiﬁcation of syntactic parsing costs with early measures and late measures with durations of post-syntactic events may be difﬁcult to uphold.
Received: February 29, 2008
Published: September 08, 2008
Boston, M. F., Hale, J., Kliegl, R., Patil, U. & Vasishth, S. (2008). Parsing costs as predictors of reading difﬁculty: An evaluation using the Potsdam Sentence Corpus. Journal of Eye Movement Research, 2(1):1, 1-12, http://www.jemr.org/.
Potsdam Sentence Corpus