D Reirter iD Moorellournal of Mernory and language 76 (2014)29-46 Table Onset time(s) Syntactic rule Yield P-VBG P edg of the page P二ATNN petition case(where (where the Decay-based model of short-term priming DcmaiRgmed priming is not rease in probabilit ptencdosey fter a potential prime of the san le(stim but extends this method by looking at all syntactic con en prime and structions rather than just passives,and by using regres target.For example,if a sentence-level conjunction leads paces the strict control of seman We sample repetitions a erent distances ( un to 25 utterances or 15 s natural.as the underlying sem antics largely dictate how nce e dialogue will normally lea a binary res ponse variable indica tition ys non as noise. emory ettects gener ay non-linear of th Corpus processing confrmed this distribution.In(DIST)is therefore log-trans- orTeieorond lad ation wh structure.Both of the corpora ha ve been annotated with The exampl Marcuset)From the trees. we identi y th target) a proxy for memory items that a speaker has tore ve to all han produce or comprehend a sentence.For example,the tree s a me VP for such phrase VBG PP keeping IN bly inflate results. PP NN IN Np y should the edge of AT NN the page tion as ion of the time between p rime and n pi with dis is for unam omparison of priming str The theow Corpus part-of-spe shown in related studies Gries (2005 :ntence-evecoorncojunction demonstrated a correlation of distance with the repetitionconditions: a repetition case (where a passive occurred shortly before), and a control case (where the passive has not occurred recently). Priming is the result of the difference between the normalized counts. Under this view, priming is not repetition, but the increase in probability caused by a preceding occurrence. Our technique is similar, but extends this method by looking at all syntactic constructions rather than just passives, and by using regression for greater sensitivity. In this and other corpus studies, the concept of adding predictors as controls replaces the strict control of semantics in the laboratory experiment. We see a high degree of variance in speakers’ choices of syntactic forms, which is natural, as the underlying semantics largely dictate how to construct the sentences. However, examining a large number of data points allows us to treat semantic variation as noise. Corpus processing To examine ‘‘all kinds of syntactic constructions’’, we analyze our datasets in terms of their syntactic phrase structure. Both of the corpora have been annotated with phrase structure trees through automatic and manual processes that included extensive verification (Anderson et al., 1991; Marcus et al., 1994). From the trees, we identify the syntactic rules used to construct them. We see the rules as a proxy for memory items that a speaker has to retrieve to produce or comprehend a sentence. For example, the tree yields the six phrase structure rule instances shown in Table 1. 1 The conversion from syntactic trees to rule instances is unambiguous. Decay-based model of short-term priming The amount of rule repetition can now be quantified. Structural priming predicts that a rule (target) occurs more often closely after a potential prime of the same rule (stimulus) than further away. Therefore, we correlate the probability of repetition with the distance between prime and target. For example, if a sentence-level conjunction leads to the rule S ? S cc S, and such a conjunction appears in utterances 3 and 11, we would observe a repetition, noting its distance d ¼ 8 utterances. We sample repetitions and non-repetitions within 1-s or 1-utterance windows at different distances (lnðDistÞ, up to 25 utterances or 15 s). Thus, a rule occurrence in the dialogue will normally lead to up to 25 or 15 data points for the various distances, with a binary response variable indicating repetition vs. nonrepetition. Memory effects generally decay non-linearly. Analysis of the repetition probabilities over increasing d confirmed this distribution. lnðDistÞ is therefore log-transformed in our models. Unlike in controlled experimentation where specific syntactic constructions are elicited, every rule may be biased by a prior prime in this paradigm. The example shown in Fig. 1 shows a subset of the rules appearing in the text. Repetitions a and b are both at distance 2, because the occurrences (prime and target) are two utterances apart, or 4.6 and 3.2 s, respectively. To facilitate the computation, we also drop all hapax rules (frequency f ¼ 1). We exclude cases where syntactic repetition is a mere consequence of verbatim lexical repetition (c). The reason for this is that speakers may merely repeat such phrases without analyzing them syntactically. Lexical repetition is likely to result in syntactic repetition, which would possibly inflate results. The basic statistical model compares the probability of a rule occurrence in situations when it was and was not primed. The null hypothesis is that this probability should be unaffected by the prime. Our statistical model is a sensitive variant of this idea. We predict the probability of repetition as a function of the time between prime and target. Priming effects decay over time or are subject to interference in working memory, so the effect assumes a decline of repetition probability with increasing distance between prime and target. The slope of this decline is the basis for comparison of priming strength under different conditions. The logistic regression model is specified in the appendix. The effect of distance on syntactic repetition has been shown in related studies on corpora. Gries (2005) demonstrated a correlation of distance with the repetition Table 1 Syntactic rules and additional information extracted from the Map Task corpus. The speaker here is the direction follower, as opposed to the direction giver. This is a simplified example compared to the actual annotation. Onset time (s) Speaker Syntactic rule Yield 185.105 Follower VP ! VBG PP Keeping on the edge of the page 185.363 Follower PP ! IN NP On the edge of the page 185.490 Follower NP ! AT NN The edge 185.490 Follower NP !NP PP The edge of the page 185.692 follower PP ! IN NP Of the page 185.729 follower NP ! AT NN The page 1 The analysis uses the Brown Corpus part-of-speech tags Kucera and Francis (1967). IN: preposition, AT: determiner, VBG: verb, present participle/gerund. CC: sentence-level coordinating conjunction. 32 D. Reitter, J.D. Moore / Journal of Memory and Language 76 (2014) 29–46