Milin Et Al.
- Towards cognitively plausible data science in language research (2016), Milin, Divjak, Dimitrijevic and Baayen
- Identify difficult and easy forms (from lemma to plural form)
- Check if human participants also react differently to independently identified difficult and easy forms Compare NDL learning model to TiMBL and human results
- MDVM computes the distance between two values of a feature to reflect their patterns of co-occurrence with categories
- Using MDVM adds an Unsupervised Learning component to MBL Hoste (2005) because essentially it clusters feature values and uses that information
- Using larger values of k with MDVM is helpful
- Easy words that are frequent tokens (forms) are reacted to faster
- Maybe this interaction doesn’t occur with difficult words because there is less variation in the frequency of the difficult words?
- This seems similar to results with regular past tense forms in English:
- Strikingly, TiMBL’s inflectional class probabilities turn out to be predictive in production and comprehension, i. e., for lexical decision latencies.
- Two Grapheme to Lexeme Measures
- Diversity Sum of the absolute values of the activations of all possible outcomes, given a set of input cues.
- Input cues that activate many different outcomes give rise to a highly diverse activation vector, which in turn indicates a high degree of Uncertainty about the intended outcome.
- G2L-Prior Sum of the absolute values of the weights on the connections from all cues to a given outcome.
- independent of the actual cues encountered in the input
- reflects the prior availability of an outcome, its entrenchment in the learning network
- TiMBL assigns higher probabilities to forms belonging to lemmas with letter trigraphs that yield more diverse activations
- Those trigraphs belong to a rich exemplar space in the memory
- it would be expected that higher probabilities would result in shorter response latencies
- However, NDL’s G2L-Diversity was in fact positively correlated with RTs, indicating inhibition, i. e. slower recognition.
- TiMBL probabilities are intended to capture the likelihood of a form’s occurrence in production.
- in comprehension (lexicality judgments) high trigraphs diversity may hurt results
- Spontaneous recovery from extinction
- After a CS is learned to associated with a given Conditioned Response (CR), this association is unlearned
- Theoretically, it can not arise again without retraining
- But in real life, sometimes seemingly completely forgotten associations are reactivated
- shows extinction is not unlearning
- responses that disappear are not necessarily forgotten
- Suggests loss of activation is not simply the mirror of acquiring associations
- Given two conditions stimuli, (CS) where one is more salient, the more salient CS will develop a strong association with the CR (Conditioned Response)
- Some linguistic things can be learned with NDL and this might show use something about the problem
- What made NDL so nice for animal learning might not scale up to linguistic phenomena
- Inductive approaches to cognition