Elman 1990
- The network learned generalizations
- examine hidden unit activation pattern for each word measure distance between each pattern and every other pattern (Euclidean Distance)
- Use this to create a hierarchical cluster.
- Network learned semantic classes
- If the input to a simulation is preselected to avoid problems, one has instantiated an expert Filtering system.
- in order to accomplish the goal of creating word classes by surface structure alone, it appears that the input must be filtered in just the right way.
- Instead of semantic representations, semantics gets replaced with distributional information
- This is not what humans know about word classes.
- If the simulation’s goals are accomplished by avoiding pronouns then we have the equivalent of a pronoun filter
- Some strings in English are both nouns and verbs, e.g. smell, break
- The simulation did not learn what children learn
- Yes, the input was oversimplified, but it’s not clear that adding these additional Features will make the model perform worse
- language is very redundant, so certain simplifications actually remove helpful Features
- Categories can ‘emerge’ via statistical regularities
- Basic RNN Architectures can find these