Elman 1990

  • The network learned generalizations
  • examine hidden unit activation pattern for each word measure distance between each pattern and every other pattern (Euclidean Distance)
  • Use this to create a hierarchical cluster.
  • Network learned semantic classes
  • If the input to a simulation is preselected to avoid problems, one has instantiated an expert Filtering system.
  • in order to accomplish the goal of creating word classes by surface structure alone, it appears that the input must be filtered in just the right way.
  • Instead of semantic representations, semantics gets replaced with distributional information
    • This is not what humans know about word classes.
    • If the simulation’s goals are accomplished by avoiding pronouns then we have the equivalent of a pronoun filter
  • Some strings in English are both nouns and verbs, e.g. smell, break
  • The simulation did not learn what children learn
  • Yes, the input was oversimplified, but it’s not clear that adding these additional Features will make the model perform worse
  • language is very redundant, so certain simplifications actually remove helpful Features
  • Categories can ‘emerge’ via statistical regularities
  • Basic RNN Architectures can find these