Weight Space Learning

 
  • Damian Borth, st.gallen
  • treat weights as data points - representation learning
  • can we look at networks infer the latent factors from the weights?
  • is there knowledge inside models, but can be accessed when they are frozen?
  • loss surfaces and optimization problem of NN are non-convex
  • nn training optimization is very high dimensional
  • what is the relationship between the characterstics(behavior, performance etc) and their solution in weight space
  • GDPR : linked model to database
  • Hypothesis
    • nn populate a structure in weight space
    • structure contains info on properties and generating factors of the models
  • Encoder Decoder architecture for weight vectors
    • then on to down-stream tasks
  • rather huge model zoo generated
  • weight space is symmetric sometimes : ACG architecture
    • multiple versions of NN which do the same thing can be used to reach them from a space
  • contrastive loss
  • linear heads are fitted on the model zoos validation split
    • encoder is frozen
  • initialization
    • random normal
    • glorot
    • orthogonal
    • he normal
    • truncated normal
  • train and test on MNIST ,Fashion MNIST , CIFAR , SVHN
  • hypernetworks dont really work somehow?
  • train a encoder decoder transformer
  • sample space
    • are they just sampling the train set? - some ablation done to prove it’s not lol
  • Sequential auto encoding of neural embeddings - SANE