Sequential auto encoding of neural embeddings - SANE

learn compressed latent representation of model sequence
contrastive loss
capacity scales with model size
gracefully scale to larger models with different architectures
different models share embedding space
unified for generative and discriminative