- ADVENT
- ALBERT
- Adagrad
- Adaptive Input Representation
- Additive Attention
- Affordance Detection Task Specific
- Alex Net
- Alphacode
- Attention NMT
- Attention
- AudioLM
- Auto Encoders
- AutoDistill
- BART
- BERT
- Bahdanau Attention
- Basic GAN
- Basic RNN Architectures
- Basic Transformer
- Beam search
- Bi Directional RNN
- Bias nodes
- Big Bird
- BinaryBERT
- BlockNeRF
- CLIP
- Capsule Layer
- Capsule Network
- Chat GPT is Not All You Need
- ChatGPT
- Chinchilla
- Classifier Gradients
- Codex
- Collaborative Topic Regression
- Conditional GAN
- Conformer
- Content Based Attention
- Contrastive Predictive Coding
- ConvBERT
- ConvNeXt
- Convolutional RNN
- Curriculum Learning
- CycleGAN
- DALL-E 3
- DALL-E
- DALL·E 2
- DCGAN
- DLRM
- DeepFM
- DeepNet
- DeepPERF
- Denoising Autoencoder
- Dense Net
- Diffusion LM
- Dilated Sliding Window Attention
- DistillBERT
- Dot Product Attention
- Dreamfusion
- Dynamic Sparsity
- ELECTRA
- ELMO
- EfficientNet
- Elu
- Encoder Decoder Attention
- Ensemble Distillation
- FLASH
- FLAVA
- FaceNet
- Factorized Embedding Parameters
- Familar Object Grasping Object Viiew recog
- FastText
- Faster RCNN
- Feature Correlationa
- Fixed Factorization Attention
- Flamingo
- FlowNet
- GAN Z Space
- GAU
- GELU
- GGCNN
- GLOW
- GPT
- GPT3
- GRConvNet
- GRU
- Galactica
- Gato
- Generative Models
- Generative RNN
- Generative Spoken Language Modeling
- Generative vs Discriminative Models
- GloVE
- Global Average Pooling
- Global and Sliding Window Attention
- Google NMT
- Grad-CAM
- Hallucination Text Generation
- HiFI-GAN Denoising
- HiFI-GAN Synthesis
- Higher Layer Capsule
- Highway Convolutions
- Hopfield networks
- Imagen
- Inception
- Instance Normalization
- Instant NeRF
- Interpreting Attention
- Isotropic Architectures
- Joint Factor Analysis
- Jukebox
- LASER
- LaMDA
- Large Kernel in Attention
- Large Kernel in Convolution
- Le Net
- Learning to Detect Grasp Affordance
- Linear Classifier Probes
- Listen Attend Spell
- Location Aware Attention
- Location Base Attention
- Long Short Term Memory (LSTM)
- Longformer
- MCnet
- MLIM
- MLM
- MVGrasp
- Magic3D
- Masked Autoencoders
- Minerva
- Mixed chunk attention
- Mobile Net
- MobileOne
- Multi Head Attention
- Multiplicative Attention
- Muse
- Nasnet
- Neural Network Architecture Cheat Sheet
- Neural Probabilistic Model
- Neural Text Degeneration
- Noisy Relu
- OPT
- PEER
- PRelu
- PaLM
- Padded Conv
- PatchGAN
- Phenaki
- Phrase Representation Learning
- Pix2Seq
- Point Cloud
- PointNet++
- Position Encoding
- Position Wise Feed Forward
- Primary Capsule
- RETRO
- Receptive field
- RegNet
- Region Proposal
- Relative Multi Head Self Attention
- Relu
- RepLKNet
- Res Net D
- Res Net
- Restricted Boltzmann Machine
- RetinaNet
- Rmsprop
- RoBERTa
- Routing by Agreement
- S2ST
- SLAK
- SRN
- Scaled Dot Product Attention
- Scene based text to image generation
- SegNet
- Self Attention GAN
- Self Attention
- Seq2Seq
- ShuffleNet
- Sigmoid
- Sliding Window Attention
- Soft Attention
- Softmax
- Softplus
- Soundify
- Sparse Evolutionary Training
- Sparse Transformer
- Spatial Transformer
- Speaker Verification
- Speech Emotion Recognition
- Speech Recognition
- Speech Resynthesis
- Spiking Networks
- Stable Difusion
- Stack GAN
- Stacking RNN
- StarGAN v2
- StarGAN
- Strided Attention
- Strided
- Style GAN
- Swish
- TSDF
- Tacotron
- Tanh
- Teacher Forcing
- Temporal Conv
- TemporalLearning
- Textless Speech Emotion Conversion
- TinyBERT
- Token Embedding
- Transformer-XL
- Transformer
- Transposed Conv
- ULMFit
- Un-LSTM
- Unet Grasping
- Unet
- VAE
- VGGish
- VICReg
- VL-BEIT
- Vgg
- ViLT
- VisualGPT
- Volumetric Grasping Network
- WaveGlow
- WebGPT
- Whisper
- Wide Deep Recommender
- Window Based Regression
- Word2Vec
- X Vectors
- XLM-R
- XLNet
- Xception
- YOLO
- Z-Space Entanglement
- Zeiler Fergus
- cross-layer parameter sharing
- dGSLM
- data2vec
- i-Code
- pGLSM
- wave2vec