__Index_of__Book-Notes

__Index_of__Understanding-Deep-Learning

_Index_of_Understanding-Deep-Learning Chapter 1 - Introduction Chapter 4 - Deep Neural Networks Chapter 5 - Loss functions Chapter 6 - Fitting models Chapter 7 - Gradients and Initialization Chapter 8 - Measuring performanc Chapter 9 - Regularization Chapter 10 - CNNs Chapter 11 - Residual Networks Chapter 12 - Transformers Chapter 13 - Graph Networks Chapter 14 - Unsupervised Learning Chapter 16 - Normalizing Flows Chapter 17 - VAE Chapter 18 - Diffusion index The Programmers Brain

Link to original
_Index_of_Book-Notes bitter_lesson Focused Life index

Link to original

__Index_of__Datasets

_Index_of_Datasets 1D-ALVINN AudioSet classification AudioSet Benchmark LLM Billion Word BooksCorpus BUCC CIFAR Cityscapes COCO CommonCrawl CUB-200-2011 4 CUB-200-2011 DrawBench English Wikipedia Europarl-ST Fashion MNIST FGVC Aircraft FGVCx Fine grained datasets Fisher Spanish-English Flickr30K GLUE Google Conceptual Captions Google voice search task GTA5 HMDB51 IDRiD ILSVRC imageCaptioning ImageNet IMDB iNaturalist ISIC 2018 Kinetics KITTI Kvasir Dataset Labeled Faces in the Wild LibriSpeech MILAN MILANNOTATIONS MIT300 MIT1003 MLDoc MMLU MNIST MoCO Modality Moment in Time MSCOCO MUSAN NIST 2008 Speaker Recognition Evaluation dataset NIST SRE 2016 Cantonese NLVR2 3 OSIE PASCAL VOC PASCAL-S People Art Dataset Picasso Dataset Places Places365 PlantCLEF RACE Salicon dataset SBU Captions SceneNet RGB-D Shapes Dataset Speakers in the Wild SQuAD Stanford Dogs Stationarity STL-10 SUNCG SVHN Swichboard SYNTHIA UCF101 VGGFace2 3 Visual Commonsense Reasoning Visual Genome VQAv2 3 Wall Street Journal task WMT14 XLSR YFCC100M

Link to original

__Index_of__Ethics

_Index_of_Ethics A declarative modular framework for representing and applying ethical principles A low-cost ethics shaping approach for designing reinforcement learning agents A voting-based system for ethical decision making Belief-Desire-Intention blind ethical judgement Building Ethics into Artificial Intelligence Capture bias Clear Thinking Collective Ethical Decision Frameworks Confirmation Bias Consequentialist ethics Coping Theory Counterfactual Fairness Coverage Bias Coverage of ethics within the artificial intelligence and machine learning academic literature Cross-dataset generalization Demographic Parity Deontological ethics Disparate Impact Disparate Treatment Embedding ethical principles in collective decision support systems Equality of Opportunity Equalized Odds Ethical dilemmas Even angels need the rules AI, roboethics, and the law Fairness Constraint fully informed ethical judgement GenEth In-group Bias Individual Fairness Inter-rater Agreement Interpretive Labor Label bias Mediatic Behavior Micromarriage Moral decision making frameworks for artificial intelligence Moral Machine project MoralDM Negative Set Bias Non-response Bias Norms as a basis for governing sociotechnical systems Out-group Homogeneity Bias partially informed ethical judgement Participation Bias PIUQ Preferences and ethical principles in decision making Research Debt Research Distillation Research Intimacy sacred values Sampling Bias Selection Bias swap-dominance trolley scenario Unawareness Unbiased Look at Dataset Bias Utilitarian ethics Virtue ethics

Link to original

__Index_of__Machine Learning

__Index_of__Contrastive Learning

_Index_of_Contrastive Learning NCE

Link to original

__Index_of__Distributions

_Index_of_Distributions Bernoulli Distribution Beta Distribution Binomial Distribution Boltzmann Distribution Categorical Distribution CDF Class Conditional distribution Dirichlet Distribution Ergodic Exponential Distribution Initialization Invariant Distribution KMeans Laplace Distribution Markov Initial Distribution Multinomial Distribution N-dim Normal Normal Distribution Nucleus Sampling PDF PMF Point Distribution Poisson Distribution Poisson Process Proto Distributions Proto PDF Proto PMF Proxy Objective Rejection Sampling Sampler Stratified Random Sampling Uniform Distribution Uniform Sampling Van Mises distribution

Link to original

__Index_of__Explainability

_Index_of_Explainability Accessibility AdaDelta Adaptive Whitening Saliency Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision A Survey Auditability Back Propamine Bayesian Rule List Beware of Inmates Running the Asylum Blur Baseline Broden Causability Causality Classifying a specific image region using convolutional nets with an ROI mask as input Co adaptation Comparing Data Augmentation Strategies for Deep Image Classification Comprehensibility Conductance Confidence Contributions of Shape, Texture, and Color in Visual Recognition Abstract Counterfactual Images Counterfactual Impact Evaluation DeconvNet Deep Inside Convolutional Networks Deep Neural Networks are Easily Fooled High Confidence Predictions for Unrecognizable Images Deep Visual Explanation DeepFool DeepLIFT Dynamic visual attention Elaborateness Embedding Human Knowledge into Deep Neural Network via Attention Map Explainability Defn Explainability Taxonomy Explainable Artificial Intelligence (XAI) Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI Explanation is not a Technical Term Explanator Fairness Faithfulness FGSM Filter Wise Normalization GAM Gaussian Baseline GradCAM++ Gradient Sensitivity Graph-based visual saliency Group fairness Guided BackProp Guided GradCAM Image Data Augmentation Survey Implementation Invariance Independence Informativeness Integrated Gradients Interactivity Interpretability and Explainability A Machine Learning Zoo Mini-tour Interpretability Interpretation of Neural networks is fragile Layerwise Conservation Principle Layerwise Relevance Propagation Limited features LRP Manifold Maximum Distance Baseline Mean Observed Dissimilarity Mental Model Matching Mini Batch GD Minimization and reporting of negative impacts Multimodal Explanation Nesterov Momentum Noise Tunnel Normalized Inverted Structural Similarity Index Parent Approximations Partial Dependence Plot pixelattribution Prediction Difference Analysis Privacy awareness PromptIR Proxy Attention Proxy features Random Directions Redress RETAIn RISE Saliency using natural statistics Saliency vs Attention SAM-ResNet Sanity Checks for Saliency Maps Separation SGD Momentum SGD Sharpness and Flatness Simple Gradient Descent Skewed data Smooth-Grad SmoothGrad Square Social Construction of XAI, do we need one definition to rule them all SP-LIME Structural Similarity Index Sufficiency Summit Tainted data Textbooks are all you need The Unreliability of Saliency Methods There and back again Towards A Rigorous Science of Interpretable Machine Learning Training Trajectories Trajectory Plotting with PCA Transferability Transparency TREPAN Trustworthiness Understandability Uniform baseline Use Case Utility VarGrad Variation in Dissimilarity Variation in Dissimilarity Vision Explainibility Visualizing the Impact of Feature Attribution Baselines Visualizing the Loss Landscape of Neural Nets Whos Thinking, A push for human centered evaluation of LLMs XAI

Link to original

__Index_of__Graph

_Index_of_Graph Adjacency matrix Arbitrary Relation Bias Area Minimization Batching for GNN Bend Minimization Challenges of Graphs Context Similarity Cross angle Maximization Cross Minimization Density Edge Graphs Edge Prediction Tasks Equivariance and Invariance for Graphs Example GCN Layer Force Directed Graph Layout GQL for SQL Users Graph convolutional network Graph Level Tasks Graph mean pooling Graph Neural Network Graphs Hierarchical Edge Bundling Inductive Bias Inductive Models Kipf Normalization Layers for GNNs Length Optimization Locality Meta AI Speech from Brain Node Distribution Node Level Tasks Node Link Diagram Parameter Sharing for Graphs Properties of Adjacency matrix Relational Inductive Bias Representing Graphs Sequential Relation Bias Small World graphs Symmetries Node Link Transductive Models Types of Graphs Weak Relation Bias

Link to original

__Index_of__Handwriting Recognition

_Index_of_Handwriting Recognition Hit list

Link to original

__Index_of__LLM

_Index_of_LLM Distributed training for LLMs

Link to original

__Index_of__Models

_Index_of_Models 1x1 conv Activation Functions Adagrad Adaptive Gradient Clipping Adaptive Input Representation Additive Attention Additive coupling layer ADVENT ALBERT Alex Net Alphacode architecture Atrous Convolution Attention NMT Attention AudioLM Auto Encoders AutoDistill autoregressive flows Backprop Bahdanau Attention BART Basic GAN Basic RNN Architectures Basic Transformer Basics of Federated Learning Beam search BEiT BERT Best Maching Unit Bi Directional RNN Bias nodes Big Bird Big-Bench BinaryBERT BlockDrop BlockNeRF Bruckhaus - 2024 - RAG Does Not Work for Enterprises CAM Capsule Layer Capsule Network Causal 1D Conv Causal Dilated Conv Causal Language Model Channels Chat GPT is Not All You Need ChatGPT Chinchilla Classifier Gradients CLIP Codex cognitivemodel Collaborative Topic Regression Complete AI Pipeline Computational Graph Conditional GAN Conformer Content Based Attention Contrastive Predictive Coding Conv ConvBERT ConvNeXt Convolutional RNN coupling flows cross-layer parameter sharing Curriculum Learning CvT CycleGAN DALL-E 3 DALL-E DALL·E 2 data2vec DCGAN DeepFM DeepLearning DeepNet DeepPERF DeiT Denoising Autoencoder Dense Net Dense Skip Connections Dense Vector Indexes Dense Density estimation using real NVP Depth Efficiency of Neural Networks Depthwise Separable dGSLM Diffusion LM Dilated Sliding Window Attention Dirac Delta DistillBERT DLRM Dot Product Attention Dreamfusion Dynamic Eager Execution Dynamic Sparsity Effect Of Depth EfficientNet EigenCAM ELECTRA Elementwise Flows ELMO Elu Encoder Decoder Attention Ensemble Distillation Exploding Gradient FaceNet Factorized Embedding Parameters Faster RCNN FastText Feature Correlationa Fixed Factorization Attention Flamingo FLASH FLAVA FlowNet FTSwish Galactica GAN Z Space Gato GAU GELU Generalizing Adversarial Explanations with Grad-CAM Generative Models Generative RNN Generative Spoken Language Modeling Generative vs Discriminative Models Git Commands Global and Sliding Window Attention Global Average Pooling GloVE GLOW Google NMT GPT GPT3 Grad-CAM Gradient Accumulation Gradient Clipping GRU Hallucination Text Generation Heaviside HiFI-GAN Denoising HiFI-GAN Synthesis Higher Layer Capsule Highway Convolutions HNSW Hopfield networks HyTAS i-Code ill conditioning Imagen Improved variational inference with inverse autoregressive flows Inception Influence of image classification accuracy on saliency map estimation Instant NeRF Interpreting Attention inverse autoregressive flows Isotropic Architectures IVFADC Joint Factor Analysis Jukebox Klue ML Engineer LaMDA Large Kernel in Attention Large Kernel in Convolution LASER Layers Le Net Learning Rate Scheduling Linear Classifier Probes Linear Flows Lisht Listen Attend Spell LLM Guide Location Aware Attention Location Base Attention Long Short Term Memory (LSTM) Longformer Lost in the Middle How Language Models Use Long Contexts Machine Learning Tool Landscape MADE - Masked autoencoder for distribution estimation Magic3D Masked Autoencoders Masked autoregressive flow for density estimation MCnet Minerva Mixed chunk attention ML Production Flow MLIM MLM MLOps Learning Mobile Net MobileOne Model Capacity Multi Head Attention Multi Scale Flows Multiplicative Attention Muse Nasnet Network Dissection Quantifying Interpretability of Deep Visual Representions Neural Network Architecture Cheat Sheet Neural Probabilistic Model Neural Text Degeneration NICE - non linear independant components estimation Noisy Relu Non Maxima Supression On the overlap between Grad-CAM saliency maps and explainable visual features in skin cancer images OpenML x SURF OPT Padded Conv PaLM PEER Perceptron pGLSM Phase Transition Model Zoo PhD vs Startup vs Big Company Phenaki Phrase Representation Learning Pix2Seq PixelShuffle Point Cloud PointNet++ Pooling Position Encoding Position Wise Feed Forward PQ PRelu Primary Capsule Problems facing MLOps Properties Required by Network Layers for Normalizing Flows RAGAS - automated evaluation of RAG RandAugment Real Time Image Saliency for Black Box Classifiers Receptive field Recurrent Region Proposal RegNet Relative Multi Head Self Attention Relu RepLKNet Representational Capacity RepVGG Res Net D Res Net Research Engineer in Human Modeling for Automated Driving delft residual flows ResNeXt Restricted Boltzmann Machine RetinaNet RETRO Rmsprop RoBERTa Robust RegNet Routing by Agreement S2ST Saddle Points Salemi and Zamani - 2024 - Evaluating Retrieval Quality in Retrieval-Augmente Salience Map Scaled Dot Product Attention Scaling matrix for coupling layers Scene based text to image generation ScoreCAM SegNet Self Attention GAN Self Attention Self Supervised Vision Transformers Seq2Seq Sequential auto encoding of neural embeddings - SANE Shake-Drop Shake-Shake Shallow vs deep networks ShuffleNet Sigmoid SimCLR Simulations Of language Skip Connection SLAK Sliding Window Attention Soft Attention Softmax Softplus Soundify Sparse Encoder Indexes Sparse Evolutionary Training Sparse Transformer Spatial Transformer Speaker Verification Speech Emotion Recognition Speech Recognition Speech Resynthesis Spiking Networks SRN Stable Difusion Stack GAN Stacking RNN StarGAN v2 StarGAN Static Graph Execution Strided Attention Strided Style GAN Swin Transformer Swish Tacotron Tanh Teacher Forcing Temporal Conv TemporalLearning Textless Speech Emotion Conversion The elephant in the interpretability room Thesis Flow TinyBERT Token Embedding Tower Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining Paradigms Transformer-XL Transformer Transposed Conv Tug of war between RAG and LLM prior Types of Normalizing flows ULMFit Un-LSTM Unet Upweighting usermodel VAE Vanishing Gradient Vapnik chervonenkis dimension Vgg VGGish VICReg ViLT Vision Transformer VisualGPT VL-BEIT Voronoi Cell wave2vec WaveGlow WebGPT Weight space learning What is being Transferred in transfer learning Whisper Wide Deep Recommender Window Based Regression WOMBO Dream Word2Vec X Vectors Xception XLM-R XLNet YOLO Z-Space Entanglement Zeiler Fergus

Link to original

__Index_of__NLP

_Index_of_NLP Bag of n-grams Bag of words Negative Sampling Skip Gram textless-lib Word Vectors

Link to original

__Index_of__Probabilistic Networks

_Index_of_Probabilistic Networks Probabilistic Circuit Units Probabilistic circuits

Link to original

__Index_of__RL

_Index_of_RL Greedy Policy Trajectory

Link to original

__Index_of__Training

__Index_of__Adversarial Learning

_Index_of_Adversarial Learning Adversarial Learning Entropy minimization by adverarial learning Gradient Ascent Mode Collapse Super Resolution

Link to original

__Index_of__Augmentation

_Index_of_Augmentation A survey on Image Data Augmentation for Deep Learning Adding noise Adversarial Spatial Dropout for Occlusion Alleviating Class Imbalance with Data Augmentation Attentive CutMix AttributeMix Augmentation-wise Weight Sharing strategy Augmented Random Search AugMix Auto Augment AutoAugment Co-Mixup Color Space Transformations CowMask Cropping Cut and Delete Cut and Mix Cut, Paste and Learn CutMix Cutout Data aug for spoken language Data Augmentation via Latent Space Interpolation for Image Classification Data Augmentation with Curriculum Learning Deep Generative Models Fast AutoAugment FeatMatch Feature Augmentation Feature Space Augmentation Flipping Fmix GAN‐based Data Augmentation Gaussian Distortion Geometric Transformations GridMask Hide and Seek Image Erasing Image Manipulation Image Mix Image Mixing and Deletion Intra-Class Part Swapping KeepAugment Kernel Filters Manifold MixUp ManifoldMix Meta Learning Data Augmentations Mixed Example Moment Exchange Neural Augmentation Noise Injection On the Importance of Visual Context for Data Augmentation in Scene Understanding Population Based Augmentation Puzzle Mix Random Distortion Random Erasing ReMix ResizeMix RICAP SaliencyMix Sample Pairing Shear Skew Tilt Smart Augmentation SmoothMix SMOTE SnapMix SpecAugment Test-time Augmentation Visual Context Augmentation

Link to original

__Index_of__Causal Inference

_Index_of_Causal Inference Causal Systems The Unified Causal AI Pipeline

Link to original

__Index_of__Federated Learning

_Index_of_Federated Learning Advantages of Federated Learning Federated Learning Federated Updates

Link to original

__Index_of__Knowledge Distillation

_Index_of_Knowledge Distillation Adversarial Distillation Applications of Knowledge Distillation Attention Based Distillation Cross Modal Distillation Data Free Distillation Distillation Algorithms Distillation Schemes Distilling the Knowledge in a Neural Network Feature Based Knowledge Graph Based Distillation Knowledge Distillation Survey 2021 Knowledge Distillation Low-rank factorization Multi Teacher Distillation Offline Distillation Quantized Distillation Response Based Knowledge Self Distillation Teacher Student Architecture Transferred compact convolutional filters

Link to original

__Index_of__Loss function

_Index_of_Loss function 0-1 Loss Absolute Error Adversarial Loss Akaike Information Criterion Attention Alignment AUC-Borji AUC-Judd Bayesian Information Criterion BCE with Logits Bhattacharya Distance Binary Cross Entropy BLEU BYOL Loss BYOL Chebyshev Distance Chi Squared Distance Confusion Matrix Contrastive Loss Cosine Distance Cosine Learning Rate Decay Cosine Similarity Cross Entropy Cross Validation CTC Cycle Consistency Loss Dice Score Distance Measures Distillation Loss Earth Mover’s Distance (EMD) ELBO loss Emperical Risk Euclidean Distance Focal Loss Frobenius norm GE2E Hamming Distance Hausdorff Distance Haversine Distance Hinge Loss Huber Identity Loss inter-sentence coherence loss Intra cluster variance ITM Loss Jaccard Distance Jensen Shannon Divergence Consistency Loss KL Divergence Least squares loss Log likelihood criterion Log Likelihood Loss LogCosh Loss for binary classification Loss for multiclass classification Loss for univariate regression MAE Mallows Cp Statistic Manhattan Distance MAPE Margin Ranking Max Margin Loss Maximum likelihood criterion Maximum Likelihood Maxout Minkowski Distance MSE MSLE Negative Log Likelihood Out-of-bag Evaluation (OOB evaluation) PatchGAN Pearson Correlation Perplexity Poisson Loss Precision Recall Curve Precision Quadratic Loss Quantile loss RAHP Recall Recipe for constructing loss functions Reconstruction loss ROC Curve SDR Sensitivity Shuffled-AUC Sørensen-Dice Index Sparse Dictionary Learning Loss Specificity Squared Error Squared Hinge SSR Triplet Loss

Link to original

__Index_of__Markov

_Index_of_Markov Markov Chain Markov for Continuous Distributions Markov Property Markov Random Field Markov Transition Kernel MCMC Sampling

Link to original

__Index_of__Multitask learning

_Index_of_Multitask learning Attribute Selection Eavesdropping Hard Parameter Sharing Multi Task Learning Representation Bias Soft Parameter Sharing

Link to original

__Index_of__Normalization

_Index_of_Normalization AdaIn Adam Batch Normalization DeepNorm Dropout Effects of Regularization Fine Tuning Based Pruning Freedom Global Gradient Magnitude Based Pruning Global Magnitude Based Pruning He Initialization Instance Normalization Label Smoothing Layer Normalization Layerwise Gradient Magnitude Based Pruning Layerwise Magnitude Based Pruning Leaky Relu Learning Rate Range Test LeCun Init Lp Regularization Modality Dropout No bias decay Normalization Optimizers Orthogonal Initialization Pruning Random Pruning Regularization Term Regularization Scheduling Scoring Pruning Approaches SELU Structure Based Pruning treecoverSegmentation Tuning Model Flexibility VariationalRecurrent Dropout Weight Decay Vs L2 Regularization Xavier Initialization

Link to original

__Index_of__Optimization

_Index_of_Optimization AdamW Amsgrad Cyclic Learning Rate Double Descent MVGrasp NADAM One cycle policy Shrinkage Structural Risk Minimization Weighted Alternating Least Squares

Link to original

__Index_of__Semi Supervised

_Index_of_Semi Supervised Cross Modal-based Methods Downstream Task Ego-motion Feature Map Visualization Free Semantic Label-based Method Human Action Recognition Image Classification Image Generation with Colorization Image Generation with Inpainting Image Generation with Super Resolution Image Generation Kernel Visualization Learning from RGB-Flow Correspondence Learning from Video Colorization Learning from Video Prediction Learning from Visual-Audio Correspondence Learning with Context Similarity Learning with Labels Generated by Game Engines Learning with Labels Generated by Hard-code Programs Learning with Spatial Context Structure Nearest Neighbor Retrieval Object Detection Pretext Task Pretext Tasks Pseudo Label Self Supervised Survey Self-supervised Learning Semantic Segmentation Semi Supervised Semi-Supervised Learning Formulation Smoothness Spatial Context Structure Spatiotemporal Convolutional Neural Network Supervised Learning Formulation Temporal Context Structure Temporal order recognition Temporal order verification Video Generation Weakly Supervised Learning Formulation Weakly-supervised Learning

Link to original
_Index_of_Training Affine Function Autoregressive Broadcasting Calibration Layer Candidate Sampling CenterNet Chain of Thought Co-training Composing shallow neural networks to get deep networks Conditional Independence Curse Of Dimensionality Decision Boundaries Dictionary Learning Dimensionality Reduction Direct entropy minimization Discrete Continuous Discrete Cosine Transform Distillation Token Downsampling Early Stopping tricks Einsum Embedding Encodings Factors for MC estimate Feature Learning Features Feedback Loop Few Shot Order Sensitivity Fitting FP16 training Functional correlates Generalization Curve Goodhart’s Law Gradient Boosting Gradient Checkpointing Gradient Descent Gradient Direction Gram matrix Hallucination handwritingRecognition Hashing heteroscedastic nonlinear regression IID Image Data Inference Path Information Gain Kernel Support Vector Machines (KSVMs) Label Encoding Lack of information Large Batch Training LDA Learning Rate Decay tricks Learning Rate Warmup Linear Learning Rate Scaling Linear scale Logits Masked Language Modeling Matrix notation for NNs Methods for Feature Learning Mixup Monk Multi Variate AR MultiReader technique NaN Trap Neural Dynamics Non Relational Inductive Bias Nonstationarity One hot PCA Post-processing Your Model’s Output Prediction assumption Quantile Bucketing Quantile Regression Rank (Tensor) Regularization Rate Ridge Regression Robust regression Rotational Invariance Sentiment Neuron Sketched Update Sketching SOMs Sparsity Staged Training Structured Update Tensor Processing Unit TIme Series TPU Node TPU Pod Tractability Training-serving Skew Transfer Learning Translational Invariance Trees Understanding deep learning still requires rethinking generalization Unsupervised Data Generation Variable Importances Vector Quantization Width Efficiency of Neural Networks Z Normalization

Link to original

__Index_of__Unsupervised

_Index_of_Unsupervised Self Supervised Unsupervised Learning

Link to original
_Index_of_Machine Learning Class Size Clustering croissant ICA Issues LeCake Stochastic ensemble learning

Link to original

__Index_of__Training

_Index_of_Training

Link to original

__Index_of__Uncertainty

_Index_of_Uncertainty Aleatoric Automation Bias Entropy Epistemic Heteroscedatic Homoscedatic LIME Predictive Parity Predictive Uncertainty SHAP Types of uncertainty Uncertainity in classification Uncertainity in regression

Link to original