Selection Bias
- Errors in conclusions drawn from sampled data due to a selection process that generates systematic differences between samples observed in the data and those not observed. The following forms of selection bias exist
- datasets often prefer particular kinds of images
- However, getting images from the Internet does not in itself guarantee a fair sampling, since keyword-based searches will return only particular types of images
- Obtaining data from multiple sources
- even better to start with a large collection of unannotated images and label them by crowd-sourcing