Selection Bias

  • Errors in conclusions drawn from sampled data due to a selection process that generates systematic differences between samples observed in the data and those not observed. The following forms of selection bias exist
  • datasets often prefer particular kinds of images
  • However, getting images from the Internet does not in itself guarantee a fair sampling, since keyword-based searches will return only particular types of images
  • Obtaining data from multiple sources
  • even better to start with a large collection of unannotated images and label them by crowd-sourcing