Bag of Words Robotics
- Recognizing objects using local descriptors would be computationally expensive.
- The number of local features for a given object mainly depends on the size of the object, and therefore, varies for different objects.
- The key idea for fast 3D object recognition is to use mechanisms for representing objects in a compact and uniform format (e.g., histogram).
- If we represent objects in a uniform format, then we can apply ML algorithms
- Compute local features for all the discovered objects and make a pool of features.
- A dictionary is generated via Clustering of the pool of features into N clusters (the number of the clusters is the codebook size).
- Visual word are then defined as the centres of the extracted clusters.
- Finally, each object is described (abstracted) by a histogram of occurrences of these visual words.

