Document Triage Characters in file must be MACHINE READABLE (Character Encoding) Character Encoding Identification (ASCII, UNICODE..) Language Identification (English, French,..) Text Sectioning