In an ideal world, clients always provide you with perfect translation files. You have no difficulty in opening them and processing a word or character count. But the reality is that customers can send you photos or scanned documents instead of text files, and you also need to work with that. To get scanned texts in PDF format is very common. So how to do word count in OCR files?
To extract the text and get a word-count, you need Optical character recognition or optical character reader (OCR). OCR is the automatic conversion of pictures with typed text, handwritten, or printed text into the machine-encoded text from a scanned file, or a photo of your docs.