Search engine with automatic text recognition (OCR) for images inside PDF documents, Powerpoint presentations & ZIP archives

Text stored in image formats (i.e. scans, screenshots or photos) cant be found by standard fulltext search. So the search engine Open Semantic Search enriches meta data of images like filename, format and size with results from automatic text recognition (OCR).

Since many information is not searchable because its in graphical formats embedded in PDF or Powerpoint presentations (i.e. screenshots), the enhancer OCR of Open Semantic Search extracts images from PDF for automatic textrecognition (OCR), too.

With the new version of Open Semantic Search now extracts images not only from PDFs but from Powerpoint presentations and ZIP archives, too.