DocToText is a data extraction tool that converts a large variety of files to plain text and HTML. DocToText comes with high-grade, scriptable, and trainable OCR and email parsing capabilities. DocToText converts DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP)
Read more about Docwire SDK