Recherche : [OCR_TextExtraction] - Liens utiles et à partager

Apache Tika – Apache Tika

The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.

ApacheTika · Search_Engine · OCR_TextExtraction · Java · Text_Analysis_&_Keywords_Extraction · open-source

November 7, 2017 01:57:26 PM GMT+01:00 · permalien

·

https://tika.apache.org

Analyzing the Panama Papers with Neo4j: Data Models, Queries & More

These structures were uncovered from leaked financial documents and were analyzed by the journalists. They extracted the metadata of documents using Apache Solr and Tika, then connected all the information together using the leaked databases, creating a graph of nodes and edges in Neo4j and made it accessible using Linkurious’ visualization application.
In this post, we look at the graph data model used by the ICIJ and show how to construct it using Cypher in Neo4j. We dissect an example from the leaked data, recreating it using Cypher, and show how the model could be extended.

Neo4j · graph · OCR_TextExtraction · Text_Analysis_&_Keywords_Extraction · journalisme · network · DataVisualization · ApacheTika

November 6, 2017 05:40:19 PM GMT+01:00 * · permalien

·

https://neo4j.com/blog/analyzing-panama-papers-neo4j/

tesseract-ocr - An OCR Engine that was developed at HP Labs between 1985 and 1995... and now at Google. - Google Project Hosting

Text_Analysis_&_Keywords_Extraction · OCR_TextExtraction

August 3, 2014 11:47:45 PM GMT+02:00 · permalien

·

https://code.google.com/p/tesseract-ocr/

Free online OCR

Extract text from images with this free online OCR tool. No registration, no email.

Text_Analysis_&_Keywords_Extraction · OCR_TextExtraction

August 3, 2014 11:47:32 PM GMT+02:00 · permalien

·

http://www.free-ocr.com/

Free Online OCR - Convert JPEG, PNG, GIF, BMP, TIFF, PDF, DjVu to Text

Free online OCR service that allows to convert scanned images, faxes, screenshots, PDF documents and ebooks to text, can process 58 languages and supports layout analysis

Text_Analysis_&_Keywords_Extraction · OCR_TextExtraction

August 3, 2014 11:46:54 PM GMT+02:00 · permalien

·

http://www.newocr.com/

12.10 - How can instantaneously extract text from a screen area using OCR tools? - Ask Ubuntu

Text_Analysis_&_Keywords_Extraction · OCR_TextExtraction

August 3, 2014 11:46:21 PM GMT+02:00 · permalien

·

http://askubuntu.com/questions/280475/how-can-instantaneously-extract-text-from-a-screen-area-using-ocr-tools/280713#280713