DEVONthink contains an optical character recognition (OCR) module that allows you to import scanned documents and make them searchable. These documents are "read" by the embedded OCR engine and stored as PDF files that contain an additional (invisible) text layer containing the searchable text. Use these options to control the engine, including the output format, resolution, what language(s) to use, and what to do with the originals.
OCR Processing:
-
Convert Incoming Scans: Choose the file format for images or PDF files received from a known scanning software, e.g., ScanSnap Home: searchable PDF, RTF document, Word document, or WebArchive. Use No Action to disable OCR for incoming scans, e.g., if the scanning software has already done OCR.
-
Original Document: Check Move to Trash to move the original documents to the trash after OCR is completed.
-
Enter metadata after text recognition: Opens a metadata panel, post-OCR, where you adjust any of the properties: title, author, subject, tags, and creation date.
Document Controls:
-
Compress PDF: Create smaller PDFs by applying compression to the final document. Compression only applies when adding metadata post-OCR or preserving annotations from an original PDF after OCR.
-
PDF Resolution: Set the desired resolution for the image layer in the PDF from 150 to 600 dpi. On M-series Macs, you can also choose As source to retain the originally scanned resolution.
-
Page Orientation: Detect and correct the page orientation.
-
Deskew: Correct angled pages in the final document.
Language Controls:
-
Custom Dictionary: Check Use Dictionary to use a custom dictionary of acceptable words. For example, you may have an unusual spelling of someone's name in some documents. You can enter the name as an acceptable choice for the OCR engine to choose from. Click the Configure button to add custom entries for OCR detection. Note you can only have one dictionary, specified for the language chosen in the Language dropdown.
-
Primary Language: Set the most common language of the documents you scan.
-
Secondary Languages: Add other languages to match the language of specific documents you're processing.
DEVONthink comes with more than 150 different language dictionaries. Adding extra languages can improve the accuracy of the text recognition. Select a language from the Available column on the right and add it to the Selected column using the right-to-left arrow button. Deactive a language by selecting it in the Selected and pressing the right-to-left arrow. You can select a maximum of four secondary languages. Note the primary language and the secondary languages are treated equally.
|