Home

Appendix

PRO
Application Icon   OCR

DEVONthink contains an optical character recognition (OCR) module that allows you to import scanned documents and make them searchable. These documents are "read" by the embedded OCR engine and stored as PDF files that contain an additional (invisible) text layer containing the searchable text. Use these options to control the engine, including the output format, resolution, what language(s) to use, and what to do with the originals.

OCR Processing:

  • Icon
    Convert Incoming Scans: Choose the file format for images or PDF files received from a known scanning software, e.g., ScanSnap Home: searchable PDF, RTF document, Word document, or WebArchive. Use No Action to disable OCR for incoming scans, e.g., if the scanning software has already done OCR.
  • Icon
    Original Document: Check Move to Trash to move the original documents to the trash after OCR is completed.
  • Icon
    Enter metadata after text recognition: Opens a metadata panel, post-OCR, where you adjust any of the properties: title, author, subject, tags, and creation date.

Document Controls:

  • Icon
    Compress PDF: Create smaller PDFs by applying compression to the final document. Compression only applies when adding metadata post-OCR or preserving annotations from an original PDF after OCR.
  • Icon
    PDF Resolution: Set the desired resolution for the image layer in the PDF from 150 to 600 dpi. On M-series Macs, you can also choose As source to retain the originally scanned resolution.
  • Icon
    Page Orientation: Detect and correct the page orientation.
  • Icon
    Deskew: Correct angled pages in the final document.

Language Controls:

  • Icon
    Custom Dictionary: Check Use Dictionary to use a custom dictionary of acceptable words. For example, you may have an unusual spelling of someone's name in some documents. You can enter the name as an acceptable choice for the OCR engine to choose from. Click the Configure button to add custom entries for OCR detection. Note you can only have one dictionary, specified for the language chosen in the Language dropdown.
  • Icon
    Primary Language: Set the most common language of the documents you scan.
  • Icon
    Secondary Languages: Add other languages to match the language of specific documents you're processing.

DEVONthink comes with more than 150 different language dictionaries. Adding extra languages can improve the accuracy of the text recognition. Select a language from the Available column on the right and add it to the Selected column using the right-to-left arrow button. Deactive a language by selecting it in the Selected and pressing the right-to-left arrow. You can select a maximum of four secondary languages. Note the primary language and the secondary languages are treated equally.