Home

Appendix

Application Icon   Building Your Database

Now that we've discussed the philosophy and core elements of DEVONthink, let's discuss actually making one.

Choosing the type of database

As you've read, a database is the core element you work with in DEVONthink — the place where you store, organize, find, and access your documents. But not every database has the same purpose or requires the same level of privacy and control. So DEVONthink offers several types of databases from which to choose.

Database (unencrypted): The most commonly used type, an unencrypted database is used for any purpose, from personal to professional. Easily created and able to grow as large as needed (and your internal disk space allows), these are the core type used by many people. The file extension in the Finder is .dtBase2.

Encrypted Databases: If you have databases containing sensitive or private information you want to lock away when not in use, create an encrypted database. This is specialized AES-256 encrypted disk image with a file extension of .dtSparse when closed. The mounted disk will not appear in the Finder's sidebar or on your desktop when it's open. To further enhance security, quitting DEVONthink or closing the database unmounts the disk, again locking its contents away. This means you are always required to enter the password to access it. To help visually identify it, an encrypted database will display a key property icon to the right of the database's name in the Navigate sidebar.

Audit-Proof Database: If you have mandatory requirements to store documents that can't be edited, e.g., for tax or legal reasons, an audit-proof database fulfills this goal. Audit-proof databases are archival and intended to be compliant with legal or financial standards. They are not working databases, but ones in which you store and "lock away" important documents. As such, they are inherently very limited in what you can do with them.

You can create groups or smart groups and you can import documents to the database. For long-term storage compliance, you can convert PDFs to PDF/A before you add them to your database. Once you add a document to the database, it is read-only and cannot be edited. The limitations also prohibits using actions like services or adding files via automation to add files. OCR and imprinting documents is also prohibited. If you open a document in an external application and attempt to edit it, you will be warned it's locked. If you make a change, it will not persist. Your only option is to duplicate and save a new version outside the database.

Every item added to the database is stored in an uneditable internal log with the metadata like the name, content hash, and filesystem dates. If you rename a document, the original name is still preserved. Dates cannot be changed. And even if you delete a document, the deletion is also recorded. All these interactions can be audited for selected documents or you can export an audit report for the whole database.

An audit-proof database can be synced like any other database. However, it can only be imported to another Mac as an audit-proof database.

As preserving the security and integrity of the documents is paramount, there are also filesystem safeguards in place. For example, the Path in the Generic inspector does not display the document's file path, nor can you reveal the file in the Finder or copy its path. If you attempt to rename the database file in the Finder, it cannot be opened. And any attempt to modify the internals of an open audit-proof database will result in irreparable damage to it.

These databases are clearly not for casual use and should be utilized when your situation requires it.

Note: Audit-proof databases are completely incompatible with DEVONthink 3.x.

Technically similar to encrypted databases and mounted in a protected disk image, the file extension when closed is .dtArchive. In the Navigate sidebar, each audit-proof database has a icon to the right of its name.

When you create an encrypted or audit-proof database, you need to provide a few extra pieces of information:

  • Icon
    Encryption key: Enter an encryption key that locks the database when it's closed. Bear in mind, you must remember or take not of this key. It is not stored in any accessible location. And it cannot be changed. So if you forget it, your data will be forever locked out of your hands.
  • Icon
    Size: Since these databases are contained in secure disk images, you must specify the anticipated maximum size, in megabytes or gigabytes, it will grow to. We recommend you determine a maximum size and add 20 to 30 percent to it. This allows for unanticipated future growth.

Spotlight Indexing: For all types of databases, you have the option to let Spotlight index its contents. However, the Spotlight index is stored locally and isn't encrypted so Spotlight is typically disabled for encrypted databases. Otherwise, someone potentially could see a document exists via a Spotlight search. However, they wouldn't be able to open and access the database without the proper key. Spotlight indexing can be enabled and disabled per-database in the Database Properties.

Database Location

Ideally, databases are stored in the Databases folder in your home directory, as that folder is: quickly accessible, not synced via iCloud, and generally part of a standard backup. Alternatively, you can store it on a connected external hard drive, if your internal drive space is low. You can put the database on an NAS but we only recommend this if you're on a hardwired gigabit Ethernet connection or better. That being said, you cannot create or store a database in a cloud-synced folder, e.g., iCloud Drive or Dropbox. This is not data-safe so the behavior is explicitly disallowed. If you try to open a database in one of these locations, you will be prompted to let DEVONthink move the database, or reveal it so you can manually relocate it.

You may think it is a clever idea to store your databases on an SD card; a portable database on hyper-portable media. However, this is not a good idea as this type of media are not robust or made for long-term storage. (Consider how quickly pro photographers offload their SD cards to other drives.) We would caution you about thumb drives as well.

Once you've determined what type of database you need, select File > New Database and select the type. Give your database an easily recognizable name, set any type-specific options as mentioned above, then choose where you want to save it.

Adding your files

Adding items to a database is often a simple matter of dragging and dropping files into your database. And we've covered many other options in the important In and Out chapter. But the question is: What should I put in it, everything or…?

While you may be tempted to dump every file on your hard drive into DEVONthink and sort it out later, you're best off being more selective in what you add (especially in the beginning). Consider this: On your Mac are hundreds of thousands of files, including in your User Library. Many of those files are never seen or accessed by you. Putting your entire user account in a database only adds an incredible amount of useless data. And weeding these unwanted files out after-the-fact is both time-consuming and frustrating. DEVONthink is not a Finder nor a Spotlight replacement and having a database filled with 90% useless documents is no practical benefit to you. Also remember, DEVONthink has to index the metadata and contents of any compatible files. Indexing unnecessary files bloats the index of a database and leads to imprecise search results and false positives.

However, if you are working in your Documents folder, that would be more useful to add. Or if you are working on separate topics in that folder, perhaps storing your dissertation files in one folder and bookmarks and PDFs about kayaking in another, you could add each folder individually, or even to its own database.

One way to effectively create separate databases is to use a topical database approach. Create multiple databases, with each holding only related information: a bird watching database full of birding articles and newsletters; a quantum physics research database with research briefs and email. This method can improve the effectiveness of DEVONthink's internal artificial intelligence (AI) with each database as it works best within a database that contains contextual relationships among many documents. Clogging your new database with everything from A (apple pie recipes) to Z (zebra population statistics) will only hamper the AI's ability to work effectively.

Having topical databases can help down the road as well. You may be collaborating on a database, syncing between machines in a group. Imagine having just one database: You decide to share your painstakingly researched academic articles with colleagues, only to find that you've mistakenly also shared personal financial records and chats. It's not hard to imagine how that has the potential to be both dangerous and embarrassing. Having multiple, topical databases will allow you to keep your data separate and private. This approach can also be beneficial from a performance standpoint, which we'll see next.

Database Size: When it comes to database size, there are many variables that can limit the size or performance. Obviously, you need available disk space to grow the database. And you should always keep at least double the space free in case you require virtual memory or maintenance. But the file size of a database is not the critical factor; it's the number of words and amount of RAM available to DEVONthink. The reason is this: When you open a DEVONthink database, the index is loaded into memory. This makes search and classification lightning-fast! But the more words in your database, the larger the index. The larger the index, the more RAM is required to avoid using the hard drive as virtual memory. Look at the number of unique and total words for a database in the Database Properties window and use these soft-limits as a guideline:

  • Icon
    Total Words: 400 million total words
  • Icon
    Unique Words: 4 million unique words
  • Icon
    Total Items: 250,000 items

As your growing databases use RAM, processor time, etc., smaller, more focused databases are often a more effective approach than using singular, monolithic databases. Separate databases generally perform better, sync faster, and in the rare case of a catastrophe, can help avoid data loss since you're not keeping "all your eggs in one basket". Another benefit of this approach is the ability to conserve some machine resources. With a single, large database all the information is always using resources, even files unrelated to what you're working on at the moment. With separate databases, you can close and open specific databases as the need dictates.

You should also have as much RAM as possible. In fact, this should be a deciding factor when purchasing a Mac: the more RAM, the better. Choosing a machine with 8GB RAM may be functional but can also be less performant as your databases grow. With more powerful machines having much more RAM, the stated figures can be exceeded. However, staying within these limits helps keep things running smoothly.

Organizing

Database organization depends on the parties involved. For collaborative work, you'll want to organize it in a manner that's agreed upon by all parties using it. This is especially important since our sync technology is a mirroring sync, meaning changes to one copy of the database gets synced to the other copies. If one person decides to reorganize things, it affects everyone. For personal work, just set up your database in a manner that makes sense to you. There is no right or wrong way to organize it. This is something you've likely already been doing in the Finder, making folders and filing things in them. Apply the same personal choices to DEVONthink.

You will likely see various organizational methods proclaimed as "this most effective". DEVONthink isn't built to accommodate any of them. Its flexibility just allows people to adapt these methods in their databases. Feel free to explore these options if you'd like, but the best method is the one that makes sense and is efficient and effective for YOU.

Case study: Bill's Database Farm

Bill DeVille, formerly DEVONtechnologies' Evangelist, worked in a number of scientific areas. Bill's main database covered environmental science and technology topics, with related interests in science and technology exchanges with developing nations. The database even contained some projects dealing with graduate education in environmental sciences and engineering. There's a broad topical relationship among these subjects and the database covers disciplines ranging from chemistry, toxicology, statistics, risk assessment, and engineering to economics, legal, regulatory, and policy issues. These disciplines fit together and combinations of these topics are necessary in many real-world cases.

As you can imagine from the above description, Bill's main database was quite large, containing about 20,000 documents and over 20,000,000 total words. Because of the relationships knitting together all these scientific, technical, legal, and policy issues, the artificial intelligence features of DEVONthink worked very well for Bill in researching the database and contextualizing the information.

In addition to his main database, Bill had seven additional databases (so, eight total). For example, he had one database for Apple Newton literature he has accumulated over the years. It was almost as big as his main database, but the topical coverage has no practical relationship to the main database, so Bill kept the Apple Newton literature in its own domain. If he were to keep this unrelated information in his main research database, the result would be a larger, slower database, with poorer performance by the artificial intelligence.

Occasionally, Bill added topical materials to it that are not related to its main purpose. However, when those "unrelated" topics grew large enough in volume, he spun them off into to a new database in order to preserve AI accuracy and relevance.

If you'd like to follow Bill's method, start by creating a database with some collections of files that interest you, but don't be afraid to create other databases that contain "different" material as your interests, and main your database, grow. And if you need to search across databases, simply open all of them at the same time. DEVONthink can easily search them simultaneously.

Remember that creating databases isn't an immutable commitment. Create and destroy them as you see fit. Start with one way of organization, see how it works for you, and decide later to re-organize, if needed. You can keep multiple databases open simultaneously, easily moving documents from one database to the other at any time. As you work with your databases, new ideas may spark new approaches which can easily be tried and adopted or discarded. Remember this: The best organization method for your databases is the one that makes sense to and is effective for YOU..