Automation : AI Assisted Automation

Quick links:

PRO

AI Assisted Automation

Artificial Intelligence has been integrated into many aspects of DEVONthink and automation is certainly included. Here we will look at a few different options for using your AI model.

Smart Actions

There are two specific smart actions available to use in smart rules and batch processing: Chat - Query and Chat - Continue if. These actions are simple to use and require little more than creating a good prompt. Each action works in conjunction with a subsequent one. You make your query and receive a response, then you use another action to do something with it.

Chat - Query: This action allows you to enter a pre-made prompt or ask a question. This action can be used with or without a document, as the need fits. And if you defined a Role in the AI > Chat settings, it will be used to shape the response in this action.

Without Document: Enter an open prompt just as you would in the Chat popover.
With Document: Enter a prompt to be used with the selected document as your would in the Chat inspector. You can explicitly choose Text, Vision for images, or Auto to let the AI decide.

The response you get back from your AI model is stored in a special placeholder: Query Response. Depending on the nature of the question and answer, you may be able to use this placeholder directly with another action. Or you may use it in another action further along in the chain of actions.

Chat - Continue if: This is a conditional action allowing you to enter a prompt for your AI and get a yes or no in return. If the response is yes, the following actions will be run. If not, it will stop running. As a simple example, you could ask if the selected document mentions scripting. If it does, use the Apply Script:Tags - Add Most Important Words action to tag it.

Related Actions

There are a few smart actions that can provide flexibility to or extend the usefulness of the AI responses in automation. Note they aren't specific to the higher editions of DEVONthink or only for AI actions.

User Input: Only used in batch processing, this action opens a dialog for you to enter text. Enter the question or comment to be answered in the dialog. Ask a question, enter an answer. The results of this action are stored in the User Input placeholder. One way this can be used with the AI actions, is to create a dynamic prompt for a Chat — Query action that contains reusable instructions, e.g., formatting, etc. It can also be used with the next action: Set Script Input.

Set Script Input: Discussed in the previous section, this acts as a variable. It can accept input from many actions, including the User Input action. Regarding AI, it will accept input from the Chat Response placeholder so you can pass it on to the next action: Script with Input/Output

Script with Input/Output: Also discussed in the previous section, this action accepts input provided by the Set Script Input action and uses it in the embedded or external script you choose. In regards to AI, this action can pass its return value to one of the chat actions via the Script Output.

Scripting

Another way DEVONthink has integrated AI into its automation tools is via script commands. You can view DEVONthink's AppleScript dictionary for more information but here is a brief introduction to the commands.

display chat dialog: Provide a prompt and the specifier, e.g., a window or document to process, and this opens the Summarize and Transform popover with your AI's response. Optionally, add a role and a window name.

Example:

tell application id "DNtp"
display chat dialog think window 1 name "Top five words" role "Provide the response in Markdown formatting. Include the frequency of each word in parentheses. Add a short summary. Include any misspelled words and their corrections at the end." prompt "What are the five more frequently used words in this document?"
end tell

download image for prompt: If you need to create an image programmatically instead of using the Generate Image window, you can use this command. Specify the engine, noting the options like size are governed by the engine you're using. See the AI > Images settings for more information. Also the image is returned as a data class.

Example:

tell application id "DNtp"
set imageData to download image for prompt "A 1934 canary-yellow Ford sitting at a stop light next to a red Ford Fusion. High noon on a small town main street." engine FluxPro size "1344x768"
set newImage to create record with {name:"Fords", type:picture} in current group
set data of newImage to imageData
end tell

get chat models for engine: Displays the available models for a specific AI engine.

Example:

tell application id "DNtp" to get chat models for engine Claude
-->{"Claude 3 Haiku", "Claude 3.5 Haiku", "Claude 3.5 Sonnet", "Claude 3 Opus"}

get chat response for message: This can be used in many different ways, when provided with your prompt and many options, e.g., engine, role, temperature, etc. You can also include a reference to a selected record or a list of them. Here is a simple interactive example of document creation:

Example:

tell application id "DNtp"
set theInput to text returned of (display dialog "Tell me about…" default answer "sea turtles")
set aiReply to (get chat response for message "list seven interesting facts about " & theInput & " in a Markdown table. Don't number the items. Don't use emoji. Include live hyperlinks to the sources. Add only " & theinput & " as the headline and include a picture after it.")
set newDoc to create record with {name:(paragraph 1 of aiReply as string), type:markdown, content:aiReply} in current group
end tell

Obviously, this command is far more powerful than just making Markdown documents about sea turtles! But this hopefully provides a view of the syntax.