Connect to other AI Agents
Objective
This cookbook is helpful if your goal is to make it possible to chat with a 3P agent or LLM through your Moveworks Assistant.
Use Cases
You can do this for..
- Other AI agents: You can send a prompt to those agents and wait for them to do work (e.g. Workday's agent, M365 copilot). Note: This requires those vendors to expose a "Responses API", "Chat Completion API", or similar.
- RAG (Retrieval Augmented Generation) Systems: In these models, the LLM is provided additional context in the prompt.
- In-House LLMs: These are fine-tuned models trained on your data & come up with content based on its training data.
- Foundation Models: These are general, off-the-shelf models, like ChatGPT, Claude, or other families of LLMs.
If you're trying to connect to a Foundation Model like GPT, try our built-in plugin: QuickGPT.
Architecture & Implementation
If you want to connect Moveworks to other AI agents & applications, you can do so through our plugins. Your plugin will look something like this:
sequenceDiagram participant User participant AI_Assistant as AI Assistant participant Claude_Plugin as Claude Plugin participant Anthropic_API as Anthropic API participant Search_Plugin as Search Plugin %% First Turn User->>AI_Assistant: "Get competitive intel on enterprise chat platforms" AI_Assistant->>AI_Assistant: Select Claude_Plugin AI_Assistant->>Claude_Plugin: Send prompt Claude_Plugin->>Anthropic_API: Call Anthropic API Anthropic_API-->>Claude_Plugin: Return competitive summary Claude_Plugin-->>AI_Assistant: Return Claude’s output AI_Assistant-->>User: Deliver competitive intel summary %% Follow-up Turn User->>AI_Assistant: "Now contrast that with our native chat capabilities" AI_Assistant->>Search_Plugin: Retrieve product docs, internal feature specs, etc. Search_Plugin-->>AI_Assistant: Return relevant internal context AI_Assistant->>AI_Assistant: Combine Claude’s competitive intel with org context AI_Assistant-->>User: Share analysis
For the easiest implementation, we recommend the following high-level approach.
-
Choose an invocation phrase for your LLM. Here we are using "Hey Claude"
-
Create two slots.
Slot Name
Data Type
Slot Description
query
string
the query a user has input to you
conversation_context
object
**Description** Capture the immediate conversational context by recording the last user message and the last bot response. This object should NEVER be requested from the user; it should be populated automatically based on the conversation history to maintain relevance and continuity for subsequent turns. **Properties** last_user_message (string) Description: NEVER ASK THE USER FOR THIS INFORMATION. This is the literal message of the last RELEVANT message the user sent. It represents the user's direct input that prompted your most recent response. Make it exact, do not summarize. Capturing the user's query or statement verbatim is crucial for understanding the immediate context of the conversation turn. last_bot_message (string) Description: NEVER ASK THE USER FOR THIS INFORMATION. This is the literal message of the last RELEVANT message you sent. Focus more on the content you've replied with and less about whether other plugins have found any knowledge. This is good for maintaining contextual relevance. Make it exact, do not summarize this at all, length is no issue here. Ignore things like "Here are the resolved argument" or progess updates. Only grab final content sent to the user
-
Set up an HTTP action
curl https://api.anthropic.com/v1/messages \ -X POST \ -H 'Content-Type: application/json' \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H 'anthropic-version: 2023-06-01' \ -d '{ "model": "claude-3-5-sonnet-20241022", "max_tokens": 1024, "messages": [ { "role": "user", "content": "{{user_query}}" } ] }'
-
Configure your Conversation Process to use that HTTP action & pass the slots into the API call.
user_query: | $CONCAT([ "'UserInput:'",data.query, "'PreviousBotMessage:'",$TEXT(data.conversation_context) ])
-
Add a content activity to help the AI assistant select your plugin on subsequent turns
Check out the demo here!
Architecture Decisions
Context Engineering
There are three ways that you can manage context, each with their own pros & cons.
Strategy | Description | Pros | Cons |
---|---|---|---|
Slots | Let the Agentic Reasoning Engine decide what conversation history to provide your model. | Can intelligently combine context with your org-specific knowledge (e.g. via the Search plugin). | Context is lossy. Reasoning engine won't provide ALL the detail for your external API to use. |
Threads API | If your LLM supports it, generate a thread_id and collect it as a slot. | All of your conversation context will be preserved between turns | Limited availability across AI vendors. More complex setup. |
Custom Database | Store user & system messages in a custom database | Full control over the context engineering approach. | Increases the # of systems touching your personal data. Databases will need to be secured. |
Response Generation
If you want to present the results directly to the user without letting the AI assistant mutate the outputs, you'll want to use a compound action instead of an HTTP action in your action activity. This lets
Then hit the LLM API as before, but send the output to the user via a notify step instead of returning it to the user conversationally.
This bypasses the AI Assistant on the way to the user, allowing the original voice of your generation to be preserved.
sequenceDiagram participant User participant AI_Assistant as AI Assistant participant Claude_Plugin as Claude Plugin participant Anthropic_API as Anthropic API participant Search_Plugin as Search Plugin %% First Turn User->>AI_Assistant: "Get competitive intel on enterprise chat platforms" AI_Assistant->>AI_Assistant: Select Claude_Plugin AI_Assistant->>Claude_Plugin: Send prompt Claude_Plugin->>Anthropic_API: Call Anthropic API Anthropic_API-->>Claude_Plugin: Return competitive summary Claude_Plugin-->>User: Notify user with EXACT output.
Token Consumption and Cost
LLM providers charge based on the number of tokens processed (both input prompt and output generation). Long conversations or large documents can become expensive quickly.
Best Practices:
- Set Limits: Always use the
max_tokens
parameter in your API calls to cap the length of the response and prevent unexpectedly large (and expensive) outputs. - Be Concise: Encourage users and design system prompts to be as concise as possible.
- Monitor Usage: Regularly check your API usage and cost dashboards on the LLM provider's platform.
- Choose the Right Model: For simpler tasks, consider using smaller, faster, and cheaper models instead of the most powerful (and most expensive) ones.
Data Security & Privacy
Standard public LLM APIs may use your prompt data to train their models. Sending Personally Identifiable Information (PII) or sensitive company data is a significant risk.
Best Practices:
- Consult Your Security Team: Always review the data privacy and terms of service for any LLM provider.
- Prefer Enterprise Offerings: Whenever possible, use enterprise-grade services like Azure OpenAI or an OpenAI Enterprise agreement, which typically guarantee that your data will not be used for model training.
- Anonymize Data: If you must send potentially sensitive information, build steps in your workflow to find and replace sensitive data with placeholders before sending it to the LLM. You can use our LLM Actions to do this.
- Educate users: Inform users about what data is being sent to a third-party service and advise them against submitting sensitive information. You can do this through a Content Activity & enabling the Activity Confirmation Policy on your API call.
Plugin Selection
Triggering reliability can vary depending on the use case and breadth of positive utterance subject matter. Below are some options to optimize your LLM plugins
- Define Diverse But Specific Utterances: In your plugin's trigger configuration, provide a wide range of example phrases. For a summarization plugin, this could include:
- "summarize this document"
- "give me the tl;dr"
- "what are the key points of this?"
- "can you create an executive summary"
- Define a trigger keyword: Assign a deterministic triggering phrase to your plugin so that users can trigger the plugin on command.
- Use a System Prompt: Instead of relying on the user to frame their entire request, use the
system
message (or an equivalent field) in your API request body. This pre-prompts the LLM with its role or instructions (e.g., "You are an expert at rewriting text to be more professional"). The user then only needs to provide the core input, making the interaction much smoother.
Updated 3 days ago