***

title: LLM Action Best Practices
position: 5
excerpt: When and how to use LLM actions effectively in Agent Studio workflows.
deprecated: false
hidden: false
metadata:
title: LLM Action Best Practices | Agent Studio Best Practices
description: >-
Guide to using generate\_text\_action and generate\_structured\_value\_action in
Agent Studio: when to reach for AI, choosing the right action type, writing
system prompts, model selection, and real workflow examples.
robots: index
-------------

Agent Studio workflows are fundamentally deterministic. HTTP actions call APIs, mappers transform data, DSL handles logic. The reasoning engine handles slot collection and inference. But at certain points in a workflow, you need language understanding that deterministic code can't provide: classifying ambiguous input, summarizing a verbose API response, extracting structured data from free text, or making routing decisions from natural language. That's where LLM actions come in.

<Callout intent="info">
  This page covers when and how to use LLM actions effectively. For the API reference (parameters, output schema, model table), see [LLM Actions](/agent-studio/actions/llm-actions).
</Callout>

## When to Reach for an LLM Action

LLM actions fit a specific set of problems in Agent Studio workflows. Here are the real use cases where they shine:

* **Classification** — routing tickets by category, determining intent from user input, tagging content. The input is unstructured text, and the output is one of a known set of labels.
* **Summarization** — condensing verbose API responses into human-readable summaries. A 50-field ServiceNow incident record becomes a 3-sentence status update.
* **Extraction** — pulling structured fields from unstructured text. Names, dates, dollar amounts, or reference numbers buried in a free-text description.
* **Transformation** — reformatting data that doesn't follow a predictable pattern. Parsing inconsistent date formats from different systems, normalizing address fields, or cleaning up user input.
* **Decision support** — analyzing context to recommend an action or route in a workflow. Given a set of facts, determine the best next step.

| Use Case                   | Example                                       | Action Type                        |
| -------------------------- | --------------------------------------------- | ---------------------------------- |
| Classify a support ticket  | Route to IT vs HR vs Facilities               | `generate_structured_value_action` |
| Summarize an incident      | 50-field API response to 3-sentence summary   | `generate_text_action`             |
| Extract entities from text | Pull names, dates, amounts from a description | `generate_structured_value_action` |
| Draft a response           | Generate a reply email from context           | `generate_text_action`             |
| Route a workflow           | Decide next step based on user input          | `generate_structured_value_action` |

## `generate_text_action` vs `generate_structured_value_action`

Both actions call an LLM, but they serve different purposes. Picking the right one upfront saves you from wrestling with output parsing later.

### Use `generate_text_action` when:

* The output IS the deliverable (summaries, drafts, explanations)
* You don't need to parse the output programmatically
* The downstream consumer is a human (via the reasoning engine)

### Use `generate_structured_value_action` when:

* You need to use the output in downstream logic (if/else, routing, API calls)
* You need consistent, parseable fields (classification labels, extracted entities, boolean decisions)
* Output reliability matters more than prose quality

### Side-by-Side: Same Ticket, Different Actions

Consider a support ticket with this description:

> *Employee Sarah Chen (Building 3, Floor 2) reports that her Dell Latitude 5540 laptop screen has been intermittently flickering since the company-wide Windows 11 23H2 update pushed last Thursday. The flickering occurs every 10-15 minutes and lasts approximately 30 seconds each time. During these episodes she is unable to read text on screen and gets disconnected from Microsoft Teams video calls, which has caused her to miss portions of three client meetings this week. She has already tried restarting the laptop, updating the Intel Iris Xe display driver through Device Manager, and connecting an external monitor (which works fine, suggesting the issue is with the built-in display panel or its driver). She notes that two other colleagues on the same floor are experiencing similar symptoms after the same update. Her ticket was originally filed under "General IT Request" but she believes it should be escalated given the impact on client-facing work.*

**Text action for a human-readable summary:**

```yaml title="generate_text_action" wordWrap
- action:
    action_name: mw.generate_text_action
    input_args:
      system_prompt: >
        Summarize this support ticket in 2-3 sentences for a help desk agent.
        Include the reported symptom, business impact, and what the user has
        already tried. Be concise.
      user_input: data.ticket_description
    output_key: ticket_summary
```

Output: `"Sarah Chen's laptop screen flickers every 10-15 minutes since the Windows 11 23H2 update, dropping her from Teams video calls and impacting client meetings. She's tried driver updates and an external monitor (which works), pointing to a display panel or driver issue. Two colleagues on the same floor have the same problem, suggesting a broader rollout issue."`

**Structured action for routing logic:**

```yaml title="generate_structured_value_action" maxLines=25 wordWrap
- action:
    action_name: mw.generate_structured_value_action
    input_args:
      system_prompt: >
        You are a support ticket classifier for an enterprise IT help desk.
        Given a ticket description, determine the category, priority, and
        affected system.
      user_input: data.ticket_description
      output_schema: >-
        {
            "type": "object",
            "properties": {
                "category": {
                    "type": "string",
                    "enum": ["IT_Hardware", "IT_Software", "IT_Access", "HR", "Facilities", "Finance"]
                },
                "priority": {
                    "type": "string",
                    "enum": ["P1_Critical", "P2_High", "P3_Medium", "P4_Low"]
                },
                "affected_system": {
                    "type": "string"
                }
            },
            "required": ["category", "priority", "affected_system"],
            "additionalProperties": false
        }
      strict: true
    output_key: ticket_classification
```

Output: `{ "category": "IT_Software", "priority": "P2_High", "affected_system": "Windows Display Driver" }`

Notice `additionalProperties: false` and `strict: true` on the structured action. These are required for reliable output. Without them, the model may add extra fields or deviate from your schema. The `output_schema` must be written as a JSON string using `>-` block syntax, not as native YAML.

## Writing Effective System Prompts

The system prompt is the most important input to an LLM action. A vague prompt produces vague results.

### Be specific about the task

Tell the model exactly what you want, including the full set of valid outputs.

**Weak:**

```yaml
system_prompt: '''Analyze this ticket and tell me what category it belongs to.'''
```

**Strong:**

```yaml wordWrap
system_prompt: >
  You are a support ticket classifier. Given a ticket description,
  classify it into exactly one category: IT, HR, Facilities, or Finance.
  Respond with only the category name.
```

The weak prompt leaves the model guessing about what categories exist and how to format the response. The strong prompt eliminates ambiguity.

### Constrain the output

For `generate_text_action`, tell the model what format you expect:

```yaml wordWrap
system_prompt: >
  Summarize this incident in exactly 3 bullet points. Each bullet should
  be one sentence. Do not include a header or introduction.
```

For `generate_structured_value_action`, the output schema handles format constraints, but the system prompt should still describe what each field means and how to determine its value.

### Include examples

Few-shot prompting works well for edge cases. Show the model 2-3 input/output pairs:

```yaml wordWrap
system_prompt: |
  You are a support ticket classifier. Given a ticket description,
  classify it into exactly one category: IT, HR, Facilities, or Finance.
  Respond with only the category name.

  Examples:
  - "My laptop won't connect to WiFi" -> IT
  - "I need to update my direct deposit info" -> HR
  - "The AC in building 3 is broken" -> Facilities
  - "I need to submit a purchase order for new monitors" -> Finance
  - "Can't access Salesforce after password reset" -> IT
```

Examples are especially useful when categories overlap. "Can't access Salesforce" could be IT or Finance depending on context. The examples anchor the model's interpretation.

### Don't contradict yourself

If you say "be concise" in one sentence and "explain your reasoning in detail" in the next, the model will pick one or produce something awkward. Review your prompt for conflicting instructions before deploying.

## Model Selection

Pick the smallest model that handles your task well. You can always upgrade if the output quality isn't sufficient.

| Model         | Best For                                         | Notes                                                    |
| ------------- | ------------------------------------------------ | -------------------------------------------------------- |
| `gpt-4o-mini` | Classification, extraction, simple summarization | Fast, cheap, good enough for most tasks. Default choice. |
| `gpt-4o`      | Nuanced summarization, complex extraction        | Better quality when mini isn't cutting it                |
| `gpt-5`       | Multi-step reasoning, edge-case classification   | Use when you need the model to think through ambiguity   |

Set the model in `input_args`:

```yaml
input_args:
  model: '''gpt-4o-mini'''
```

If you don't specify a model, it defaults to `gpt-4o-mini`. For most classification and extraction tasks, the default is fine. Start there and upgrade only if you see quality issues in testing.

## Temperature

Temperature controls output randomness. Lower values produce more consistent, deterministic results. Higher values produce more varied, creative output.

* **Low (0-0.3):** Classification, extraction, routing, anything where consistency matters.
* **Medium (0.5-0.7):** Summarization, content generation, drafting responses.

The default works for most use cases. Override it when you have a specific need:

```yaml
input_args:
  temperature: 0.2
```

## Workflow Example: Classify and Route a Support Ticket

Here's a complete compound action that shows an LLM action in context. The workflow fetches a ticket from ServiceNow, classifies it with a structured LLM action, then assigns it to the right team.

```yaml title="Compound Action: Classify and Route Ticket" maxLines=30 wordWrap
steps:
  # Step 1: Fetch ticket details from ServiceNow
  - action:
      action_name: fetch_ticket_details
      input_args:
        ticket_id: data.ticket_id
      output_key: ticket_data

  # Step 2: Classify the ticket using an LLM
  - action:
      action_name: mw.generate_structured_value_action
      input_args:
        system_prompt: |
          You are a support ticket classifier for an enterprise IT help desk.
          Given a ticket's short description and full description, determine
          the category, priority level, and the team that should handle it.

          Categories: IT_Hardware, IT_Software, IT_Access, HR, Facilities, Finance
          Priority: P1_Critical (system down, multiple users affected),
            P2_High (single user blocked), P3_Medium (degraded but functional),
            P4_Low (cosmetic or future request)
          Teams: Desktop_Support, Network_Ops, Identity_Access, HR_Operations,
            Facilities_Mgmt, Finance_Ops

          Examples:
          - "Laptop won't power on" -> IT_Hardware, P2_High, Desktop_Support
          - "Need VPN access for new contractor" -> IT_Access, P3_Medium, Identity_Access
          - "Office kitchen sink is leaking" -> Facilities, P3_Medium, Facilities_Mgmt
        user_input:
          short_description: data.ticket_data.short_description
          description: data.ticket_data.description
          reported_by: data.ticket_data.caller_id
          created: data.ticket_data.sys_created_on
        output_schema: >-
          {
              "type": "object",
              "properties": {
                  "category": {
                      "type": "string",
                      "enum": ["IT_Hardware", "IT_Software", "IT_Access", "HR", "Facilities", "Finance"]
                  },
                  "priority": {
                      "type": "string",
                      "enum": ["P1_Critical", "P2_High", "P3_Medium", "P4_Low"]
                  },
                  "suggested_team": {
                      "type": "string",
                      "enum": ["Desktop_Support", "Network_Ops", "Identity_Access", "HR_Operations", "Facilities_Mgmt", "Finance_Ops"]
                  },
                  "reasoning": {
                      "type": "string"
                  }
              },
              "required": ["category", "priority", "suggested_team", "reasoning"],
              "additionalProperties": false
          }
        strict: true
        model: '''gpt-4o-mini'''
        temperature: 0.1
      output_key: classification

  # Step 3: Assign the ticket to the suggested team
  - action:
      action_name: assign_ticket
      input_args:
        ticket_id: data.ticket_id
        assignment_group: classification.suggested_team
        priority: classification.priority
        category: classification.category
        work_notes: |
          Auto-classified by Agent Studio.
          Category: {{classification.category}}
          Priority: {{classification.priority}}
          Team: {{classification.suggested_team}}
          Reasoning: {{classification.reasoning}}
      output_key: assignment_result
```

A few things to note in this example:

* **Low temperature (0.1)** because classification needs to be consistent. The same ticket should get the same classification every time.
* **`strict: true`** and **`additionalProperties: false`** on the output schema guarantee the model returns exactly the fields you expect.
* **The `output_schema` uses `>-` block syntax** to write JSON inline in YAML. This is required for `generate_structured_value_action`.
* **The `reasoning` field** is included in the schema so the model explains its decision. This gets written to the ticket's work notes for the human agent, but it doesn't affect routing logic.
* **Step 3 only uses the structured fields** (`suggested_team`, `priority`, `category`) from the classification. No free-text parsing needed.

<CardGroup cols={2}>
  <Card title="LLM Actions Reference" icon="brain" href="/agent-studio/actions/llm-actions">
    Full API reference for generate\_text\_action and generate\_structured\_value\_action — parameters, output schema, and model table.
  </Card>

  <Card title="Action Best Practices" icon="gears" href="/agent-studio/guides/action-best-practices">
    Compound action and DSL anti-patterns — when NOT to use LLM actions.
  </Card>

  <Card title="Decision Frameworks" icon="diagram-project" href="/agent-studio/guides/decision-frameworks">
    Decision trees for when to use LLM vs DSL, compound action vs HTTP, and more.
  </Card>

  <Card title="The Golden Rule" icon="scale-balanced" href="/agent-studio/guides/the-golden-rule">
    Never chain actions without a slot barrier — the architecture principle that keeps workflows fast.
  </Card>
</CardGroup>