As research labs developer frontier large language models, Moveworks is committed to bringing frontier intelligence to your AI assistant as fast as possible.

The Reasoning Engine Upgrade Program gives every Moveworks tenant a predictable upgrade track for new versions of the reasoning engine.

Upgrade Programs

Frontier Reasoning Engine

Receive the latest reasoning engine models as soon as they are available.

Best for:

Trailblazing customers who want users to experience frontier intelligence and are comfortable with pre-GA quality
All customer dev/sandbox tenants

Standard Reasoning Engine

Receive the latest reasoning engine models once they are ready for GA.

Best for:

Most production tenants for enterprise customers

Basic Reasoning Engine

Migrated to the latest reasoning engine models at the deprecation deadline, typically 2–4 weeks after the Standard release.

Best for:

Organizations with more extended change management programs.

Limited Support Warning

Note: If you experience reasoning quality issues with a “Basic Reasoning Engine” model version that’s behind the “Standard” release, our support team won’t be able to help.

We focus all our ML services for prompt tuning & enhancements on the latest version of the Reasoning Engine.

Selecting an Upgrade Program

If you own multiple tenants, you can mix & match programs across your tenants. For example, a customer might put their sandbox on Frontier and production on Basic.

You can change your program verison at any time to access different models. Simply go to Moveworks Setup > Core Platform > AI Assistant > Advanced Settings > Conversation Settings.

Once you pick your model tiering, the update will be reflected for your users within <5 minutes.

The current models for the upgrade program are…

Data Center	Frontier	Standard	Basic
US West	GPT-5.4	GPT-5.4	GPT-5.2
Europe	GPT-5.4	GPT-5.4	GPT-5.2
Canada	GPT-5.4	GPT-5.4	GPT-5.2
Austalia	GPT-5.4	GPT-5.4	GPT-5.2
GovCloud	GPT-5.4	GPT-5.4	GPT-5.2

Model Behavior Changelog

We will express the improved (& regressed) behaviors of these models as we load them into an agentic harness (ours being the Agentic Reasoning Engine). Generally, newer models will exhibit greater performance, stronger instruction following, and a better ability to attain the user’s goal.

We optimize our harness to work with frontier LLMs so you continue to get great performance.

GPT 5.4

Read the launch blog

(+) Stronger Instruction Following. GPT-5.4 adheres more closely to the instructions that shape your plugin’s behavior, resulting in more consistent, predictable outputs aligned to your configuration.

(+) Improved Cited Answers. When your assistant responds with information from your knowledge base, it now includes links back to the source documents far more consistently. Users get answers they can verify, not just bare claims.

(+) Improved Tool-Call Argument Filling. When your assistant calls a tool, it more accurately extracts the right information from the conversation. Whether it’s a ticket title, a PTO date, or an employee ID, GPT-5.4 more reliably finds the particular values it needs.

Tip

GPT-5.4’s improved instruction adherence means that misconfigured or misaligned instructions in your plugin could have a more pronounced effect on behavior. If you see misaligned behavior, check the tool description, slot description, and the display instructions configurations within the plugin to more finely tune the desired response.

GPT 5.2

Read the launch blog

(+) More comprehensive & accurate tool calling. GPT 5.2 is more comprehensive in the sets of tools it uses to help answer user questions. As a result, it provides a more helpful answer and is less likely to need to handoff users to internal help channels.

(+) Better Enterprise Search. GPT 5.2 works better with Enterprise Search. GPT 5.2 provides a more concise and relevant search query when using Search. This means the assistant finds what you’re looking for more efficiently, with less noise in the search process.

(+) Stronger Tool Call Post-Processing. After a tool or plugin is called by GPT 5.2, it does a better job of writing code to analyze and restructure the results.

(+) Less Hallucinations. GPT-5.2 does a better job of discerning what capabilities it does and doesn’t have and hallucinates less when conversing with the user

(-) Verbose tool responses & reasoning traces. When your Assistant executes a tool, GPT-5.2 is more likely to explain how it interpreted the results, sharing details about how your tool is structured rather than answering the question.

Tip

You can course correct for verbose tool responses by influencing the Reasoning Engine through display_instructions_for_model (docs)

GPT 5.1

Read the launch blog

(+) Improved planning & tool calling. GPT-5.1 is capable of longer-running tasks compared to the GPT-4 series counterparts. As a result, we’ve found it can preserve accuracy while taking on work than spans ~20% more tool calls.

(-) Tool Calling Accuracy. GPT-5.1 had a higher tendency to call the wrong tools and to call them with incorrect arguments, hurting the likelihood of getting work done.

(-) Latency. GPT-5.1 introduced significantly more latency into the reasoning engine over non-reasoning model alternatives like GPT-4.1

(-) Hallucinations. GPT-5.1 had a stronger tendency to hallucinate capabilities and pretend that it can do things that it can’t.

FAQ

Can I release a Reasoning Engine model version to a subset of my users?
- No. Upgrades apply to the entire tenant. Customers who need targeted testing should use a sandbox environment.
Are there pricing or packaging restrictions on the program version?
- No. You’re eligible for any version.
How much testing time do I get before the Basic deadline?
- The Frontier and Basic milestones are generally kept ~1 month apart so you have sufficient time for testing.