How to build integration with the API

Overview

Data APIs are a set of APIs exposed by Moveworks, allowing you to export all data points related to Moveworks AI Assistant analytics. These APIs can be consumed to build integrations with data lakes like Snowflake, SQL Server, and others, enabling the ingested data to be used for custom reporting and analytics.

This document outlines a process to develop an integration pipeline using a set of tools that can be customized or changed to fit your tech stack. In this exercise, we have used Snowflake to store the data, Workato to build ingestion pipelines & PowerBI to build custom dashboards.

Pre-requisites

Please ensure you have access according to the checklist below and are ready to set up the API integration:

  1. Access to data lake: Ensure you have a role that allows the creation of tables and upsert of data into tables.
  2. Automation capabilities: Ensure you have the ability to create workflows within an automation tool or enable REST API calls through a script.
  3. BI tools permissions: Ensure you have the correct permissions to connect BI tools with your data lake and build metrics on top of the ingested data.

Timeline view

Building tables into data lake

Each API endpoint provides data for a base objects such as conversations, interactions, plugin calls, plugin resources and users. These objects also needs to be replicated in your data lake. For this exercise we are creating our tables in snowflake.

Before we create a new table, we need to create a database in snowflake and a schema which allows us to store different tables

  1. We have created a new data base called Data_API
  2. We have created a PROD schema
Screenshot 2025-02-25 at 2.18.02 PM.png

Navigate to data bases and click on “+ Database” to add a new database and select the data base to add new schema.

Screenshot 2025-02-25 at 2.20.22 PM.png

To start creating tables, let’s start a new work sheet and placing the below SQL queries which allows us to build tables. Please ensure to replace the Database name & Schema name accordingly. Place the below SQL queries into a worksheet and start executing the SQL queries.

Screenshot 2025-02-25 at 2.36.40 PM.png
  1. Conversations

    create or replace TABLE DATA_API.PROD.DATA_API_COPILOT_CONVERSATIONS (
    	TS_LAST_UPDATED VARCHAR(16777216) COMMENT 'Last updated timestamp of the record',
    	CONVERSATION_ID VARCHAR(16777216) COMMENT 'Id identifying the unique conversation',
    	USER_RECORD_ID VARCHAR(16777216) COMMENT 'Unique id of the user',
    	TS_CREATED VARCHAR(16777216) COMMENT 'The timestamp of the first interaction in the conversation',
    	CONVERSATION_ORIGIN VARCHAR(16777216) COMMENT 'Route/origin of the conversation - DM, Ticket, Channel, Notification',
    	PRIMARY_DOMAIN VARCHAR(16777216) COMMENT 'The main detected domain in the conversation',
    );
    
  2. Interactions

    create or replace TABLE DATA_API.PROD.DATA_API_INTERACTIONS (
    	TS_LAST_UPDATED VARCHAR(16777216) COMMENT 'Last updated timestamp of the record',
    	CONVERSATION_ID VARCHAR(16777216) COMMENT 'Id identifying the unique conversation',
    	USER_RECORD_ID VARCHAR(16777216) COMMENT 'Unique id of the user',
    	INTERACTION_ID VARCHAR(16777216) COMMENT 'Id identifying the unique interaction',
    	TS_INTERACTION VARCHAR(16777216) COMMENT 'Timestamp of the interaction',
    	INTERACTION_PLATFORM VARCHAR(16777216) COMMENT 'Platform of the interaction - Slack/msteams etc.',
    	INTERACTION_TYPE VARCHAR(16777216) COMMENT 'Type of the interaction - INTERACTION_TYPE_UTTERANCE, INTERACTION_TYPE_BOT_MESSAGE etc.',
    	INTERACTION_LABEL VARCHAR(16777216) COMMENT 'LABEL for the interaction type UI form submission'
    	PARENT_INTERACTION_ID VARCHAR(16777216) COMMENT 'Parent interaction id for UI Action link click and button click interactions',
    	DATA_INTERACTION_DETAILS VARIANT COMMENT 'Details of the interaction (Json col)- content, type, domain, detail, entity etc.',
    	ACTOR VARCHAR(16777216) COMMENT 'Actor - User/Bot'
    );
    
  3. Plugin calls

    create or replace TABLE DATA_API.PROD.DATA_API_PLUGIN_CALLS (
    	TS_LAST_UPDATED VARCHAR(16777216) COMMENT 'Last updated timestamp of the record',
    	CONVERSATION_ID VARCHAR(16777216) COMMENT 'Id identifying the unique conversation related to the plugin call',
    	USER_RECORD_ID VARCHAR(16777216) COMMENT 'Unique id of the user',
    	INTERACTION_ID VARCHAR(16777216) COMMENT 'Id identifying the unique interaction related to the plugin call',
    	PLUGIN_CALL_ID VARCHAR(16777216) COMMENT 'Unique Id identifying the plugin call',
    	TS_CREATED VARCHAR(16777216) COMMENT 'Timestamp of the plugin call',
    	TS_PLUGIN_UPDATED VARCHAR(16777216) COMMENT 'Timestamp of the latest plugin call update',
    	PLUGIN_NAME VARCHAR(16777216) COMMENT 'Name of the executed Plugin - Product display name for the Native plugin and Config name for the Custom plugin',
    	PLUGIN_STATUS VARCHAR(16777216) COMMENT 'Semantic Plugin status for each of the executed Plugin call',
    	IS_PLUGIN_SERVED BOOLEAN COMMENT 'Whether or not plugin is served to the user - Plugin served or plugin waiting for user input',
    	IS_PLUGIN_USED BOOLEAN COMMENT 'Whether or not plugin is successfully executed'
    );
    
  4. Plugin resources

    create or replace TABLE DATA_API.PROD.DATA_API_PLUGIN_RESOURCES_V2 (
    	PLUGIN_RESOURCE_ID VARCHAR(16777216) COMMENT 'Plugin resource item id as the hash of system_action_id, resource_id and resource_type',
    	CONVERSATION_ID VARCHAR(16777216) COMMENT 'conversation_id identifying unique conversation (series_id)',
    	INTERACTION_ID VARCHAR(16777216) COMMENT 'interaction_id identifying interaction corresponding to the system action',
    	PLUGIN_CALL_ID VARCHAR(16777216) COMMENT 'Id for the plugin call',
    	USER_RECORD_ID VARCHAR(16777216) COMMENT 'Id for the user record',
    	RESOURCE_TYPE VARCHAR(16777216) COMMENT 'Type of Resource - FILE, FORM, KNOWLEDGE, Ticket etc.',
    	RESOURCE_ID VARCHAR(16777216) COMMENT 'Resource ID - doc_id, ticket_id, etc.',
    	DATA_RESOURCE_DETAILS VARIANT COMMENT 'Details of resources searched in the plugin - e.g., json of {name, url, domain, title, snippet}',
    	IS_CITED BOOLEAN COMMENT 'If the resource is cited in the AI Assistant response',
    	TS_LAST_UPDATED VARCHAR(16777216) COMMENT 'Last updated timestamp of the record',
    	TS_CREATED VARCHAR(16777216) COMMENT 'timestamp of plugin call'
    );
    
  5. Users

    create or replace TABLE DATA_API.PROD.DATA_API_COPILOT_USERS (
    	USER_RECORD_ID VARCHAR(16777216) COMMENT 'Unique id of the user',
    	FIRST_NAME VARCHAR(16777216) COMMENT 'First name of the user',
    	LAST_NAME VARCHAR(16777216) COMMENT 'Last name of the user',
    	EMAIL_ADDR VARCHAR(16777216) COMMENT 'Email address of the user',
    	USER_PREFERRED_LANGUAGE VARCHAR(16777216) COMMENT 'User preferred language configured in AI Assistant',
    	HAS_ACCESS_TO_BOT BOOLEAN COMMENT 'Whether or not the user has access to AI Assistant',
    	DATA_EXTERNAL_SYSTEM_ID_DETAILS VARCHAR(16777216) COMMENT 'All external system identities from which user data is ingested',
    	TS_LAST_UPDATED_AT VARCHAR(16777216) COMMENT 'Timestamp at which user was last updated in Moveworks',
    	TS_FIRST_INTERACTION_AT VARCHAR(16777216) COMMENT 'Timestamp at which the users first interaction was recorded',
    	TS_LAST_INTERACTION_AT VARCHAR(16777216) COMMENT 'Timestamp at which the users latest interaction was recorded',
    );
    

Once you have successfully executed these queries you should see all the tables created under your schema. Congratulations! 🎉 on completing your first step towards setting up the integration.

Screenshot 2025-02-25 at 2.34.49 PM.png

Building integration pipelines

Once the tables are created in your data lake, we need to upsert these tables with data so insights can be build on top of that. Building the pipelines is subject to your tech stack. In this exercise we have used workato to build data pipelines so we can explain visually on how to ingest the data you can also utilize the traditional method to write code and connect it programmatically with your data lake.

In workato, each endpoint needs to have a separate workflow to ingest the data. This can allow you to ingest data parallely and handle data ingestion per endpoint.

There are 2 types of ingestion jobs required

  1. Historical ingestion : This allows you to fetch historical data till date the API is available for you
  2. Daily ingestion : This allows you to fetch new data on a daily basis and keep your metrics updated.

We will use workato recipes to build this ingestion jobs. Start by setting up the historical ingestion job which can be replicated to create a daily ingestion job.

You can start creating a new recipe by clicking on the “Create” button and provide the recipe name as “Data API conversations workflow”.

Historical ingestion job

Select the trigger point as “Trigger from app” while creating the historical ingestion job.

Screenshot 2025-02-25 at 2.58.40 PM.png

To start the ingestion we need to make the first API call for the data that is being provided by the API. The data available to you will be from the current date (when you are building this integration) till Jan 1st (Data till Aug 1st will be available at a later point of time).

To backfill your data, begin by fetching responses from the current date and continue backwards until all available data is obtained from the API. Note that in the ODATA API, the Odata.nextlink attribute will be absent from the response when there is no more data to retrieve.

In workato, select the HTTP app and setup the connection with Data API. Once this is done setup the API call so all the field can be read through the API call.

Screenshot 2025-02-25 at 3.23.34 PM.png

Setting up the Connection with Data API

image.png

Adding URL to call in HTTP APP

Once the response schema is defined and identified in workato we need to also setup snowflake connection and map this data from the API to snowflake tables.

  • Select the snowflake upsert action and setup the snowflake connection
  • Once the connection is setup select the appropriate table. In this case we will select the conversations table ‘DATA_API_COPILOT_CONVERSATIONS’
  • Map each field from the API response to the table columns.
Screenshot 2025-02-25 at 4.13.07 PM.png

Setting up connection to snowflake

Screenshot 2025-02-25 at 4.13.18 PM.png

Mapping fields from API to Snowflake columns

The base components are setup for us to create a pipeline to complete the historical ingestion you can use the IF statement and REPEAT WHILE loop to go over all API responses.

image.png

Ensure to handle rate limiting and other error codes while building the connection to the API. We also need to add retries if the API is has failed (other than 200 OK response). The current rate limit on API is around 60 request per minute. This proves to be sufficient enough to handle large volume of data through an Endpoint.

Thus, your historical data pipeline is setup. Start this to backfill all historical data. Let’s now setup the daily ingestion pipeline in a similar fashion.

Daily ingestion pipeline

The daily ingestion pipeline behaves very similar to the historical pipeline in fetching the data and storing it in snowflake. The primary difference here is to fetch data on a daily basis. Data API SLA mentions 48 hours of time before the new data is available read more about this here.

To setup a daily ingestion pipeline in workato, you can setup the trigger to “Run on a schedule”

image.png

To call the API on a daily basis we need to fetch the data by the last updated time field. This allows you to fetch the recent data as well as update the historical records with new information.

Append the API URL with $filter query parameter and use the below condition to fetch records from a certain date.

https://api.moveworks.ai/export/v1beta2/records/conversations?$filter=last_updated_time eq '2025-02-17T05:09:09.028Z' (Replace the data with the current date)
Screenshot 2025-02-26 at 3.04.16 PM.png

The other components of the pipeline still remain the same. Once the API calls have been modified your daily ingestion job is also created. Run this on a daily basis to fetch new data into your data lake.

The process we followed above was for conversations endpoint. Data API also exposes 4 more endpoints such as Interactions, Plugin calls, Plugin resources & Users. Replicate the same process for them and ingest data into your tables.

Connecting BI tools with data lake

Integrating your BI tool with data lake is the final step of the integration. In this exercise we are using PowerBI along with Snowflake to build metrics on top of the ingested data. All of the standard BI tools do support direct connectors with Snowflake or any other SQL based data storage system.

In order to connect PowerBI with snowflake we will be selecting the snowflake connector available in PowerBI and setup a live connection with the database.

Screenshot 2025-02-26 at 3.13.59 PM.png

Add the appropriate credentials while setting up the connector. Please take a note of our auth credentials and ensure the account has access to fetch data from the data lake and import into PowerBI.

Once the Data is imported the process to build visualizations is pretty simple, you just need to select the type of metric & visualization and power it from a attribute from the ingested data.

image.png

To go through different ideas on what can be built using the ingested data, read more here.