WordPress

View as Markdown

Overview

WordPress is a content management system used to publish pages and blog posts. The Classic Ingestion connector indexes publicly available pages and posts and makes them searchable in the Moveworks AI Assistant.

Access Requirements

See WordPress Access Requirements for instructions on how to connect Moveworks to your WordPress instance.

Permissions

The Classic Ingestion connector for WordPress does not mirror source permissions. All indexed content is visible to all employees in the AI Assistant search experience.

Content Types Supported within WordPress

  • Pages
  • Blog Posts

How Moveworks Integrates with WordPress

Moveworks has Indexed Search support for publicly available blog posts and pages within WordPress. The following diagram illustrates the high-level architecture of how Moveworks integrates with the WordPress Sites:

This is a live integration which means we poll for knowledge articles every four hours. This is done so that the enterprise cache is updated with relevant snippets for answers.

Our enterprise cache stores the knowledge documents and generates relevant knowledge snippets by understanding the content. This is also where we store redirect URLs in order to direct users to where the knowledge article is located and can be read.

How do we fetch knowledge articles from WordPress

We use the following APIs to fetch the knowledge articles that you want Moveworks to ingest.

Fetch all Pages

$curl --location --globoff 'https://{{wordpress_url}}/wp-json/wp/v2/pages?per_page=100&page=1&status=publish' \
>--header 'Accept: application/json' \
>--header 'Authorization: Basic Basic <base64({{wordpress_username}}:{{wordpress_password}})>'

Fetch all Posts

$curl --location --globoff 'https://{{wordpress_url}}/wp-json/wp/v2/posts?per_page=100&page=1&status=publish' \
>--header 'Accept: application/json' \
>--header 'Authorization: Basic Basic <base64({{wordpress_username}}:{{wordpress_password}})>'

Integration Scope

Content

Our knowledge ingestion engine works on ingesting the content.rendered section from the API response which is an HTML block and we ingest and snippetize the content based on this HTML.

Configuring WordPress Ingestion with Custom Mappers

WordPress requires custom mappers to map the API response to a format that Moveworks can ingest. Because the WordPress REST API returns pages and posts in a consistent JSON structure, the mapper configuration is fairly standard.

Prerequisites

  • A WordPress connector has been created with the necessary permissions. See the WordPress Access Requirements for details.
  • The WordPress REST API is accessible at https://<your-site>/wp-json/wp/v2/.

Setup Knowledge Ingestion in Advanced Mode

Start by creating a new ingestion under Search > Configure Search > Classic Ingestion > Internal Knowledge.

  1. Select the Connector created for WordPress and provide a name for the ingestion under Ingestion Name.
  2. Choose a Domain — the functional area of employee service most related to the knowledge being ingested.
  3. Toggle on Advanced Mode (top-right corner of the Setup Knowledge Bases step).
  4. Select Generic Config as the ingestion system. Generic Config allows you to ingest content from any API response by mapping the response fields to Moveworks internal attributes.

Start URLs

Enter the WordPress API endpoint for the content type you want to ingest. You will need to create a separate ingestion configuration for pages and posts.

For Pages:

https://<your-wordpress-site>/wp-json/wp/v2/pages?per_page=100&page=1&status=publish

For Posts:

https://<your-wordpress-site>/wp-json/wp/v2/posts?per_page=100&page=1&status=publish

Set the Type of content to Article List, since the WordPress API returns a flat list of articles (no folder hierarchy).

Response Mapper

The Response Mapper tells Moveworks how to traverse the API response and handle pagination. Select KNOWLEDGE_URL_TYPE_ARTICLE_LIST as the Type in Response Mapper and use the following configuration:

1{
2 "knowledge_urls": {
3 "CONDITIONAL()": {
4 "condition": "NOT (NOT (response))",
5 "on_pass": {
6 "FLATTEN()": [
7 {
8 "CONDITIONAL()": {
9 "condition": "$INTEGER(source_url.parsed_url.query_params.page) < $INTEGER(headers['x-wp-totalpages'])",
10 "on_pass": {
11 "parsed_url": {
12 "base_url": "source_url.parsed_url.base_url",
13 "query_params": {
14 "page": {
15 "EVAL()": {
16 "expression": "page + 1",
17 "args": "source_url.parsed_url.query_params"
18 }
19 },
20 "status": "source_url.parsed_url.query_params.status",
21 "per_page": "source_url.parsed_url.query_params.per_page"
22 }
23 },
24 "type": "\"ARTICLE_LIST\"",
25 "method": "\"GET\"",
26 "headers": {
27 "Accept": "\"application/json\""
28 }
29 }
30 }
31 }
32 ]
33 },
34 "on_fail": "[]"
35 }
36 },
37 "knowledge_articles": {
38 "CONDITIONAL()": {
39 "condition": "NOT(NOT(response))",
40 "on_pass": "response",
41 "on_fail": "[]"
42 }
43 }
44}

How this works:

  • knowledge_urls handles pagination by comparing the current page query parameter against the x-wp-totalpages response header that WordPress provides. If there are more pages remaining, it constructs the next page URL by incrementing page by 1. Once all pages have been fetched, pagination stops.
  • knowledge_articles passes the current page of results through to the Article Mapper.

Article Mapper

The Article Mapper maps individual WordPress post/page fields to Moveworks internal article attributes. Use the following configuration:

For Pages:

1{
2 "created_at": "$TIMECONV(date_gmt)",
3 "updated_at": "$TIMECONV(modified_gmt)",
4 "article_id": "$TEXT($INTEGER(id))",
5 "title": "title.rendered",
6 "body": "content.rendered",
7 "article_url": "link",
8 "article_state": "\"PUBLISHED\"",
9 "article_visibility": "\"VISIBLE\"",
10 "article_class": "\"CONTENT_ITEM\"",
11 "article_hierarchies": [
12 {
13 "tier_values": "[\"\"]",
14 "tier_label": "\"KNOWLEDGE_BASE\""
15 }
16 ]
17}

For Posts:

1{
2 "created_at": "$TIMECONV(date_gmt)",
3 "updated_at": "$TIMECONV(modified_gmt)",
4 "article_id": "$TEXT($INTEGER(id))",
5 "title": "title.rendered",
6 "body": "content.rendered",
7 "article_url": "link",
8 "article_state": "\"PUBLISHED\"",
9 "article_visibility": "\"VISIBLE\"",
10 "article_class": "\"CONTENT_ITEM\"",
11 "article_hierarchies": [
12 {
13 "tier_values": "[\"\"]",
14 "tier_label": "\"KNOWLEDGE_BASE\""
15 }
16 ]
17}

The Article Mapper is the same for both pages and posts since the WordPress REST API uses a consistent response structure for both content types. The key difference is the Start URL — use /wp/v2/pages for pages and /wp/v2/posts for posts.

Field mapping reference:

Moveworks AttributeWordPress API FieldDescription
created_atdate_gmtArticle creation timestamp (GMT)
updated_atmodified_gmtLast modified timestamp (GMT)
article_ididUnique WordPress post/page ID
titletitle.renderedRendered HTML title
bodycontent.renderedRendered HTML content body
article_urllinkPublic permalink URL
article_stateSet to PUBLISHED (the Start URL filters for status=publish)
article_visibilitySet to VISIBLE for all employees

Validation

Once you have saved the ingestion configuration, the ingestion pipeline will run in the background. You can track the status on the Indexed Content View.

Search for your WordPress articles by title — if they appear in the console, they have been ingested successfully.