Starter Code

View as Markdown

Overview

The Content Gateway Starter Code gives you a working Python server that implements the full Content Gateway protocol. Run it immediately to verify connectivity, then edit it to connect your source system.

What’s included:

FilePurpose
content_gateway.pyDemo server and integration starting point
validate.pySchema validator. Run against any live server to confirm responses conform to the Content Gateway API
requirements.txtPython dependencies (flask, requests)
.env.exampleTemplate for the environment variables you’ll need to set
openapi.jsonFull Content Gateway API spec

Prerequisites

  • Python 3.10+
  • pip
  • A publicly reachable HTTPS hosting environment (see Deployment)

What’s handled vs. what you implement

The starter code handles the full Content Gateway protocol layer. You never write that. What you own is the source layer: logic that is inherently specific to your system.

LayerWhat it coversWho writes it
ProtocolOData pagination ($skip/$top/@odata.nextLink), Bearer auth validation, error response shapesDone. Don’t touch
Rate-limit headersThe add_rate_limit_headers hook is in place, but values are commented out by defaultUncomment and wire to your real rate-limit budget when you go to production. See the Operational Guide for behavior
SourceCalling your API, field mapping, handling your authYou

Two patterns the starter code cannot pre-build for you:

  • Multi-source enumeration: most enterprise systems (SharePoint, Confluence, Google Drive, Zendesk) spread content across sites, spaces, shared drives, or brands. There is no single “list all documents” endpoint. You enumerate those containers and flatten into one stream inside fetch_files_from_source. The same applies to fetch_users_from_source and fetch_groups_from_source if your identity data spans multiple directories.

  • Permission inheritance: many systems store access rules on folders or spaces, not individual documents. You resolve the effective permissions inside fetch_permissions_for_file and return them as a flat list. If your source supports bulk ACL fetch, strongly prefer that over per-document live walks; per-file calls multiply your first-sync load by 2-4x. See the docstring on fetch_permissions_for_file for the recommended bulk-cache pattern.

Each source function in the file has a docstring that explains which of these applies and what to do about it. For the access-control model itself (ReBAC graph traversal, the {type: GROUP, id: "*"} wildcard requirement, VIEW-only action), see How Permissions Work.


Step 1: Verify connectivity with the demo server

Before connecting your real source system, confirm that Moveworks can reach your gateway at all. The base content_gateway.py returns built-in sample data with no external dependencies.

$# Clone the repo
$git clone https://github.com/moveworks/gateway.git
$cd gateway/starter-code
$
$# Install dependencies
$pip install -r requirements.txt
$
$# Generate a gateway API key (save this. You'll need it in Moveworks Setup)
>python -c "import secrets; print(secrets.token_hex(32))"
>
># Start the server
>GATEWAY_API_KEY=<your-generated-key> python content_gateway.py

The server starts on port 5001. Before exposing it to Moveworks, validate that all endpoints are returning the correct shape:

$GATEWAY_API_KEY=<your-generated-key> python validate.py --rebac

See Verifying Your Build for a full explanation of what the validator checks. Once all checks pass, expose the server over HTTPS (see Deployment) and follow the Connecting Your Gateway to Moveworks guide. If that succeeds, proceed to Step 2.


Step 2: Connect your source system

Edit content_gateway.py directly. The file has three labeled sections:

Section 1: Configuration

Set SOURCE_API_BASE_URL to your API’s base URL. Then uncomment the _source_headers() block that matches your auth method. Bearer token, API key header, OAuth2 client credentials, or no auth. One change there applies everywhere.

Section 2: Source functions

Implement the fetch_* functions. These are the only functions you need to write. They call your source API and return raw data. Read the docstring on each function before implementing it. The docstrings explain what to return, what pagination patterns to handle, and whether your system requires multi-source enumeration or permission inheritance.

Section 3: Mapper functions

Update the field names in map_item_to_node (and map_item_to_user, map_item_to_group if you sync identity) to match your API’s response shape. Search the file for # TODO comments.

Set credentials and run:

$cp .env.example .env # fill in your values
$export $(cat .env | xargs)
$python content_gateway.py

Prefer to edit by hand with API docs open?

That is the intended workflow. Open your source system’s API documentation alongside content_gateway.py and work through each fetch_* function in Section 2. The docstrings tell you exactly what each function needs to return.


Environment Variables

VariableRequiredDescription
GATEWAY_API_KEYYesThe API key Moveworks will use to call your gateway. You generate this value
SOURCE_API_BASE_URLYesYour source system’s API base URL
SOURCE_API_KEYDependsYour source system credential. Rename to match your system
PORTNoOverride the default port (default: 5001)
DEMO_TEST_USER_EMAILSNoDemo mode only. Comma-separated emails injected as sample users with access to kb-001kb-007. Set this to your real Moveworks user email when testing the demo against a real tenant — otherwise the hardcoded sample identities won’t match any real Moveworks user and search results will be empty

Copy .env.example to .env and fill in your values for local development. In production, use your platform’s secret management (AWS Secrets Manager, Azure Key Vault, Heroku config vars, etc.). Never commit credentials to source control.

Testing the demo against a real Moveworks tenant

The demo’s sample users (sarah.chen@acmecorp.internal, etc.) won’t match any real Moveworks user identity, so connecting this server to a real tenant as-is will ingest content successfully but search results will be empty.

To test end-to-end:

$DEMO_TEST_USER_EMAILS=you@yourcompany.com \
>GATEWAY_API_KEY=<your-key> \
>python content_gateway.py

Your email is injected as a sample user with access to kb-001 through kb-007 (via group-it-staff and its nested membership in group-all-employees). You will NOT have access to kb-008 (HR), kb-009, or kb-010 (Executives), so you can verify permission enforcement end-to-end: those documents should be hidden from your search results even though they’re ingested.

For the full end-to-end test sequence (HTTPS exposure, Moveworks Setup connector creation, ingestion verification, search testing), see the starter-code README on GitHub.


Deployment

Moveworks polls your gateway on a schedule, so it must be reachable over HTTPS from the internet. Common hosting options:

PlatformHow to deploy
AWS LambdaWrap with a WSGI adapter (e.g. mangum) and deploy with handler as the entry point
Azure App ServicePush the repo to App Service and set environment variables in the Application Settings panel
Herokuheroku create && git push heroku main. Set env vars with heroku config:set
Any container / VMThe server is a standard Flask app. Run it behind nginx or any reverse proxy

For local development and testing, ngrok or Cloudflare Tunnel can expose your local server over HTTPS temporarily.


Reference

  • Content Gateway Overview: permission model, sync pattern, capacity planning, common pitfalls, and the full operational FAQ
  • Supported MIME Types: file formats the gateway can serve, plus the 25 MB size cap
  • Authentication: how Moveworks authenticates to your gateway
  • Errors: the standard error format your gateway should return