> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://help.moveworks.com/llms.txt.
> For full documentation content, see https://help.moveworks.com/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://help.moveworks.com/_mcp/server.

# Architecture Recommendations

There are a few different ways to build Content Gateway integrations:

* Using system APIs
* Using web scrapers

# Strategy 1: Use System APIs

System APIs offer a structured, programmatic way to retrieve data directly from the source system. This is the most robust approach for building Content Gateway integrations.

For this approach, you need to follow 3 steps:

1. Conduct source system API discovery (including API endpoints and authentication).
2. Create a server that can host [Content Gateway APIs](/api-reference/content-gateway/content-gateway/list-files). You can middleware tools or host your own server.
3. Return content using source system APIs every time your Gateway APIs are invoked.

# Strategy 2: Use Web Scrapers

Web scraping can be used when source systems APIs are unavailable, though it comes with significant challenges.

For this approach, you need to follow 3 steps:

1. Build a web scraper to crawl and retrieve content from source systems. You may need to use external libraries such as [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) or [Selenium](https://www.selenium.dev/) depending on your purpose.
2. Create a server that can host [Content Gateway APIs](/api-reference/content-gateway/content-gateway/list-files). You can middleware tools or host your own server.
3. Return content by scraping content from source systems every time your Gateway APIs are invoked.

# Comparison of approaches

While you can build gateway integrations either using source system APIs or web scrapers. We **highly recommend using source system APIs**, since scrapers can easily break and are unreliable.

Here is a detailed comparison of the 2 approaches:

| **Aspect**             | **Web Scraping (Cons)**                                                                                           | **Why system APIs are better**                                                                  |
| ---------------------- | ----------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| **Reliability**        | Scrapers depend on the structure of the site which can change without notice, easily breaking your integration.   | APIs are designed for structured data access with stable endpoints.                             |
| **Data Precision**     | Scraped data often includes unnecessary parsed data such as HTML headers, footers, images or Javascript snippets. | APIs provide precise, well-defined data that ensures higher Assistant accuracy.                 |
| **Performance**        | Scraping is much slower because it involves rendering and parsing HTML, JavaScript, and CSS.                      | APIs are optimized for performance, allowing efficient data retrieval.                          |
| **Scalability**        | Requires additional effort to handle rate limits, paginated data, and large datasets efficiently.                 | APIs are designed with scalability in mind, including features like rate limits and pagination. |
| **Access Permissions** | Scraping may not support authenticated access or require complex workarounds to handle web login.                 | APIs have secure authentication methods like OAuth and offer robust permission management.      |