User Ingestion Processors Guide

Processors are data transformation tools that help clean, filter, and enrich user data during identity ingestion

Overview

Processors are optional data transformation steps in your User Import Flow. They clean, filter, and enhance user data during ingestion from source systems (Okta, Active Directory, Workday, etc.).

Note: Most processors operate on individual source data before merging, unless specifically stated otherwise. Processors execute in the order listed.

Quick Reference
How to Configure
Available Processors
Best Practices
Common Scenarios
Troubleshooting
Rule Syntax Reference
Limitations

Quick Reference

Processor	Use Case
User Filter Processor	Remove users matching specific field values
Filter Rule Post Processor	Complex filtering with multiple conditions
User Timezone Processor	Auto-populate timezone from location
User Password Meta Info Processor	Fill missing password expiration dates
User Geocode Processor	Add coordinates for analytics and location-based content retrieval
DSL First Match Dedupe Processor	Deduplicate users across sources
Unified Resolve Manager Processor	Link manager-employee hierarchy

How to Configure

Navigation

Navigate to Import Users
Select your source(s) on the Connectors page
Proceed to Configure Selected Sources
Click Advanced Mode

In Advanced Mode

Processors to Apply: Add transformation processors that run during ingestion. Processors execute in the order listed.

Filter and Attribute List: Control which records and fields are imported at the source level before any processors run.

Available Processors

Filter Users by Field Value

Processor Name: User Filter Processor

Excludes users from ingestion when a specified field matches any value in your exclusion list. This is a simple, single-field filter that performs exact matching.

Use Cases: Exclude terminated employees, contractors, test accounts, or specific departments based on a single field value.

Configuration:

Field	Description	Example
Filter Key	User field to check	`employment_status`
Filter List	Values to exclude (comma-separated)	`Terminated, Inactive`

Examples:

Exclude inactive employees:
  Filter Key: employment_status
  Filter List: Terminated, Inactive, On Leave

Exclude non-employees:
  Filter Key: user_type
  Filter List: Contractor, Temp, External

Filter Users by Rule

Processor Name: Filter Rule Post Processor

Excludes users from ingestion based on complex conditional logic. Unlike the simple Field Value filter, this processor allows you to combine multiple field conditions using AND/OR logic, perform date comparisons, and apply sophisticated filtering rules.

Use Cases: Apply multi-condition filtering (e.g., "active AND hired after date"), date-based filtering, or any logic requiring multiple field comparisons.

Configuration:

Field	Description
Filter Condition (DSL)	Rule determining which users to keep

Examples:

Keep only active employees:
  employment_status == "Active"

Active employees in specific departments:
  employment_status == "Active" AND department IN ["Engineering", "Sales"]

Keep only users with company email:
  "@company.com" IN email_addr

Keep only users with employee IDs:
  employee_id != ""

💡 Tip: See Rule Syntax Reference for complete syntax.

Remove Duplicate Users

Processor Name: DSL First Match Dedupe Processor

When the same user appears multiple times (identified by your Index Key, typically email), this processor evaluates all duplicate records and keeps only the first one that matches your filter condition. All other duplicates are discarded. This operates across all sources after they're merged together.

Use Cases: Multiple integrations provide overlapping users, need to select which source's data to prioritize, ensure each user appears only once in final roster.

⚠️ Note: Can be attached to any source - operates on merged data from all sources after ingestion.

Configuration:

Field	Description	Example
Index Key	Field to identify duplicates	`email_addr`
Filter Condition (DSL)	Rule to select which duplicate to keep	`record.employee_id != ""`
Lowercase	Convert index key to lowercase	`true` (recommended for emails)

Common Rules:

Prefer active users:
  record.employment_status == "Active"

Prefer records with employee ID:
  record.employee_id != ""

⚠️ Important: Always set Lowercase to true when using email_addr as Index Key.

Set User Timezone

Processor Name: User Timezone Processor

Automatically infers and populates the user's timezone field by analyzing their location information (city, state, country). The processor uses geographic data to determine the most likely timezone for each user's location.

Use Cases: Source system doesn't provide timezone field, need consistent timezone data for time-based notifications and scheduling.

Configuration: No configuration needed - just add the processor. It automatically reads from standard location fields.

Calculate Password Expiration

Processor Name: User Password Meta Info Processor

Fills in missing password date information using your organization's password policy configuration. This processor operates on two fields in the user record: password_last_changed and password_expires.

What Fields It Uses:

Input fields: password_meta_info.password_last_changed (date), password_meta_info.password_expires (date)
Password policy: Uses your org's configured password_expiry_in_days setting
Output: Populates whichever field is missing

How It Works:

Scenario	What It Does	Calculation
`password_meta_info.password_last_changed` exists, `password_meta_info.password_expires` empty	Calculates expiry date	`password_expires = password_last_changed + password_expiry_in_days`
`password_meta_info.password_expires` exists, `password_meta_info.password_last_changed` empty	Calculates last changed date	`password_last_changed = password_expires - password_expiry_in_days`
Both fields populated	No action taken	(already complete)
Both fields empty	No action taken	(insufficient data)

Configuration:

Field	Description	Default	Use Case
Offset Days	Adjustment to password policy duration	`0`	Use if source system's policy differs from org config (e.g., +5 or -5 days)

Use Cases:

Source provides only one of the two password fields
Need complete password data for expiry notifications and password reset workflows
Source system password policy differs slightly from Moveworks org configuration

Add Location Coordinates

Processor Name: User Geocode Processor

Enriches user records with geographic coordinates (latitude/longitude) by geocoding their location information. The processor constructs a location query from specified fields, sends it to a geocoding service, and adds the resulting coordinates to the user's geocodes field.

What Fields It Uses:

Input: Any combination of location fields you specify (typically country_code, state, city)
Output: Populates geocodes field with latitude/longitude data

Use Cases:

Enable location-based analytics and reporting
Support features that require geographic coordinates
Enrich user profiles with precise location data

⚠️ Important: Attach to the source that contains the location fields you want to geocode.

Performance Note: Makes external API calls for geocoding - may slow ingestion for large user sets.

Configuration:

Field	Description	Example
Location Fields	Fields for geocoding	`country_code, state, city`

Resolve Manager Relationships

Processor Name: Unified Resolve Manager Processor

Establishes manager-employee relationships by resolving manager email addresses to internal user IDs. This processor builds an index of all users (email → ID), then replaces each user's manager_email field value with the corresponding manager's internal ID, enabling proper organizational hierarchy.

What Fields It Uses:

Input: manager_email (manager's email address)
Index built from: email_addr (all users' emails)
Output: Replaces manager_email value with manager's internal identifier

How It Works:

Builds an index mapping every user's email address to their internal ID
For each user record, looks up their manager_email in the index
Replaces the email with the manager's internal ID
Result: Proper manager-employee links throughout the organization

Use Cases:

Source provides manager email instead of manager ID
Need to build organizational reporting hierarchy
Manager data comes from different source than employee data

⚠️ Note: Can be attached to any source - operates on all users after merge. Add AFTER deduplication to ensure manager links resolve correctly.

Configuration: No configuration needed - just add the processor.

Best Practices

1. Filter Early

Add filter processors before enrichment (like geocoding) to reduce processing time.

✅ Good Order:
  1. Filter Users by Field Value (remove terminated)
  2. Set User Timezone
  3. Add Location Coordinates

❌ Bad Order:
  1. Add Location Coordinates (slow)
  2. Filter Users by Field Value (wastes processing)

2. Deduplicate Before Manager Resolution

If using both processors, always apply deduplication first.

✅ Correct Order:
  1. Remove Duplicate Users
  2. Resolve Manager Relationships

❌ Incorrect Order:
  1. Resolve Manager Relationships
  2. Remove Duplicate Users

3. Use Lowercase for Email Deduplication

When deduplicating by email, always set Lowercase to true.

✅ Correct:
  Index Key: email_addr
  Lowercase: true

4. Attach Geocode to Source with Location Data

Add the geocode processor to the source that has location fields (country_code, state, city).

5. Test with Sample Data First

Configure processor on test integration
Run ingestion with small sample
Verify results match expectations
Apply to production

Common Scenarios

Scenario 1: Basic Filtering

Goal: Exclude terminated and inactive users from Okta.

Steps:

Import Users → Select Okta → Advanced Mode
In Processors to Apply, add: Filter Users by Field Value
Configure: Filter Key: employment_status, Filter List: Terminated, Inactive

Scenario 2: Multi-Source with Deduplication

Goal: Use both Okta and Workday, preferring records with employee IDs.

Okta Source:

Import Users → Select Okta → Advanced Mode
Add: Set User Timezone

Workday Source:

Import Users → Select Workday → Advanced Mode
Add: Filter Users by Field Value
- Filter Key: worker_type, Filter List: Contractor, Temp

Either Source (Deduplication):

Add: Remove Duplicate Users
- Index Key: email_addr
- Filter Condition: record.employee_id != ""
- Lowercase: true

Scenario 3: Complex Filtering

Goal: Keep only active, full-time employees with company email addresses.

Steps:

Import Users → Select source → Advanced Mode
Add: Filter Users by Rule

Configure Filter Condition:

employment_status == "Active" AND employment_type == "Full-time" AND "@company.com" IN email_addr

Scenario 4: Manager Hierarchy

Goal: Establish manager relationships when source provides manager emails.

Steps:

Import Users → Select any source → Advanced Mode
Add: Resolve Manager Relationships (no configuration needed)

Note: Add AFTER any deduplication processors.

Troubleshooting

❌ Too many users filtered out

Solution:

Review filter conditions and test with small sample
Verify field names match source data exactly (case-sensitive)
Check logical operators match intent (AND vs OR)

❌ Duplicate users still appearing

Check:

✓ Lowercase set to true for email-based deduplication
✓ Index Key matches field name exactly (case-sensitive)
✓ Filter condition correctly identifies preferred record

❌ Manager relationships not working

Check:

✓ Manager processor added AFTER deduplication
✓ Manager emails exist in ingested user data
✓ Manager email field populated in source data

❌ Rule syntax error

Check:

✓ Field names match exactly (case-sensitive)
✓ Strings in quotes: "value" not value
✓ Lists use brackets: ["value1", "value2"]

Rule Syntax Reference

Filter Users by Rule

Direct field names, no prefix needed.

Basic Comparisons

field_name == "value"         # Equal to
field_name != "value"         # Not equal to
field_name > 100              # Greater than
field_name >= 100             # Greater than or equal
field_name < 100              # Less than
field_name <= 100             # Less than or equal

List Operations

field_name IN ["value1", "value2"]          # Field is in list
field_name NOT IN ["value1", "value2"]      # Field is not in list

Text Matching

"text" IN field_name               # Substring match (text is contained in field)

Combining Conditions

condition1 AND condition2          # Both must be true
condition1 OR condition2           # Either must be true
NOT condition                      # Opposite/negation

Examples

# Keep active employees
employment_status == "Active"

# Active employees in specific departments
employment_status == "Active" AND department IN ["Engineering", "Sales"]

# Users with company email
"@company.com" IN email_addr

Remove Duplicate Users

Uses record. prefix to access fields.

Basic Comparisons

record.field_name == "value"      # Equal to
record.field_name != "value"      # Not equal to
record.field_name > 100           # Greater than
record.field_name >= 100          # Greater than or equal
record.field_name < 100           # Less than
record.field_name <= 100          # Less than or equal

List Operations

"value" IN record.field_name               # Value is in field
record.field_name IN ["value1", "value2"]  # Field is in list

Text Matching

"text" IN record.field_name            # Substring match (text is contained in field)

Examples

# Prefer active users
record.employment_status == "Active"

# Prefer records with employee ID
record.employee_id != ""

# Check nested fields
"123" IN record.snow.itsm_user_id

Limitations

Processor Limits:

Maximum 20 processors per integration source
Processors run in configured order
No processor loops or conditional execution

Rule Constraints:

Field names are case-sensitive
Changes require running ingestion to take effect

Performance Considerations:

Geocoding processors make external API calls (slower)
Large filter lists may impact performance
Test with sample data before full ingestion

Updated 17 days ago

Overview

Table of Contents

Quick Reference

How to Configure

Navigation

In Advanced Mode

Available Processors

Filter Users by Field Value

Filter Users by Rule

Remove Duplicate Users

Set User Timezone

Calculate Password Expiration

Add Location Coordinates

Resolve Manager Relationships

Best Practices

1. Filter Early

2. Deduplicate Before Manager Resolution

3. Use Lowercase for Email Deduplication

4. Attach Geocode to Source with Location Data

5. Test with Sample Data First

Common Scenarios

Scenario 1: Basic Filtering

Scenario 2: Multi-Source with Deduplication

Scenario 3: Complex Filtering

Scenario 4: Manager Hierarchy

Troubleshooting

❌ Too many users filtered out

❌ Duplicate users still appearing

❌ Manager relationships not working

❌ Rule syntax error

Rule Syntax Reference

Filter Users by Rule

Basic Comparisons

List Operations

Text Matching

Combining Conditions

Examples

Remove Duplicate Users

Basic Comparisons

List Operations

Text Matching

Examples

Limitations