Setup Guide

SPAN Identity Graph enables identity resolution directly inside your Snowflake account. All processing runs within Snowflake. No data leaves your environment.

Overview

SPAN clusters records that refer to the same real-world entity and assigns a stable profile_id to each resolved identity.

The Identity Graph:

Normalizes identity fields
Applies configurable matching rules
Clusters related records
Writes results back into Snowflake

SPAN runs entirely as a Snowflake Native App.

Prerequisites

Before installing SPAN, ensure you have:

Permission to install applications from Snowflake Marketplace
Access to a running Snowflake warehouse
A source table containing identity-related fields (e.g., name, email, phone)
Permission to create tables in your selected output schema

SPAN does not require:

External API access
Data export
Python environment setup
Local execution

Installation

Navigate to Snowflake Marketplace.
Search for SPAN Identity Graph.
Click Install.
Select:

A warehouse
An application database (auto-created)

Review and approve requested privileges.
Click Launch.

After installation, SPAN runs entirely inside your Snowflake account.

First-Time Setup

After launching SPAN:

1. Select a Source Table

Choose the Snowflake table containing records you want to resolve.

Example:

Each record must include a unique key column (e.g., ACCOUNT_ID).

2. Map Identity Fields

SPAN requires a field mapping so it can interpret your schema.

You will map source columns to standard identity fields.

Example mapping:

Source Column

Type

SPAN Field

FIRSTNAME

text

first_name

LASTNAME

text

last_name

USER_EMAIL

DOB

text

birthdate

PHONE

phonenumber

phone

STREET_ADDRESS

text

street_address

CITY

text

city

STATE

text

state

ZIP_CODE

text

zip_code

GENDER

text

gender

Field types determine:

Normalization rules
Matching semantics
Comparison logic

Default configurations are provided for common identity schemas.

3. Configure Matching Rules

SPAN uses deterministic blocking rules to efficiently compare records.

Example rule patterns:

first_name + last_name + birthdate + state
first_name + last_name + zip_code
email + phone
phone + gender

These rules define how records are grouped and compared during clustering.

For MVP usage, default rule sets are pre-configured. Advanced tuning can be performed later.

4. Run Identity Graph

Click Run Identity Graph.

SPAN will:

Load source data
Normalize identity fields
Apply matching rules
Cluster related records
Assign a profile_id to each resolved entity

Processing uses your selected Snowflake warehouse.

Output Location

SPAN writes results to a Snowflake table you specify.

Example:

The output table contains:

All original source columns
A new column: profile_id

profile_id represents the resolved unique entity identifier.

Source tables are never modified.

Objects Created by SPAN

Upon installation, SPAN creates:

Application database (managed)
Core processing schemas
Temporary processing tables (auto-managed)

When running the Identity Graph, SPAN creates:

Output identity table (in your selected schema)

SPAN does not:

Modify source tables
Store data externally
Create background scheduled tasks (unless explicitly configured)

Cost & Compute Notes

SPAN uses your selected Snowflake warehouse.
Compute usage depends on:
Table size
Number of blocking rules
Record similarity
No compute is consumed unless the Identity Graph is actively running.

Troubleshooting

Insufficient Privileges

Ensure:

You can read from source tables
You can create tables in the output schema

Warehouse Suspended

Resume the selected warehouse before running the graph.

Column Mapping Errors

Confirm:

Selected columns exist in the source table
The key column uniquely identifies each record

For additional troubleshooting steps, refer to the following article: Troubleshoot

Security Model

All processing occurs inside your Snowflake account.
No data leaves your environment.
SPAN operates under Snowflake RBAC.
All activity is auditable via Snowflake system logs.

Next Steps

After generating your first Identity Graph:

Validate record counts
Compare cluster sizes
Join downstream models on profile_id
Iterate on matching rules if needed

SPAN is designed to make identity resolution a governed, queryable data primitive inside Snowflake.

PreviousHow Does SPAN work?NextSecurity & Privilege Disclosure

Last updated 19 hours ago

Good afternoon

hashtagOverview

hashtagPrerequisites

hashtagInstallation

hashtagFirst-Time Setup

hashtag1. Select a Source Table

hashtag2. Map Identity Fields

hashtag3. Configure Matching Rules

hashtag4. Run Identity Graph

hashtagOutput Location

hashtagObjects Created by SPAN

hashtagCost & Compute Notes

hashtagTroubleshooting

hashtagInsufficient Privileges

hashtagWarehouse Suspended

hashtagColumn Mapping Errors

hashtagSecurity Model

hashtagNext Steps

Overview

Prerequisites

Installation

First-Time Setup

1. Select a Source Table

2. Map Identity Fields

3. Configure Matching Rules

4. Run Identity Graph

Output Location

Objects Created by SPAN

Cost & Compute Notes

Troubleshooting

Insufficient Privileges

Warehouse Suspended

Column Mapping Errors

Security Model

Next Steps