To ingest Addepar API data into Snowflake on a scheduled basis, configure an ETL pipeline that authenticates to the Addepar API, triggers an async report job, polls the job status endpoint until the report is ready, retrieves the report data, flattens any nested fields, and loads the result into a Snowflake table on a recurring schedule. This guide is written for data engineers and analysts at wealth management firms, family offices, and RIAs who use Addepar as their portfolio management platform and need Addepar data in Snowflake for reporting, analytics, or dbt transformations. After reading, you will have a working pipeline that pulls Addepar positions, transactions, or performance data into Snowflake automatically each day.
The key to building this pipeline correctly is understanding Addepar's async report pattern before writing any configuration. Addepar's core data endpoints do not return data on a direct GET request; they require you to trigger a report job, poll for completion, and retrieve the results using the job ID. A pipeline that skips polling will consistently load empty or incomplete data into Snowflake.
The Problem
Addepar is not a simple REST API where you call an endpoint and receive data. Core data types, including positions by date, performance for a period, and AUM breakdowns, are generated by Addepar's reporting engine through an asynchronous job pattern. Generating a positions report across thousands of accounts and securities takes time; the server cannot return results synchronously.
Teams that attempt to call the data endpoint directly, without first triggering a report job, receive an empty response or an error. Without a polling mechanism, pipelines either hard-code a sleep delay (which breaks when the report takes longer than expected) or proceed to downstream steps on incomplete data.
Addepar also returns deeply nested JSON in many endpoints. A position record may include a nested entity object containing the account name, account type, and account ID rather than exposing those as flat fields. Flattening this into a Snowflake-ready table requires a parsing step most teams do not anticipate until they see the raw response.
What You'll Need
-
Addepar API credentials: the base URL for your firm's Addepar instance (typically yourfirm.addepar.com), plus your API username and password or API key
-
Your firm ID in Addepar: required as a query parameter on most API calls
-
Target endpoint knowledge: which Addepar report or data type you want to extract (positions, transactions, AUM, performance)
-
A Snowflake account: with a destination database, schema, and target table, or the ability to create one
-
An ETL tool that supports async API polling: Integrate.io handles Addepar's trigger-poll-retrieve pattern natively through its REST API source component and workflow polling support
To learn how Integrate.io can help to automate API data integrations, reach out to our team to discuss your use case with our Sales engineer.
How to Ingest Addepar API Data Into Snowflake on a Scheduled Basis: Step-by-Step
Step 1: Identify the Addepar Endpoint and Understand Its Async Pattern
Most Addepar data types use an asynchronous job pattern rather than a synchronous GET. Knowing which pattern your target endpoint uses before building the pipeline saves you from diagnosing empty-data failures after deployment.
What to do:
- Consult the Addepar API documentation to locate the endpoint for the data type you need (positions, transactions, AUM, or performance)
- Confirm whether the endpoint uses the async job pattern: you POST to start a report job, receive a job_id in the response, GET the job status endpoint with that job_id until the status reads "completed," then GET the results endpoint using the same job_id to retrieve the actual data
- Note that simpler endpoints (account lists, entity metadata) return data synchronously and require only a single GET; identify which pattern applies to your target before proceeding
- Document the three endpoint URLs for async endpoints: the trigger endpoint, the status endpoint, and the retrieval endpoint, along with required parameters for each
Output of this step: A documented API call sequence (async or synchronous) for your target data type, with endpoint URLs and required parameters for each call.
Step 2: Configure Authentication and Store Credentials Securely
Addepar's API uses HTTP basic authentication. Every request must include your credentials and your firm's base URL. Storing these values in plain text in the pipeline configuration creates a security risk and makes credential rotation harder.
What to do:
- Locate your Addepar API username, API password, and your firm's base URL (for example, yourfirm.addepar.com)
- Store the username, password, and firm ID in your ETL tool's secrets manager rather than in the pipeline configuration itself
- Test the authenticated connection by making a GET request to a simple metadata endpoint (such as the accounts list) and confirming a 200 response with data
- Verify that the firm_id parameter is included in every request, as most Addepar endpoints require it as a query parameter alongside authentication headers
Output of this step: A working authenticated connection to the Addepar API with credentials stored securely and verified against a live endpoint.
Step 3: Trigger the Async Report Job and Capture the Job ID
The report trigger is the first of three calls in Addepar's async pattern. The job_id returned from this call is required for both the status poll and data retrieval; failing to capture it stops the pipeline at step four.
What to do:
- Send a POST request to the Addepar report trigger endpoint with the required parameters in the request body: typically the firm_id, a date range (start_date and end_date), and the report type identifier
- Capture the job_id field from the POST response body immediately; this value is not returned again by any subsequent endpoint
- Store the job_id as a variable in your pipeline so it can be passed to the status poll and retrieval calls without manual entry
Output of this step: A triggered Addepar report job with the job_id captured and stored as a pipeline variable ready for the polling step.
Where Integrate.io helps: Integrate.io's REST API source component sends the POST request to the Addepar trigger endpoint, extracts the job_id from the JSON response using a path expression you specify, and stores it as a pipeline variable automatically. That variable passes to the subsequent poll and retrieval steps without manual wiring.
Step 4: Poll the Job Status Endpoint Until the Report Is Ready
Addepar reports typically complete in 30 seconds to 5 minutes depending on date range and portfolio size. Polling at regular intervals and waiting for a confirmed "completed" status before retrieving data is what separates a reliable pipeline from one that loads empty results intermittently.
What to do:
- Send a GET request to the Addepar job status endpoint, passing the job_id captured in Step 3
- Check the status field in the response; expected values include "created," "processing," "completed," and "failed"
- Repeat the GET request every 15 to 30 seconds until the status is either "completed" or "failed"
- If the status returns "failed," fire a failure alert and halt the pipeline before attempting to retrieve data; proceeding to retrieval after a failed job returns empty or corrupt data
- Do not proceed to Step 5 until a "completed" status is confirmed
Output of this step: A confirmed "completed" status from the Addepar job status endpoint, with the job_id ready for the data retrieval call.
Where Integrate.io helps: Integrate.io's workflow supports a polling component that sends a GET request to the status endpoint, inspects the status field in the response, and loops with a configurable delay until the target value ("completed") is met. This handles Addepar's async pattern without a custom polling script.
Step 5: Retrieve the Report Data and Flatten Nested JSON Fields
Once the status is "completed," the actual data is available at the retrieval endpoint. Addepar returns JSON, often with nested objects that require flattening before the data is Snowflake-ready.
What to do:
- Send a GET request to the Addepar data retrieval endpoint using the job_id from Step 3
- Review the raw JSON response to identify nested fields; a position record, for example, may include a nested entity object with the account name, account type, and account ID inside it rather than as flat fields at the top level
- Extract nested field values using JSON path expressions in your ETL tool's transformation layer (for example, extracting entity.name from the nested entity object into a flat account_name column)
- For array fields, such as a list of lots within a position, decide whether to explode the array into multiple rows or select only the summary fields your Snowflake model uses
- Rename fields to match your target Snowflake column names at this step, before loading
Output of this step: A flat, Snowflake-ready dataset with all nested Addepar fields extracted into individual columns and renamed to match the target table schema.
Where Integrate.io helps: Integrate.io's transformation canvas includes a SELECT component with JSON parsing functions that extract nested field values by path, turning a nested entity object into flat columns, without a custom parsing script.
Step 6: Load the Data Into Snowflake and Configure the Operation Type
The load operation type determines whether records are replaced, updated, or appended. Choosing the wrong type is the most common cause of duplicate or stale records in the Snowflake destination table.
What to do:
- Connect to your Snowflake destination using your account identifier, warehouse, database, schema, and target table name
- Choose the load operation that matches the data's update pattern:
- Use overwrite if you replace the full dataset on each run (appropriate for point-in-time snapshots like today's positions)
- Use upsert (merge) if you accumulate records over time and need to update existing records by primary key (appropriate for transactions or account-level balances)
- Use append if you are building a historical log and never update existing records
- For upsert operations, set the primary key; for positions, this is typically a composite of account_id and date
- Run a test load with a limited date range before scheduling the full pipeline
Output of this step: A Snowflake table loaded with Addepar report data, with the correct operation type configured for the data's update pattern.
Step 7: Schedule the Pipeline and Set Incremental Date Parameters
A pipeline that runs on a fixed date range is only useful once. Passing dynamic date variables to the Addepar trigger call makes the pipeline self-maintaining after initial configuration.
What to do:
- Set the pipeline to run on a schedule matching the frequency your team needs the data refreshed; daily end-of-day is typical for positions, while accounts data supports intraday refresh
- Pass the current run date as a dynamic variable to the start_date and end_date parameters in the Addepar trigger call (Step 3), so the pipeline automatically requests the correct date range without manual updates
- For historical backfill, run the pipeline once with a fixed start date and end date before enabling the recurring schedule
- Configure failure alerts to fire if the Addepar report job returns "failed" status (Step 4) or if the Snowflake load does not complete successfully
Output of this step: A scheduled pipeline that automatically requests the correct date range from Addepar on each run and loads the results into Snowflake without manual intervention.
Common Mistakes to Avoid
-
Calling the Addepar data retrieval endpoint without triggering a report job first: the retrieval endpoint requires a valid, completed job_id; calling it directly without one, or with a job_id from a run that has not finished, returns an error or empty data; always start with the POST trigger call
-
Skipping the poll step and retrieving data immediately after the trigger: Addepar report generation takes time; retrieving data before the status is "completed" returns an incomplete or empty dataset; always confirm "completed" before calling the retrieval endpoint
-
Hardcoding date ranges in the API call: a pipeline with hardcoded start_date and end_date fields runs correctly the first time and returns stale data on every run after; always use dynamic date variables tied to the current run date
-
Selecting all fields from the API response without reviewing the schema: Addepar responses include internal IDs, audit timestamps, and API versioning fields that do not belong in the Snowflake analytics table; review the response schema and select only the fields your downstream models use
-
Using append mode for position data without a date filter: appending position snapshots will duplicate records if the pipeline runs more than once for the same date; use overwrite for daily snapshots or upsert with a composite primary key that includes the date column
-
Not handling the "failed" status from the Addepar job: if Addepar's reporting engine fails due to a large date range, a rate limit, or a system issue, the status returns "failed"; a pipeline that does not check for this will wait indefinitely or proceed to retrieve empty data; always add a failure branch that fires an alert and halts the pipeline
Conclusion
Ingesting Addepar API data into Snowflake on a scheduled basis requires handling Addepar's async report pattern before any loading or scheduling logic can work. The process follows a fixed sequence: identify whether your endpoint is async or synchronous, authenticate and store credentials securely, trigger the report job and capture the job_id, poll for "completed" status, retrieve and flatten the nested JSON, load to Snowflake with the operation type matched to your data's update pattern, and schedule with dynamic date variables.
Integrate.io's REST API source handles the trigger, poll, and retrieve sequence for Addepar's async endpoints without custom scripts; its Snowflake destination with upsert support and built-in scheduler complete the pipeline end to end. Once this pipeline is running for one Addepar data type, the same trigger-poll-retrieve pattern applies to other endpoints: positions, transactions, performance, or AUM. Each additional dataset requires a new configuration, not a new approach.