> ## Documentation Index
> Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ETL: Creating packages with MCP tools

> Use the Integrate.io MCP server to let Claude, Cursor, or the MCP Inspector author, edit, and clean up ETL packages without opening the web UI.

The Integrate.io MCP server exposes a set of package-authoring tools that AI clients can call to build a working ETL package end to end. Supported clients include Claude Desktop, Claude Code, Cursor, and the MCP Inspector. The agent discovers what the catalog supports, composes a dataflow JSON, creates the package, fixes wiring errors in place, and runs it. No web UI required.

These tools complement the read and run tools described in the [MCP Server overview](/etl/integrateio-mcp-server). Configure your client and mint a token there first.

## When to use these tools

* You want an AI assistant to build a new package from a natural-language prompt.
* You want the agent to clone an existing pipeline as a template, then swap in different connections, tables, or paths.
* You want the agent to fix a broken edge or component without opening the package designer.
* You want the agent to archive packages that failed an authoring attempt so they don't clutter the active list.

If you only need read or run access, stick to the inspection and execution tools on the [overview page](/etl/integrateio-mcp-server).

## Tool reference

### create\_package (mutation)

Mints a new package from a full `data_flow_json` spec. Components and edges are persisted in a single transaction — partial packages are never written.

| Argument       | Type    | Required | Notes                                                                             |
| -------------- | ------- | -------- | --------------------------------------------------------------------------------- |
| `name`         | string  | Yes      | Package name shown in the dashboard.                                              |
| `flow_type`    | string  | Yes      | `dataflow` or `workflow`.                                                         |
| `components`   | array   | Yes      | Array of single-key wrapper hashes (see below).                                   |
| `edges`        | array   | Yes      | Array of edge hashes referencing component `id`s.                                 |
| `workspace_id` | integer | No       | Must belong to the calling account. Omit to create the package as **Unassigned**. |
| `description`  | string  | No       | Free-form, up to 4096 characters.                                                 |
| `flow_version` | string  | No       | Defaults to `2.0.0` (the modern package designer).                                |

`variables` is not accepted. Packages that need package-level variables must be created through the REST API.

#### Component shape

Each component is a single-key wrapper hash. The key is the wrapper type (for example `database_source_component`), and the inner hash carries the configuration. Common fields:

* `id` — stable identifier referenced by edges. Convention is `component-<hex>`. Omit to let the server generate one.
* `name` — human-readable label shown on the canvas.
* `xy` — `[x, y]` position pair.
* `alias` — data-flow alias. Sources expose this; destinations consume it via `input_alias`.
* `connection` — connection metadata block. Uses the generic key `connection`, not type-specific keys. Shape: `{ "id": <integer>, "name": "<display name>", "type": "<connection-type-slug>" }`. Without `name` and `type`, the dashboard can't render the connection chip.
* `<type>_connection_id` — string form of the connection id (for example `"cloud_storage_connection_id": "47376"`).
* `specificComponentType` — vendor sub-type (for example `sftp_source_component`, `mysql_source_component`). Required when the wrapper covers multiple vendors so the dashboard renders the correct icon and form.
* `schema` — nested object `{ "fields": [{ name, alias, data_type }, ...] }`. Each field needs an `alias`.

#### Edge shape

Edges reference components by their `id` field:

```json theme={null}
{
  "id": "edge-abc123",
  "label": "edge-abc123",
  "source": "component-aaa111",
  "target": "component-bbb222",
  "source_index": 1,
  "order": 1
}
```

`id` and `label` are optional — the server generates them when absent.

#### Best practice: clone from an existing package

The safest way to compose a valid `data_flow_json` is to read a similar existing package first:

1. Call `list_packages` and find a pipeline with the same source and destination connection types.
2. Call `get_package(<that_id>, include_full_graph: true)` and inspect its `data_flow_json`.
3. Use the result as a template. Swap in your own connection ids and table or file paths. Keep the structural keys.

Building from `describe_component_type` output alone is error-prone — it surfaces `attr_accessor` property names but can't expose nested sub-shapes (like `schema.fields`) or renderer conventions like `specificComponentType`. Pair it with a working template whenever possible.

#### Returns

On success: `{ package_id, package_version, name, flow_type, flow_version, workspace_id, created_at, warnings }`. The `warnings` array lists shape fixups the server applied (for example rewriting a vendor-named wrapper to the canonical wrapper plus `specificComponentType`).

On failure: `{ error: "..." }` — workspace not in account, name too long, save failure, and so on.

### list\_component\_types (read)

Enumerates every component registered in the platform catalog. Returns one entry per concrete subclass with:

* `name` — internal name (for example `mysql_source_component`). Pass this to `describe_component_type`.
* `category` — `source`, `transformation`, or `destination`.
* `class_name` — Ruby class name, informational.
* `description` — header docstring from the component's source file.
* `wrapper_key` — the outer key to use in `data_flow_json`. May be `null` for transformations or unmapped classes.
* `specific_component_type` — value for the inner `specificComponentType` field. May be `null` when the vendor has its own top-level wrapper.

Optional `category` argument filters the result.

Use this when the account has no similar package to clone from. For mature accounts, `get_package` of an existing pipeline is often enough.

### describe\_component\_type (read)

Per-component introspection by internal name. Returns:

* `properties` — writable attribute names extracted from `attr_accessor` declarations.
* `required` — attribute names with presence validators.
* `description` — header docstring.
* `example` — fixture JSON example, or `null` if none exists.
* `wrapper_key`, `specific_component_type`, `valid_specific_types`, `wrapper_example` — copy-paste-ready wrapper recipe.

Required argument: `type` (the `name` from `list_component_types`). Returns `{ error: "unknown component type: <type>" }` for unknown names.

### update\_package\_edges (mutation)

Replaces a package's edges array wholesale. Pass the complete desired edges array — existing edges are overwritten.

| Argument     | Type    | Required | Notes                                                                                                                |
| ------------ | ------- | -------- | -------------------------------------------------------------------------------------------------------------------- |
| `package_id` | integer | Yes      | Package to update.                                                                                                   |
| `edges`      | array   | Yes      | Full edges array. Each edge needs at minimum `source` and `target` matching a component `id` in the current package. |

The tool pre-validates every `source` and `target` against the package's current components. If any edge references an unknown component, the entire call is rejected — no partial writes. Each write bumps the package version via PaperTrail and is reversible through the existing UI history.

Use this when `validate_package` surfaces an edge-related error after `create_package` and you need to fix wiring in place. For component-internal edits (changing a connection, adjusting a schema), use `update_package_components`.

### delete\_package (mutation)

Archives a package — same semantics as the dashboard's **Archive** action. The `Job` row is preserved with status `archived`, removed from the active package list, and remains queryable via `list_packages(status: 'archived')`.

| Argument     | Type    | Required |
| ------------ | ------- | -------- |
| `package_id` | integer | Yes      |

If any active schedule still references the package, the archive transition is rejected. Disable the schedule first with `toggle_schedule(enabled: false)`.

Use this to clean up after an unsalvageable `create_package` attempt so failed iterations don't accumulate as dead rows.

Returns `{ package_id, status: 'archived', archived_at }` on success, or `{ error: "..." }` for package-not-found or rejected transitions.

## Recommended agent flow

```text theme={null}
list_connections / discover_schema / preview_data        # understand what data is available
list_packages / get_package(<reference_id>)              # find a template
list_component_types / describe_component_type           # only if no template exists
create_package                                           # mint the new package
validate_package                                         # catch structural + per-component errors
update_package_components / update_package_edges         # fix anything validate_package flagged
delete_package                                           # archive failed attempts
run_package                                              # execute on an available cluster
get_run                                                  # poll until completed
```

## Example: create a minimal SFTP-to-S3 package

```json theme={null}
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "create_package",
    "arguments": {
      "name": "SFTP nightly drop to S3",
      "flow_type": "dataflow",
      "components": [
        {
          "cloud_storage_source_component": {
            "id": "component-sftp01",
            "name": "nightly_drop",
            "xy": [100, 100],
            "alias": "raw",
            "specificComponentType": "sftp_source_component",
            "connection": { "id": 47376, "name": "[Prod] SFTP", "type": "sftp" },
            "cloud_storage_connection_id": "47376",
            "path": "/incoming/orders/*.csv",
            "schema": {
              "fields": [
                { "name": "id",    "alias": "id",    "data_type": "string" },
                { "name": "total", "alias": "total", "data_type": "double" }
              ]
            }
          }
        },
        {
          "cloud_storage_destination_component": {
            "id": "component-s3001",
            "name": "s3_landing",
            "xy": [400, 100],
            "input_alias": "raw",
            "specificComponentType": "s3_destination_component",
            "connection": { "id": 52001, "name": "[Prod] S3", "type": "s3" },
            "cloud_storage_connection_id": "52001",
            "path": "s3://my-bucket/orders/"
          }
        }
      ],
      "edges": [
        { "source": "component-sftp01", "target": "component-s3001" }
      ]
    }
  }
}
```

A successful response returns the new `package_id`. Pass it to `validate_package` next, fix any errors with `update_package_components` or `update_package_edges`, then run with `run_package`.

## Error envelopes

Mutating tools return a plain `{ "error": "..." }` object instead of a JSON-RPC error when the failure is expected and recoverable. Common cases:

| Tool                      | Envelope                               | Cause                                                     |
| ------------------------- | -------------------------------------- | --------------------------------------------------------- |
| `create_package`          | `workspace not found in this account`  | `workspace_id` belongs to another account.                |
| `create_package`          | record-validation message              | Name too long, invalid `flow_type`, save failure.         |
| `update_package_edges`    | `unresolved component references: ...` | Edge references a component id that isn't in the package. |
| `delete_package`          | `package not found`                    | `package_id` doesn't belong to the calling account.       |
| `delete_package`          | archive transition rejection           | An active schedule still references the package.          |
| `describe_component_type` | `unknown component type: <name>`       | `type` argument doesn't match any registered component.   |

Unrecoverable failures (auth, malformed JSON-RPC) return standard JSON-RPC error responses — see the [MCP Server overview](/etl/integrateio-mcp-server) for HTTP-level errors.

## Audit trail

Every write performed through these tools is captured in your account's audit history via PaperTrail. Package creates, edge updates, and archive transitions all record the calling user as the author. Component edits made by `update_package_components` are versioned and reversible through the existing package version history UI.
