When to use these tools
- You want an AI assistant to build a new package from a natural-language prompt.
- You want the agent to clone an existing pipeline as a template, then swap in different connections, tables, or paths.
- You want the agent to add, edit, or remove components on an existing pipeline without opening the package designer.
- You want the agent to rename a package or define its public variables in place.
- You want the agent to fix a broken edge or component without opening the package designer.
- You want the agent to detect the columns of a CSV or other delimited file on an SFTP, S3, or GCS connection before wiring it into a flow.
- You want the agent to archive packages that failed an authoring attempt so they don’t clutter the active list.
Tool reference
create_package (mutation)
Mints a new package from a fulldata_flow_json spec. Components and edges are persisted in a single transaction — partial packages are never written.
| Argument | Type | Required | Notes |
|---|---|---|---|
name | string | Yes | Package name shown in the dashboard. |
flow_type | string | Yes | dataflow or workflow. |
components | array | Yes | Array of single-key wrapper hashes (see below). |
edges | array | Yes | Array of edge hashes referencing component ids. |
workspace_id | integer | No | Must belong to the calling account. Omit to create the package as Unassigned. |
description | string | No | Free-form, up to 4096 characters. |
flow_version | string | No | Defaults to 2.0.0 (the modern package designer). |
variables is not accepted. Packages that need package-level variables must be created through the REST API.
Component shape
Each component is a single-key wrapper hash. The key is the wrapper type (for exampledatabase_source_component), and the inner hash carries the configuration. Common fields:
id— stable identifier referenced by edges. Convention iscomponent-<hex>. Omit to let the server generate one.name— human-readable label shown on the canvas.xy—[x, y]position pair.alias— data-flow alias. Sources expose this; destinations consume it viainput_alias.connection— connection metadata block. Uses the generic keyconnection, not type-specific keys. Shape:{ "id": <integer>, "name": "<display name>", "type": "<connection-type-slug>" }. Withoutnameandtype, the dashboard can’t render the connection chip.<type>_connection_id— string form of the connection id (for example"cloud_storage_connection_id": "47376").specificComponentType— vendor sub-type (for examplesftp_source_component,mysql_source_component). Required when the wrapper covers multiple vendors so the dashboard renders the correct icon and form.schema— nested object{ "fields": [{ name, alias, data_type }, ...] }. Each field needs analias.
Edge shape
Edges reference components by theirid field:
id and label are optional — the server generates them when absent.
Best practice: clone from an existing package
The safest way to compose a validdata_flow_json is to read a similar existing package first:
- Call
list_packagesand find a pipeline with the same source and destination connection types. - Call
get_package(<that_id>, include_full_graph: true)and inspect itsdata_flow_json. - Use the result as a template. Swap in your own connection ids and table or file paths. Keep the structural keys.
describe_component_type output alone is error-prone — it surfaces attr_accessor property names but can’t expose nested sub-shapes (like schema.fields) or renderer conventions like specificComponentType. Pair it with a working template whenever possible.
Returns
On success:{ package_id, package_version, name, flow_type, flow_version, workspace_id, created_at, warnings }. The warnings array lists shape fixups the server applied (for example rewriting a vendor-named wrapper to the canonical wrapper plus specificComponentType).
On failure: { error: "..." } — workspace not in account, name too long, save failure, and so on.
list_component_types (read)
Enumerates every component registered in the platform catalog. Returns one entry per concrete subclass with:name— internal name (for examplemysql_source_component). Pass this todescribe_component_type.category—source,transformation, ordestination.class_name— Ruby class name, informational.description— header docstring from the component’s source file.wrapper_key— the outer key to use indata_flow_json. May benullfor transformations or unmapped classes.specific_component_type— value for the innerspecificComponentTypefield. May benullwhen the vendor has its own top-level wrapper.
category argument filters the result.
Use this when the account has no similar package to clone from. For mature accounts, get_package of an existing pipeline is often enough.
describe_component_type (read)
Per-component introspection by internal name. Returns:properties— writable attribute names extracted fromattr_accessordeclarations.required— attribute names with presence validators.description— header docstring.example— fixture JSON example, ornullif none exists.wrapper_key,specific_component_type,valid_specific_types,wrapper_example— copy-paste-ready wrapper recipe.
type (the name from list_component_types). Returns { error: "unknown component type: <type>" } for unknown names.
update_package_edges (mutation)
Replaces a package’s edges array wholesale. Pass the complete desired edges array — existing edges are overwritten.| Argument | Type | Required | Notes |
|---|---|---|---|
package_id | integer | Yes | Package to update. |
edges | array | Yes | Full edges array. Each edge needs at minimum source and target matching a component id in the current package. |
source and target against the package’s current components. If any edge references an unknown component, the entire call is rejected — no partial writes. Each write bumps the package version via PaperTrail and is reversible through the existing UI history.
Use this when validate_package surfaces an edge-related error after create_package and you need to fix wiring in place. For component-internal edits (changing a connection, adjusting a schema), use update_package_components.
remove_package_components (mutation)
Removes one or more components from an existing package’s flow by componentname. Any edge whose source or target was a removed component is dropped at the same time, so the package never ends up with dangling edges.
| Argument | Type | Required | Notes |
|---|---|---|---|
package_id | integer | Yes | Package to update. |
component_names | array | Yes | Inner name values from the package’s components. The same key update_package_components targets. |
add_package_components. To edit one in place, use update_package_components. To archive the entire package, use delete_package.
Returns { package_id, components_removed, edges_removed, new_version, updated_at } on success.
rename_package (mutation)
Renames a package and, optionally, updates its description. Metadata-only.data_flow_json (components and edges) is left untouched.
| Argument | Type | Required | Notes |
|---|---|---|---|
package_id | integer | Yes | Package to rename. |
name | string | Yes | New package name (3–128 characters). |
description | string | No | Free-form, up to 1024 characters. |
{ package_id, previous_name, name, updated_at } on success.
manage_package_variables (mutation)
Defines, updates, or removes a package’s public variables. Public variables are the named defaults a package exposes; per-run overrides are still passed throughrun_package’s variables argument.
| Argument | Type | Required | Notes |
|---|---|---|---|
package_id | integer | Yes | Package to update. |
set | object | At least one of set or remove | Hash of { variable_name => default_value } to add or update. Values are stored as given; for Pig string literals, follow the embedded single-quote rule used by run_package. |
remove | array | At least one of set or remove | Variable names to delete. |
set would store it as a plaintext public variable.
Returns { package_id, variables } (the full resulting public-variable map) on success.
discover_file_schema (read)
Detects the columns of a delimited file (CSV, TSV, and similar) on a cloud-storage or SFTP connection. This is the file-source equivalent ofdiscover_schema, which only handles database connections. Use it to learn a file’s columns before wiring a Select component or destination.
| Argument | Type | Required | Notes |
|---|---|---|---|
connection_id | integer | Yes | A cloud-storage or SFTP connection id from list_connections. |
path | string | Yes | File path on the connection (for example /data/customers.csv). |
delimiter | string | No | Field delimiter. Defaults to ,; use \t for TSV. |
header_row | boolean | No | true if the first row holds column names (default true). |
record_type | string | No | delimited (default) for CSV/TSV. |
record_delimiter | string | No | Defaults to new_line. |
char_encoding | string | No | Defaults to utf-8. |
bucket | string | No | Container or bucket. Default "", which is correct for SFTP since the file location lives in path. |
lines | integer | No | Sample size the importer reads (default 20, max 200). |
quote | string | No | Optional CSV quote character. |
escape | string | No | Optional CSV escape character. |
{ connection_id, connection_type, path, field_count, column_names, fields }. Use column_names to wire a Select. Copy fields into the source component’s schema.fields. The importer’s raw field objects are passed through unchanged, so nothing is lost in translation.
Each call hits the schema importer over the network and reads a sample of the actual file, so use it sparingly.
Example: detect columns on an SFTP file
delete_package (mutation)
Archives a package — same semantics as the dashboard’s Archive action. TheJob row is preserved with status archived, removed from the active package list, and remains queryable via list_packages(status: 'archived').
| Argument | Type | Required |
|---|---|---|
package_id | integer | Yes |
toggle_schedule(enabled: false).
Use this to clean up after an unsalvageable create_package attempt so failed iterations don’t accumulate as dead rows.
Returns { package_id, status: 'archived', archived_at } on success, or { error: "..." } for package-not-found or rejected transitions.
Recommended agent flow
Example: create a minimal SFTP-to-S3 package
package_id. Pass it to validate_package next, fix any errors with update_package_components or update_package_edges, then run with run_package.
Error envelopes
Mutating tools return a plain{ "error": "..." } object instead of a JSON-RPC error when the failure is expected and recoverable. Common cases:
| Tool | Envelope | Cause |
|---|---|---|
create_package | workspace not found in this account | workspace_id belongs to another account. |
create_package | record-validation message | Name too long, invalid flow_type, save failure. |
update_package_edges | unresolved component references: ... | Edge references a component id that isn’t in the package. |
delete_package | package not found | package_id doesn’t belong to the calling account. |
delete_package | archive transition rejection | An active schedule still references the package. |
remove_package_components | component(s) not found in flow: ... | A name in component_names doesn’t match any current component. The whole call is rejected. |
rename_package | record-validation message | name is too short, too long, or otherwise rejected by the model. |
manage_package_variables | provide \set` (hash) and/or `remove` (array) — nothing to do` | Both arguments were missing or empty. |
discover_file_schema | connection not found, or not a cloud-storage/SFTP connection | The id belongs to a database connection (use discover_schema instead) or is from another account. |
discover_file_schema | schema-importer error: ... | The importer couldn’t read the file (auth failure, missing file, malformed delimiter). Fix the connection or arguments and retry. |
describe_component_type | unknown component type: <name> | type argument doesn’t match any registered component. |
Audit trail
Every write performed through these tools is captured in your account’s audit history via PaperTrail. Package creates, edge updates, and archive transitions all record the calling user as the author. Component edits made byupdate_package_components are versioned and reversible through the existing package version history UI.