> ## Documentation Index
> Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ETL: Gmail Source

> Configure the Gmail source component in Integrate.io ETL to extract email messages, threads, and labels using OAuth into your data pipelines.

Use the Google Mail (Gmail) source component to read CSV or Excel file attachments from a Gmail inbox and ingest them into your [Integrate.io](http://integrate.io/) ETL pipeline.

## Connection

Select an existing Google Mail (Gmail) connection or create a new one. For setup instructions, see **Allowing Integrate.io ETL access to my Google Mail (Gmail)**.

## Source Properties

The source component is configured in Step 02 of the component editor.

<Frame>
  <img src="https://mintcdn.com/integrateio/vPmzup7uAj66abcx/images/creating-packages/using-components-google-mail-gmail-source/image-1.webp?fit=max&auto=format&n=vPmzup7uAj66abcx&q=85&s=70be5362794b8af38a4abfbf2b357383" alt="Gmail source component configuration" width="1200" height="1250" data-path="images/creating-packages/using-components-google-mail-gmail-source/image-1.webp" />
</Frame>

### Gmail Query

Enter a Gmail search query to filter which emails are fetched. This field supports all standard [Gmail search operators](https://support.google.com/mail/answer/7190).

Examples:

* `from:supplier@example.com has:attachment filename:*.csv`: CSV attachments from a specific sender
* `from:@example.com has:attachment filename:*.csv`: CSV attachments from any sender at a domain
* `subject:"monthly report" has:attachment filename:*.csv`: Emails with a specific subject
* `{from:alice@example.com <from:bob@example.com>} has:attachment filename:*.csv`: OR logic across multiple senders

<Tip>A space between operators means AND, so all conditions must match. Use `OR` (uppercase) or `{}` for OR logic between values of the same operator.</Tip>

### File Type

Select the format of the attachment files to ingest:

* **CSV.** Comma-separated values
* **Excel.** `.xlsx` / `.xls` spreadsheet files

### File Contains a Header Row

Check this box if the first row of the file contains column headers. When enabled, the connector uses the header row to name the schema fields. This is checked by default.

### Load Type

Select how records are loaded on each pipeline run:

* **Full Load.** Fetches all emails matching the Gmail query on every run.
* **Incremental Load.** Fetches only emails received after a reference date. The connector appends an `after:YYYY/MM/DD` operator to your Gmail query at runtime so that only new emails are returned by the Gmail API, keeping API usage low and execution fast.

### Incremental Load Settings

When **Incremental Load** is selected, the following options appear:

**Load records.** Select the filter condition:

* `newer than`: Fetch emails received after the reference date

**Reference date.** Choose the source of the date value:

* **Fixed Date.** Select a specific calendar date using the date picker. Use this for a one-time historical backfill.
* **Variable.** Use a system or custom variable as the reference date. The recommended value for scheduled pipelines is `$package_last_successful_job_submission_timestamp`, which automatically advances the start date after each successful run.

<Warning>**Timezone note**: The Gmail `after:` operator interprets dates in **PST/PDT**, not UTC. If your variable is UTC-based, consider subtracting a 1-day buffer to avoid missing emails near the date boundary.</Warning>

## Schema

After configuring the source properties, the **Schema** section (Step 03) displays the fields available in the pipeline. These are derived from the header row of the first matching email attachment.

In addition to the columns from the file itself, the connector automatically appends the following **metadata columns** to every row:

| Column               | Description                                 | Example                                                   |
| -------------------- | ------------------------------------------- | --------------------------------------------------------- |
| email\_message\_id   | Gmail message ID of the source email        | 18d4f2e3a7b1c9d0                                          |
| email\_date          | Date the email was received (ISO 8601, UTC) | 2026-03-10T14:30:00Z                                      |
| attachment\_filename | Original filename of the attachment         | sales\_report\_march.csv                                  |
| email\_from          | Sender email address                        | [supplier@example.com](mailto:supplier@example.com)       |
| email\_to            | Recipient email address                     | [reports@yourcompany.com](mailto:reports@yourcompany.com) |
| email\_subject       | Subject line of the email                   | Monthly Sales Report                                      |
| email\_body          | Plain-text body of the email                | Please find the report attached.                          |

These metadata columns allow you to trace each row back to its source email and file, which is useful for deduplication and auditing downstream in your pipeline.
