> ## Documentation Index
> Fetch the complete documentation index at: https://www.integrate.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ETL: MongoDB Source

> Configure the MongoDB source component to read data from MongoDB collections and run queries in your Integrate.io ETL data pipeline.

## Connection Setup

<Frame>
  <iframe className="w-full aspect-video rounded-xl" src="https://fast.wistia.com/embed/iframe/3sv32gj84l" title="Allowing Integrate.io ETL access to MongoDB" allow="autoplay; fullscreen" allowFullScreen />
</Frame>

Integrate.io ETL can access [your MongoDB](https://www.integrate.io/blog/mongodb-etl/) on a variety of services and deployment models. This article discusses providing Integrate.io ETL access on your MongoDB and then details creating the MongoDB connection in Integrate.io ETL.

You must provide Integrate.io ETL access to MongoDB. If it is behind a firewall:

* Create a MongoDB user and grant it minimum permissions required for Integrate.io ETL to read or write data from the database.
* Allow access from [Integrate.io ETL's IP addresses](/etl/integrateio-etls-ip-list/) to MongoDB's port (Refer to [this](/etl/allowing-integrateio-etl-access-to-my-server-behind-a-firewall/) article if you'd prefer to create a reverse SSH tunnel.)

## To create a MongoDB connection In Integrate.io ETL

<Steps>
  <Step>
    Click the Connections icon (lightning bolt) on the top left menu.
  </Step>

  <Step>
    To create a connection, click **New connection**.

    <Frame>
      <img src="https://mintcdn.com/integrateio/K1OxIkBgHF64pvnH/images/connectivity-and-security/image-37.webp?fit=max&auto=format&n=K1OxIkBgHF64pvnH&q=85&s=5351ab90b68775ae186feffdc6feab46" alt="Connections page with New connection button" width="1200" height="830" data-path="images/connectivity-and-security/image-37.webp" />
    </Frame>
  </Step>

  <Step>
    Select MongoDB.

    <Frame>
      <img src="https://mintcdn.com/integrateio/K1OxIkBgHF64pvnH/images/connectivity-and-security/image-38.webp?fit=max&auto=format&n=K1OxIkBgHF64pvnH&q=85&s=4333c5a664687f97a4544d96e0372b0f" alt="Selecting MongoDB from the connection type list" width="1200" height="830" data-path="images/connectivity-and-security/image-38.webp" />
    </Frame>
  </Step>

  <Step>
    In the new Mongo DB connection window, name the connection and enter the connection information.

    <Frame>
      <img src="https://mintcdn.com/integrateio/K1OxIkBgHF64pvnH/images/connectivity-and-security/image-39.webp?fit=max&auto=format&n=K1OxIkBgHF64pvnH&q=85&s=b1e7369d620dba56238b22fe4306f1b1" alt="MongoDB connection form with hostname, port, and database fields" width="1200" height="830" data-path="images/connectivity-and-security/image-39.webp" />
    </Frame>

    * **Name** - name for the new connection
    * **User name** - database user name
    * **Password** - database user's password
    * **Hostname** - name of the host to connect to
    * **Port** - TCP port to connect to. Allow Integrate.io ETL access to this port on the specified host for connectivity
    * **Read Preference** - Determines how the connection should route read operations to members of a replica set.
    * **Connection Scheme**
      * DNS Seed List (SRV) - Connects via the mongodb+srv:// syntax (Read more [here](https://www.mongodb.com/docs/manual/reference/connection-string/#dns-seed-list-connection-format))
      * Replica Set Members - Connects via the mongodb:// syntax. Specify the replica set members hostname manually.
    * **Database** - name of database to use
    * **Authentication Database** - name of the database to use for authentication. Leave empty to use default database.
    * **Connect using SSL** - Determines whether to connect to the database using SSL. SSL allows encryption of client/server communications for increased security. *Always check SSL if connecting to Mongo Atlas.*
  </Step>

  <Step>
    You can test the MongoDB connection by clicking **Test connection**. Note that although the test may fail, a job may be able to use the connection because the cluster on which you execute the job may have access to the database, whereas the Integrate.io ETL web application may not.
  </Step>
</Steps>

## To modify MongoDB connections in Integrate.io ETL

<Steps>
  <Step>
    Click the Connections icon (lightning bolt) on the top left menu.
  </Step>

  <Step>
    Click a connection to open and modify it. Make any necessary changes, then click Test connection, and Save changes. To exit the MongoDB connection window without changes, click Back to connections (grey tab on the left side) on the MongoDB connection window.
  </Step>

  <Step>
    To delete a MongoDB connection, click on the three vertical dots on the far right of the connection listing and select the Delete connection option.

    <Frame>
      <img src="https://mintcdn.com/integrateio/K1OxIkBgHF64pvnH/images/connectivity-and-security/image-40.webp?fit=max&auto=format&n=K1OxIkBgHF64pvnH&q=85&s=124dacaa36390b519560619247c69997" alt="Delete connection option for MongoDB" width="1200" height="830" data-path="images/connectivity-and-security/image-40.webp" />
    </Frame>
  </Step>
</Steps>

<Note>
  **Note:**

  For information on connecting to MongoDB Atlas, see [here](https://www.mongodb.com/docs/atlas/connect-to-database-deployment/).
</Note>

***

<Frame>
  <iframe src="https://fast.wistia.com/embed/iframe/3sv32gj84l" allowfullscreen playinline class="wistia_embed" width="640" height="360" />
</Frame>

Use the MongoDB source component to read data stored in an MongoDB collection.

<Frame>
  <img src="https://mintcdn.com/integrateio/OwEKdS5aIKsEcmhX/images/creating-packages/using-components-mongodb-source/image-1.png?fit=max&auto=format&n=OwEKdS5aIKsEcmhX&q=85&s=f1055cb7d623b7eb9578c13c3dd6eb9d" alt="MongoDB source component in the pipeline designer" width="1200" height="828" data-path="images/creating-packages/using-components-mongodb-source/image-1.png" />
</Frame>

## Connection

Select an existing MongoDB connection or create a new one (for more information, see [Allowing Integrate.io ETL access to MongoDB](/etl/allowing-integrateio-etl-access-to-mongodb/ "Link: /etl/allowing-integrateio-etl-access-to-mongodb/").)

## Source Properties

<Frame>
  <img src="https://mintcdn.com/integrateio/OwEKdS5aIKsEcmhX/images/creating-packages/using-components-mongodb-source/image-2.png?fit=max&auto=format&n=OwEKdS5aIKsEcmhX&q=85&s=fba8bd5f04578d6f874c82b6e5fb554d" alt="MongoDB source properties with collection and filter query fields" width="991" height="547" data-path="images/creating-packages/using-components-mongodb-source/image-2.png" />
</Frame>

* **Source collection** - the collection name from which the data will be imported.
* **Filter query** - use [MongoDB extended JSON](https://www.mongodb.com/docs/manual/reference/mongodb-extended-json/ "Link: http://docs.mongodb.org/manual/reference/mongodb-extended-json/") to apply a filter on MongoDB's server side, or leave empty to query the entire collection. Note that \$ is a special character that denotes a variable, so it must be escaped by a single back-slash in your extended JSON filter. For example:
  * `{"age":{"\$gt":24}}`  - extract all documents where age is greater than 24.
    * `{"\$or":[{"price":{"\$exists":false}},{"price":{"\$eq":0}}]}` - extract all documents where price is zero or does not exist.
    * `{"timestamp":{"\$gt":{"\$date":"2014-01-01T00:00:00.000Z"}}}` - extract all documents where timestamp is greater than the date value 2014-01-01T00:00:00.000Z

## Source Schema

<Frame>
  <img src="https://mintcdn.com/integrateio/OwEKdS5aIKsEcmhX/images/creating-packages/using-components-mongodb-source/image-3.png?fit=max&auto=format&n=OwEKdS5aIKsEcmhX&q=85&s=969e5822a33c216ea76a5e0fc89ccc40" alt="MongoDB source schema with field selection and data types" width="1200" height="1157" data-path="images/creating-packages/using-components-mongodb-source/image-3.png" />
</Frame>

After defining the source collection, select the fields to use in the source.

The fields you select are the only ones pulled from the source collection.

Define the data type for the field. Use the following table when matching MongoDB data types to Integrate.io ETL data types.

| **MongoDB**    | **Integrate.io ETL** |
| :------------- | :------------------- |
| String         | String               |
| 32 Bit Integer | Integer              |
| 64 Bit Integer | Long                 |
| Double         | Double               |
| Date           | DateTime             |
| Object         | Json                 |
| Array          | Json Array           |
| Boolean        | Boolean              |
| ObjectID       | String               |

## Reading data incrementally from MongoDB

In order to read data incrementally (changes and additions) from a collection, we need a timestamp column that specifies when the data was updated (or inserted in collections where data is only inserted). In our example, this column is called "updated\_at". When reading from the source collection, we’ll use a **filter query** to only read the rows that were updated since the last time the package executed. We can use the following **filter query,** in which $last\_updated\_at is a package variable (make sure to use a single back-slash to escape the $ as is mentioned above):

`{"updated_at": {"\$gt":{"\$date": $last_updated_at}}}`

<Frame>
  <img src="https://mintcdn.com/integrateio/OwEKdS5aIKsEcmhX/images/creating-packages/using-components-mongodb-source/image-4.png?fit=max&auto=format&n=OwEKdS5aIKsEcmhX&q=85&s=45df2ab25ad2a828ad87621196bacbdb" alt="Filter query with date variable for incremental loading" width="1032" height="797" data-path="images/creating-packages/using-components-mongodb-source/image-4.png" />
</Frame>

Note that the schema detection or data preview fails when using the variable in the **filter query,** as variables are not evaluated in design time.

You can use the predefined variable \_PACKAGE\_LAST\_SUCCESSFUL\_JOB\_SUBMISSION\_TIMESTAMP which returns the submission timestamp for the last successful execution of the package as you can see in the example below as a value for the variable, or use the ExecuteSqlDatetime function to execute a query on the target database to get the max (last) value of updated\_at in the target. Wrap the variable with a CASE statement to handle empty values and to allow full load if required. Then convert the datetime timestamp to a Unix timestamp and multiply it times 1000 to result in a Unix timestamp in milliseconds like MongoDB uses.

| **Variable name** | **Expression**                                                                                                                                                                                                           |
| :---------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| full\_load        | `0`                                                                                                                                                                                                                      |
| last\_updated\_at | `CASE `   `     WHEN (COALESCE($_PACKAGE_LAST_SUCCESSFUL_JOB_SUBMISSION_TIMESTAMP,'')=='' OR $full_load==1) THEN 0  `   `    ELSE ToUnixTime(ToDate($_PACKAGE_LAST_SUCCESSFUL_JOB_SUBMISSION_TIMESTAMP)) * 1000`   `END` |

<Frame>
  <img src="https://mintcdn.com/integrateio/OwEKdS5aIKsEcmhX/images/creating-packages/using-components-mongodb-source/image-5.png?fit=max&auto=format&n=OwEKdS5aIKsEcmhX&q=85&s=3376ecec5c4eda93e709b2704d667ebd" alt="Package variables for full load flag and last updated timestamp" width="1029" height="322" data-path="images/creating-packages/using-components-mongodb-source/image-5.png" />
</Frame>

In order to store additions or changes in your database destination, make sure to mark the id column as key and change the operation type to "merge."

## Related

<CardGroup cols={2}>
  <Card title="MongoDB Destination" icon="arrow-right" href="/etl/using-components-mongodb-destination" horizontal />

  <Card title="Firewall Setup" icon="arrow-right" href="/etl/allowing-integrateio-etl-access-to-my-server-behind-a-firewall" horizontal />
</CardGroup>
