Integrating data from MongoDB to a relational database is a challenge for any data engineer. You have a schemaless NoSQL database containing JSON objects and an SQL database with a fully defined schema. How can you integrate them without tearing your hair out? With integrate.io’s no-code cloud-based MongoDB SecurETL solution, of course!

In this tutorial, updated for 2021, learn how integrate.io performs sentiment analysis by integrating data from MongoDB and a relational database. When you're done reading this guide, you'll have a much better understanding of how to process and prepare data from MongoDB for all your analytics needs. 

Read more: Which Modern Database is Right for Your Use Case?


Integrate MongoDB with relational databases now! Schedule a integrate.io demo.


How Does MongoDB Work?

MongoDB is one of the leading open-source NoSQL databases for storing big data. If you have used this database for a while, you might notice that it keeps data in collections that store one or more JSON files. That makes it difficult to move MongoDB data to a different location for analysis or another purpose. Thankfully, tools like integrate.io make data transformation between MongoDB and other sources simple. 

MongoDB has become incredibly popular with developers over the years who use this platform to build all kinds of scalable applications. However, many of these developers struggle with data integration because the process is so complex. 

Why Would You Integrate MongoDB With a Relational Database?

MongoDB is a NoSQL solution so doesn't require a relational database management system or RDBMS. However, there might be times when you want to integrate MongoDB with a relational database. For example, if you want to generate data visualizations about information from two disparate sources. You can use these visualizations to represent that information graphically, making the data easier to interpret and analyze. 

Integrating data from MongoDB with data from a relational database, however, can be a challenge. That's why you need a tool like integrate.io to facilitate the process. 

How Does integrate.io Integrate MongoDB With Relational Databases?

integrate.io's data integration platform is the perfect fit for MongoDB. This solution lets you connect to MongoDB data stores and integrate data with other sources using a point-and-click interface, making it easier than ever to process and prepare data for big data analytics. With integrate.io's SecurETL solution, you can generate valuable insights for business intelligence with very little code, ideal if you lack a large data engineering team. 

integrate.io integrates data from MongoDB data stores and relational databases via its native MongoDB connector and then can move that data to a final destination for analysis via the ETL process. Use this connector to integrate data from MongoDB and relational databases like:


Tutorial: Integrating Data From MongoDB to a Relational Database

In this tutorial, Xplenty carries out data transformation via a process called sentiment analysis. The native connector will pull data from a MongoDB database on the cloud database platform Compose (formerly MongoHQ) that stores a collection of tweets about big data. Then, the connector will integrate that data with a MySQL database that runs on Amazon RDS loaded with an open-source sentiment dictionary called SentiWordNet. The result? It suddenly becomes much easier to visualize the tweet data. 

As you can see below, MongoDB stores tweet as JSON objects:

{    "created_at":"Tue Jul 02 09:33:13 +0000 2013",  "id":351996975274668000,  "id_str":"351996975274668032",  "text":"@tomassino how is the taste of #mongodb and #hadoop for breakfast?",  "source":"<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>",  "truncated":false,  "in_reply_to_status_id":null,  "in_reply_to_status_id_str":null,  "in_reply_to_user_id":14224944,  "in_reply_to_user_id_str":"14224944",  "in_reply_to_screen_name":"tomassino",  "user":{      "id":183977468,    "id_str":"183977468",    "name":"Markus Schmidberger",    "screen_name":"cloudHPC",    "location":"Munich, Bavaria, Germany",   ... }

SentiWordNet is stored as a MySQL table:

 a  00001740  0.125  0  able#1  (usually followed by 'to') having the necessary means or skill or know-how or authority to do something; "able to swim"; "she was able to program her computer"; "we were at last able to buy a car"; "able to get a grant for the project"
 a  00002098  0  0.75  unable#1  (usually followed by 'to') not having the necessary means or skill or know-how; "unable to get to town without a car"; "unable to obtain funds"
 a  00002312  0  0  dorsal#2 abaxial#1  facing away from the axis of an organ or organism; "the abaxial surface of a leaf is the underside or side facing away from the stem"

Here's the schema used in this experiment:

  1. POS
  2. ID
  3. PosScore
  4. NegScore
  5. SynsetTerms
  6. Gloss

Tutorial: Integrating MongoDB and a Relational Database in 8 Simple Steps

The Xplenty dataflow loads data from both MongoDB (via Compose) and the MySQL database, split that data into keywords, joins all the data, calculates the sentiment score per tweet, and finally, stores the results. Xplenty's native connector does this in just eight steps:

thumbnail image

  1. mongo_source - loads Twitter data from MongoDB; in this case, from MongoHQ. Please see our previous post on how to connect Xplenty with MongoHQ.

    thumbnail image

  2. mongo_select - removes unnecessary fields and separates tweets into keywords using the Flatten(TOKENIZE(text)) functions.

    thumbnail image

  3. mysql_source - loads sentiment data from the relational database. The green button at the top right-hand side automatically detects the schema.

    thumbnail image

  4. mysql_select - removes irrelevant fields and cleans sentiment keywords using the  Flatten(TOKENIZE(REPLACE(synset_terms,'#\\d+',''))) functions. The REPLACE function removes hashtags and numbers from the keywords (e.g. changes ‘unable#1’ to ‘unable’). Since one database entry may contain several conjugations of the same keyword, the Flatten and TOKENIZE functions split the text per keyword.

    thumbnail image

  5. join_keywords - joins data from both sources by keyword. Left join is used so that keywords from tweets that are not available in the sentiment dictionary are still available following the join.

    thumbnail image

  6. aggregate_sentiment - calculates positive and negative scores per tweet. The user is kept only for reference.

    thumbnail image

  7. selectsentimentscore - calculates the final sentiment score by subtracting the negative score from the positive score.

    thumbnail image

  8. output_s3 - stores results on Xplenty’s Amazon S3 account. 

    thumbnail image

Tutorial: Results of Integrating MongoDB With a Relational Database

Below is a distribution chart of sentiments for the tweets used in this experiment. As you can see, the distribution evenly centers on the neutral score of 0, while sentiments proved to be slightly more positive than negative.

How Xplenty Helps When Learning How to Integrate MongoDB with Relational Databases

The workflow above is just one example of how Xplenty integrates MongoDB and relational databases. By integrating data from these sources and moving it to a final destination for analytics, you can generate even more data insights into the topics your business cares about, whether that's social media sentiment or something else entirely.

The great thing about integrating MongoDB and relational databases with Xplenty is that the process requires little code, so you can build powerful data pipelines even if you lack a large data engineering team. With an incredible range of pre-built connectors (like the one used above), you can integrate data from disparate sources quickly and then move it all to a final destination like a data warehouse for analytics. 

There are other benefits of using Xplenty to integrate MongoDB data and relational database data. The platform cleanses data to reduce its size before the integration process starts and can schedule data integration jobs for a day and time that suits you. That provides you with much more flexibility. You can also monitor all jobs on a user-friendly dashboard. 

Integrating data from MongoDB and a relational database poses some challenges. As data governance frameworks become stricter, you need a platform that adheres to data privacy regulations such as GDPR, CCPA, and HIPAA. Xplenty does exactly that, preventing expensive penalties for data protection non-compliance. 

Other Xplenty benefits include:

  • World-class customer service
  • A streamlined pricing model that charges you for the number of data connectors you require and not for the amount of data you consume. (The latter often works out more expensive for smaller data integration projects.)
  • Other native data connectors include a powerful Salesforce-to-Salesforce connector.
  • Simple workflow creation that makes it easy to define discrepancies between tasks. 
  • A powerful REST API.

"Most of Xplenty's customer success engineers are amazing," says a verified current user of Xplenty on the review website G2.com. "We were very impressed with Teri during our trial. She was one reason we decided to partner with Xplenty. Edsel a customer success engineer is smart, and great at problem-solving, best of all, he is dependable and goes above and beyond to help us."

"Xplenty is an intuitive ETL platform that does the basics and can handle a lot of complex stuff too," says another user.

Do you want to know how to integrate MongoDB with relational databases? Xplenty’s powerful data integration platform optimizes the entire process. Schedule a demo today!