This is a guest post for Integrate.io written by Bill Inmon, an American computer scientist recognized as the "father of the data warehouse." Inmon wrote the first book and first magazine column about data warehousing, held the first conference about this topic, and was the first person to teach data warehousing classes.
What pray tell is a tautology? A tautology is something that, under all conditions, is true. It is kind of like gravity. You can throw a ball in the air and, for a few seconds, it seems to be suspended. But soon gravity takes hold, and the ball falls back to earth.
And it doesn’t matter if you throw the ball in the air in China, Australia, England, or Canada. Gravity is the same everywhere.
Several years ago, big data came onto the scene. The providers of big data declared they were sweeping away all the nasty old stuff and inventing a brave, new world. And the investors in Silicon Valley lined up to stuff their pocketbooks with this wondrous new technology.
But the providers of big data swept out some things they should not have thrown away. They tossed out a time-honored tautology of software that was becoming established in the computer industry. The providers looked at data warehousing and all the work involved in building a data warehouse and declared that the world was free of old, messy, hard stuff with this wonderful new thing called big data.
Let’s dive deeper into this topic.
Table of Contents
Moving unintegrated data requires advanced knowledge of coding and big data pipelines. Integrate.io simplifies the data integration process for e-commerce companies with its native out-of-the-box connectors. Schedule an intro call to learn more.
Complexities of Data Warehousing
Let’s take a look at what was thrown away. Easily the hardest part of data warehousing is the transformation of data from a siloed, unintegrated state to an integrated, unified corporate state. As a simple example of the transformation that occurs in the world of data warehousing, consider three applications that specify gender:
- Application ABC calls gender m/f
- Application B calls gender 1/0
- Application C calls gender male/female
When building an e-commerce data warehouse, the warehouse needs a single specification for gender. The analyst chooses male/female. The first two applications need a transformation of their representation of gender as data moves from the application to the data warehouse.
This transformation occurs in many different forms each day. And it is often difficult and complex to do.
Integrate.io helps e-commerce companies move unintegrated data to a warehouse via low-code/no-code ETL pipelines. The platform also performs reverseETL and fast CDC. Schedule an intro call to learn more.
What Did Big Data Providers Get Wrong?
So the big data provider says: ”Just buy my technology, and you won’t need to do all this messy and complicated stuff. Who needs a data warehouse when you can have the wonders of big data?” And a few gullible people bought into this solution.
Then they wake up one day and find they can’t use the data in their big data environment. Sure enough, they can store their data in the environment. But once there, data is indecipherable. In order to use the data, they have to subject it to a complex transformation process. That’s exactly what they wanted to avoid in the first place.
What the big data vendors didn’t understand was the transformation process was not a function of a data warehouse. If you want to do analytical processing of unintegrated e-commerce data, it doesn’t matter whether you are dealing with a data warehouse, data mart, data lake, or an ice cream sundae.
In order to do analytical processing on unintegrated data, you have to do transformation. It is a tautology that transformation is required for proper analytical analysis of unintegrated data wherever and whenever you are at.
Transformation is as inescapable as gravity.
Read more: Advantages of Data Warehouse Integration
It is true that transformation was first discovered and pioneered in the world of data warehousing. In truth, it would have been discovered whenever anyone would have attempted to do analytical processing on unintegrated data.
But to buy big data because you don’t want to do transformation? That’s like looking for ice-cold lemonade in Death Valley on a scorching July day. Lots of luck.
Bill Inmon, the father of the data warehouse, has authored 65 books. Computerworld named him one of the ten most influential people in the history of computing. Inmon's Castle Rock, Colorado-based company Forest Rim Technology helps companies hear the voice of their customers. See more at www.forestrimtech.com.