This is a guest post for Integrate.io written by Bill Inmon, an American computer scientist recognized as the "father of the data warehouse." Inmon wrote the first book and first magazine column about data warehousing, held the first conference about this topic, and was the first person to teach data warehousing classes.
Five things you need to know about this topic:
- Computer scientist Bill Inman argues that the data warehouse is very much alive, despite attempts to kill it or destroy its reputation.
- Initiatives designed to damage the data warehouse, such as ELT, data marts, and big data, have strengthened data warehousing architecture.
- Many vendors and IT professionals don't like the concept of data warehousing because they don't want to spend time integrating data.
- Data warehousing is the only way to remove data silos and centralize Ecommerce data.
- Integrate.io is a low-code data warehousing integration platform that makes integrating data simple.
The data warehouse is the whack-a-mole of technology. Like the carnival game where the mole sticks its head up out of a random hole and you take a whack at it, data warehousing just keeps popping up. It won’t go away. Just like the mole. Companies like yours are using it more than ever.
This whack-a-mole act that the data warehouse does is especially impressive because no vendor or organization is behind this technology. Data warehousing is supported solely by end users. There is no committee, company, or organization that sits around and makes decisions about data warehouses. Data warehousing has a life of its own.
So who has been trying to kill the data warehouse? Who has been taking swings at the ever-appearing mole that keeps randomly popping out of the hole? Learn more below!
Table of Contents
Integrate.io realizes the power and importance of the data warehouse for data analysis. Its low-code out-of-the-box connectors let you ETL data from sources to a supported warehouse in as little as a few minutes, helping you generate valuable insights about sales, customers, inventory, and more. You can also streamline the data integration process with Integrate.io's ELT, Reverse ETL, and super-fast Change Data Capture (CDC) tools. Why not try Integrate.io yourself with a 14-day free trial?
Threats to the Data Warehouse
There have been several, rather major attempts at exterminating and/or bypassing the data warehouse:
- Dimensional modeling and star joins. Ralph Kimball introduced the idea of a data mart in his 1996 book, “The Data Warehouse Toolkit." Ralph stated that you can just build a data mart directly from an application. There was no need for one of those messy and hard-to-build data warehouses when you use his approach.
- ETL changed to ELT. The big vendors of the world gave us Extract, Load, Transform (ELT), a descendant of Extract, Transform, Load (ETL). The trick with ELT was that you did the E and you did the L, and conveniently forgot to do the T. In doing so, you just copied data from one place to the next. There was no need for a data warehouse with ELT.
- Big data. Big data came along and proclaimed that you didn’t need a data warehouse. Large mainframe vendors like Cloudera and others said that with big data, there was no need for a data warehouse. You could just conveniently store your data in big data and that was it.
- Then all you needed was a data lake. There was no need to go through all that creepy and complex stuff with a data warehouse. Just dump all of your data in a data lake and that was the end of the story.
- Data mesh/data mash came along and said all you needed to do was have some fancy connections of data, and there was no need for a data warehouse.
- Data scientists disdained the data warehouse. They learned all of these statistical algorithms in school and, when they got into the real world, they spent 95% of their time wrestling with data. But these data scientists thought that data warehouses were beneath them.
- Some people thought a data warehouse was just a bunch of data squeezed together. And you got your hands dirty when you squeezed your hands together hard.
Some of these efforts to destroy the reputation of the data warehouse were very well-funded and very well-advertised. But all of them failed to kill the data warehouse.
Integrate.io's philosophy is to simplify data warehousing. No longer will you build complex big data pipelines or hire a data engineering team. Just move data to a supported warehouse with Integrate.io's native connectors and generate intelligence about your Ecommerce processes. Set up an ETL trial or an ELT trial now!
How the Data Warehouse Became Even Stronger
Some of these efforts to trash the data warehouse actually made its architecture stronger.
For example, people found that adding data marts to a data warehouse was a very good thing to do. Data marts allow you to customize data and, at the same time, improve data integrity. So, Ralph Kimball’s contribution of data marts and the dimensional model added to data warehousing was valuable.
Moreover, big data added a dimension of scalability for data warehousing that had not existed before. The data in a data warehouse with a low probability of access fit very conveniently in big data. The people that promulgated big data never saw it that way, so that was an unintended positive consequence.
Those who championed the data lake inadvertently pushed new kinds of data into the data warehouse. With the data lake, analog and IoT data, as well as textual data, found its way into the data warehouse.
So, the very people that tried to kill the data warehouse improved it!
Why Some People Hate the Data Warehouse
So, what’s the problem with data warehouses? Why do people want to kill it? Or ruin its reputation? There are a lot of reasons. But the primary one is that people don’t want to do the dreaded task of integrating data.
Data warehouses require data to be integrated. Integration is complex, risky, hard to do, imprecise, and requires research. Integrating data requires using your brain and elbow grease. And vendors just hate doing that!
Ecommerce organizations often have huge silos of information that cannot talk to each other. These silos impede analytical processing across the enterprise. The ONLY way to break these silos apart is to integrate the data in them and place the integrated data into a data warehouse.
There simply is no other way.
But vendors and most IT professionals just don’t have the backbone and/or intellect to integrate the siloed data. So the silos remain and corporate/enterprise data analysis remains an elusive, unreachable goal.
Vendors would rather walk across a bed of fiery red hot coals barefoot than have to go back and integrate data. The problem is that the major value of a data warehouse is having a foundation of integrated data.
So here lies the data warehouse in 2022. It’s still going like a game of whack-a-mole. ‘RIP’ doesn’t refer to the death of the warehouse but something else entirely. In the context of data warehousing, RIP means Resilient Information Processing!
Yes, the data warehouse lives on despite numerous efforts to kill or ignore it.
The data warehouse is very much alive! Integrate.io is a low-code data warehouse integration platform built for Ecommerce that lets you move data to a supported warehouse with its out-of-the-box connectors. No data engineering experience is required! Schedule an intro call now!
Bill Inmon, the father of the data warehouse, has authored 65 books. Computerworld named him one of the ten most influential people in the history of computing. Inmon's Castle Rock, Colorado-based company Forest Rim Technology helps companies hear the voice of their customers. See more at www.forestrimtech.com.