Guest post by Bill Inmon
Bill Inmon “is an American computer scientist, recognized by many as the father of the data warehouse. Inmon wrote the first book, held the first conference, wrote the first column in a magazine, and was the first to offer classes in data warehousing.” Source: Wikipedia.
For many years now, vendors and consultants have avoided the practice of data integration. Integrating data is complex. Systems are often undocumented, which makes searching them difficult. Assumptions must be made in order to understand business decisions from long ago. Data integration requires the proverbial four-letter word no one wants to hear: work. But anything worth doing should involve a bit of sweat and tears, right?
Why Do So Many Industries Avoid Data Integration?
Vendors and consultants across industries have avoided integrating data. These professionals have a long list of excuses for not integrating data, such as:
- We didn’t invent the technology for integrating data. So, it's not worthwhile for us.
- We do ELT, not ETL. (And we conveniently forget to do T)
- There’s just too much data. We can’t handle that much data without a data warehouse, and we don't have one of those.
- We are already busy. We don’t have time to do all of that work.
- We can’t get our users to agree on anything. We're not really listening, though, either…
- We can just federate data. There is no need to unify or integrate the data.
- Integrating data is complex and hard. Who knows how to do that? Who has that kind of time?
- We threw the documentation away. Or the documentation never existed.
And the list goes on.
But there is a new excuse that we need to add to this list of excuses for not integrating data. The new excuse is: let’s use data in place. Copying data is expensive and time-consuming. Let’s just leave our data where it sits.
This sounds like a reasonable argument, but this line of thinking is anything but reasonable.
First off, when people go to unify and integrate data, the fundamental act they are performing is transforming data, not copying data. It is true that some amount of copying of data is necessary in the act of transforming data. That's incidental, collateral damage. But when people go to integrate data, a certain amount of data copying is absolutely necessary and absolutely unavoidable.
The net result of transforming data is the ability to unify and integrate data. Once data is transformed the data can be viewed as a single entity. But without transforming data you cannot have a unified view across the organization.
Why Data Integration Isn't a Choice — It's a Must
The choice is simple and basic – you either transform data and have a unified view of the data or you don’t transform the data, leave it in place, and have a fractured view of your organization. This choice is binary. It is one way or the other. It is as simple as being alive or dead. You can’t have it both ways. And your company can't afford to try.
I'm aware that the transformation of data — data integration — is difficult. It's messy. It's time-consuming, complex and full of guesswork. Copying some data has its pitfalls. But that's the price you pay to be able to look at data in a unified, integrated manner. The choice is binary and simple. You either:
- Transform data and look at it in a unified integrated manner.
- Don’t transform data and look at your data in a fractured manner.
You can have either 1 or 2. But you can’t have both.
The good news is that you don’t have to boil the ocean. Not all data needs to be transformed. And certainly, the data that needs transformation can be divided up into phases, based on the criticality of need. In fact, most data don't need to be transformed. Only the data that needs to be shared and examined in a unified, integrated manner needs to be transformed. But trying to leave data in place that needs to be transformed for the purpose of having a unified, integrated view of data simply is not a long-term winning proposition.
And there are no shortcuts.
While there are no shortcuts, you can save hundreds of developer hours with our integration tool. Schedule an intro call to learn more.
Bill Inmon, the father of the data warehouse, has authored 65 books and was named by Computerworld as one of the ten most influential people in the history of computing. Bill’s company, Forest Rim Technology, is a Castle Rock, Colorado company. Bill Inmon and Forest Rim Technology provide a service to companies, helping businesses hear the voice of their customers. See more at www.forestrimtech.com.
Integrate.io is a new ETL platform with blazing-fast CDC capture, reverse ETL, and deep Ecommerce capabilities. Schedule an intro call today.