This is a guest post for Integrate.io written by Bill Inmon, an American computer scientist recognized as the "father of the data warehouse." Inmon wrote the first book and first magazine column about data warehousing, held the first conference about this topic, and was the first person to teach data warehousing classes.
Here are five things to know about this topic:
- The data architecture is ever-evolving.
- In the 1990s, organizations used spider web systems and siloed systems to manage data. However, these technologies made it difficult to determine the worth of data.
- Data warehouses provided a solution to this problem, despite some theorists claiming organizations didn't need a warehouse when using big data and a data lake.
- Data vaults and textual-based data expanded the evolution of data warehouses.
- Integrate.io is a data warehousing integration solution that helps organizations generate business intelligence from data sets.
“A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision-making process.” — Bill Inmon, 1990
“A data warehouse is a subject-oriented, integrated (by business key), time-variant and non-volatile collection of data in support of management's decision-making process, and/or in support of auditability as a system-of-record.” — Bill Inmon, 2018
Architecture is an ever-living, ever-evolving entity. Architecture is constantly changing. As time passes and technology advances, it is inevitable that the underlying architecture of technology also evolves and mutates.
Learn more about ever-evolving architecture below.
Table of Contents
- Spider Web Systems and Siloed Systems
- Then Came Data Warehouses
- The Rise of Data Vaults
- What About Big Data?
- Textual-Based Data
- New Vistas of Opportunity
- Data Vault and Text
- Final Word
Integrate.io understands the power of ever-changing data warehouses. Its low-code/no-code native connectors streamline data integration processes like ETL, Reverse ETL, and fast Change Data Capture (CDC), letting you generate intelligence about your business without hiring an expensive data engineering team. Try Integrate.io yourself with a 14-day free trial!
Spider Web Systems and Siloed Systems
In 1990, the world experienced an architectural phenomenon known as “spider web systems” or siloed systems. Spider web systems grew out of the rapid-fire production of multiple applications, and no enterprise considerations were given to the building of these applications. As a result, the same data elements appeared in multiple places.
An organization would wake up one day and have a value of 38 in Application ABC element XYZ, a value of 1,000 in Application BCD element XYZ, and a value of -762 in Application CDE element XYZ! As you can see, trying to make good corporate decisions on spider web systems/siloed systems across an enterprise was an impossibility when no one knew what the value of data really meant. Furthermore, organizations could not reconcile conflicting data. Throwing more money, more consultants, and more technology at spider web systems was like throwing gasoline (not water) on a burning house. It made the fire worse, not better.
Then Came Data Warehouses
Into this environment grew the architecture known as the “data warehouse.” The original definition of a data warehouse in 1990 was:
A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision-making process.
The data warehouse represented an architectural departure from the conventional wisdom of the time. It was an architectural solution to the misery of spider web systems. The academic database theorists back then railed against the notion of data warehouses. However, the need for a different architectural approach to solve problems of spider web/siloed systems prevailed over the outcry of theorists. Soon, the world discovered and started to build data warehouses.
In the 1990s and into the new millennium, data warehousing began to grow into conventional wisdom.
Read more: Advantages of Data Warehouse Integration
The Rise of Data Vaults
The need for a more established foundation of data started to become apparent. Soon, data vaults appeared as an extension of the data warehouse concept. With data vaults, the firmly established and well-defined notion of the system of record began to emerge. The evolution continued with enhancements to data vaults, and the concept of “data vault 2.0” emerged.
This evolution took place around 2010 and continues today. In 2018, the definition of a data warehouse expanded:
A data warehouse is a subject-oriented, integrated (by business key), time-variant and non-volatile collection of data in support of management's decision-making process, and/or in support of auditability as a system-of-record
It is predictable that this definition will – in time – continue to evolve. As with all evolutions, 2018’s definition of a data warehouse will be modified one day. Architecture and evolution are an ever-moving entity, and what was true and right in 2018 will be dated in 2025 and beyond.
What About Big Data?
An interesting issue in 2018 was whether organizations needed the data warehouse in a world of big data. The vendors of big data (in attempts to sell their products) tried to convince people they didn’t need a data warehouse with big data and a data lake.
That is not the case at all. The data warehouse is an architecture. Big data is a technology. Comparing architecture to technology is like comparing Picasso’s Guernica to street graffiti. Both fine art and graffiti have their place in the world, but they are fundamentally different in many, many ways.
Under NO circumstances should people confuse a data warehouse with big data. They are very different things.
Integrate.io is the low-code/no-code data warehousing integration platform that moves data to a supported destination via its native connectors. You can move data sets (including textual-based data) to a supported destination for data analysis without data engineering knowledge or excessive coding, removing the pain points of data integration. Set up an ETL trial or an ELT trial today!
Another branch of data architecture has evolved from the data warehouse. That is the evolution of the data architecture including text. For years, computing has centered around structured, record-oriented data. This form of computing is very valuable as it fits well with transaction processing, and transaction processing is very important to the world of business. But as important as transactions are, they are not the only data that corporations use. Textual data of a corporation is more important than record-oriented data. Text is the fabric of communication, contracts, customer attitudes, warranties, and a thousand other subjects.
Just because text does not fit comfortably in records once optimal for transaction processing does not mean text should not play an important role in the decisions of the corporation. Now there is technology – textual ETL – that allows organizations to incorporate text into decision-making.
New Vistas of Opportunity
By including text in corporate decisions, whole new vistas of opportunity open up. Consider your customers. Prior to text processing, understanding your customer meant looking at them externally. The first view of the customer included name, occupation, salary, age, education, marital status, address, and a hundred other indirect measurements. The early view of the customer did everything EXCEPT actually hear from them. You could know a tremendous amount about your customer without actually knowing what they were thinking.
With the inclusion of text, you can now start to actually hear and process what the customer is thinking. You can listen to your customer in telephone conversations, email, on the internet, and a thousand other places. So, text opens up the door to understanding the internal view of the customer. And at the end of the day, the internal view of the customer is more useful and more powerful than the external view of the customer.
But understanding the customer in a profound way is only one benefit of understanding text. Many, many doors are opened by being able to manage text.
As with all evolutions, progress is slow and gradual. Evolution does not happen overnight. Each day sends the world one step further into evolution (whether the world likes it or not).
Read more: The Ultimate Guide to Data Warehouse Design
Data Vault and Text
One of the interesting conjectures is how and if the evolution into text will meet up with and join the data vault evolution. From a speculative standpoint, there are reasons why the evolution may occur and there are reasons why the evolution may not occur. Some of the reasons why the evolution may happen are:
- Both text and the data vault are fundamental to success in business.
- Both involve data design.
- Both center around the notion of the value of data integrity.
Some of the reasons why the evolution may take a while are:
- Both evolutions are still in their formative stages. Data vault is further evolved than text, but both are still a work in progress.
- There are some fundamental differences between the two disciplines.
In any case, an evolution will be inevitable; such is the nature of evolution.
The data architecture is ever-evolving. From spider web systems to text processing, organizations will continue to use new technologies to manage, store, and get more value from their data. Despite architectural changes, the data warehouse has remained a constant for corporations wanting to analyze information and generate business intelligence.
Integrate.io simplifies data engineering by allowing you to move data to a supported warehouse quickly and securely without any data engineering knowledge. Schedule a demo now!
Bill Inmon, the father of the data warehouse, has authored 65 books. Computerworld named him one of the ten most influential people in the history of computing. Inmon's Castle Rock, Colorado-based company Forest Rim Technology helps companies hear the voice of their customers. See more at www.forestrimtech.com.