This is a guest post for Integrate.io written by Bill Inmon, an American computer scientist recognized as the "father of the data warehouse." Inmon wrote the first book and magazine column about data warehousing, held the first conference about this topic, and was the first person to teach data warehousing classes.
Data science is immature.
This statement is not pejorative; it is simply a statement of historical fact. As such, it is not arguable.
Data science has been around for a mere ten years or so. This is very new compared to medicine, engineering, or accounting. There are bones in the high caves of Chile that indicate humans have practiced medicine for at least 10,000 years. Over in Rome, an engineer designed the walls more than 2,000 years ago. So, when we compare the age of data science to other professions, there is simply no contest regarding historical maturity.
Data science is the new kid on the block.
Let’s examine this topic further.
Table of Contents
- Disdain for Data Architecture and Data Warehouses
- Not All Data Scientists Are the Same
- Data Scientists + Integrate.io = Data Warehousing Success
Moving data to a supported warehouse can be a complicated process. However, data scientists can extract meaning from this data for business insights. Integrate.io streamlines the data integration process with its out-of-the-box connectors. Schedule a demo today!
Disdain for Data Architecture and Data Warehouses
With any immature profession, anomalies surface at the beginning and smooth out over time. One of the anomalies of data science today is the disdain data scientists have for data architecture and data warehousing. For whatever reason, most of these professionals look down on data architecture and warehousing (such as moving data from disparate systems to a centralized warehouse) as mundane subjects, unworthy of their time and attention. Data scientists would much rather concern themselves with statistical correlations, scatter charts, resolutions of outliers, algorithms, pattern analysis, and other interesting analytical tools and techniques.
Indeed, the data scientist takes courses in school to learn all of these sophisticated techniques. But when the data scientist gets out into the real world, they discover that instead of doing the work they studied for, they turn into a data garbageman. The real world just does not have the carefully vetted data that data scientists have been told to expect. These professionals spend 98 percent of their time gathering and cleaning data and 2 percent being a data scientist in the real world. That’s the reality.
Trying to become a data scientist without understanding data architecture and data warehousing is like:
- A doctor who doesn’t understand pills and medications
- An accountant who cannot add or subtract
- An engineer who builds bridges but doesn’t understand the strength of materials
- A tennis player who doesn’t like to run
- An ice cream maker who doesn’t touch anything cold
Integrate.io helps companies (and the data scientists that work for them) move data to a warehouse via low-code/no-code connectors. The platform also performs reverseETL and fast CDC. Schedule a demo now.
- Read more: Advantages of Data Warehouse Integration
Not All Data Scientists Are the Same
Data architecture and data warehousing are so fundamental to the discipline of data science that they need to be second nature. Instead, they get short shrift.
But not all data scientists have disdain for data architecture and warehousing. Just the other day, I was talking to a data scientist who has discovered the incredible benefits of a data warehouse. This data scientist told me:
“The data warehouse is a wonderful place. Data is in place and organized. Data is believable. Data is accessible. On occasion, I will need data from other places than the data warehouse, but the warehouse is always the starting point for our data science projects. It can save us huge amounts of time and money. Now I can be a data scientist and not a data garbageman.”
Data Scientists + Integrate.io = Data Warehousing Success
As data science matures, it will inevitably recognize the importance of data architecture and data warehouses. Today, it is fashionable for data scientists to be snobbish to their cousins in data architecture and data warehousing and to look down on those disciplines. But that is just the immaturity of the data science profession. Like the miniskirt of yesteryear, fashions will change over time.
Integrate.io is a new ETL platform that makes data warehousing easy. Its philosophy is to make data integration less of a chore: Schedule an intro call or try Integrate.io for yourself with a 14-day free trial.
Bill Inmon, the father of the data warehouse, has authored 65 books. Computerworld named him one of the ten most influential people in the history of computing. Inmon's Castle Rock, Colorado-based company Forest Rim Technology helps companies hear the voice of their customers. See more at www.forestrimtech.com.