Technologies and definitions inside of the Business Intelligence trade are changing. Data Warehouses are traditionally the way to generate data for evaluations and reports. But is this changing – and how?
Generally speaking, data source inside a business are easier to access nowadays than back in the days when large amounts of data were stored inside a computer main frame. Today, visualization tools like Tableau, QlikView or Tibco Spotfire – which also are able to source the data they need (to a certain extent) – show a very different way in getting and combining data. This change in technology shows the undergoing change in Business Intelligence.
For example, when using Google Trends to search for the term Data Warehouse, the following graph comes up:
To compare this, the frequency of the searched term Big Data looks like this:
Judging after search results, Data Warehousing is declining but Big Data on an uprise since 2012. Very interesting.
This leads me to the following conclusions:
- Definitions change – The questions remain the same. How does my business do? How are my processes performing? Can we use data to predict behaviour? What changes is the underlying infrastructure to provide answers.
- Perceptions change – It was always a big argument which had to be fought when creating a data warehouse – do we need another layer of storage? Isn’t it very expensive? With the mentioned tools (Tableau, QlikView and the like) the perceptions change rapidly. I wrote an article about Talend ETL and why it isn’t like Tableau since people are wondering how to combine data on-the-fly with tools from a different era.
- Requirements change – It is more clear what a Data Warehouse in the classical sense is. Big Data is for many people still a new technology to be tamed. Big Data is much more complex in the needed ressources (upcoming scarcity of Data Scientists) and knowledge in building a working Big Data infrastructure. The requirements are to be able to access more data more rapidly. Compared to a classical Data Warehouse, this can indeed be a good point to argue – gathering the data is easier in a schema-on-read Big Data infrastructure. The classical way is just much more proven and has a bigger background in the sense of maturity.
So I suppose there is a paradigm shift happening in the Business Intelligence or Business Analytics world. Data is needed much faster and directly nowadays. For specific reasons and known questions, a Data Warehouse is still the way to go. Classically, the data in the Data Warehouse will be around 20% of all the data in the company.
Meaning: 80% will be stored in some kind of Big Data infrastructure. If this means direct accessing of the source systems or creating Data Lakes for storage doesn’t matter – as long as the new demand for data access can be dealt with faster than in the past.
So do definitions.
Do you think the same? Or do you disagree?