Why meta data is one of the single most important topics in Business Intelligence

Have you ever been working with an internal or external customer who said he just needs this simple Excel into a report? When investigating the process, you find out that there are many undiscovered steps in this process? And you told them that this implementation might take a few days which in turn didn’t make them happy. Could this be avoided?

Business dynamics

I mean, I get it. We live in very dynamic times. The pressure is on. Organizations need to adapt and test quickly new products, create new value chains to stay on market and compete with the other players.

Still, if a new product comes up, after the initial chaotic phase, a few adaptions have to be made. A bit of documentation, adaption to general company procedures and processes and problem management.

The reality, although, is quite different:

New product gets proof of concept
Proof of concept is being transformed into something „live“
All done, up to the next phase – no one wants to be „unproductive“ (with documenting etc.)
Months later the investigation starts to understand the completely undocumented process which of course is business critical

The past

My point being, eventually some key persons might leave the organization. Or there might be a change in infrastructure and no one is aware that a specific process is using a specific infrastructure.

This also affects Business Intelligence processes in a great way. How many times did you get some data without knowing

where the data is being generated, there are only new files / updates in a database
if there are any measures if the data gets corrupted or the delivering process has a problem
who the heck integrated several new fields and how those Business Rules look like
who is the owner of the script / ETL process / stored procedure
how long the processing takes and how „old“ the data is at the time it is being read
when the processing takes place and if it will execute again if it fails

Now, this is not primarily about the data itself, but about the process of generating the data, hence meta data of data generating jobs – especially for pre-existing, legacy jobs. I bet you’ve had your fair share.

Documentation

A big part of this, especially in smaller organizations, is a central chosen point for this documentation. I’ve been using Intranet pages, internal Wiki pages, explicit folders for describing documents for data… it comes up to the own creativity in how meta data will be stored.

With bigger budgets there are powerful applications available which can do a lot of infrastructural analysis as well as providing insights into database schemas – knowing this can be, regarding the use case, a huge advantage in developing data lineage, reports and requirement analysis. But in my experience it is first all about centralization of data flows in the organization. This also means that there might not even be something to do for the Business Intelligence department but to write down a process which is gathering data.

Sometimes it is more helpful which data flows where and just know about it than trying to centralize all jobs and re-build them just for the sake of it. I try to focus on delivering value and using existing resources, not to generate more work. This helps the team and the customer more quickly, resulting in sprints times to build a report. Especially if a lot of the data processing processes in the organization are known already.

Business Intelligence can sometimes be a bit like being a detective. A detective with quality reports, though.

Photo by guillermoluis21

Tobias Maasland