You would be surprised how many tasks inside top enterprises’ BI are still manual. So, where’s the maturity then?
Two weeks ago, from the 9th to 11th of June, two of my Manta Tools colleagues and I attended a great Data Governance & Information Quality Conference. Manta Tools was there for the first time, and I am happy to say that the whole event was great. I was also satisfied from a business perspective – we took advantage of several new interesting opportunities there and we also met a few potential partners which are so critical for the ecosystem of Manta Tools. But this blog is not about our revenue.
The organizers were able to put together a very impressive list of great and interesting speakers. I enjoyed the mix of both customers (all important verticals) and suppliers (consultants, analysts, vendors). I had an opportunity to see several superb and visionary presentations and also to hear a lot of real customer stories. And you know what? It was like night and day! We are all pretty well used to the differences between visions presented by top analysts and the reality we can see everywhere around us. But this time it shocked me even more.
2015: Everything is still manual
I saw large enterprises having issues with basic data consolidation and usage. Big Data programs run without any plan or even any idea of what to do and how to do it. A huge amount of manual labor is needed for basic governance, and there are no ideas on how to digitize and automate things. Frustration everywhere. One of the strongest moments was a presentation made by a very smart data governance leader from a large finance company. She presented how they govern data and the whole BI environment. Most of the activities are manual without any automation. Metadata critical for the success of the whole program is gathered from source systems manually and developers are responsible for it!
Almost everyone in the meeting room was excited about how mature that company was. That’s weird! I immediately recalled a presentation at Informatica World 2015 where two experienced consultants from a top consulting company talked about next generation information architecture. We asked them how they think enterprises should collect metadata from custom code that is basically everywhere in nextgen information architecture, and they replied: “Yes, that is a real issue, and we believe the best way to solve it is to document your code appropriately”.
Why is our maturity so low?
Having consolidated and complete enterprise metadata is a critical asset in all these data management activities:
- Do you want to mix internal and external data in a controlled manner?
- Do you want to be really effective when it comes to analytics or the maintenance of your BI environment?
- Do you want to offer your business staff self-service BI?
- Do you want to comply with all those strict regulations?
If your answer is “yes” you really need to have great metadata in place. This is the reason why all large data management vendors today hunt the metadata beast so aggressively. Actually, Wayne Eckerson wrote a great article about it.
But the question is, why is our maturity so low (and why do we pretend it is not)? Of course, the right answer is “It depends …” (I hope you know the famous tweet by Kent Beck, one of the fathers of agile.) But to name a few of the things that definitely make it much worse:
1. It takes some time and money to see the results of your data governance program, and sometimes it is so tempting to give up the fight.
You can start small, but the real benefits and “wow” moments come when you have most of the metadata under control. For example, try to imagine you are analyzing how your BI must be changed to fulfill a new business requirement. For this analysis you will definitely need a so called lineage to do an impact analysis right. But how do you use this lineage if half of it is missing?
2. Data governance programs are very often under the control of those in finance.
It is great because the finance people are usually very much focused on financial benefits and are very familiar with governance. But on the other hand, those guys are not engineers in most cases, so it is very hard for them to imagine what can be automated and digitized and to what level. So you can sometimes see too much friction and too much manual labor in place. In the end it means a lot of work for everyone, a lot of mistakes and errors, and slow advancement in the data governance program itself.
3. For a long time data governance was only good for compliance and nobody really took care of it.
Now business is in charge, data is the most important asset, and almost every enterprise is becoming data-driven. In this environment data governance is absolutely critical, and management is able to see (while writhing in pain) how expensive it is when data governance is missing.
4. Everyone is in this game either to play and help, or to hamper.
Unfortunately, you can’t play your data governance game in isolation. It is about everyone and everything. People are in this game, not just machines. We know how hard it is to find a win-win scenario when several people are involved. So unfortunately, data governance turns into a political game from time to time.
5. Many top research and consulting companies focus more on future visions and forget about today’s reality.
Two or three years ago when we started planning our Manta Tools U.S. presence, we decided to talk to one really big research company to get some feedback. Do you know what their advice was? They told us our product is obsolete for the U.S. market. And why? Because based on their knowledge and level of experience in data governance, the maturity of U.S. companies is pretty high. Metadata is in place, everything is automated, no custom code is in BI. We were shocked. Our dreams were torn to shreds. Fortunately, we didn’t stop working, and one year ago we slowly entered the U.S. market. And what has our experience been after several months and dozens of pilots done for the biggest enterprises? I guess you know by now.
Not all hope is gone
But even if the situation is not perfect and easy to change, there are a few things we can do to increase our maturity. The DGIQ conference was full of presentations and workshops focused on social aspects of data governance. How to play this game with management, how to explain the importance of data governance practice, how to show results, how to convince other parties to play with you and not against you.
The whole transformation towards data-driven enterprises also helps a lot when it comes to management support. And because big old vendors fell asleep, several new and eager players arose to grasp this opportunity. Collibra, Diaku, Tamr, and Waterline are just the first examples which come to mind. All those newcomers focus primarily on business users and their experience, ease of use, and collaboration. The old vendors are running fast to catch up with them, and it will be a great game to watch.
And last but not least, we need to really automate what we do and get rid of manual labor as much as possible. Boring and repetitive tasks, when done manually, are the greatest source of ineffectiveness and errors. Fortunately, all big vendors ship their products with a lot of connectors to make your life easier. But unfortunately, their focus is more on the low-hanging fruit like basic data structures (tables, reports, etc.), and they do not care too much about the rest.
One example is business logic hidden inside custom code. Yes, it is darn hard but necessary to extract and manage metadata from code also. We can’t count on developers to document it. And it is not enough to cover 60% or even 80% of that logic. In last few months, during our pilots and PoC, we have heard almost the same thing from everybody: “It is great your product was able to explain more than 95% of our code to us. Without such a high level of understanding, it would make no sense for us to even try.”
And the most important thing? Greg Norden from Boeing Commercial Airplanes said it during his DGIQ 2015 presentation: “You need to start as soon as possible, you need to execute persistently, and you need to finish often to show results!”
Data governance is a never ending game, folks.