
Crucial Data Sufficiency Analysis: Fast and Painless with Manta Tools
Data Sufficiency can be a problem. Let us show you one possible use case of Manta Tools helping ensure the validity of a Data Warehouse Model.
If you have ever worked in BI at a multinational corporation, you know that situation:
Your reports are yours, special.
Your data model is also unique – it needs to feed those reports, after all. Now imagine that an order comes from the headquarters to unify all national and regional data models into one. You cannot really say “no” – it is actually a sensible move, from the global perspective. But how do you make sure you can still run all the reports you most certainly have to keep producing?
This is where data sufficiency analysis comes in. And when done in a traditional way, it is a beast to tame – because it means making sure that every single column in every single existing data warehouse maps to some particular column in the new data model. Just try to imagine doing that manually and either you will shudder, or your imagination has just blown a fuse. Many organizations have done it already and burnt thousands of man days, sometimes with results that were not too pretty – because to crack detailed semantics of many columns, you have to analyze all the code that feeds them – a code that may have been gradually built up by a dozen or so developers over a decade or more. A large retail bank we spoke to (after the fact, darn it) decided to deploy a new data warehouse architecture with a unified data model across the EMEA region, with its various BI solutions and wildly varying data models. Performing the data sufficiency analysis manually turned out to be a very painful, time and resource consuming exercise which, in the end, caused a 4 month delay in the project. To add insult to injury, many gaps in the analysis were revealed way too late in the UAT phase.
The Difference of Manta Tools
But it doesn’t have to be this way. Had the analysis been done using Manta Flow, all the relevant SQL scripts, data structures and data flows would have been documented within hours. And all the free manpower (which would have been scaled down significantly) could have concentrated on the remaining, much more manageable tasks – making inputs available to Manta, interpreting the outputs (which Manta produces in a graphical form and a variety of structured data formats as needed) and deducing gaps in the new data model from differences between the old and the new in plain sight.
Any comments or questions? Just use the form on the right. And do not forget to subscribe to our feed.