Engineer’s Notes

MANTA + Informatica

MANTA completes Informatica to form a comprehensive metadata management platform. But how precisely does the bond work? [VIDEO BELOW ↓ ]

MANTA completes Informatica to form a comprehensive metadata management platform. But how precisely does the bond work? [VIDEO BELOW ↓ ]

Trust Through Understanding

Our product allows you to trust the data in your BI environment, because it is specialized in cracking SQL code and helps to fill gaps in metadata management solutions. It does not matter whether they were deployed to fulfill compliance regulations or create the backbone of your data governance efforts. The Informatica Metadata Management solution has a rudimentary capability to parse SQL, but, in our experience with our customers over the years, there are still blind spots here and there.

MANTA connects to IMM through XConnect (native API plugin), and enriches the metadata model of IMM with missing pieces of data lineage:

  • BTEQ scripts, stored procedures, views, and macros from Teradata
  • PL/SQL scripts, stored procedures, packages, and more, including DB links, from Oracle DB & Exadata
  • T-SQL scripts, stored procedures and more, including linked servers, from Microsoft SQL Server, Sybase (now SAP ASE), and PDW
  • NZPLSQL scripts, stored procedures, and more from IBM Netezza
  • DB2 scripts, stored procedures, and more from IBM DB2
  • SQL overrides from Informatica PowerCenter, Cognos and Microsoft SQL Server Reporting Services

It fills in the gaps in data flows and allows our customers to get end-to-end data lineage (including those pesky indirect data flows).

Here is a live shot of MANTA + IMM,  fullscreen is recommended.

Beyond Lineage

MANTA’s ultimate goal is to understand the semantics of the code being analyzed. If you have ever thought about advanced performance tuning, real data protection analysis, automated business lineage extraction,or migration of your code base to a different platform, MANTA is exactly what you need.

For customers who do not have Informatica Metadata Manager, but do enjoy the advanced ETL capabilities of  Informatica PowerCenter, we are also able to provide data lineage in our own visualization or IBM IGC. MANTA’s open API allows customers to push metadata to 3rd-party tools such as Collibra, Adaptive, Alation, Axon, and others.

Faster and Cheaper

The impressive part is how fast MANTA works. It can map a BI environment at a speed that is incomparable with any  human workforce, which means you save quite a lot of money on man-hours.

If you would like a more detailed explanation of MANTA + Informatica, and how it can help with your current situation, be sure to contact one of our specialists! They can discuss your situation in detail and give you a better idea of how MANTA can help with your data governance efforts.

MANTA 3.17: Say Hello to Netezza, Business Data Lineage, and Much More

It’s a bird! It’s a plane! It’s MANTA Flow 3.17 coming your way!

It’s a bird! It’s a plane! It’s MANTA Flow 3.17 coming your way!

When we said (and we always say this) that we would expand the list of our supported technologies, we weren’t lying. MANTA 3.17 comes with support for IBM Netezza and IBM DB2, provides business data lineage, and much more!

The biggest news among big news is that MANTA now supports Netezza. As always, pioneers are wanted and needed (ask for your trial).  It can do everything we taught it to, so if there is anyone out there who has a Netezza solution and wants to have the data lineage all nice and clean, we’ve got you covered.

This release is also the debut for a DB2 connector, now (almost) in stock. It’s no secret that we still have some plans for that one, so stay tuned. Because the Parser Team rocks, we worked on all of our parsers and we can proudly say that every new version of MANTA brings more and more improvements.

Did you know that complete business data lineage is now a thing? Yes, it is! And MANTA can provide that for you!

What we do is we can put together the physical metadata from the existing MANTA connectors with business terms provided by the user. From that, MANTA Flow can give you end-to-end business data lineage. We are the first ones who can provide you with data lineage that suits the needs of your business users as well as your BI team from information that your company already owns.

And as usual, that’s not it. We have added some cool features to MANTA, e.g. searching in source code, and much more.

How to Handle Impact Analyses in Complex DWHs with Predicates

“How to get full data lineage in complex BI environments and perform reliable impact analyses?” Predicates (with the help of Manta Flow!) might be the answer. 

“How to get full data lineage in complex BI environments and perform reliable impact analyses?” Predicates (with the help of Manta Flow!) might be the answer. 

During our pilots and deployments, we often find data warehouse environments that use very general physical models including several big tables like PARTY, BALANCE, ORDER and others. These tables contain data obtained from various source systems, and there are a lot of data marts and reports built on top of them. These tables make things difficult during the impact analysis because data lineage from almost every report goes through them to all sources making the result worthless.

Impact Analyses Do Not Have to Be THIS BIG 

Let’s take a look at an example to understand exactly what happens. The table PARTY contains all individuals and companies that are somehow related to the organization. Thus, in one table, it is possible to have records for clients, employees, suppliers and its branch network. Each type of entity is identified by a unique attribute or source system from which data is obtained – for example, clients are managed in a different system than employees.

Now, let’s assume we have two reports based on data from the PARTY table – a report EMPL_REPORT that displays information about employees and another report BRANCH_REPORT that displays information about the branch network. If we use the standard data lineage analysis, we can get this picture:

predicates1

Although only data from the EMPLOYEE source table is relevant for the report EMPL_REPORT, the impact analysis from that report also includes the CLIENT, BRANCH and SUPPLIER source tables due to the PARTY table. The problem is the same for the report BRANCH_REPORT. From the other side, the impact analysis from the EMPLOYEE source table includes both the EMPL_REPORT and BRANCH_REPORT which is confusing.
In the real environment, there are dozens of source systems and hundreds of reports, which makes the standard data lineage analysis worthless.

The Advanced Data Lineage Analysis 

Fortunately, there is a solution. When data is inserted into the PARTY table from different source systems, there is often a column like PARTY.source_system_id where the identification of the source system is stored as a constant. Similarly, when a report is created that consumes data only from specific source systems, there is a condition in the statement filtering data based on the PARTY.source_system_id column. Thus, it is possible to automatically analyze both the insertion and selection to/from the PARTY table and create predicates such as PARTY.source_system_id = 20 that are then stored together with data lineage in the metadata repository. Therefore, it is possible to include them in the computation during the impact analysis.

Thanks to that, if we perform an impact analysis from the report EMPL_REPORT, the predicate PARTY.source_system_id = 20 is gathered before the table PARTY. When the analysis continues towards source tables, the predicate for each path is selected and compared to what has already been gathered. Therefore, when the path to the source table CLIENT with the predicate PARTY.source_system_id = 10 is tested, the result is that both predicates cannot hold at once, so data for this report cannot come from this source table. Conversely, when the path to source table EMPLOYEE with the predicate PARTY.souce_system_id = 20 is tested, the result is that data for this report can come from this source table, so it is included in the result of the impact analysis. We can get similar results if we perform an impact analysis for the BRANCH_REPORT and also from sources like the EMPLOYEE table.

The result of the advanced data lineage analysis can look like this (in reality, if we perform the impact analysis from the EMPL_REPORT, we will only see the EMPLOYEE and PARTY tables):

predicates2

Surely, the situation can be far more complex. For example, the data from the PARTY table can be pre-computed for more source systems first, and then several reports can be created on top of them for only a specific source system, like in this picture:

predicates3

This is also something that can be handled and, as you may have expected, even this is a part of the Manta Flow product analysis.

If you have any questions or comments, feel free to contact Lukas at manta@mantatools.com. You can try these predicate-based impact analyses in our free trial – just request it using the form on the right. 

 

How To Inspect Raw Data Lineage With Manta Flow

Risk departments have a lot of complex SQL queries in their data warehouses and data marts. But sometimes it’s really difficult to find the right level of detail. Manta Flow can help.

Risk departments have a lot of complex SQL queries in their data warehouses and data marts. But sometimes it’s really difficult to find the right level of detail. Manta Flow can help.

“When we present Manta Flow to potential customers, most of them are happy that we can reduce very complex SQL statements to a few simple rectangles connected by arrows”, explains Lukas Hermann, our Director of Engineering. “They need to be able to quickly understand what source tables their SQL queries read, what target tables they fill, what columns are involved in computing a particular column and how.”

The Usual

For example, let’s look at just two ordinary insert statements moving data from a stage to a datamart and to a report:

raw_code

It could take you quite a while to analyze which columns are involved in the computation. But with Manta Flow it is really easy to see, including all the statements involved:

Direct

(click to show the large version in a new tab)

This is perfectly sufficient for all business analytics in data warehouse environments. All the unnecessary details like exactly how data is computed, filtered, aggregated, or ordered are hidden. And if you want to go deeper, Manta Flow can easily show the SQL code of the statements where you find the full detail.

The Raw

However, some analysts (particulary from the aforementioned risk departments) say that their SQL statements are really huge, including many subselects, complex expressions, etc., so the jump between the clear picture and the SQL code is too big. Therefore, they would like to see all the computation steps in a similar simplified format, and they ask if Manta Flow can handle it, if it has all the information necessary to show it.

The answer is that Manta Flow has the most detailed information possible about each part of the statement, but so as not to disturb you with what are in most cases useless details, it filters the information to the best level of detail. If you want to see everything including expressions, conditions, aggregations, etc., it’s possible to configure or completely turn off the filtering. Manta Flow is able to show you unfiltered information, but still keep you in the loop and oriented within your own systems.

click_to_see_the_big_picture

(image will open in a new tab)

See? It’s possible to show the SQL code in the precise position of each part of the statement shown.

If you’d like to try something like that yourself, just let us know in the form on the right. Also, do not forget to follow us on Twitter.

Manta Tools 3.12: The New Versions of Oracle & Informatica, a New Demo, and More!

Surprise! Another Manta Tools release here a bit early. And what’s in the box?

Surprise! Another Manta Tools release here a bit early. And what’s in the box?

Our customer base is growing quickly, and feature requests are piling up. That’s why Lukas (our Codehead Prime) decided to squeeze one more major release into the schedule. So, what’s new?

We need to catch up on the development of technologies we support, that’s one of the keys to our success. We generally support the latest versions, but the guys did a thorough review and declared Oracle 12c and Informatica 10 (namely Informatica PowerCenter and Informatica Metadata Manager) fully supported. Although most of the market still has older versions, both Oracle and Informatica are pushing people really hard to switch to newest releases. Informatica even announced the end of support for 9-something versions next year.

Manta Flow, in particular, received a few major updates (minor updates are not worth mentioning due to the very fact that they are minor). REST API is now available in Manta Flow. Our native visualization can now easily show SQL overrides, and individual nodes (and their children) are now easy to contract with just one click. What does it mean to “contract nodes,” you ask? I thought you would ask, so I made an animated GIF about it! Check it out:

node_contraction

See? Big fat script in the middle got out of the way and you can see just tables you wanted. Notice the line from one table to another is thicker, so you can click on it and show the script again. 

And last, but not least: Manta Flow online demo was updated, and it now corresponds with the latest version of Manta Flow you can get off the shelf. There are some limits to it though – as with any cloud product, it does not have access to your system and it cannot show full data lineage (only the part you push in). But try it anyway, it’s awesome. (Or ask for a free trial right away, and test Manta Flow at home.)

Did you know we are on Twitter? Yes, it’s true. Follow us, please. Also, do not hesitate to ask any questions and submit any comments via email to manta@mantatools.com. 

 

 

 

Manta Tools 3.11: Increased Compatibility, Improved Visualization & More

Compatible with yet another data source & offering more user-friendly visualizations, that’s Manta Tools 3.11.  

Compatible with yet another data source & offering more user-friendly visualizations, that’s Manta Tools 3.11.  

So, what’s new? Manta Flow’s visualization export functions have been enhanced significantly – you can now filter resources and change the level of detail on which the lineage is extracted. Also, native Manta Flow visualization now shows SQL override queries in Informatica PowerCenter as a parameter.

For the even faster automated visualization settings, Manta Flow now supports selection via text input (such as CSVs). This is especially useful when a customer with a huge environment would like to visualize a very specific part of it. And, as always, Teradata & Oracle parsers are now better than ever.

And last, but definitely not least, is our biggest new “feature” – compatibility with Microsoft SQL Server. But this topic is waay to big for a humble release blog post like this one. We will announce it in separate article tomorrow.

Any questions, comments or concerns? Just send them to manta@mantatools.com. And do not forget to follow us on Twitter!

Manta Tools 3.10: Unlimited Time Machine, Automated Backup & More

The new version is out – what is new with Manta Tools 3.10?

The new version is out – what is new with Manta Tools 3.10?

One of the all time most requested Manta Flow features was the unlimited time machine function. It is now possible to see every visualization you’ve ever performed on your system and freely explore them like you would with current ones. Manta Tools’ metadata repository is also automatically backed-up whenever you like, so even your most paranoid teammates can sleep at night.

When it comes to technology, we’ve improved metadata extraction from Teradata databases and the predicate system for Oracle and Informatica. The Manta Flow visualization itself has also been updated. We’ve added extended highlighting (you can try this in our online demo!) and also a direct export option on the customer’s catalog page.

And what’s coming in our next version? We will finally release support for a whole new technology – Microsoft SQL Server, including a connection with Informatica Metadata Manager.

Do you have any questions or suggestions? Please let us know at manta@mantatools.com, and do not forget to follow us on Twitter.

 

 

Manta Tools’ New Customer in Germany: DIY Retailer OBI

We have our first customer in Germany – and it’s the most famous beaver in Europe!

We have our first customer in Germany – and it’s the most famous beaver in Europe!

Germany is one of the most promising markets for us in the EMEA region (along with the rest of Western Europe), and we pushed strongly there parallel to our expansion to the United States. We managed to get a few experienced partners in Germany, attended the TDWI conference in Munich, and our joint efforts have paid off.

At the end of the summer, we signed up our first customer in Germany: OBI, the biggest DIY retail store in Europe and the third largest on the entire planet. We were happy to provide them with Manta Flow for Teradata, capable of analyzing all the data flows of their whole Teradata-based environment, and our own end-to-end data lineage visualization (you can try it for yourself in our demo center).

“When I first talked with my team in the beginning of 2015 about Germany being one of the top target markets for us, I believed we would be able to close the first deal there before the end of the year,” says Tomas Kratky, CEO of Manta Tools. “We slowly built a network of hard-working sales reps and enthusiastic partners, started targeted marketing campaigns, and voilà, we closed the first deal by the end of this year’s sunny summer. And our task is simple – to get other customers and to get them fast.”

We are preparing a joint case study, so stay tuned for details!

Do not forget to follow us on Twitter, LinkedIN, or the German LinkedIn-like social network XING

 

 

 

Manta Tools 3.9: Time Machine, Role Management & More

It’s been a while, but a new release of Manta Tools is here! And what’s new in version 3.9? 

It’s been a while, but a new release of Manta Tools is here! And what’s new in version 3.9? 

As usual, a lot. Manta Checker is introducing advanced role management – it is used in bigger BI/DWH teams now and we needed to comply with all the enterprise security policies. Admin/configuration roles are now separate from user/engineer roles, and compliance officers now have broader control over what’s happening in Manta Tools. The rules used to enforce standards and policies in Checker now have three different categories – pre-validation, validation, and post-validation rules. That’s crucial for most of the complex transformation scenarios – Tomas Fechtner is gonna explain this feature more in a separate article.

And last but not least – for the very first time, we are introducing something called incremental parsing. Manta Checker’s performance is increased by parsing only parts of code when necessary, thereby making transformations faster and more accurate.

Manta Flow has also been vastly improved, mostly on the grammar side – we improve grammar in every release, but this time it was Oracle who received the biggest update. Also, Manta Flow for Oracle received a major update of its database links. But the biggest and most important function is the Manta Flow time machine function. The simple ability to work with the previous version of visualization gives users even more control over what is happening in their data warehouse.

Up Next in 3.10

In the next version, Manta Tools 3.10, we are planning on expanding the powerful time machine function (and making it possible to compare the current version with any previous version in the history of the universe) and with both products, we will focus on Teradata a little more next time.

Comments? Questions? Let us know at manta@mantatools.com or use the form on the right.

Manta Flow + SAP PowerDesigner: A New Metamodel Mapping Platform

In a previous article, Lukas Hermann explained how PowerDesigner and Manta Flow can work together. Now, our Senior Developer Jiri Tousek will explain all the technical details. 

In a previous article, Lukas Hermann explained how PowerDesigner and Manta Flow can work together. Now, our Senior Developer Jiri Tousek will explain all the technical details. 

Basically, we’ve created a general procedure for importing PowerDesigner’s metadata to Manta Flow for various purposes, e.g. advanced impact analyses. It is a simple, two-step process:

Step 1: Get Metadata from PowerDesigner

First, we need to export/access the PDM file and extract its metadata as an XML file (our tool also supports batch processing of multiple files). The metadata is easily mapped to physical database objects as long as the physical name (“code” field) is filled in correctly and the correct model object type is used. PowerDesigner’s own API is responsible for two-way compatibility between different versions of PD. We use API because PDM file formats have changed a lot between different PD versions. Each model is extracted with standard and also extended attributes in all packages, tables, views and columns.

Step 2: Mapping Platform

Our mapping platform supports metadata in any general XML file. There’s only one requirement: the XML has to contain physical data model names and the object types of the database objects being mapped. Our configuration is quite flexible though. It’s XPath-based and can support pretty much any metadata scheme. It goes without saying that you can choose what metadata to extract from the input and that we also support translation of the metadata attribute names (e.g. from technical identifiers to human-readable attribute labels).

Benefits

So, this is a short summary of how we export metadata from PowerDesigner and import it to Manta Flow. And how can you benefit from this in your daily work? Well, there are a bunch of ways, but this triplet comes to mind right now:

1) In Manta Flow, you can see data flows all together with attributes.

2) Pretty much everything written in this article.

3) You can easily create an impact analysis regrading those attributes. This is probably the most interesting, so let’s illustrate this with a use case – security impact analysis:

Imagine that you need to investigate how your sensitive data propagates through your system. And obviously, you need to compare the real security status with the rules, guidelines, and procedures for your organization. That’s not something you can just wave off, and this solution will make your work doing this a lot easier. In order to perform this analysis, you need to define sensitivity levels for (some of) your database objects. By importing those from PowerDesigner, you can avoid manually setting up administration rights for sensitivity levels. Using the policies you already have in place for PD will save you time as you avoid bureaucratic hassles.

Jiri Tousek would love to hear your feedback on this – let him know what you think at manta@mantatools.com and follow us on Twitter

Subscribe to our newsletter

We cherish your privacy.

By using this site, you agree with using our cookies.