All Topics

MANTA x Record Level Lineage: Why we don’t have it

You may or may not have heard about record level lineage. This is a topic that our customers ask about quite frequently, so our vice president of development, Lukas Hermann, decided to write an article where he answers some of the FAQs. Continue reading to find out more about record level lineage and why we don’t have it.

You may or may not have heard about record level lineage. This is a topic that our customers ask about quite frequently, so our vice president of development, Lukas Hermann, decided to write an article where he answers some of the FAQs. Continue reading to find out more about record level lineage and why we don’t have it.

What is record level lineage?

Record level lineage is an approach to data lineage that is similar to data tagging. The idea behind data tracking is that each piece of data that is being moved or transformed is tagged/labeled by a transformation engine which then tracks that label all along its way from start to finish. This approach seems great, but it only works well when a transformation engine controls the data’s every move. Some good examples are controlled environments like Cloudera or Dremeo that focus only on the origin of one specific record.

Record level lineage vs. column level lineage

A feature that MANTA does have, that in a way is similar to record level lineage, is column level lineage. What exactly is the difference? Let’s look at an example.

Let’s say you have the column full name in your table. In this table, the full name is created by combining the first name and the last name. Imagine that in the full name column you have names like John Snow and Jack Snow. Now, let’s say that the name John Snow came to this table from your own CRM database, but Jack Snow came from a contact database acquired from a third party.

Record level lineage is able to tell you exactly that John Snow came from CRM and Jack Snow came from your contact database. Column level lineage, like in MANTA, is able to tell you that the column full name consists of data from these two databases—your CRM database and your contact database.

Why we don’t have it

The reason why MANTA does not have record level lineage is that MANTA doesn’t “see” your data; it doesn’t even “see” that you have a John Snow and a Jack Snow in your full name column. It only reads your metadata. That is why MANTA only sees a table that contains data from these databases and which databases they are.

Now, you might be thinking that the overall idea of the record level lineage approach might not be so bad after all. But keep in mind that if anything happens outside its walls, the lineage is broken. It is also important to realize that the lineage is only there if the transformation logic has been executed. But think about all the exceptions and rules that apply only once every couple of years. You will not see them in your lineage until they are executed, which is not exactly healthy for your data governance, especially if some of those pieces are critical to your organization.

Also, tags are formed by assigning additional metadata to the records. If you lose this metadata, you will never be able to form the lineage again. And without actually running the transformation engine, you don’t know how the given record was put together, and therefore don’t know the lineage behind it.

In conclusion

If MANTA wanted to have record level lineage, it would have to start reading your data instead of your metadata, and it would have to have much more information about your environment. This would make the entire process of getting data lineage far more complicated and time-consuming.

We can safely say that we are not planning on having record level lineage as a feature any time soon. On the other hand, we plan on putting more effort into understanding your data transformations. The fact that MANTA only reads your metadata and is only interested in your data transformations, not your actual data, is the reason why MANTA can be automated so well and get data lineage so fast.

And what about Conditional Lineage? 

MANTA also has conditional lineage as a feature, and you do look into the actual data when you are creating conditional lineage. Well, not quite. We only use the data that is specifically mentioned in the scripts. You can learn more about conditional lineage in the article: How to Handle Impact Analyses in Complex DWHs with Predicates.

So what MANTA does give you is a list of the exact databases that supply data to the given column in your table. For compliance with regulations such as GDPR and other financial or banking regulations, it is completely sufficient. And typically, there are no more than a few databases that supply each column, so then the question is: If you really need to have the specific database for each record in your table THAT BAD and you can have the databases narrowed down to a few for each column in a couple of hours, wouldn’t it be more efficient to just check those two databases manually for the specific record yourself?

Do you have any development-related questions for Lukas, or would you like to learn more about how MANTA can solve a specific issue in your company? Don’t hesitate to contact us at manta@getmanta.com.

MANTA Cases #2: Pure Development Efficiency

In the second part of the “MANTA Cases” series, we will take a look at a US investment management company that deployed MANTA, primarily for their development team to perform impact analyses and to boost the overall efficiency and agile development processes of their internal software. 

In the second part of the “MANTA Cases” series, we will take a look at a US investment management company that deployed MANTA, primarily for their development team to perform impact analyses and to boost the overall efficiency and agile development processes of their internal software. 

The Customer had a fairly small yet, for its size, very complex data environment. The environment consisted of three databases:

  1. One was used as the main data warehouse that collected data for the core business
  2. The second database was a special investment data mart database (this database had specific requirements, making it even more challenging for it to co-op with new releases)
  3. The third database contained resources for the sales team

The entire environment contained about 15 hundred scripts, of which 20 to 30 percent were legacy code. The other 70 to 80 percent were newer code that was evolving at a fast pace due to the customer’s agile development standards. The customer used Microsoft SQL, most of the ETL code was in T-SQL, and some scripts were in SSAS.

Given the user requirements for each of the separate databases, the environment was growing and its expansion was necessary – otherwise, it would have become a burden. And, as is always the case in development teams, that familiar question was floating around in everyone’s heads: What’s gonna break if we make changes? The customer had developed an automated testing process for new code involving Unitask software, but for the legacy code, there was nothing like that. This made it necessary to deploy a tool that would define the dependencies between various parts of the code.

Now that MANTA has become part of the customer’s development process, MANTA can show the development team the impact a new release will have on existing code, making it easier to catch most of the larger issues during the testing phase. The issues the customer faces after a release have been, in their words, reduced by half and are in general much smaller.

Another benefit of MANTA for the development team is that the overall testing time before a new software release has been cut by 10 to 20 percent, and the number of developers needed to perform a release is now 2 people!

In this case, MANTA helped the customer maintain an agile development strategy for their internal software and improve the overall performance of impact analyses. However, due to the benefits that came with it, such as the reduction in overall testing time and the number of people needed to perform a release, MANTA could easily be beneficial for companies having trouble with staffing as well as for other small development and business analytics teams.

Do you have any questions about this specific case, or would you like to learn more about how to deploy MANTA in your company? Write us at manta@getmanta.com.

MANTA x Conferences: Where you can meet us in 2019

February 15, 2019 by

Conference season is knocking. Have you already purchased tickets to some tech conferences in 2019? Let Jan Ulrych invite you to conferences where you can meet MANTA this year.

Conference season is knocking. Have you already purchased tickets to some tech conferences in 2019? Let Jan Ulrych invite you to conferences where you can meet MANTA this year.

Hi, readers. It’s Jan here. You might remember me from my last article in 2018, where I summed up my experience from last year’s conferences. A lot has changed with our software since then, and you can come and talk to us about it. But just to give you a little appetizer before the main conference experience:

As you might have read in our Release 3.23 article, MANTA has added or upgraded both its integrations and connectors for many new technologies, including Sqoop, Talend Data Integration, Microsoft SSAS, Apache Pig, IBM IGC, Collibra DGC, PostgreSQL, and many, many more.

Now, without further ado, here are the Conferences where you can meet us in 2019:

  • IBM Think February 12-15, booth no. 472
  • Enterprise Data World March 17-22, booth no. 60
  • FIMA US April 1-3, booth no. is yet to be specified
  • Teradata Universe April 7-10, booth no. is yet to be specified
  • Informatica World May 20-23, booth no. is yet to be specified
  • Collibra Data Citizens May 22-23, booth no. is yet to be specified
  • MIT CDOIQ Symposium 2019 July 31-August 2, details are yet to be specified

 

We are looking forward to seeing you there!

If you have any questions about MANTA’s presence at conferences, feel free to e-mail us at manta@getmanta.com

MANTA Cases #1 (Pilot): Market Data Tracking

We have created a new series for you on our blog: MANTA Cases. Real-life examples of how to use MANTA are what interests our readers and prospective customers the most, therefore we have decided to periodically publish some of the most interesting and creative cases on our blog. You will be able to find the entire series in the category called “Use Cases & Case Studies” on the right side of the page. Enjoy!

We have created a new series for you on our blog: MANTA Cases. Real-life examples of how to use MANTA are what interests our readers and prospective customers the most, therefore we have decided to periodically publish some of the most interesting and creative cases on our blog. You will be able to find the entire series in the category called “Use Cases & Case Studies” on the right side of the page. Enjoy!

The first part of our series is dedicated to using MANTA for market data tracking. This interesting way of using MANTA was thought up by a company that provides analysis and reporting services to their customers. Such companies handle large amounts of data. They themselves also need to buy enough data for their analyses and reports from other companies, e.g. market data from Bloomberg and others.

The problem, in this case, was not a matter of personal data, therefore they did not need to comply with GDPR. (Compliance projects are, by the way, one of the most common uses of MANTA.) This was data such as past and current prices of shares, market research data, and other data that has a different kind of “sensitivity”.

Here, there were two main problems with the sensitivity of the data:

  1. When the company buys market data from third parties, they often need to assure the seller that the data is being handled in a safe way, with minimum risk of leaking, and it usually must be used according to the terms of a license agreement for data use.
  2. The second thing to look out for in relation to this data is that such data is usually priced according to the number of users who have access to it.

Based on these two points, MANTA had to help the customer with the need to comply with the licensing agreements, to be able to prove how and where the data is being stored; and to be able to prove the number of users, to monitor the profiles of the users working with the data.

How does MANTA solve these problems?

MANTA documents what market data sets are used in which individual reports. When combined with access privileges to the individual reports, the company has a clear and documented understanding of which data set is used by how many end-users, and they pay for the particular data set accordingly. Moreover, with people coming and leaving and any changes happening in the actual data use, the always up-to-date lineage gives them exact numbers at any point in time.

Do you have any questions about how MANTA can solve a problem at your company? Contact us at manta@getmanta.com or use the cute little bot on the right. 

Budget Friendly MANTA: Our Pricing Is Built to Last

January 9, 2019 by

Most budgeting for the next year occurs at the end of the previous one. But we really get into it in January, when things start to fall into place. To help you calculate along with MANTA in 2019, we have written an article all about MANTA’s pricing.

Most budgeting for the next year occurs at the end of the previous one. But we really get into it in January, when things start to fall into place. To help you calculate along with MANTA in 2019, we have written an article all about MANTA’s pricing.

Last year we published an article about how MANTA is able to last a company lifetime from a technological perspective. Now, we’ll share how MANTA’s pricing is also set to meet your needs. Besides the carefully designed tiers that you can read about here, we have also made sure our pricing meets customer needs even when special situations occur – for example, the growth of the environment or the need to upgrade to a higher tier.

When it comes to upgrading, we are always sure to take into account how much you have already paid for MANTA. You see, with us the minimum term for a subscription license is one year. Twelve months from beginning to end. For example, let’s say you buy a MANTA license now, in January. But as you can not always plan absolutely everything in advance, by April you find you need to upgrade to a higher tier. In this case, MANTA takes into account the remaining months of the subscription and allows you to upgrade to a new tier with a contract starting from the date of purchase. So now, your new contract with MANTA will last from April to April of the following year.

Now, a less standard situation would be one where the customer upgrades after a longer period of time. Such a customer might have an older MANTA license, for example, a now historic one-time-purchase license. In such a case, MANTA takes this license into account as well. If such a license is still supported, meaning the customer pays for support, we pause this license at the time the customer closes an annual contract for a higher tier. Yes, you understood that right, pause!

Here you can see the transformation showed on an example of a model company that would like to upgrade to an Unlimited Fleet Tier.

 

So theoretically, if one of our customers ever wants to downgrade from the subscription back to the basic license, it is always possible. But that has never happened. Once our customers take a sail on one of the big boats, they never want to go back to the cheap-folks pier.

You can view our full price list here.

Do you have any more questions about how our pricing works? Then write to us at manta@getmanta.com or schedule a call at a time that is convenient for you directly in our chatbot. He’s over there on the right, and he is really not as annoying as he looks, I promise!

We cherish your privacy.

And we need to tell you that this site uses cookies. Learn more in our Privacy Policy.