What is data lineage in a data lake?
When looking at data lineage in a data lake, let’s first define what a data lake is. A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is without first structuring it and run different types of analytics—from dashboards and visualizations to big data processing, real-time analytics, and machine learning to guide better decisions.
Within a data lake, data can be moved, transformed, and utilized by other systems. These activities create data lineage, which MANTA’s automated data lineage platform is able to visualize.
Didn’t find the answers you were looking for? Get in touch with us!