Data Lineage for Apache Kafka
Apache Kafka is an open-source distributed event streaming platform used for many different use cases such as messaging, website activity tracking, and stream processing.
MANTA can either connect to Confluent Platform Schema Registry and extract the schemas contained in Kafka topics in an automated way or allow the user to describe the Kafka environment on their own to visualize it and benefit from integrations with other scanners. The Kafka visualization includes objects such as a cluster, topics, schemas, and columns.
Main scanner features
- Metadata extraction from Confluent Platform Schema Registry
- Option to define the elements in Kafka manually by providing a simple JSON file
- Schema definitions in JSON schema and Avro format
- Integrations with DataStage and StreamSets scanners
- Schema definitions using “raw” JSON files or payloads
What you can look forward to
- Exports to third-party tools
- HTTPs support for extraction from Confluent Schema Registry
- Support for the different naming strategies used in Confluent Schema Registry
Frequently Asked Questions
What is a data lineage scanner?
A data lineage scanner connects to database repositories, ETL tools, reporting tools, and other types of source technology to document how data flows, transforms, and impacts assets both downstream and upstream as well as where the data is sourced from, making it possible to gain full visibility and control over even the most complex data pipelines.
How flexible is MANTA when it comes to possible integrations?
Can I integrate MANTA with my CICD pipeline?
Yes, MANTA can be utilized as a component of a CICD pipeline to supplement teams’ development efforts.
How can MANTA integrate with my data intelligence?
You can boost your data intelligence efforts with detailed, accurate, and up-to-date data lineage provided by MANTA. MANTA has a robust API for developing integrations with data intelligence tools.
Can I integrate MANTA with my data privacy tool?
Yes, you can leverage MANTA’s comprehensive data lineage to build trust in data, ensure data security, and adjust your data privacy policies. MANTA has a robust API for developing integrations with data privacy tools.
Can I integrate MANTA with my profiling tool?
You can utilize MANTA’s detailed lineage and unique features for data profiling and achieving better data quality. MANTA has a robust API for developing integrations with data profiling tools.
How can MANTA integrate with my metadata management tool?
MANTA has OOTB connectors to all the major players in the data governance/cataloging space. MANTA also can export its repository to consumable formats for unsupported third-party metadata management applications. Please visit MANTA’s Integrations page to find a full list of supported data governance tools/catalogs.
Does MANTA work with various ETL orchestrations?
There will always be technologies on the market that don’t have supported scanners provided by MANTA. In order for the lineage from unsupported technologies to be represented in MANTA visualization diagrams, MANTA provides a framework called Open MANTA. The Open MANTA framework makes it possible to define and manage lineage generated by unsupported technologies.
How can data lineage improve data quality?
When you have a complete overview of all your data flows, sources, transformations, and dependencies, you have control of your data assets. You can speak to the accuracy and quality of your data and have confidence in your data information and reports. By giving you a full overview of how your data moves across systems, where it originated, how it transforms along the way, and how it’s interconnected, data lineage can help you to ensure the quality of your data, reinforce your overall data management strategy, and increase trust in your data.
What is the purpose of data lineage?
Data lineage helps you tame data complexity and gives you a full overview of how your data moves across systems, including where it originated, how it transforms along the way, and how it’s interconnected. Such an overview will help you boost your data governance efforts, increase overall trust in data, achieve full regulatory compliance, accelerate root cause and impact analyses, roll out our frequent bug-free releases, painlessly migrate to the cloud, and more.
Get to Know Your Data’s Complete Story with Data Lineage
Metadata—data about your data—holds necessary information that helps you unlock valuable insights. Insights that will allow you to fully understand your data and get rid of anecdote-driven decisions and processes once and for all.