StreamSets Data Collector is an open-source execution engine for fast data ingestion and light transformations. The engine is designed to execute smart data pipelines for streaming, change data capture (CDC), and batch data without hand coding. The MANTA StreamSets scanner includes – but is not limited to – support for Hadoop, JDBC, and Google BigQuery, both as origin and destination stages, as well as processor stages such as fields, expressions, schemas, and data parsers.
StreamSets
SUPPORTED SCANNERS