The data tracer project aims at identifying metadata using machine learning. Where did the data come from, where did it go, what transforms happened, types, connections, are all what we aspire to identify. To develop these machine learning based methods, we develop several machine learning pipelines and make it easy for an end user to use. Explore our several open source libraries, testbeds, benchmarking frameworks, contribute and become part of the community. And most of all, try and give us feedback.
Learn about different concepts that underpin Data Tracer, evaluation and usage through our tutorials.
Explore our open source libraries, contribute and become part of the community.
Tracer
Data Lineage Tracing Library.
DataReactor
Augmenting relational datasets by generating derived columns with known lineage.
Metadata
Organized representation and validation of metadata.