Data Catalog

Collection and Integrated Management of Metadata from Data Assets

The integrated management of metadata from enterprise data assets allows for more effective use of data by users, including data analysts, data scientists, and developers. Automated collection and integration of data ensures that data is always kept up to date. Supported by strong search capabilities and data classification, an ever-efficient utilization of data assets is also achieved.

Overview

01

04

Service Architecture

  • User → Request/Distribute Products → Data Catalog ← Process Data ← Data Engineer
  • Data Source Crawling → Data Catalog
  • Data Source Hooking → Data Catalog
Data Catalog
  • Metadata Crawler: ATLAS, Ranger, Kafka, HBASE, HDFS, SOLR, PostgreSQL, ZooKeeper
  • Catalog Server: ATLAS, Ranger, Kafka, HBASE, HDFS, SOLR, PostgreSQL, ZooKeeper
Data Source
  • Oracle DataBase, PostareSQL, Vertica, Microsoft SQL Server, MariaDB, MySQL
  • HIVE

Key Features

  • Automated collection of metadata

    - Meta crawler : Collect metadata from data source such as DB, schema, table, and columns
    - Lineage crawler : Collect history information of data source
    - Sample crawler : Collect sample metadata

  • Check data lineage

    - Provide a visual representation of data flow
    - Manage table and schema change history

  • Integrated search

    - Search data using filters such as metadata, table name and tag
    - Detailed lookup of table such as summary, columns, and lineage
    - Search by filters such as role, owner, classification, and terms

  • Data classification

    - Discern key traits of assets
    - Grouping for data protection
    - Provide access control feature for metadata using tag policy

Pricing

    • Billing
    • Charged based on VM resources used by Data Catalog
    • Charged separately for VM resources and storage
Let’s talk

Whether you’re looking for a specific business solution or just need some questions answered, we’re here to help

Share