How to "bulk load" an existing data dictionary into Zetaris Catalogue

Overview:

Users might want to update metadata tags and descriptions into the data catalog of Lightning. To make the process easier and avoid manual work of writing hundreds of SQL queries or manually adding them, a python library is available for our use.

Description of Library:


Information 

Details 

Name of the Python library

ndpmetadata

Version

0.5

Purpose

Automatically update tags & descriptions in data catalog on Lightning platform from a excel sheet

Requirements:

  1. Python should be installed (The latest version if possible)

  2. Python Libraries

    1. Pandas (version 1.1.5 or newer)

    2. Jaydebeapi (Latest version)

    3. Xlrd (version 1.2.0)

  3. Any IDE (Google Colab, Jupyter Notebooks, PyCharm, etc.)

  4. An excel sheet (.xlsx) containing all the tags and descriptions.

  5. Tags should be alphanumeric. Descriptions should not contain quotes. ( “ , ' )

  6. Lightning Java Archive file (.jar).

  7. Enterprise version’s server details and user credentials

  8. Lightning Driver’s class file

Steps:

  • Install the library using either of the methods.

    • Use Command Prompt / Terminal and use the command :

      pip install ndpmetadata
    • Using Jupyter / Colab or any other notebook, you can install the library using:

      !pip install ndpmetadata
  • Use the below command to import the function:

from metadata_dc.metadata import execute_metadata
  • Call the function along with the necessary arguments. The arguments are described in detail in the table below along with examples.

execute_metadata(
host,
port,
user,
password,
path_of_excel_sheet,
driver,
path_of_jar_file
)

Argument 

Description

Example 

host

Host / IP address of the enterprise version of lightning

“13.75.254.117”

port

Lightning Port

“10000”

user

Email Address of the admin User

“abc.xyz@zetaris.com”

password

Password

“password”

path_of_excel_sheet

Entire path of the excel file

“/Users/harshal/Desktop/metadata.xlsx”

driver

Lightning JDBC Driver

“com.zetaris.lightning.jdbc.LightningDriver”

path_of_jar_file

Entire path of the Java Archive file

“/Users/harshal/Desktop/lightning-jdbc-driver-2.1.0.1-driver.jar”

Please Note: Port and Driver will remain the same every time and for every user.

Demonstration:

  • Use the below table, and save as an Excel sheet containing tags and descriptions in the below-mentioned format

Database Name Table Name Column Name Description Tag1 Tag2
Test_DB table_test1 alarm_origin Test Description1 Test Tag 11 Test Tag 21
Test_DB table_test2 alarm_type Test Description2 Test Tag 12 Test Tag 22

Please Note: The columns Database Name, Table Name, Column Name, and Description are mandatory and should be written in the same way as shown in the image above.

Database name → Datasource Name

Table Name → Entity

Column Name → Attribute

Description → Descriptions for the respective attributes

 

Tag1, Tag2 are the tags for the columns and there could be as many tags as possible and with any column names.

For example, after the Description column, we have 4 tags per attribute. In this case, we can have different columns like Tag1, Tag2, Tag3, Tag4.

  • Open any IDE and execute the code as shown (if, ndpmetadata library is already installed):

from metadata_dc.metadata import execute_metadata

execute_metadata("18.219.104.62",
                 "10000",
                 "rishabh.bhagat@zetaris.com",
                 "password",
                 "/Users/harshal/Desktop/trial.xlsx",
                 "com.zetaris.lightning.jdbc.LightningDriver",
                "/Users/harshal/Desktop/lightning-jdbc-driver-2.1.0.1-driver.jar")

Output

Python Notebook Output

python_output

Updated Data Catalog in Zetaris

updated_catalog