This page lists all the steps that needs to be followed in order to implement the version control mechanism for Zetaris Lightning Metastore. This mechanism is useful to restore the data objects to its previously working state in case of a crash.
Prerequisites:
Setting up Git Repository and Gathering Credentials
1. Create an empty GitHub repository from GitHub Home Page with the default branch named "main" under your organisation's account.
2. Generate a new token by visiting https://github.com/settings/tokens - Connect your Github account and selecting "Generate new token" → "Generate new token (classic)." Assign an appropriate name to the token and configure the necessary permissions.
3. Please make sure to store the token along with the following details: branch named 'main', owner (in this case 'Zetaris'), repository name (in this case 'Test-Version-Control'), and the GitHub API/FILE URL as <https://api.github.com/repos/<owner>/<repo-name>/contents/>.
Setting up Airflow for Full Backup & Full/Partial Restore
1. We need to create variables on the Airflow UI to provide parameters for the backup and restore process. Follow these steps:
-
On the Airflow UI, navigate to the Admin section and select "Variables" from the dropdown list.
- Upload the Airflow Variables content as a JSON file by selecting "Choose File," and then import it by clicking "Import Variables" in Airflow.
Please note:
By default, the installation of Airflow includes predefined variables related to Airflow configuration and the Zetaris Environment.
The table provided below contain either example values or parameter names, which should be replaced with appropriate and specific values.
In the case of backup, leave the variables starting with 'Restore_' with a default value of a single space (' ').
On the Airflow UI, for restore mechanism, edit the variable values below in LOWER CASE before triggering the DAG.
Variable Name |
Value |
---|---|
Email_Address |
<email_address_for_notifications> |
VC_GitHubToken |
<github-token> |
VC_Github_Repo |
<github-repo-name> |
VC_Github_Owner |
<github-organisation-name> |
VC_Github_Folder |
backup/ |
VC_Github_Branch |
main |
VC_GitHub_File_URL |
<github-file-url> |
Restore_Type |
full or partial or individual |
Restore_Object_Type |
pipelines or data_marts or permanent_views |
Restore_Object_Name |
<pipeline_container name or data_mart name or permanent_view name> |
Restore_Object_Individual_Name |
<individual_pipeline_name> |
- Please refer to the table below for the different types of restore mechanisms with the specific required user inputs for the variables defined in the above step on the Airflow UI
Type of Restore (Level of Restore) |
Description |
Variable1 |
Variable2 |
Variable3 |
|
---|---|---|---|---|---|
Full Restore |
Restore all the pipelines, data marts and views |
full |
<single_space> |
<single_space> |
<single_space> |
Object Type |
Restore on Object type level for either pipelines, data marts or views(allowed to enter only one option at any time) |
partial |
pipelines |
<single_space> |
<single_space> |
Container |
Restore on Pipeline Container or Data Mart Container or a individual permanent view (allowed to enter only one option at any time) |
individual |
pipelines |
<pipleine_container_name> OR <data_mart_container_name> OR <permanent_view_name> |
<single_space> |
Individual Pipeline |
Restore any individual pipeline |
individual |
pipelines |
<pipeline_container_name> |
<pipeline_name> |
2. To setup teams notifications, follow the below steps:
-
Configuring webhook in Microsoft teams
-
Open up teams, browse to the channel you would like to configure messages to be sent to
-
click the … beside the channel name then select Connectors
-
Search for Incoming Webhook.
-
Click Configure, provide a webhook name and select Create. A Webhook URL will be generated.
-
On the Airflow UI, click on Admin and click on connections from the dropdown list
-
Click on Add and add the respective parameters as shown below in the image and name the conn_id as ‘msteams_webhook_url’
3. To setup slack notifications, follow the below steps:
-
Create a Slack app if you don’t have already
- Enable Incoming Webhooks on the next page
- Create an Incoming Webhook by clicking on Add New Webhook to Workspace on the same page
- So go ahead and pick a channel that the app will post to, and then click to Authorize your app. You’ll be sent back to your app settings, and you should now see a new entry under the Webhook URLs for Your Workspace section, with a Webhook URL that’ll look something like this:
https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX
- Create an Airflow connection for Slack with the name as ‘slack_webhook_url’ with HTTP connection and the part after https://hooks.slack.com/services should go under password:
Slack Conn Id: slack_webhook_url
Host: https://hooks.slack.com/services
Conn Type: HTTP
Password: /T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX
Schema : https
4. After setting up the airflow and defining the necessary variables, you are all set to execute the 'Backup_Zetaris_Data_Objects' and 'Restore_Zetaris_Data_Objects' dags, each designed for its specific purposes.
Example of different objects on Zetaris Lightning below:
Example of backups of the different objects on GITHUB below:
Walkthrough of Version Control Process (Video)