This guide provides step-by-step instructions to deploy and run the Zetaris Lightning Catalog (Open Source) on a GPU-enabled Virtual Machine (VM) in Google Cloud Platform (GCP).
The guide covers the complete process — from provisioning the VM with the right hardware, GPU drivers, and RAPIDS support, to installing the required software stack (Conda, CUDA, NVIDIA Driver, Git, Docker), and finally launching the Lightning Catalog application via Docker Compose.
By following this guide, you will:
-
Set up a GCP VM with NVIDIA V100 GPU and optimal specs for Spark + RAPIDS workloads.
-
Install all necessary dependencies for GPU-accelerated Spark execution.
-
Pull and run the Zetaris Lightning Catalog in a containerized environment.
-
Access the web-based UI for managing and querying enterprise data assets.
This document assumes you already have a GCP account with permission to create VM instances and access to pull public Docker images.
Component | Recommended Configuration |
---|---|
GPU Type |
NVIDIA V100 (16GB) — Best RAPIDS compatibility and proven Spark + cuDF support |
Machine Type | Minimum: n1-standard-8 (8 vCPUs, 30 GB RAM)Recommended: Higher for larger workloads |
OS | Ubuntu 22.04 LTS |
Disk | Boot Disk: 100 GB SSD |
Environment Setup
Once the VM is provisioned, install Conda, RAPIDS, CUDA, and the NVIDIA Driver.
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
Add Conda to PATH
echo 'export PATH="$HOME/miniconda3/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
Create RAPIDS environment
conda create -n rapids-24.02 -c rapidsai -c nvidia -c conda-forge rapids=24.02 python=3.10 cudatoolkit=11.8
conda activate rapids-24.02
Verify Conda installation
conda --version
B.Install NVidia Driver
sudo apt update
sudo apt install -y nvidia-driver-535
sudo reboot
nvidia-smi
- `conda --version` → Conda installed and accessible
- `nvidia-smi` → GPU recognized and driver working
Your VM should now have:
- Ubuntu 22.04 LTS
- NVIDIA Driver, CUDA, and RAPIDS installed
We will now:
- Install Git + Docker
- Clone the Lightning Catalog repo
- Run it via Docker Compose
# Update system packages
sudo apt update && sudo apt -y upgrade
# Install Git & dependencies
sudo apt install -y git ca-certificates curl gnupg lsb-release
# Add Docker’s GPG key & repo
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker Engine and Compose plugin
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# (Optional) Run Docker without sudo
sudo usermod -aG docker $USER
newgrp docker
# Verify installation
docker --version
docker compose version
git clone https://github.com/zetaris/Thunderlake.git cd Thunderlake
# Using Compose v2
docker compose pull
# Legacy
docker-compose pull
# Specific image example:
docker pull zetaris/lightning-catalog-nvidia:latest
# Using Compose v2
docker compose up -d
# Legacy
docker-compose up -d
# See running containers
docker ps
# View logs
docker compose logs -f
docker-compose logs -f (legacy)
Access the UI:
- Direct: `http://<VM_PUBLIC_IP>:8081`
- GCP SSH Tunnel:
gcloud compute ssh <INSTANCE_NAME> --zone -- -L 8081:localhost:8081 -L 8080:localhost:8080
Then browse: http://localhost:8081
Ensure firewall rules/security groups allow inbound traffic on the exposed ports.