Metabase on GKE Autopilot — Lab Guide
Overview
Estimated time: 3–4 hours
Metabase is an open-source business intelligence platform that enables non-technical users to query and visualize data without SQL. This lab deploys Metabase on Google Kubernetes Engine (GKE) Autopilot backed by Cloud SQL PostgreSQL 15, Workload Identity, and session-affinity-enabled LoadBalancer. A db-init Kubernetes Job initializes the PostgreSQL schema before Metabase's first boot.
What the Module Automates
- GKE Autopilot namespace and Kubernetes Deployment
- Cloud SQL PostgreSQL 15 instance, database, and user
- Cloud SQL Auth Proxy sidecar injection
db-initKubernetes Job (creates Metabase database schema before first boot)- Secret Manager secrets (database credentials)
- Artifact Registry repository and Cloud Build image pipeline
- Workload Identity and IAM bindings
- Kubernetes Service (LoadBalancer) with
ClientIPsession affinity - HPA and PodDisruptionBudget
- Cloud Monitoring uptime checks and alert policies
- Automated database backup Kubernetes CronJob
What You Do Manually
- Note the deployment outputs (external IP, namespace) from the RAD UI deployment panel
- Configure
kubectlwith cluster credentials - Obtain the external load balancer IP and confirm Metabase is reachable
- Wait for Metabase JVM startup (60–120 seconds)
- Complete the Metabase setup wizard
- Connect data sources (BigQuery, Cloud SQL, PostgreSQL, etc.)
- Create questions, dashboards, and collections
- Configure users and permissions
- Review logs in Cloud Logging and metrics in Cloud Monitoring
CLI and REST API Overview
| Tool | Purpose |
|---|---|
gcloud | Retrieve secrets, query GCP resources |
kubectl | Inspect pods, deployments, and services |
Install: Google Cloud SDK, kubectl
Prerequisites
- A GCP project with billing enabled.
- The
Services GCPmodule deployed in the same project (provides VPC, GKE cluster, Cloud SQL instance). - Access to the RAD UI with permission to deploy modules in the target GCP project.
- The following APIs enabled (Services GCP handles this):
container.googleapis.comsqladmin.googleapis.comsecretmanager.googleapis.comartifactregistry.googleapis.comcloudbuild.googleapis.com
gcloudauthenticated:gcloud auth application-default loginkubectlinstalled and available in PATH.
Phase 1 — Deploy Infrastructure [AUTOMATED]
Step 1.1 — Configure Variables
| Variable | Required | Default | Description |
|---|---|---|---|
project_id | Yes | — | GCP project ID |
tenant_deployment_id | No | "demo" | Short identifier for this deployment (e.g. "prod") |
deployment_id | No | "" | Auto-generated suffix appended to resource names |
region | No | "us-central1" | GCP region for resources |
application_name | No | "metabase" | Base name used in Kubernetes and GCP resource naming |
application_version | No | "v0.51.3" | Metabase container image version |
deploy_application | No | true | Set false to provision infrastructure without deploying the workload |
container_resources | No | { cpu_limit="2000m", memory_limit="4Gi" } | CPU/memory limits |
min_instance_count | No | 1 | Minimum pod replicas. Keep at 1 for production (eliminates 60–120s JVM cold starts) |
max_instance_count | No | 5 | Maximum pod replicas |
application_database_name | No | "metabase" | PostgreSQL database name |
application_database_user | No | "metabase" | PostgreSQL database username |
session_affinity | No | "ClientIP" | Sticky sessions — required to prevent session loss on HPA scale-out |
termination_grace_period_seconds | No | 60 | Seconds Kubernetes waits after SIGTERM before killing pods (allows in-flight queries to complete) |
service_type | No | "LoadBalancer" | Kubernetes Service type |
backup_schedule | No | "0 2 * * *" | Cron schedule for automated backups |
support_users | No | [] | Email addresses for monitoring alerts |
Step 1.2 — Initiate Deployment
Deployment is initiated from the RAD UI. Fill in the variables form and click Deploy.
Approximate deployment durations:
| Phase | Duration |
|---|---|
| Cloud SQL PostgreSQL instance creation | 8–12 min |
db-init Kubernetes Job execution | 2–3 min |
| GKE namespace and IAM provisioning | 3–5 min |
| Artifact Registry image build (Cloud Build) | 5–10 min |
| Kubernetes Deployment rollout | 3–5 min |
| Total | 21–35 min |
Step 1.3 — Record Outputs
After deployment completes, the following outputs are available in the RAD UI deployment panel.
| Output | Description |
|---|---|
external_ip | External load balancer IP for Metabase |
namespace | Kubernetes namespace |
database_instance_name | Cloud SQL instance name |
database_password_secret | Secret Manager secret name for the DB password |
deployment_id | Unique deployment identifier |
Set shell variables:
export PROJECT="your-gcp-project-id"
export REGION="us-central1"
export CLUSTER=$(gcloud container clusters list \
--project=${PROJECT} \
--region=${REGION} \
--format="value(name)" \
--limit=1)
export NAMESPACE=$(kubectl get namespaces \
--selector="app.kubernetes.io/managed-by=terraform" \
-o jsonpath='{.items[0].metadata.name}')
Phase 2 — Configure kubectl and Access Metabase [MANUAL]
Step 2.1 — Get Cluster Credentials
gcloud container clusters get-credentials ${CLUSTER} \
--region=${REGION} \
--project=${PROJECT}
Expected result: kubeconfig is updated with credentials for the GKE cluster.
Step 2.2 — Verify the db-init Job Completed
kubectl get jobs -n ${NAMESPACE}
Expected result: A job named metabase-db-init (or similar) with COMPLETIONS: 1/1 and STATUS: Complete. If the job shows Failed, check the logs:
kubectl logs -n ${NAMESPACE} -l job-name=metabase-db-init
Step 2.3 — Verify Pods are Running
kubectl get pods -n ${NAMESPACE}
Expected result: One or more Metabase pods in Running state with READY 2/2 (main container + Cloud SQL Auth Proxy sidecar).
Step 2.4 — Get the External IP
kubectl get service -n ${NAMESPACE} --selector="app=metabase" -o wide
Expected result: A LoadBalancer service with an external IP (may take 2–3 minutes to provision).
export METABASE_IP=$(kubectl get service -n ${NAMESPACE} \
--selector="app=metabase" \
-o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}')
export SERVICE_URL="http://${METABASE_IP}:3000"
Step 2.5 — Wait for Metabase JVM Startup
Metabase takes 60–120 seconds to fully initialize. Poll the health endpoint:
until curl -s -o /dev/null -w "%{http_code}" ${SERVICE_URL}/api/health | grep -q "200"; do
echo "Waiting for Metabase..."; sleep 10
done
echo "Metabase is ready"
gcloud equivalent:
gcloud logging read \
'resource.type="k8s_container" AND resource.labels.namespace_name="'${NAMESPACE}'" AND textPayload:"Metabase Initialization COMPLETE"' \
--project=${PROJECT} \
--limit=5
Step 2.6 — Check the Health Endpoint
curl -s ${SERVICE_URL}/api/health | jq .
Expected result:
{"status": "ok"}
Phase 3 — Complete the Setup Wizard [MANUAL]
Step 3.1 — Access the Setup Wizard
Open a browser and navigate to ${SERVICE_URL}.
Expected result: The Metabase setup wizard appears. On subsequent visits after setup is complete, the login page is shown instead.
Step 3.2 — Complete Initial Setup
- Select language: Choose your preferred language and click Let's get started.
- Admin account: Enter your name, email, and a strong password.
- Add a data source (optional — skip to connect later):
- Click Add your data or I'll add my data later.
- Usage data: Choose whether to allow Metabase to collect anonymized usage data.
- Click Take me to Metabase.
Expected result: You are redirected to the Metabase home screen.
Step 3.3 — Retrieve Database Password (if needed)
export DB_SECRET=$(gcloud secrets list \
--project=${PROJECT} \
--filter="name~metabase.*password" \
--format="value(name)" \
--limit=1)
gcloud secrets versions access latest \
--secret="${DB_SECRET}" \
--project=${PROJECT}
REST API equivalent:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://secretmanager.googleapis.com/v1/projects/${PROJECT}/secrets?filter=labels.app%3Dmetabase"
Phase 4 — Connect Data Sources [MANUAL]
Step 4.1 — Connect to BigQuery
- In Metabase, navigate to Admin (gear icon) > Databases > Add a database.
- Select BigQuery as the database type.
- Configure:
- Project ID: Your GCP project ID
- Dataset filters: Optionally restrict to specific datasets
- Service Account JSON: Paste a service account key JSON, or use the Cloud Run SA via Workload Identity (recommended)
- Click Save.
gcloud — grant the Metabase GKE service account BigQuery access:
export KSA=$(kubectl get serviceaccount -n ${NAMESPACE} \
-o jsonpath='{.items[0].metadata.name}')
export GSA=$(kubectl get serviceaccount -n ${NAMESPACE} ${KSA} \
-o jsonpath='{.metadata.annotations.iam\.gke\.io/gcp-service-account}')
gcloud projects add-iam-policy-binding ${PROJECT} \
--member="serviceAccount:${GSA}" \
--role="roles/bigquery.dataViewer"
Expected result: After saving, Metabase syncs BigQuery metadata. Tables appear in the Data Browser within 2–5 minutes.
Step 4.2 — Connect to Cloud SQL PostgreSQL
- Navigate to Admin > Databases > Add a database.
- Select PostgreSQL.
- Configure:
- Host: Cloud SQL internal IP or the Auth Proxy socket path (
/cloudsql/<instance>) - Port:
5432 - Database: Your database name
- Username: Your PostgreSQL user
- Password: Retrieved from Secret Manager
- Host: Cloud SQL internal IP or the Auth Proxy socket path (
- Click Save.
gcloud — retrieve a Cloud SQL connection string:
gcloud sql instances describe \
$(gcloud sql instances list --project=${PROJECT} --format="value(name)" --limit=1) \
--project=${PROJECT} \
--format="value(connectionName)"
Step 4.3 — Verify Data Source Connection
# List connected databases via Metabase API
curl -s -u admin@example.com:yourpassword \
${SERVICE_URL}/api/database \
| jq '.[].name'
Expected result: Your connected databases are listed. Metabase begins scanning table metadata in the background.
Phase 5 — Create Questions and Dashboards [MANUAL]
Step 5.1 — Create a New Question
- Click + New > Question.
- Select a data source and table.
- Use the GUI query builder to filter, group, and summarize data without SQL.
- Click Visualize to see the results.
- Click Save to save the question to a collection.
Expected result: The question appears in your collection with a visualization type automatically selected.
Step 5.2 — Write a Native SQL Query
- Click + New > Question > Native query.
- Select your data source.
- Write a SQL query:
SELECT date_trunc('day', created_at) as day, count(*) as events
FROM my_table
WHERE created_at > now() - interval '30 days'
GROUP BY 1 ORDER BY 1 - Click Run query and then Visualize.
Step 5.3 — Build a Dashboard
- Click + New > Dashboard.
- Click Add a question to add saved questions.
- Resize and arrange cards on the dashboard canvas.
- Add text, images, or filter widgets.
- Click Save.
Expected result: A shareable dashboard with multiple visualizations from your data sources.
Phase 6 — Observe Kubernetes Behaviour [MANUAL]
Step 6.1 — Check Pod Details
kubectl describe pod -n ${NAMESPACE} -l app=metabase | head -80
Expected result: Pod spec includes the Cloud SQL Auth Proxy as a sidecar container, environment variables from Secret Manager, and MB_JETTY_PORT=3000 and JAVA_TIMEZONE=UTC.
Step 6.2 — View Pod Logs
kubectl logs -n ${NAMESPACE} -l app=metabase -c metabase --tail=50
gcloud equivalent:
gcloud logging read \
'resource.type="k8s_container" AND resource.labels.namespace_name="'${NAMESPACE}'" AND resource.labels.container_name="metabase"' \
--project=${PROJECT} \
--limit=50 \
--format="table(timestamp, textPayload)"
REST API equivalent:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://logging.googleapis.com/v2/entries:list" \
-d '{
"projectIds": ["'"${PROJECT}"'"],
"filter": "resource.type=\"k8s_container\" AND resource.labels.namespace_name=\"'"${NAMESPACE}"'\" AND resource.labels.container_name=\"metabase\"",
"pageSize": 20
}'
Expected result: Metabase JVM startup logs, including Metabase Initialization COMPLETE after the 60–120 second startup sequence.
Step 6.3 — Observe HPA and Session Affinity
kubectl get hpa -n ${NAMESPACE}
Verify session affinity is configured:
kubectl get service -n ${NAMESPACE} --selector="app=metabase" \
-o jsonpath='{.items[0].spec.sessionAffinity}'
Expected result: ClientIP — requests from the same client IP are routed to the same pod, preventing session loss during scale-out events.
Generate load to trigger scaling:
for i in $(seq 1 50); do curl -s -o /dev/null ${SERVICE_URL}/api/health; done
kubectl get pods -n ${NAMESPACE} -w
Expected result: HPA schedules additional pods. Existing sessions remain on their original pods due to ClientIP affinity.
Step 6.4 — Check the PodDisruptionBudget
kubectl get pdb -n ${NAMESPACE}
Expected result: A PDB with MIN AVAILABLE: 1, ensuring at least one Metabase pod is always available during cluster maintenance.
Phase 7 — Explore Cloud Logging [MANUAL]
Step 7.1 — Filter for JVM Warnings
gcloud logging read \
'resource.type="k8s_container" AND resource.labels.namespace_name="'${NAMESPACE}'" AND severity>=WARNING' \
--project=${PROJECT} \
--limit=20
Expected result: Under normal operation, only informational logs appear.
Step 7.2 — Review the Uptime Check
Navigate to Monitoring > Uptime checks in the Cloud Console.
Expected result: The uptime check targeting /api/health shows Passing.
Phase 8 — Configure Metabase [MANUAL]
Step 8.1 — Set Up Email Notifications
Configure via environment_variables in the RAD UI:
environment_variables = {
MB_EMAIL_SMTP_HOST = "smtp.sendgrid.net"
MB_EMAIL_SMTP_PORT = "587"
MB_EMAIL_FROM_ADDRESS = "metabase@example.com"
}
secret_environment_variables = {
MB_EMAIL_SMTP_PASSWORD = "metabase-smtp-password"
}
Or navigate to Admin > Email in the Metabase UI.
Step 8.2 — Configure User Authentication
Metabase supports SSO via Google, SAML, or LDAP (Enterprise Edition). For Google OAuth:
- Navigate to Admin > Authentication > Google.
- Enter your Google Client ID.
- Configure allowed domains.
Step 8.3 — Enable Public Sharing
To allow dashboard public sharing:
- Navigate to Admin > Public sharing.
- Enable Enable public sharing.
Phase 9 — Undeploy [AUTOMATED]
When you are finished, return to the RAD UI, navigate to your deployment, and click Undeploy (or Delete) to remove all resources.
Approximate undeploy duration: 15–25 minutes (Cloud SQL and PVC deletion take the longest).
Warning: This permanently deletes all resources including the Metabase application database (all questions, dashboards, users, collections). Export your data model before undeploying: Admin > Serialization (Metabase Pro/Enterprise) or export individual questions as CSV/JSON.
Resources provisioned by the Services GCP module (VPC, GKE cluster, Cloud SQL instance) are managed separately and must be undeployed via their own RAD UI deployment entry.
Summary
| Action | Phase | Automated |
|---|---|---|
| GKE namespace and Kubernetes Deployment | 1 | Yes |
| Cloud SQL PostgreSQL 15 database | 1 | Yes |
db-init Kubernetes Job (schema init) | 1 | Yes |
| Cloud SQL Auth Proxy sidecar | 1 | Yes |
| Secret Manager credentials | 1 | Yes |
| Workload Identity and IAM bindings | 1 | Yes |
| Container image build (Cloud Build) | 1 | Yes |
| HPA and PodDisruptionBudget | 1 | Yes |
| Configure kubectl credentials | 2 | No |
| Confirm external IP and Metabase reachability | 2 | No |
| Complete setup wizard (admin account, language) | 3 | No |
| Connect BigQuery data source | 4 | No |
| Connect Cloud SQL PostgreSQL data source | 4 | No |
| Create questions and dashboards | 5 | No |
| Observe pods, logs, HPA, session affinity | 6 | No |
| Review Cloud Logging | 7 | No |
| Configure SMTP and authentication | 8 | No |
| Undeploy infrastructure | 9 | Yes |