Skip to main content

Superset on Google Cloud Run

This document provides a comprehensive reference for the modules/Superset_CloudRun Terraform module. It covers architecture, IAM, configuration variables, Superset-specific behaviours, and operational patterns for deploying Apache Superset on Google Cloud Run (v2).


1. Module Overview

Apache Superset is an open-source data visualisation and exploration platform. Superset CloudRun is a wrapper module built on top of App CloudRun. It uses App CloudRun for all GCP infrastructure provisioning and injects Superset-specific application configuration, a two-phase database initialisation pipeline, and a SECRET_KEY secret via Superset Common.

Key Capabilities:

  • Compute: Cloud Run v2 (Gen2), 2 vCPU / 2 Gi by default, up to 5 instances.
  • Data Persistence: Cloud SQL PostgreSQL 15 (default). GCS superset-data bucket auto-provisioned by Superset Common.
  • Security: SUPERSET_SECRET_KEY (50-char random) auto-generated by Superset Common and stored in Secret Manager.
  • Caching: Redis optional but recommended for production (enable_redis = false by default).
  • Initialisation: Two-phase init — db-init (database schema) followed by app-init (Superset bootstrap: database upgrade + admin user creation). Both run automatically on first deploy.
  • CI/CD: Cloud Build custom image pipeline; psycopg2-binary is pre-installed in the bundled Dockerfile for PostgreSQL connectivity.

Project & Application Identity

VariableGroupTypeDefaultDescription
project_id1stringGCP project ID. Required.
tenant_deployment_id1string'demo'Short suffix appended to all resource names.
support_users1list(string)[]Email recipients for monitoring alerts.
resource_labels1map(string){}Labels applied to all provisioned resources.
application_name2string'superset'Base resource name. Do not change after initial deployment.
display_name2string'Superset'Human-readable name shown in the GCP Console.
description2string'Apache Superset data visualisation platform'Cloud Run service description.
application_version2string'latest'Superset image version tag.

Two-phase init: Superset Common provisions two sequential Cloud Run Jobs by default:

  1. db-init — creates the PostgreSQL database and user using postgres:15-alpine.
  2. app-init — runs as the Superset container image, executes db upgrade (schema migrations) and fab create-admin (admin user creation). app-init depends on db-init completing successfully.

2. IAM & Access Control

Superset_CloudRun delegates all IAM provisioning to App_CloudRun. The Cloud Run SA, Cloud Build SA, IAP service agent, and password rotation role sets are identical to those in App_CloudRun §2.

SUPERSET_SECRET_KEY: Superset Common auto-generates a 50-character random password (no special characters) and stores it in Secret Manager as {prefix}-secret-key. This key is used by Superset for Flask session signing and is injected as SUPERSET_SECRET_KEY.

App-init identity: The app-init Cloud Run Job runs as the Superset application container. It has access to the database credentials and SUPERSET_SECRET_KEY from Secret Manager to perform schema migration and admin user creation.

For the complete role tables, see App_CloudRun §2.


3. Core Service Configuration

A. Compute (Cloud Run)

Superset is a Python/React application with substantial resource requirements due to query execution and visualisation rendering.

VariableGroupDefaultDescription
deploy_application3trueSet false for infrastructure-only deployment.
container_image_source3'custom''custom' builds via Cloud Build. 'prebuilt' deploys an existing image URI.
container_image3""Override image URI. Leave empty for Cloud Build.
cpu_limit3'2000m'CPU per instance. 2 vCPU recommended for query execution.
memory_limit3'2Gi'Memory per instance.
container_port38088Superset's Gunicorn HTTP port.
execution_environment3'gen2'Gen2 execution environment.
timeout_seconds3600Max request duration. Extended to 600s for long-running queries and dashboard rendering.
enable_cloudsql_volume3trueConnects via Unix socket.
min_instance_count31Minimum Cloud Run instances.
max_instance_count35Maximum Cloud Run instances.
traffic_split3[]Canary/blue-green traffic allocation.

Differences from App CloudRun defaults:

VariableApp CloudRunSuperset CloudRunReason
container_port80808088Superset's Gunicorn port.
cpu_limit'1000m''2000m'Query execution and rendering require 2 vCPU.
memory_limit'512Mi''2Gi'Python-based query engine and caching require more memory.
timeout_seconds300600Long-running queries and dashboard loads can exceed 5 minutes.
min_instance_count01Dashboard users benefit from a warm instance.
max_instance_count15Multiple concurrent dashboard users require more instances.

B. Database (Cloud SQL — PostgreSQL 15)

Superset uses PostgreSQL 15 as its metadata database (storing dashboards, charts, datasets, and user settings).

VariableGroupDefaultDescription
db_name11'superset_db'PostgreSQL database name. Do not change after deployment.
db_user11'superset_user'PostgreSQL application user.
database_password_length1132Auto-generated password length.
enable_auto_password_rotation11falseAutomated zero-downtime password rotation.

Note: application_database_name = 'superset_db' and application_database_user = 'superset_user' are the defaults in Superset CloudRun. These differ from the App CloudRun defaults (gkeappdb / gkeappuser).

C. Storage (GCS)

VariableGroupDefaultDescription
create_cloud_storage10trueSet false to skip GCS bucket creation.
storage_buckets10[{ name_suffix = "data" }]Additional GCS buckets.
gcs_volumes10[]GCS Fuse volume mounts.

D. Networking

VariableGroupDefaultDescription
ingress_settings4'all''all', 'internal', or 'internal-and-cloud-load-balancing'.
vpc_egress_setting4'PRIVATE_RANGES_ONLY'VPC egress routing.

E. Initialization & Bootstrap

Superset Common provides a two-phase initialisation pipeline by default:

Phase 1 — db-init:

  • Image: postgres:15-alpine
  • Creates the superset_db database and superset_user via db-init.sh
  • execute_on_apply = true

Phase 2 — app-init:

  • Image: null (uses the Superset application image)
  • Executes app-init.sh: runs superset db upgrade and superset fab create-admin
  • depends_on_jobs = ["db-init"]
  • execute_on_apply = true
  • Timeout: 1800s (30 minutes for migrations on first run)
VariableGroupDefaultDescription
initialization_jobs12[]One-shot Cloud Run Jobs. Leave empty for Superset Common to supply the default two-job pipeline. Non-empty list replaces it entirely.
cron_jobs12[]Recurring scheduled jobs.

4. Advanced Security

A. Secret Key

Superset Common auto-generates a 50-character SUPERSET_SECRET_KEY and stores it in Secret Manager. This key is critical — changing it invalidates all existing user sessions.

VariableGroupDefaultDescription
secret_environment_variables5{}Additional Secret Manager references. SUPERSET_SECRET_KEY is auto-injected.

B. Cloud Armor WAF

VariableGroupDefaultDescription
enable_cloud_armor9falseProvisions Global HTTPS LB + Cloud Armor WAF.
admin_ip_ranges9[]CIDR ranges exempted from WAF rules.

C. Identity-Aware Proxy (IAP)

VariableGroupDefaultDescription
enable_iap4falseEnables IAP on the Cloud Run service. Useful for restricting Superset to internal users.
iap_authorized_users4[]Users/service accounts granted access.
iap_authorized_groups4[]Google Groups granted access.

5. Traffic & Ingress

VariableGroupDefaultDescription
enable_cloud_armor9falseProvisions Global HTTPS LB + Cloud Armor WAF.
enable_cdn9falseEnables Cloud CDN. Superset serves dynamic content; CDN is appropriate for static assets only.
application_domains9[]Custom domain names.

6. CI/CD & Delivery

VariableGroupDefaultDescription
enable_cicd_trigger7falseProvisions a Cloud Build GitHub trigger.
github_repository_url7""GitHub repository URL.
github_token7""GitHub PAT. Sensitive.
enable_cloud_deploy7falseProvisions a Cloud Deploy pipeline.
enable_binary_authorization7falseImage attestation enforcement.

7. Reliability & Scheduling

A. Health Probes

Superset exposes /health as its health endpoint.

VariableGroupDefaultDescription
startup_probe13{ enabled=true, type="HTTP", path="/health", initial_delay_seconds=60, timeout_seconds=5, period_seconds=10, failure_threshold=12 }Long startup probe — allows up to 180s for Superset initialisation.
liveness_probe13{ enabled=true, type="HTTP", path="/health", initial_delay_seconds=30, timeout_seconds=5, period_seconds=30, failure_threshold=3 }Liveness probe.
uptime_check_config13{ enabled=true, path="/health" }Cloud Monitoring uptime check.

Note on startup probe: initial_delay_seconds=60 and failure_threshold=12 give Superset up to 180 seconds of total startup tolerance. This accommodates the Gunicorn worker pool initialisation and first database connection setup.


8. Integrations

A. Redis Cache

Redis is disabled by default (enable_redis = false). For production deployments with multiple users, enabling Redis as the Celery broker and result backend is recommended.

VariableGroupDefaultDescription
enable_redis21falseEnables Redis. Recommended for production multi-user deployments.
redis_host21""Redis server hostname or IP.
redis_port216379Redis TCP port (number, not string in CloudRun variant).

Note: redis_port in Superset CloudRun is a number type (default 6379), unlike the string type in Superset GKE (default "6379").

B. Email (SMTP)

Configure SMTP for alert and report notifications via environment_variables.


9. Platform-Managed Behaviours

BehaviourImplementationDetail
50-char secret keyAuto-generated by Superset CommonSUPERSET_SECRET_KEY is a 50-char random string (special=false). Changing it invalidates all sessions.
Two-phase initdb-init + app-init jobsdb-init creates the database; app-init runs migrations and creates admin. Both run automatically.
psycopg2-binary in DockerfileSuperset Common DockerfileThe bundled Dockerfile pre-installs psycopg2-binary for PostgreSQL connectivity, which requires native library build at image creation time.
PostgreSQL 15database_type = "POSTGRES_15" fixed in Superset CommonPostgreSQL is the supported metadata database.
Port 8088container_port = 8088Superset's Gunicorn default port.
Extended timeouttimeout_seconds = 600Long-running SQL queries and dashboard renders can exceed 5 minutes.
superset-data bucketSuperset Common provisions automaticallyGCS bucket for chart exports and report outputs.
Unix socket defaultenable_cloudsql_volume = trueConnects to Cloud SQL via the Auth Proxy Unix socket.

10. Variable Reference

VariableGroupDefaultDescription
module_description0(Superset platform text)Platform metadata.
credit_cost050Deployment credit cost.
enable_purge0truePermits full deletion on destroy.
project_id1GCP project ID. Required.
tenant_deployment_id1'demo'Resource name suffix.
support_users1[]Monitoring alert email addresses.
resource_labels1{}Labels for all resources.
application_name2'superset'Base resource name.
display_name2'Superset'Human-readable name.
description2'Apache Superset data visualisation platform'Service description.
application_version2'latest'Superset image tag.
deploy_application3trueInfrastructure-only when false.
container_image_source3'custom''custom' or 'prebuilt'.
container_image3""Image URI override.
cpu_limit3'2000m'CPU per instance.
memory_limit3'2Gi'Memory per instance.
container_port38088Superset's Gunicorn port.
execution_environment3'gen2'Cloud Run execution environment.
timeout_seconds3600Max request duration (600s for long queries).
enable_cloudsql_volume3trueCloud SQL Auth Proxy sidecar.
min_instance_count31Minimum Cloud Run instances.
max_instance_count35Maximum Cloud Run instances.
traffic_split3[]Traffic allocation.
ingress_settings4'all'Traffic source restrictions.
vpc_egress_setting4'PRIVATE_RANGES_ONLY'VPC egress.
enable_iap4falseIdentity-Aware Proxy.
iap_authorized_users4[]IAP access list.
iap_authorized_groups4[]IAP group access.
environment_variables5{}Additional env vars.
secret_environment_variables5{}Additional Secret Manager references. SUPERSET_SECRET_KEY is auto-injected.
backup_schedule6'0 2 * * *'Backup cron schedule.
backup_retention_days67Backup retention days.
enable_backup_import6falseOne-time restore on apply.
enable_cicd_trigger7falseCloud Build GitHub trigger.
github_repository_url7""GitHub repository URL.
github_token7""GitHub PAT. Sensitive.
enable_cloud_deploy7falseCloud Deploy pipeline.
enable_binary_authorization7falseImage attestation.
enable_cloud_armor9falseCloud Armor WAF.
admin_ip_ranges9[]WAF-exempt CIDR ranges.
application_domains9[]Custom domains.
enable_cdn9falseCloud CDN.
storage_buckets10[{ name_suffix = "data" }]Additional GCS buckets.
gcs_volumes10[]GCS Fuse mounts.
db_name11'superset_db'PostgreSQL database name.
db_user11'superset_user'PostgreSQL user.
database_password_length1132Password length.
enable_auto_password_rotation11falseAutomated password rotation.
initialization_jobs12[]Init jobs. Empty uses default two-phase pipeline.
cron_jobs12[]Recurring scheduled jobs.
startup_probe13{ path="/health", initial_delay_seconds=60, failure_threshold=12, ... }Startup probe.
liveness_probe13{ path="/health", initial_delay_seconds=30, ... }Liveness probe.
uptime_check_config13{ enabled=true, path="/health" }Uptime check.
enable_redis21falseRedis. Recommended for production.
redis_host21""Redis hostname/IP.
redis_port216379Redis port (number).
enable_vpc_sc22falseVPC Service Controls.
enable_audit_logging22falseCloud Audit Logs.

11. Outputs

OutputDescription
service_nameName of the Cloud Run service.
service_urlPublic URL of the Cloud Run service.
service_locationGCP region.
project_idGCP project ID.
deployment_idDeployment ID suffix.
database_instance_nameName of the Cloud SQL PostgreSQL 15 instance.
database_nameApplication database name.
database_userApplication database user.
database_password_secretSecret Manager secret name for the database password.
storage_bucketsCreated GCS storage buckets.
container_imageContainer image used.
cicd_enabledWhether the CI/CD pipeline is enabled.

Configuration Pitfalls & Sensible Defaults

Risk levels: Critical (data loss, full outage, security breach) — High (service unavailable or significant degradation) — Medium (degraded function or increased cost) — Low (minor impact).

VariableSensible DefaultRiskConsequence of Incorrect Value
SUPERSET_SECRET_KEY (via Secret Manager)Auto-generated 50-char random stringCriticalThe module auto-generates a secret key and injects it as SUPERSET_SECRET_KEY. If this value is changed or the secret is rotated, all existing user sessions are immediately invalidated and all encrypted database credentials stored in Superset (connected data sources) become permanently unreadable. Treat as immutable after first deploy.
container_resources.cpu_limit"2000m"HighSuperset's Python/Flask stack is CPU-intensive at startup. Under 1000m the superset db upgrade migration step (run by the app-init job) times out, leaving the database in a partially migrated state.
container_resources.memory_limit"2Gi"HighUnder 1 Gi the gunicorn workers are OOM-killed during heavy query execution. "2Gi" is the minimum; "4Gi" is recommended for production with many concurrent users.
SUPERSET_PORT"8088" (injected in superset.tf)HighHardcoded by the module to match container_port. Overriding to a different value without also changing container_port causes health checks and routing to fail entirely.
container_port8088HighMust match SUPERSET_PORT. Changing one without the other causes a port mismatch between the Cloud Run health check target and the gunicorn bind address.
enable_redisfalseHighWithout Redis, async query execution (Celery workers), dashboard cache warming, and result backend are all disabled. Superset falls back to synchronous execution, which blocks gunicorn workers on long-running queries and can cause a full service hang under moderate load. Always set enable_redis = true for production.
redis_hostnullHighRequired when enable_redis = true. Injected as REDIS_HOST and REDIS_PORT into the runtime environment. An empty value causes all Celery workers to fail to connect on startup.
SUPERSET_LOAD_EXAMPLES"no" (injected in superset.tf)MediumIf overridden to "yes", Superset loads example dashboards and datasets into the database on every startup, increasing init time significantly and polluting the workspace with demo data.
application_database_name"superset"HighChanging after the database is initialised orphans the Superset application schema including all dashboards, charts, and database connections. Immutable after first apply.
application_database_user"superset"HighThe database user is created in the db-init job. Renaming requires manual Cloud SQL intervention. Immutable after first apply.
application_version"latest"MediumPinning to a specific version is strongly recommended for production. "latest" triggers uncontrolled upgrades on every container rebuild, which may introduce breaking changes to the Superset config API or require new migration steps that the app-init job must handle.
min_instance_count1HighScale-to-zero causes Celery workers to shut down; any async queries scheduled while the container is cold will be lost. Superset itself has a 30–60 s startup time.
max_instance_count1 (check your setting)MediumMultiple Superset instances share PostgreSQL but need Redis as a shared result backend for Celery. Without Redis, running max_instance_count > 1 causes async query results to be unavailable to the instance that did not execute the query.
enable_cloudsql_volumetrueCriticalRequired for the Cloud SQL Auth Proxy sidecar. Disabling causes all PostgreSQL connections to fail.
ingress_settings"all"HighLeaves the Superset UI accessible from the public internet. For internal BI tools, set to "internal-and-cloud-load-balancing".
enable_iapfalseHighWithout IAP, Superset's login form is publicly reachable. Always enable IAP or restrict ingress for production deployments.
timeout_seconds300MediumSuperset complex SQL queries can run for several minutes. Reducing below 120 s causes long-running analytical queries to be aborted mid-execution.
backup_schedule"0 2 * * *"MediumDisabling automated backups leaves all dashboards, charts, database connection configs, and RLS rules unprotected.
enable_auto_password_rotationfalseMediumEnabling without sufficient rotation_propagation_delay_sec causes brief intervals of DB connectivity failures during the rotation window.
startup_probe.failure_threshold30HighReducing below 15 causes Cloud Run to kill the container before the app-init job migrations complete and Superset first boots.
db_tier"db-f1-micro" (Common default)MediumThe Common module defaults to "db-f1-micro" for Cloud SQL. This is insufficient for production Superset workloads with concurrent users. Override to at least "db-custom-2-7680" in production.

Destroying Resources

When destroying a Cloud Run deployment, you may encounter:

Error: Error waiting for Subnetwork to be deleted: The following serverless IPv4 address(es) on subnet ... are still in use.

Resolution: Wait 20–30 minutes and re-run tofu destroy. GCP holds serverless IPv4 addresses asynchronously after Cloud Run service deletion.