Skip to main content

Open WebUI on Google Kubernetes Engine (GKE Autopilot)

This document provides a comprehensive reference for the modules/OpenWebUI_GKE Terraform module. It covers architecture, IAM, configuration variables, Open WebUI-specific behaviours, and operational patterns for deploying Open WebUI on GKE Autopilot.


1. Module Overview

Open WebUI is a self-hosted AI interface with 90,000+ GitHub stars, providing a polished ChatGPT-style frontend for Ollama, OpenAI-compatible APIs, and dozens of LLM providers. OpenWebUI GKE is a wrapper module built on top of App GKE. It uses App GKE for all GCP infrastructure provisioning and injects Open WebUI-specific application configuration via OpenWebUI Common.

Key Capabilities:

  • Compute: GKE Autopilot, Kubernetes Deployment, 2 vCPU / 4 Gi by default. HPA scales from min_instance_count = 0 to max_instance_count = 3.
  • Data Persistence: Cloud SQL PostgreSQL 15. OpenWebUI Common provisions a db-init job.
  • IAM: Workload Identity binds the Kubernetes SA to a GCP SA for Secret Manager and GCS access.
  • AI Backend Integration: ollama_base_url and openai_api_base_url variables configure backend connections.
  • Security: Inherits Cloud Armor, Binary Authorization, and VPC Service Controls from App GKE. WEBUI_SECRET_KEY is auto-generated and stored in Secret Manager.
  • No Redis: Open WebUI persists sessions and application state in PostgreSQL. No Redis session store required.
  • StatefulSet option: PVC-backed StatefulSet is available via stateful_pvc_enabled = true if local file persistence is needed alongside PostgreSQL.
  • Health: Health probes target /health.

Project & Application Identity

VariableGroupTypeDefaultDescription
project_id1stringGCP project ID. Required.
tenant_deployment_id2string'demo'Short suffix appended to all resource names.
support_users2list(string)[]Email recipients for monitoring alerts.
resource_labels2map(string){}Labels applied to all provisioned resources.
application_name3string'openwebui'Base resource name. Do not change after initial deployment.
display_name3string'Open WebUI'Human-readable name.
description3string'Open WebUI — self-hosted AI interface...'Application description.
application_version3string'latest'Open WebUI image version tag.

Wrapper architecture: OpenWebUI GKE calls OpenWebUI Common to produce the application configuration, then forwards application_modules, module_secret_env_vars, and module_storage_buckets to App GKE. enable_cloudsql_volume is forwarded through to the OpenWebUI Common call and merged into the config.


2. IAM & Access Control

Workload Identity binds the Kubernetes SA to a GCP SA, granting access to Secret Manager secrets (PostgreSQL password, WEBUI_SECRET_KEY) and GCS buckets. All IAM provisioning is delegated to App GKE.


3. Core Service Configuration

A. Compute (GKE Autopilot)

VariableGroupDefaultDescription
deploy_application4trueSet false for infrastructure-only deployment.
cpu_limit4'2000m'CPU per instance. 2 vCPU for RAG workloads.
memory_limit4'4Gi'Memory per instance.
min_instance_count40Minimum pod replicas (HPA minReplicas).
max_instance_count43Maximum pod replicas (HPA maxReplicas).
container_port48080Open WebUI's HTTP port.
enable_cloudsql_volume4trueCloud SQL Auth Proxy sidecar for PostgreSQL connection.
enable_vertical_pod_autoscaling4falseEnables VPA for automatic resource adjustment.
container_image_source4'prebuilt''prebuilt' (official image) or 'custom' (build from source).
container_image4""Container image URI override. Leave empty for the official image.

B. Open WebUI Settings

VariableGroupDefaultDescription
ollama_base_url5""Base URL for the Ollama backend. Leave empty to disable direct integration.
openai_api_base_url5""Base URL for an OpenAI-compatible API.
default_user_role5'pending'Default role for new user registrations.
enable_signup5trueAllow new users to self-register.
webui_auth5trueEnable the login/authentication system.

C. Database (Cloud SQL — PostgreSQL 15)

VariableGroupDefaultDescription
db_name'openwebui_db'PostgreSQL database name.
db_user'openwebui_user'PostgreSQL application user.

D. Storage

VariableGroupDefaultDescription
gcs_volumes[]GCS buckets to mount via GCS Fuse CSI driver.

E. Observability

VariableGroupDefaultDescription
startup_probe{ path="/health", initial_delay_seconds=30, failure_threshold=30 }Startup probe.
liveness_probe{ path="/health", initial_delay_seconds=60, failure_threshold=3 }Liveness probe.

4. Integrations

A. CI/CD

VariableGroupDefaultDescription
enable_cicd_trigger8falseProvisions a Cloud Build GitHub trigger.
github_repository_url8""Full HTTPS URL of the GitHub repository.
enable_cloud_deploy8falseProvisions a Cloud Deploy pipeline.

5. Outputs

OutputDescription
service_nameName of the Kubernetes Service.
service_urlURL of the Open WebUI deployment.
project_idGCP project ID.
deployment_idDeployment ID suffix used in resource names.
database_instance_nameName of the Cloud SQL PostgreSQL instance.
database_nameName of the application database.
database_password_secretSecret Manager secret name for the database password.

Configuration Pitfalls & Sensible Defaults

Risk levels: Critical (data loss, full outage, security breach) — High (service unavailable or significant degradation) — Medium (degraded function or increased cost) — Low (minor impact).

VariableSensible DefaultRiskConsequence of Incorrect Value
ollama_base_url"" (disabled)CriticalIf neither ollama_base_url nor openai_api_base_url is configured, Open WebUI has no AI backend and all model inference requests fail. For GKE-to-GKE communication, use the ClusterIP DNS name: http://ollama.<namespace>.svc.cluster.local:11434.
openai_api_base_url"" (disabled)HighMust include /v1 suffix for standard OpenAI-compatible APIs. Omitting it causes 404 errors on all model listing and completion calls.
WEBUI_SECRET_KEY (auto-generated)Random 32-char secret stored in Secret ManagerCriticalImmutable after first use. Changing the secret key (e.g., by redeploying with a new random value) logs out every active user simultaneously and invalidates all remember-me tokens.
webui_authtrueCriticalDisabling removes the login form. Any pod-level network access immediately grants full admin control. Only safe behind IAP or in isolated development namespaces.
default_user_role"pending"High"user" auto-approves registrations. Combined with a publicly exposed Kubernetes ingress, this allows unrestricted self-signup.
enable_signuptrueHighDisabling is the strongest registration control short of IAP. Set enable_signup = false after the admin account is created in production.
min_instance_count1MediumScale-to-zero (0) causes pod cold starts (30–60 s for Cloud SQL proxy + app startup). The GKE Autopilot node may also need to provision. Set to 1 for interactive team use.
enable_cloudsql_volumetrueCriticalDisabling the Cloud SQL Auth Proxy sidecar breaks all database connections when the module is configured to use Cloud SQL. Only disable when connecting to an external PostgreSQL over TCP with direct network access.
enable_vertical_pod_autoscalingfalseMediumEnabling VPA disables HPA (they conflict). On GKE Autopilot, VPA is the recommended approach for right-sizing. If both are configured simultaneously, HPA CPU/memory scaling is silently dropped.
quota_memory_requests / quota_memory_limits"4Gi" / "8Gi"CriticalMust use binary suffixes (Gi, Mi). A bare integer is treated as bytes by Kubernetes, preventing all pod scheduling in the namespace.
stateful_pvc_enabledfalseMediumEnabling PVC storage for Open WebUI prevents pod migration and complicates rolling updates (a StatefulSet pod must be manually deleted for rescheduling). Use GCS Fuse unless local IOPS are critical.
timeout_seconds300MediumLong-running document ingestion or streaming LLM responses are cut off if the load balancer timeout is exceeded. Increase to 6003600 for document-heavy RAG workloads.
database_type"POSTGRES"CriticalOpen WebUI requires PostgreSQL. Any other database type results in failed migrations and a completely non-functional application.
enable_nfsfalseMediumWithout NFS or a shared PVC, uploaded files are local to each pod. With multiple replicas, a file uploaded via one pod is not accessible via another.
backup_schedule"" (disabled)HighWithout scheduled backups, the PostgreSQL database (all user accounts, conversations, RAG data) is unprotected against accidental deletion.
enable_image_mirroringtrueMediumDisabling mirrors from GitHub Container Registry directly. GHCR has rate limits that can block deployments during high-traffic periods.
application_version"latest"MediumAuto-applying the latest Open WebUI version on each deploy risks breaking schema migrations. Pin to a specific release (e.g., "0.4.8") in production.
service_type (Kubernetes)"ClusterIP"HighUsing LoadBalancer for Open WebUI on GKE exposes the service without TLS termination or authentication unless an ingress controller handles it. Use ClusterIP behind an Ingress/Gateway resource.