Skip to main content

Django GKE Module — Configuration Guide

Django is a high-level Python web framework that encourages rapid development and clean, pragmatic design. This module deploys a production-ready Django application on GKE Autopilot, backed by a managed Cloud SQL PostgreSQL instance, GCS media storage, and Secret Manager for secrets including the Django SECRET_KEY.

Django GKE is a wrapper module built on top of App GKE. It uses App GKE for all GCP infrastructure provisioning (cluster, networking, Cloud SQL, GCS, Filestore, secrets, CI/CD) and adds Django-specific application configuration on top via the Django Common sub-module.

Note: Variables marked as platform-managed are set and maintained by the platform. You do not normally need to change them.


How This Guide Is Structured

This guide documents only the variables that are unique to Django_GKE or that have Django-specific defaults that differ from the App_GKE base module. For all other variables — project identity, runtime scaling, backend configuration, CI/CD, networking, IAP, Cloud Armor, and VPC Service Controls — refer directly to the App_GKE Configuration Guide.

Variables fully covered by the App GKE guide:

Configuration AreaApp GKE.md SectionDjango-Specific Notes
Module Metadata & Configuration§1 Module OverviewDifferent defaults for module_description and module_documentation.
Project & Identity§2 IAM & Access ControlIdentical. Plus deployment_region for fallback region.
Application Identity§3.A Compute (GKE Autopilot)See Django Application Identity below. application_name defaults to "django".
Runtime & Scaling§3.A Compute (GKE Autopilot)min_instance_count defaults to 0. container_image_source defaults to "custom". container_port defaults to 8080.
Environment Variables & Secrets§3 Core Service ConfigurationDjango Common injects DB_HOST, DB_ENGINE, SECRET_KEY, and other database variables automatically — see Platform-Managed Behaviours.
Networking & Network Policies§3.D Networking & Network Policiessession_affinity defaults to "ClientIP" — see Session Affinity.
Initialization Jobs & CronJobs§3.E Initialization Jobs & CronJobsSee Initialization Jobs for Django-specific job patterns.
Additional Services§3.F Additional ServicesIdentical.
Storage — NFS & GCS§3.C Storage (NFS / GCS / GCS Fuse)enable_nfs defaults to true for Django (shared file storage across pods). The media GCS bucket is provisioned automatically by Django Common.
Database Configuration§3.B Database (Cloud SQL)See Django Database Configuration. PostgreSQL required; Django-specific extensions auto-installed by Django Common.
Backup Schedule & Retention§3.B Database (Cloud SQL)Identical.
Custom SQL Scripts§3.E Initialization Jobs & CronJobsIdentical.
Observability & Health Checks§3.A Compute (GKE Autopilot)See Django Health Probes — Django GKE exposes a dual probe system.
Cloud Armor WAF§4.A Cloud Armor WAFIdentical.
Identity-Aware Proxy§4.B Identity-Aware Proxy (IAP)Requires additional GKE-specific variables: iap_oauth_client_id, iap_oauth_client_secret, iap_support_email (group 19). See note below.
Binary Authorization§4.C Binary AuthorizationIdentical.
VPC Service Controls§4.D VPC Service ControlsIdentical.
Secrets Store CSI Driver§4.E Secrets Store CSI DriverIdentical.
Traffic & Ingress§5 Traffic & IngressIdentical.
CDN§5.B CDNIdentical.
Static IP§5.C Static IPIdentical.
Cloud Build Triggers§6.A Cloud Build TriggersIdentical.
Cloud Deploy Pipeline§6.B Cloud Deploy PipelineIdentical.
Image Mirroring§6.C Image Mirroringenable_image_mirroring defaults to true.
Pod Disruption Budgets§7.A Pod Disruption BudgetsIdentical.
Topology Spread Constraints§7.B Topology Spread ConstraintsIdentical.
Resource Quotas§7.C Resource QuotasIdentical.
Auto Password Rotation§7.D Auto Password RotationSee Password Rotation Propagation Delay below.
Redis Cache§8.A Redisenable_redis defaults to false for Django. See Redis Configuration for Django-specific usage.
Backup Import§8.B Backup ImportUses backup_uri instead of backup_file — the variable maps directly to App GKE's backup_file.
Service Mesh (ASM)§8.C Service Mesh (ASM via Fleet)Identical.
Multi-Cluster Services§8.D Multi-Cluster Services (MCS)Identical.

Platform-Managed Behaviours

The following behaviours are applied automatically by Django GKE (via the Django Common sub-module) regardless of the variable values in your tfvars file. They cannot be overridden by user configuration.

BehaviourDetail
Django environment variablesDjango Common injects the following environment variables automatically: DB_ENGINE (django.db.backends.postgresql), DB_HOST (Cloud SQL Auth Proxy socket path, e.g. /cloudsql/PROJECT:REGION:INSTANCE), DB_PORT (5432), DB_NAME, DB_USER. These values are derived from the Cloud SQL instance provisioned by App GKE and do not need to be set manually in environment_variables.
Django secret keyA random SECRET_KEY is auto-generated and stored in Secret Manager. It is injected into the container as the SECRET_KEY environment variable via module_secret_env_vars. Do not set SECRET_KEY in environment_variables — the platform-managed value in Secret Manager takes precedence.
PostgreSQL extensionsThe following extensions are installed automatically in the application database during the initialisation job: pg_trgm, unaccent, hstore, citext. These are required for Django's full-text search, accent-insensitive lookups, and schema-flexible field types. You do not need to set enable_postgres_extensions = true for these extensions.
Database initialisationA dedicated Django database user is created with the password from Secret Manager and granted the permissions required by the application. The postgres superuser is used only for the extension and user setup jobs.
GCS media storageWhen gcs_volumes is configured (e.g. a bucket mounted at /app/media), Django Common provisions the bucket and grants the application service account roles/storage.objectAdmin and roles/storage.legacyBucketReader. The Django application can read and write user-uploaded media files directly to the GCS-mounted path.
NFS enabled by defaultenable_nfs defaults to true so that shared persistent storage is available across all pod replicas for Django media files. If you configure GCS volumes for media instead of NFS, set enable_nfs = false to suppress Filestore provisioning.
Session affinitysession_affinity defaults to "ClientIP" so that a given user's requests are consistently routed to the same pod. This prevents session inconsistency in deployments that use in-process session storage or local caching rather than Redis.

Identity-Aware Proxy (GKE-specific)

Django GKE exposes three IAP variables not present in Django CloudRun or App GKE's default IAP configuration. These are required when enable_iap = true:

VariableGroupDefaultDescription
iap_oauth_client_id19""OAuth client ID. Create in Google Cloud Console > APIs & Services > Credentials. Sensitive (sensitive = true).
iap_oauth_client_secret19""OAuth client secret. Sensitive (sensitive = true).
iap_support_email19""Support email shown on the OAuth consent screen. Must be a valid email or Google Group address. Validated by regex.

A validation.tf precondition enforces that both iap_oauth_client_id and iap_oauth_client_secret are non-empty when enable_iap = true.


Django Application Identity

These variables have Django-specific defaults. Their semantics are identical to the equivalents in App_GKE.md §3.A.

VariableDefaultDescription & Implications
application_name"django"Internal identifier used as the base name for GKE workloads, Cloud SQL, GCS buckets, and Artifact Registry. Functionally identical to application_name in App GKE. Do not change after initial deployment.
application_display_name"Django Application"Human-readable name shown in the platform UI and monitoring dashboards. Can be updated freely at any time.
application_description"Django Application - High-level Python Web framework on GKE Autopilot"Brief description populated into Kubernetes annotations and platform documentation.
application_version"latest"Version tag applied to the container image. When container_image_source = "custom", incrementing this value triggers a new Cloud Build run. Prefer a pinned version (e.g. "v1.2.0") over "latest" in production to ensure reproducible deployments.

Validating Application Identity

# Confirm the Deployment exists with the expected name
kubectl get deployments -n NAMESPACE -o wide

# View workload annotations (description is stored here)
kubectl describe deployment django -n NAMESPACE | grep -A5 Annotations

Django Database Configuration

Django requires PostgreSQL. All database variables behave identically to those documented in App_GKE.md §3.B, with the following Django-specific notes.

VariableDefaultDescription & Implications
application_database_name"gkeapp"The name of the PostgreSQL database created within the Cloud SQL instance. Injected as DB_NAME. Recommended: change to "django_db" to clearly identify this as a Django database. Do not change after initial deployment — renaming the database requires manual data migration.
application_database_user"gkeapp"The PostgreSQL user created for the Django application. Injected as DB_USER. Recommended: change to "django_user" for clarity. The password is auto-generated, stored in Secret Manager, and injected as DB_PASSWORD.
database_type"POSTGRES"Cloud SQL database engine. Django requires PostgreSQL — do not change to MYSQL or SQLSERVER. The Django DB_ENGINE variable (django.db.backends.postgresql) is hard-wired by Django Common and will not work with non-PostgreSQL engines. Use a versioned value such as "POSTGRES_15" in production for consistency across environments.
enable_postgres_extensionsfalseYou do not need to set this to true for Django's required extensions (pg_trgm, unaccent, hstore, citext) — these are installed automatically by Django Common. Set enable_postgres_extensions = true only if you need to install additional extensions beyond those managed by the platform.
postgres_extensions[]Additional PostgreSQL extensions to install. Used only when enable_postgres_extensions = true. The Django Common-managed extensions (pg_trgm, unaccent, hstore, citext) are always installed regardless of this list. Common additions: postgis (geospatial queries), uuid-ossp (UUID generation), pg_stat_statements (query performance analysis).
enable_mysql_pluginsfalseMySQL plugins. Not applicable for Django — Django does not support MySQL in the default module configuration. Leave as false.
mysql_plugins[]MySQL plugins list. Not applicable for Django. Leave as [].

Note on database_password_length: This variable, the database_password_length default, and the enable_auto_password_rotation / rotation_propagation_delay_sec variables are documented in App_GKE.md §3.B. See also Password Rotation Propagation Delay below.

Validating Database Configuration

# Confirm the database and user were created
gcloud sql databases list --instance=INSTANCE_NAME --project=PROJECT_ID

gcloud sql users list --instance=INSTANCE_NAME --project=PROJECT_ID

# Confirm DB environment variables are injected into the running pod
kubectl exec -n NAMESPACE POD_NAME -- env | grep -E "^DB_"

# Confirm SECRET_KEY is injected
kubectl exec -n NAMESPACE POD_NAME -- env | grep SECRET_KEY

Django Health Probes

Django GKE exposes two separate sets of probe variables with different routing:

  • startup_probe / liveness_probe — Django-specific variables passed to Django Common, which uses them to configure the Django container's Kubernetes probe spec. Both target the / path by default; configure a dedicated health endpoint (e.g. /healthz/) for cleaner health signalling.
  • startup_probe_config / health_check_config — App_GKE-standard variables passed directly to App_GKE. These also default to / in Django_GKE and use App_GKE's standard timeout defaults. See App_GKE.md §3.A for full documentation.

In practice, use startup_probe and liveness_probe to tune Django probe behaviour. The startup_probe_config / health_check_config variables are available for compatibility but are not the primary probe path for the Django container.

startup_probe and liveness_probe (Django Common internal probes):

VariableDefaultDescription & Implications
startup_probe{ enabled = true, type = "HTTP", path = "/", initial_delay_seconds = 90, timeout_seconds = 5, period_seconds = 10, failure_threshold = 3 }Used by Django Common to assess whether Django has started successfully. initial_delay_seconds = 90 accounts for Django's startup time (database connection, application loading). The path / assumes a Django view responds to the root URL — configure a dedicated /healthz/ view for cleaner health signalling.
liveness_probe{ enabled = true, type = "HTTP", path = "/", initial_delay_seconds = 60, timeout_seconds = 5, period_seconds = 30, failure_threshold = 3 }Used by Django Common to assess whether a running Django instance is healthy.

startup_probe_config / health_check_config (App GKE-standard probes):

These variables control the App_GKE-level probes passed directly to the App_GKE module. They are documented in App_GKE.md §3.A.

Django-specific defaults:

  • startup_probe_config: { enabled = true, type = "TCP", path = "/", initial_delay_seconds = 0, timeout_seconds = 240, period_seconds = 240, failure_threshold = 1 }
  • health_check_config: { enabled = true, type = "HTTP", path = "/", initial_delay_seconds = 0, timeout_seconds = 1, period_seconds = 10, failure_threshold = 3 }

Best practice: Implement a dedicated health endpoint (e.g. GET /healthz/) in your Django application that returns HTTP 200 when the app is ready (database connected, migrations applied). Then set path = "/healthz/" in both the startup_probe / liveness_probe and the startup_probe_config / health_check_config variables for consistent health signalling.

Validating Health Probe Configuration

Google Cloud Console: Navigate to Kubernetes Engine → Workloads → django deployment, click a pod, and select the Events tab to view probe failure events.

# View startup and liveness probe config on the deployment pod spec
kubectl get deployment django -n NAMESPACE \
-o jsonpath='{.spec.template.spec.containers[0].startupProbe}' | jq .

kubectl get deployment django -n NAMESPACE \
-o jsonpath='{.spec.template.spec.containers[0].livenessProbe}' | jq .

# View pod restart counts (rising count indicates probe failures)
kubectl get pods -n NAMESPACE -o wide

# View Django startup logs
kubectl logs -n NAMESPACE -l app=django --since=10m | head -100

Redis Configuration

Django uses Redis as a session store and caching backend via django-redis. When enable_redis = true, the REDIS_HOST and REDIS_PORT environment variables are injected automatically into the Django container by Django Common.

VariableDefaultOptions / FormatDescription & Implications
enable_redisfalsetrue / falseWhen true, REDIS_HOST and REDIS_PORT are injected into the container. Your Django settings.py must be configured to use these variables for CACHES and SESSION_ENGINE. Recommended for production deployments with multiple replicas or where Django's session framework is used.
redis_host"" (falls back to NFS server IP)IP address or hostnameThe hostname or IP address of the Redis server. Leave empty to fall back to the NFS server IP (suitable for single-VM shared environments where Redis is co-located). For production, set this explicitly to a Cloud Memorystore for Redis private IP. The cluster must be able to reach this address over the VPC.
redis_port"6379"Port number as stringThe TCP port of the Redis server. The default 6379 is the standard Redis port and is correct for Cloud Memorystore and most self-hosted Redis instances.
redis_auth"" (no authentication)Password string (sensitive)Authentication password for the Redis server. Leave empty if the Redis instance does not require authentication. When set, the value is stored securely and never appears in Terraform state in plaintext. For Cloud Memorystore with AUTH enabled, set this to the instance's auth string.

Provisioning Redis: The Django GKE module does not provision a Redis instance. Provision a Cloud Memorystore instance separately, or deploy Services GCP first — it provides a shared Memorystore instance that is auto-discovered when redis_host is left blank.

Validating Redis Configuration

# Confirm REDIS_HOST and REDIS_PORT are injected into the pod
kubectl exec -n NAMESPACE POD_NAME -- env | grep REDIS

# Test Redis connectivity from within the cluster (using a debug pod or exec)
# redis-cli -h REDIS_HOST -p 6379 ping

Session Affinity

Django applications that rely on in-process session storage or local caching benefit from routing a given user's requests consistently to the same pod.

VariableDefaultOptions / FormatDescription & Implications
session_affinity"ClientIP"None / ClientIPClientIP: the Kubernetes Service routes requests from a given client IP to the same pod for the duration of the affinity timeout (default 10800 seconds / 3 hours). Prevents session data loss on deployments using Django's default database-backed sessions or in-process caching. None: requests are distributed across all pods without affinity. Use None when all session and cache state is externalised to Redis or the database, as is the case for fully stateless Django deployments.

Note: session_affinity is documented in App_GKE.md §3.A. The "ClientIP" default in Django_GKE differs from the App_GKE default — this is intentional to provide better out-of-the-box behaviour for Django session handling.


Initialization Jobs

Django deployments typically require database setup and schema migration jobs to run before (or immediately after) the application starts. Django_GKE supports these via the initialization_jobs variable (documented in App_GKE.md §3.E).

Django GKE does not configure a default initialization_jobs list — the variable defaults to []. You must configure initialization jobs explicitly in your tfvars file.

Recommended Django initialization jobs:

initialization_jobs = [
{
name = "db-init"
description = "Create Django Database and User"
image = "postgres:15-alpine"
script_path = "db-init.sh"
mount_nfs = false
mount_gcs_volumes = []
execute_on_apply = true
},
{
name = "db-migrate"
description = "Run Django Migrations"
image = null # Uses the application image
script_path = "migrate.sh"
mount_nfs = false
mount_gcs_volumes = ["django-media"] # If GCS media volume is configured
execute_on_apply = false
}
]

Job descriptions:

JobImagePurpose
db-initpostgres:15-alpineCreates the Django database and user in the Cloud SQL instance. Uses the postgres superuser credentials from Secret Manager. Run execute_on_apply = true so that the database is ready before the application starts.
db-migrateApplication image (null)Runs python manage.py migrate and python manage.py collectstatic. Uses the application's own container image so that it has access to the current migration files. Set execute_on_apply = false to run only on explicit invocation, or true to apply on every deployment.

Script location: Both db-init.sh and migrate.sh are provided by Django_Common at modules/Django_Common/scripts/. They are referenced via script_path and loaded automatically by the platform. You do not need to copy or manage these scripts.

Validating Initialization Jobs

# List all Kubernetes Jobs in the namespace
kubectl get jobs -n NAMESPACE

# View logs of the db-init job
kubectl logs -n NAMESPACE -l job-name=db-init --tail=50

# View logs of the db-migrate job
kubectl logs -n NAMESPACE -l job-name=db-migrate --tail=50

# Confirm the database was created
gcloud sql databases list --instance=INSTANCE_NAME --project=PROJECT_ID

StatefulSet PVC Configuration

When workload_type = "StatefulSet" is set (see App_GKE.md §3.A), the following variables configure the per-pod PersistentVolumeClaim automatically created for each StatefulSet replica.

Django use case: A StatefulSet is rarely needed for Django. The default Deployment workload type with GCS-mounted media (gcs_volumes) and shared NFS (enable_nfs = true) is recommended. Use a StatefulSet only if your Django application requires per-pod local persistent storage that cannot be externalised to GCS or NFS.

VariableDefaultOptions / FormatDescription & Implications
stateful_pvc_enabledfalsetrue / falseWhen true and workload_type = "StatefulSet", a PVC is provisioned for each pod replica. Leave false for standard Deployment workloads.
stateful_pvc_size"10Gi"Kubernetes storage quantityCapacity of each per-pod PVC. Size based on expected local storage needs for the pod.
stateful_pvc_mount_path"/data"Filesystem pathPath inside the container where the PVC is mounted. Ensure this matches the Django file path configuration.
stateful_pvc_storage_class"standard-rwo"Kubernetes storage class nameStorage class for PVCs. "standard-rwo" (ReadWriteOnce) is the GKE Autopilot default. Use "premium-rwo" for lower latency I/O.
stateful_headless_servicetruetrue / falseCreates a headless Kubernetes Service giving each pod a stable DNS name. Required if pods need to address each other directly.
stateful_pod_management_policy"OrderedReady"OrderedReady / ParallelOrderedReady: pods start sequentially, each waiting for the previous to be ready — safer for coordinated startup. Parallel: all pods start simultaneously.
stateful_update_strategy"RollingUpdate"RollingUpdate / OnDeleteRollingUpdate: pods are updated automatically one at a time. OnDelete: pods are only updated when manually deleted.

Password Rotation Propagation Delay

The rotation_propagation_delay_sec variable controls how long the module waits after writing a new database password to Secret Manager before restarting the GKE pods to pick up the new credentials. It is used together with enable_auto_password_rotation (documented in App_GKE.md §7.D).

VariableDefaultOptions / FormatDescription & Implications
rotation_propagation_delay_sec90Integer (seconds)Seconds to wait after updating the DB_PASSWORD secret before triggering a rolling restart of the Django Deployment. This delay allows Secret Manager's global replication to complete before pods attempt to reconnect with new credentials. Increase to 120 in multi-region deployments or if you observe rotation failures. Only used when enable_auto_password_rotation = true.

Resource Creator Identity

VariableDefaultOptions / FormatDescription & Implications
resource_creator_identity"rad-module-creator@tec-rad-ui-2b65.iam.gserviceaccount.com"Service account emailThe service account used by Terraform to create and manage GCP resources. For enhanced security, replace with a project-scoped service account granted only the minimum permissions required by this module.

Configuration Pitfalls & Sensible Defaults

The table below identifies the variables most commonly misconfigured in Django GKE deployments, explains the sensible starting value, and describes exactly what happens when the value is wrong. For full variable details see the [App GKE configuration guide](App GKE.md).

Risk levels: Critical (data loss, full outage, security breach) — High (service unavailable or significant degradation) — Medium (degraded function or increased cost) — Low (minor impact).

VariableSensible DefaultRiskConsequence of Incorrect Value
application_name"django" (default; do not change after first deploy)CriticalEmbedded in GKE namespace name, Artifact Registry repo, Secret Manager secrets, and Cloud SQL database. Changing causes all named resources to be recreated — complete data loss.
tenant_deployment_idMatch environment: "prod", "staging", "dev"CriticalChanging after first deploy recreates all named resources. The old Cloud SQL instance (with all data) is orphaned and a new empty one is created.
application_versionA pinned tag (e.g. "1.2.3"); avoid "latest" in productionMedium"latest" makes rollback ambiguous — Kubernetes cannot distinguish between two "latest" image pulls. Always pin to a meaningful digest or version tag in production.
workload_typenull (auto-selects Deployment when stateful_pvc_enabled is null/false)CriticalSetting "StatefulSet" without stateful_pvc_enabled = true creates a StatefulSet with no stable storage. Setting "Deployment" alongside stateful_pvc_enabled = true fails at plan time. For NFS-only deployments keep null (Deployment) — StatefulSet is only needed when each pod requires its own independent disk.
stateful_pvc_size"10Gi" (default); increase based on application data volumeHighPVC storage cannot be decreased after provisioning. Set too small: the disk fills up, Django raises OSError: [Errno 28] No space left on device, and writes fail. Always provision 2–3× the expected data size to allow for growth.
quota_memory_requests"4Gi" (use binary suffix — Gi or Mi)CriticalA bare integer like "4" is treated as 4 bytes by Kubernetes. The ResourceQuota rejects every pod that requests more than 4 bytes of memory, which means all pods fail to schedule. Always use "4Gi" or "4096Mi". This is the most common GKE deployment failure for this module.
quota_memory_limits"8Gi" (must be ≥ quota_memory_requests)CriticalSame bare-integer issue as quota_memory_requests. A value of "8" = 8 bytes, blocking all pod scheduling immediately.
min_instance_count1 for production (eliminates cold starts on GKE)Medium0 for production: GKE scales to zero — Django pods are fully stopped when idle. On the next request, Kubernetes must schedule a new pod, pull the image (if not cached), and wait for the startup probe. Total cold-start time can exceed 60 seconds.
max_instance_count≤ Cloud SQL max_connections ÷ avg_DB_connections_per_podHighExceeding Cloud SQL's connection limit causes FATAL: sorry, too many clients already. All Django pods simultaneously fail database queries.
container_resources{ cpu_limit = "1000m", memory_limit = "512Mi" } minimum; increase for media-processing workloadsHighMemory too low: Django pod is OOMKilled (exit code 137) when processing large uploads or loading large querysets. GKE Autopilot enforces a minimum of 1 CPU and 512Mi per pod — requests below the minimum are silently raised to the minimum, but limits below minimum cause plan-time errors.
enable_cloudsql_volumetrue (default; Cloud SQL Auth Proxy sidecar — secure, recommended)Highfalse: Django must reach Cloud SQL over TCP via private IP. If Private Service Access is not configured, all DB connections fail. IAM-based auth is lost; password-only auth is required.
cloudsql_volume_mount_path"/cloudsql" (default; db-init.sh uses this socket path)CriticalWrong path: db-init.sh cannot find the Auth Proxy Unix socket. DB init fails. The pod starts but crashes on the first database call with no such file or directory.
enable_nfstrue (default; required for shared Django media files across replicas)Highfalse with max_instance_count > 1: each pod has its own ephemeral volume. Media files written by one pod are invisible to others. Users see 404 for recently uploaded files. All files on a pod are lost on restart.
nfs_mount_path"/mnt/nfs" (must match MEDIA_ROOT in settings.py)HighMismatch with MEDIA_ROOT: Django writes media to ephemeral local storage instead of NFS. Files are lost on pod restart. If MEDIA_ROOT points to a non-existent path, every file write raises FileNotFoundError.
startup_probe{ path = "/healthz", failure_threshold = 30, period_seconds = 10 } — 300 s total toleranceCriticalfailure_threshold too low with Django running migrations at startup: Kubernetes kills the pod before migrations finish. Restart loop prevents the service from ever becoming healthy. Increase failure_threshold to 40–60 for large migration sets.
liveness_probe{ path = "/healthz", period_seconds = 30, failure_threshold = 3 } — must be fast and non-blockingHighHealth endpoint that makes a database call: if the DB is slow, the liveness probe times out and Kubernetes restarts all healthy pods simultaneously. Use an endpoint that returns 200 OK in < 1 s without DB access.
enable_iapfalse for public; true for internal-only Django deploymentsHightrue without iap_oauth_client_id and iap_oauth_client_secret: the validation guard blocks the plan. Providing credentials but omitting entries in iap_authorized_users/iap_authorized_groups: all requests return HTTP 403 — even yours.
enable_pod_disruption_budgetfalse (default; safe at replica count 1)Hightrue with max_instance_count = 1 and pdb_min_available = "1": GKE node drains are permanently blocked. Node upgrades stall. The GKE Autopilot maintenance window cannot complete. Only enable when min_instance_count ≥ 2.
pdb_min_available"1" (default) — keep below min_instance_countHighpdb_min_available equal to max_instance_count: zero pods can be evicted during voluntary disruptions. Node upgrades, cluster maintenance, and rolling deployments all stall indefinitely.
secret_environment_variablesUse for DJANGO_SUPERUSER_PASSWORD, SECRET_KEY, and all API credentialsHighCredentials in plain environment_variables: visible in the GCP Console, in Cloud Logging if Django prints env vars (e.g. manage.py diffsettings), and in Terraform state in plaintext.
enable_redisfalse (default); set true when using Redis-backed sessions or CeleryMediumLeft false when Django is configured to use Redis for sessions or caching: redis.exceptions.ConnectionRefusedError on every cache/session access. Users cannot log in; cached views raise uncaught exceptions.
binauthz_evaluation_mode"ALWAYS_ALLOW" until CI pipeline attests images; then "REQUIRE_ATTESTATION"Critical"REQUIRE_ATTESTATION" without a functioning Cloud Build attestation step: no new image can be deployed to GKE, and rollbacks also fail. The only recovery is reverting to "ALWAYS_ALLOW".
enable_vpc_scfalse until VPC-SC perimeter exists; then vpc_sc_dry_run = true firstCriticalenable_vpc_sc = true with vpc_sc_dry_run = false on first enable: if the Django GKE SA, Cloud Build SA, or your admin IP is missing from the access level, Cloud SQL, Secret Manager, and Artifact Registry access all fail simultaneously — complete outage.
enable_backup_importfalse after a successful restore — set back to false immediatelyHighLeaving true after a successful import: the restore job re-runs on every tofu apply, overwriting live Django data (including new user registrations and content) with the stale backup.
rotation_propagation_delay_sec120180 for production Django (Gunicorn worker pool needs time to reconnect)HighDefault 90 s too short for pools with long-lived DB connections: the old password is revoked before all Gunicorn workers reconnect. Workers throw authentication failed until they are recycled. Increase to at least 120 for production.
enable_audit_loggingfalse for dev; true for regulated production environmentsLowfalse in production: Secret Manager reads, DB password accesses, and KMS key usage are not logged. SOC 2 and HIPAA audits may flag the absence. Enabling increases Cloud Logging costs but is strongly recommended for regulated workloads.

Deployment Prerequisites & Validation

After deploying Django GKE, confirm the deployment is healthy:

# Confirm the Django pod is running and ready
kubectl get pods -n NAMESPACE -l app=django -o wide

# Confirm the Cloud SQL instance is running
gcloud sql instances describe INSTANCE_NAME \
--project=PROJECT_ID \
--format="table(name,state,databaseVersion)"

# Confirm the GCS media bucket was created
gcloud storage buckets list \
--project=PROJECT_ID \
--filter="name:django-media"

# Confirm DB and SECRET_KEY environment variables are injected
kubectl exec -n NAMESPACE POD_NAME -- env | grep -E "^(DB_|SECRET_KEY)"

# Confirm the database and user exist
gcloud sql databases list --instance=INSTANCE_NAME --project=PROJECT_ID
gcloud sql users list --instance=INSTANCE_NAME --project=PROJECT_ID

# View Django application logs
kubectl logs -n NAMESPACE -l app=django --since=5m