AnythingLLM on Google Kubernetes Engine
This document provides a comprehensive reference for the modules/AnythingLLM_GKE Terraform module. It covers architecture, IAM, configuration variables, AnythingLLM-specific behaviours, and operational patterns for deploying AnythingLLM on GKE Autopilot.
1. Module Overview
AnythingLLM is a private AI workspace and Retrieval-Augmented Generation (RAG) platform. It allows teams to chat with documents, connect to any LLM provider (OpenAI, Anthropic, Ollama, and others), and build AI-powered knowledge assistants — without sending data to third-party services. AnythingLLM GKE is a wrapper module built on top of App GKE. It uses App GKE for all GCP and Kubernetes infrastructure provisioning and injects AnythingLLM-specific application configuration, secrets, and storage configuration via AnythingLLM Common.
Key Capabilities:
- Compute: GKE Autopilot, 2 vCPU / 4 Gi by default. Supports
Deployment(stateless) andStatefulSet(persistent PVC) workload types.min_instance_count = 1recommended to keep AnythingLLM warm. - Data Persistence: Cloud SQL PostgreSQL 15 (required by AnythingLLM's Prisma ORM). GCS document storage bucket auto-provisioned by
AnythingLLM Common. StatefulSet PVCs for persistent vector store data. Optional NFS for shared document access. - Security: Four application-level secrets are auto-generated by
AnythingLLM Common—JWT_SECRET,AUTH_TOKEN,SIG_KEY, andSIG_SALT— stored in Secret Manager and accessed via Workload Identity. Inherits Cloud Armor, IAP, Binary Authorization, and VPC Service Controls fromApp GKE. - Caching: Redis is disabled by default (
enable_redis = false). Enable for session or cache workloads if required. - CI/CD: Cloud Build custom image pipeline by default; Cloud Deploy progressive delivery optional.
- Reliability: Health probes target
/api/pingwith a 60-second initial delay and extended startup window (30 periods × 10 seconds) for AI model loading. - Scaling: Horizontal Pod Autoscaler (HPA) with
min_instance_countandmax_instance_count. Optional Vertical Pod Autoscaling (VPA).
Project & Application Identity
| Variable | Group | Type | Default | Description |
|---|---|---|---|---|
project_id | 1 | string | — | GCP project ID. Required. |
tenant_deployment_id | 2 | string | 'demo' | Short suffix appended to all resource names. |
support_users | 2 | list(string) | [] | Email recipients for monitoring alerts. |
resource_labels | 2 | map(string) | {} | Labels applied to all provisioned resources. |
application_name | 3 | string | 'anythingllm' | Base resource name. Do not change after initial deployment. |
application_display_name | 3 | string | 'AnythingLLM' | Human-readable name shown in dashboards. |
application_description | 3 | string | 'AnythingLLM Private AI Workspace on GKE' | Service description. |
application_version | 3 | string | 'latest' | Container image version tag. |
Wrapper architecture: AnythingLLM GKE calls AnythingLLM Common to build an application_config object containing AnythingLLM-specific environment variables, probe configuration, and the db-init job definition. AnythingLLM Common generates and stores JWT_SECRET, AUTH_TOKEN, SIG_KEY, and SIG_SALT in Secret Manager and returns their IDs via module.anythingllm_app.secret_ids. The GOOGLE_CLOUD_STORAGE_BUCKET_NAME environment variable is set automatically from module.anythingllm_app.storage_buckets[0].name.
PostgreSQL note: AnythingLLM uses Prisma ORM and requires PostgreSQL. database_type = "POSTGRES_15" is the default. Do not set this to a MySQL or SQL Server variant.
2. IAM & Access Control
AnythingLLM GKE delegates all IAM provisioning to App GKE. Workload Identity binds the Kubernetes Service Account to a GCP Service Account, granting the pods access to Secret Manager, Cloud SQL, GCS, and Artifact Registry.
Application secrets: AnythingLLM Common auto-generates four secrets on first apply:
JWT_SECRET— signs AnythingLLM authentication tokens.AUTH_TOKEN— optional API bearer token for programmatic access.SIG_KEY— HMAC signing key for request signatures (32+ characters).SIG_SALT— salt used alongsideSIG_KEYfor HMAC signatures (32+ characters).
These secrets are stored in Secret Manager and mounted into pods as Kubernetes Secret volumes or environment variables via Workload Identity. Plaintext is never written to Terraform state.
Database initialisation identity: The db-init Kubernetes Job runs under the pod's Workload Identity SA. It connects to Cloud SQL PostgreSQL via the Auth Proxy sidecar, using DB_HOST, DB_USER, and ROOT_PASSWORD (from Secret Manager).
3. Core Service Configuration
A. Compute (GKE Autopilot)
AnythingLLM's AI workloads require more resources than typical web applications. GKE Autopilot automatically provisions node capacity to meet pod resource requests.
min_instance_count = 1 is recommended to keep AnythingLLM warm and avoid cold starts for AI document chat and embedding operations.
StatefulSet for persistent vector store: Setting stateful_pvc_enabled = true automatically selects StatefulSet workload type. Each pod receives its own PVC mounted at /app/server/storage (the AnythingLLM storage directory) by default.
| Variable | Group | Default | Description |
|---|---|---|---|
deploy_application | 4 | true | Set false for infrastructure-only deployment (SQL, storage, secrets). |
container_image_source | 4 | 'custom' | 'custom' builds via Cloud Build. 'prebuilt' deploys an existing image URI. |
container_image | 4 | "" | Override image URI. Leave empty for Cloud Build to manage. |
container_build_config | 4 | { enabled=true } | Cloud Build configuration object. |
enable_image_mirroring | 4 | true | Mirrors the container image into Artifact Registry. |
container_resources | 4 | { cpu_limit="2000m", memory_limit="4Gi" } | CPU/Memory limits and requests. AI workloads require at least 2 vCPU and 4 Gi. |
min_instance_count | 4 | 1 | Minimum pod replicas. Set to 1 to keep AnythingLLM warm. |
max_instance_count | 4 | 1 | Maximum pod replicas. |
container_port | 4 | 3001 | AnythingLLM's native HTTP port. |
container_protocol | 4 | 'http1' | 'http1' or 'h2c'. |
enable_vertical_pod_autoscaling | 4 | false | Enables VPA to automatically adjust CPU/memory requests. |
timeout_seconds | 4 | 300 | Maximum load balancer backend response timeout. Increase for long document processing. |
enable_cloudsql_volume | 4 | true | Injects Cloud SQL Auth Proxy sidecar for database connectivity. |
cloudsql_volume_mount_path | 4 | '/cloudsql' | Container path for the Auth Proxy Unix socket. |
service_annotations | 4 | {} | Annotations applied to the Kubernetes Service resource. |
service_labels | 4 | {} | Labels applied to the Kubernetes Service resource. |
Differences from App GKE defaults:
| Variable | App GKE | AnythingLLM GKE | Reason |
|---|---|---|---|
container_port | 8080 | 3001 | AnythingLLM's native port. |
container_resources.cpu_limit | '1000m' | '2000m' | AI workloads require more CPU. |
container_resources.memory_limit | '512Mi' | '4Gi' | LLM context, document vectors, and Prisma ORM require more RAM. |
min_instance_count | 0 | 1 | Keep warm for AI operations. |
health probe path | '/healthz' | '/api/ping' | AnythingLLM's health endpoint. |
stateful_pvc_mount_path | (app-defined) | '/app/server/storage' | AnythingLLM's storage directory for documents and vectors. |
stateful_fs_group | varies | 1000 | Matches the AnythingLLM container user GID. |
B. Kubernetes Workload Configuration
| Variable | Group | Default | Description |
|---|---|---|---|
workload_type | 6 | null | 'Deployment' (stateless) or 'StatefulSet' (persistent PVC). Auto-selects StatefulSet when stateful_pvc_enabled = true. |
service_type | 6 | 'LoadBalancer' | Kubernetes Service type: 'ClusterIP', 'LoadBalancer', or 'NodePort'. |
session_affinity | 6 | 'ClientIP' | Session affinity for the Kubernetes Service. |
namespace_name | 6 | "" | Kubernetes namespace. Auto-generated if empty. |
gke_cluster_name | 6 | "" | Target GKE cluster name. Auto-discovered if empty. |
termination_grace_period_seconds | 6 | 60 | Seconds Kubernetes waits after SIGTERM before force-terminating. |
deployment_timeout | 6 | 1800 | Seconds Terraform waits for the Deployment rollout to complete. |
enable_network_segmentation | 6 | false | Creates Kubernetes NetworkPolicy resources to restrict pod-to-pod traffic. |
enable_multi_cluster_service | 6 | false | Creates a ServiceExport for Multi-Cluster Services (MCS). |
configure_service_mesh | 6 | false | Enables Istio service mesh injection for the application namespace. |
network_tags | 19 | ['nfsserver'] | Network tags applied to GKE nodes for VPC firewall rules. |
C. StatefulSet Configuration
| Variable | Group | Default | Description |
|---|---|---|---|
stateful_pvc_enabled | 7 | null | Enables PVC templates in the StatefulSet. Setting true auto-selects StatefulSet. Recommended for persistent AnythingLLM vector store data. |
stateful_pvc_size | 7 | '20Gi' | Storage size for each per-pod PVC. Recommended minimum 20 Gi for vector store data. |
stateful_pvc_mount_path | 7 | '/app/server/storage' | Mount path for the per-pod PVC. Set to AnythingLLM's storage directory. |
stateful_pvc_storage_class | 7 | 'standard-rwo' | Kubernetes StorageClass for StatefulSet PVCs. |
stateful_headless_service | 7 | null | Creates a headless Kubernetes Service for stable pod DNS. |
stateful_pod_management_policy | 7 | null | Pod creation order: 'OrderedReady' or 'Parallel'. |
stateful_update_strategy | 7 | null | Update strategy: 'RollingUpdate' or 'OnDelete'. |
stateful_fs_group | 7 | 1000 | Pod-level fsGroup GID. Set to 1000 to match the AnythingLLM container user. |
D. Database (Cloud SQL — PostgreSQL 15)
AnythingLLM uses Prisma ORM and requires PostgreSQL. The entrypoint script constructs the DATABASE_URL Prisma connection string from the platform-injected DB_* variables at container start time.
| Variable | Group | Default | Description |
|---|---|---|---|
database_type | 16 | 'POSTGRES_15' | Cloud SQL engine. AnythingLLM requires PostgreSQL. |
application_database_name | 16 | 'anythingllmdb' | PostgreSQL database name. Do not change after initial deployment. |
application_database_user | 16 | 'anythingllmuser' | Database user. Password auto-generated and stored in Secret Manager. |
database_password_length | 16 | 32 | Auto-generated password length. Range: 16–64. |
enable_postgres_extensions | 16 | false | Enables PostgreSQL extension installation. |
postgres_extensions | 16 | [] | List of PostgreSQL extensions to install (e.g., ['uuid-ossp', 'vector']). |
enable_auto_password_rotation | 16 | false | Automated zero-downtime password rotation via Kubernetes CronJob. |
rotation_propagation_delay_sec | 16 | 90 | Seconds to wait after rotation before restarting GKE pods. |
E. Storage (NFS, GCS, PVCs)
NFS is disabled by default (enable_nfs = false). For multi-pod deployments sharing document storage, enable NFS. For single-pod persistent storage, use stateful_pvc_enabled = true.
GCS document bucket: AnythingLLM Common automatically provisions a dedicated anythingllm-docs GCS bucket. The GOOGLE_CLOUD_STORAGE_BUCKET_NAME environment variable is set automatically.
| Variable | Group | Default | Description |
|---|---|---|---|
create_cloud_storage | 14 | true | Set false to skip bucket creation. The anythingllm-docs bucket is always provisioned. |
storage_buckets | 14 | [{ name_suffix = "data" }] | Additional GCS buckets to provision. |
gcs_volumes | 14 | [] | GCS buckets to mount via CSI GCS Fuse. |
manage_storage_kms_iam | 14 | false | Creates CMEK KMS keys and enables CMEK on storage buckets. |
enable_artifact_registry_cmek | 14 | false | Enables CMEK encryption on Artifact Registry container images. |
max_images_to_retain | 14 | 7 | Maximum container images to keep in Artifact Registry. |
delete_untagged_images | 14 | true | Automatically deletes untagged images. |
image_retention_days | 14 | 30 | Days after which images are eligible for deletion. |
enable_nfs | 13 | false | Provisions a Cloud Filestore NFS instance for shared file storage. |
nfs_mount_path | 13 | '/mnt/nfs' | Container path where the NFS volume is mounted. |
nfs_volume_name | 13 | 'nfs-data-volume' | Volume name for the NFS mount. |
nfs_instance_name | 13 | "" | Existing NFS GCE VM name. Leave empty to auto-discover. |
nfs_instance_base_name | 13 | 'app-nfs' | Base name for inline NFS GCE VM. |
F. Networking
| Variable | Group | Default | Description |
|---|---|---|---|
enable_custom_domain | 19 | false | Provisions a Kubernetes Ingress for application_domains. |
application_domains | 19 | [] | Custom domain names for the Ingress. |
reserve_static_ip | 19 | true | Provisions a global static external IP. Recommended for production DNS. |
static_ip_name | 19 | "" | Name for the static IP. Auto-generated if empty. |
network_tags | 19 | ['nfsserver'] | Network tags for GKE nodes. |
G. Initialization & Bootstrap
A db-init Kubernetes Job is automatically provisioned by AnythingLLM Common when initialization_jobs is left as the default empty list ([]). It uses the postgres:15-alpine image and executes create-db-and-user.sh.
| Variable | Group | Default | Description |
|---|---|---|---|
initialization_jobs | 11 | [] | Kubernetes Jobs to run before the application starts. Leave empty for AnythingLLM Common to supply the default db-init job. Each entry must have at least one of command, args, or script_path. |
cron_jobs | 11 | [] | Scheduled Kubernetes CronJobs. Each entry: name, schedule, image, command, args, env_vars, secret_env_vars, cpu_limit, memory_limit, restart_policy, concurrency_policy, suspend, mount_nfs, mount_gcs_volumes, script_path. |
additional_services | 11 | [] | Sidecar or helper GKE services deployed alongside AnythingLLM. |
4. Advanced Security
A. Cloud Armor WAF
When enable_cloud_armor = true, a Cloud Armor security policy is attached to the GKE Ingress backend.
| Variable | Group | Default | Description |
|---|---|---|---|
enable_cloud_armor | 21 | false | Attaches a Cloud Armor security policy to the GKE Ingress. |
admin_ip_ranges | 21 | [] | Admin CIDR ranges for privileged access. |
cloud_armor_policy_name | 21 | 'default-waf-policy' | Cloud Armor security policy name. |
enable_cdn | 21 | false | Enables Cloud CDN on the GKE Ingress backend. |
B. Identity-Aware Proxy (IAP)
When enable_iap = true, IAP is configured on the GKE Ingress backend. Google identity authentication is required before requests reach AnythingLLM.
| Variable | Group | Default | Description |
|---|---|---|---|
enable_iap | 20 | false | Enables IAP on the GKE Ingress. |
iap_authorized_users | 20 | [] | Users/service accounts granted IAP access. |
iap_authorized_groups | 20 | [] | Google Groups granted IAP access. |
iap_oauth_client_id | 20 | "" | OAuth 2.0 Client ID for IAP. Sensitive. |
iap_oauth_client_secret | 20 | "" | OAuth 2.0 Client Secret for IAP. Sensitive. |
iap_support_email | 20 | "" | Support email shown on the OAuth consent screen. |
C. Binary Authorization
| Variable | Group | Default | Description |
|---|---|---|---|
enable_binary_authorization | 12 | false | Enforces image attestation on the GKE cluster. |
D. VPC Service Controls
| Variable | Group | Default | Description |
|---|---|---|---|
enable_vpc_sc | 22 | false | Registers module API calls within the project's VPC-SC perimeter. |
vpc_cidr_ranges | 22 | [] | VPC subnet CIDR ranges for VPC-SC network access level. |
vpc_sc_dry_run | 22 | true | Logs VPC-SC violations without blocking. |
organization_id | 22 | "" | GCP Organization ID for VPC-SC. |
enable_audit_logging | 22 | false | Enables detailed Cloud Audit Logs. |
5. Reliability Policies
A. Pod Disruption Budget
| Variable | Group | Default | Description |
|---|---|---|---|
enable_pod_disruption_budget | 9 | false | Creates a Kubernetes PodDisruptionBudget. |
pdb_min_available | 9 | '1' | Minimum pods available during voluntary disruptions (integer or percentage). |
B. Topology Spread
| Variable | Group | Default | Description |
|---|---|---|---|
enable_topology_spread | 9 | false | Adds TopologySpreadConstraints to distribute pods across GKE zones. |
topology_spread_strict | 9 | false | Uses DoNotSchedule when topology spread cannot be satisfied. |
C. Resource Quota
| Variable | Group | Default | Description |
|---|---|---|---|
enable_resource_quota | 8 | false | Creates a Kubernetes ResourceQuota in the namespace. |
quota_cpu_requests | 8 | "" | Total CPU requests allowed in the namespace. |
quota_cpu_limits | 8 | "" | Total CPU limits allowed in the namespace. |
quota_memory_requests | 8 | "" | Total memory requests. Must use binary suffix (e.g., '8Gi'). |
quota_memory_limits | 8 | "" | Total memory limits. Must use binary suffix (e.g., '16Gi'). |
Warning:
quota_memory_requestsandquota_memory_limitsmust include a binary unit suffix (e.g.,'8Gi','4096Mi'). Bare integers are treated as bytes by Kubernetes and will block all pod scheduling.
D. Health Probes & Uptime Monitoring
AnythingLLM requires a longer startup window for AI model loading and database migration.
| Variable | Group | Default | Description |
|---|---|---|---|
startup_probe_config | 10 | { path="/api/ping", initial_delay_seconds=60, failure_threshold=30, ... } | Startup probe configuration. |
health_check_config | 10 | { path="/api/ping", initial_delay_seconds=30, failure_threshold=3, ... } | Liveness probe configuration. |
uptime_check_config | 10 | { enabled=true, path="/" } | Cloud Monitoring uptime check. |
alert_policies | 10 | [] | Cloud Monitoring metric alert policies. |
startup_probe | 10 | { path="/api/ping", initial_delay_seconds=60, failure_threshold=30, ... } | Startup probe passed to AnythingLLM Common. |
liveness_probe | 10 | { path="/api/ping", initial_delay_seconds=30, failure_threshold=3, ... } | Liveness probe passed to AnythingLLM Common. |
6. Integrations
A. LLM Provider Configuration
Use environment_variables for non-sensitive provider configuration and secret_environment_variables for API keys:
environment_variables = {
LLM_PROVIDER = "openai"
EMBEDDING_ENGINE = "native"
VECTOR_DB = "lancedb"
}
secret_environment_variables = {
OPENAI_API_KEY = "anythingllm-openai-key"
ANTHROPIC_API_KEY = "anythingllm-anthropic-key"
}
B. Redis Cache
| Variable | Group | Default | Description |
|---|---|---|---|
enable_redis | 15 | false | Enables Redis. Not required for AnythingLLM core functionality. |
redis_host | 15 | null | Redis hostname or IP. Required when enable_redis = true. |
redis_port | 15 | '6379' | Redis TCP port (string). |
redis_auth | 15 | "" | Redis AUTH password. Sensitive. |
C. Backup & Import
| Variable | Group | Default | Description |
|---|---|---|---|
backup_schedule | 17 | '0 2 * * *' | Backup cron schedule (UTC). |
backup_retention_days | 17 | 7 | Days to retain backup files in GCS. |
enable_backup_import | 17 | false | Triggers a one-time database restore on apply. |
backup_source | 17 | 'gcs' | 'gcs' or 'gdrive'. |
backup_file | 17 | 'backup.sql' | Backup filename to import. |
backup_format | 17 | 'sql' | Backup format: sql, tar, gz, tgz, tar.gz, zip, auto. |
D. Custom SQL Scripts
| Variable | Group | Default | Description |
|---|---|---|---|
enable_custom_sql_scripts | 18 | false | Runs custom SQL scripts from a GCS bucket against the database. |
custom_sql_scripts_bucket | 18 | "" | GCS bucket containing SQL scripts. |
custom_sql_scripts_path | 18 | "" | Path prefix within the bucket. |
custom_sql_scripts_use_root | 18 | false | Run scripts as the root DB user. |
E. CI/CD
| Variable | Group | Default | Description |
|---|---|---|---|
enable_cicd_trigger | 12 | false | Enables a Cloud Build GitHub trigger. |
github_repository_url | 12 | "" | Full HTTPS URL of the GitHub repository. |
github_token | 12 | "" | GitHub PAT. Sensitive. |
github_app_installation_id | 12 | "" | GitHub App installation ID. |
cicd_trigger_config | 12 | { branch_pattern = "^main$" } | Advanced Cloud Build trigger configuration. |
enable_cloud_deploy | 12 | false | Provisions a Cloud Deploy pipeline. |
cloud_deploy_stages | 12 | [dev, staging, prod(approval)] | Cloud Deploy promotion stages. |
7. Platform-Managed Behaviours
| Behaviour | Implementation | Detail |
|---|---|---|
| PostgreSQL 15 required | database_type = "POSTGRES_15" fixed by AnythingLLM Common | AnythingLLM's Prisma ORM requires PostgreSQL. |
| Prisma DATABASE_URL | Constructed by anythingllm-entrypoint.sh at container start | The entrypoint script builds the PostgreSQL connection string from DB_* vars. |
| Application secrets auto-generated | JWT_SECRET, AUTH_TOKEN, SIG_KEY, SIG_SALT provisioned by AnythingLLM Common | Secret IDs forwarded to App GKE via module_secret_env_vars. |
| GCS document bucket | anythingllm-docs bucket provisioned by AnythingLLM Common | GOOGLE_CLOUD_STORAGE_BUCKET_NAME env var set automatically. |
| Fixed environment variables | SERVER_PORT=3001, STORAGE_DIR=/app/server/storage, UID=1000, GID=1000 | Set by AnythingLLM Common. Do not override. |
| StatefulSet PVC mount path | stateful_pvc_mount_path = '/app/server/storage' default | Points to AnythingLLM's document and vector storage directory. |
| fsGroup = 1000 | stateful_fs_group = 1000 default | Kubernetes chowns the PVC mount to match the AnythingLLM container user. |
| Unix socket by default | enable_cloudsql_volume = true default | Auth Proxy sidecar provides Cloud SQL connectivity. |
| Image mirroring enabled | enable_image_mirroring = true default | Mirrors image into Artifact Registry. |
| Default db-init job | Supplied by AnythingLLM Common when initialization_jobs = [] | PostgreSQL database and user are created automatically. |
kubernetes_ready gating | kubernetes_ready output gates all Kubernetes resources | On a first apply with an inline cluster, Kubernetes resources are skipped until the cluster endpoint is available. Re-run apply to complete deployment. |
8. Variable Reference
All user-configurable variables exposed by AnythingLLM GKE, sorted by UI group then order.
| Variable | Group | Default | Description |
|---|---|---|---|
module_description | 0 | (AnythingLLM platform text) | Platform metadata: module description. |
module_documentation | 0 | (docs URL) | Platform metadata: documentation URL. |
module_dependency | 0 | ['Services GCP'] | Platform metadata: required modules. |
module_services | 0 | (GCP service list) | Platform metadata: GCP services consumed. |
credit_cost | 0 | 150 | Platform metadata: deployment credit cost. |
require_credit_purchases | 0 | false | Platform metadata: enforces credit balance check. |
enable_purge | 0 | true | Permits full deletion of module resources on destroy. |
public_access | 0 | false | Platform catalogue visibility. |
deployment_id | 0 | "" | Deployment ID suffix. Auto-generated if empty. |
resource_creator_identity | 0 | (platform SA) | Service account used by Terraform to manage resources. |
impersonation_service_account | 0 | "" | Service account to impersonate for GCP API calls. |
project_id | 1 | — | GCP project ID. Required. |
region | 1 | 'us-central1' | GCP region for all resources. |
tenant_deployment_id | 2 | 'demo' | Short suffix appended to all resource names. |
support_users | 2 | [] | Email addresses for monitoring alerts. |
resource_labels | 2 | {} | Labels applied to all provisioned resources. |
application_name | 3 | 'anythingllm' | Base resource name. Do not change after initial deployment. |
application_display_name | 3 | 'AnythingLLM' | Human-readable name. |
application_description | 3 | 'AnythingLLM Private AI Workspace on GKE' | Service description. |
application_version | 3 | 'latest' | Container image version tag. |
deploy_application | 4 | true | Set false for infrastructure-only deployment. |
container_image_source | 4 | 'custom' | 'custom' (Cloud Build) or 'prebuilt' (existing image). |
container_image | 4 | "" | Container image URI. |
container_build_config | 4 | { enabled=true } | Cloud Build configuration object. |
enable_image_mirroring | 4 | true | Mirrors the container image into Artifact Registry. |
container_resources | 4 | { cpu_limit="2000m", memory_limit="4Gi" } | CPU/Memory limits and optional requests. |
min_instance_count | 4 | 1 | Minimum pod replicas. |
max_instance_count | 4 | 1 | Maximum pod replicas. |
container_port | 4 | 3001 | AnythingLLM's native port. |
container_protocol | 4 | 'http1' | 'http1' or 'h2c'. |
enable_vertical_pod_autoscaling | 4 | false | Enables VPA. |
timeout_seconds | 4 | 300 | Load balancer backend response timeout. |
enable_cloudsql_volume | 4 | true | Injects Cloud SQL Auth Proxy sidecar. |
cloudsql_volume_mount_path | 4 | '/cloudsql' | Path for the Auth Proxy socket. |
service_annotations | 4 | {} | Kubernetes Service annotations. |
service_labels | 4 | {} | Kubernetes Service labels. |
environment_variables | 5 | {} | Plain-text env vars. |
secret_environment_variables | 5 | {} | Secret Manager references. |
secret_rotation_period | 5 | '2592000s' | Secret rotation notification frequency. |
secret_propagation_delay | 5 | 30 | Seconds to wait after secret creation. |
gke_cluster_name | 6 | "" | Target GKE cluster name. Auto-discovered if empty. |
workload_type | 6 | null | 'Deployment' or 'StatefulSet'. |
service_type | 6 | 'LoadBalancer' | Kubernetes Service type. |
session_affinity | 6 | 'ClientIP' | Session affinity mode. |
namespace_name | 6 | "" | Kubernetes namespace. Auto-generated if empty. |
termination_grace_period_seconds | 6 | 60 | Seconds to wait after SIGTERM. |
deployment_timeout | 6 | 1800 | Seconds Terraform waits for rollout. |
enable_network_segmentation | 6 | false | Creates Kubernetes NetworkPolicy resources. |
enable_multi_cluster_service | 6 | false | Creates a ServiceExport for MCS. |
configure_service_mesh | 6 | false | Enables Istio injection. |
stateful_pvc_enabled | 7 | null | Enables PVC templates. Auto-selects StatefulSet. |
stateful_pvc_size | 7 | '20Gi' | Storage per pod. |
stateful_pvc_mount_path | 7 | '/app/server/storage' | PVC mount path (AnythingLLM storage dir). |
stateful_pvc_storage_class | 7 | 'standard-rwo' | Kubernetes StorageClass. |
stateful_headless_service | 7 | null | Creates a headless service for StatefulSet DNS. |
stateful_pod_management_policy | 7 | null | 'OrderedReady' or 'Parallel'. |
stateful_update_strategy | 7 | null | 'RollingUpdate' or 'OnDelete'. |
stateful_fs_group | 7 | 1000 | Pod fsGroup GID. Matches AnythingLLM container user. |
enable_resource_quota | 8 | false | Creates a Kubernetes ResourceQuota. |
quota_cpu_requests | 8 | "" | Total CPU requests in the namespace. |
quota_cpu_limits | 8 | "" | Total CPU limits in the namespace. |
quota_memory_requests | 8 | "" | Total memory requests. Must use binary suffix (e.g., '8Gi'). |
quota_memory_limits | 8 | "" | Total memory limits. Must use binary suffix. |
enable_pod_disruption_budget | 9 | false | Creates a Kubernetes PodDisruptionBudget. |
pdb_min_available | 9 | '1' | Minimum pods available during disruptions. |
enable_topology_spread | 9 | false | TopologySpreadConstraints for multi-zone distribution. |
topology_spread_strict | 9 | false | DoNotSchedule when spread cannot be satisfied. |
startup_probe_config | 10 | { path="/api/ping", initial_delay_seconds=60, ... } | Startup probe configuration. |
health_check_config | 10 | { path="/api/ping", initial_delay_seconds=30, ... } | Liveness probe configuration. |
uptime_check_config | 10 | { enabled=true, path="/" } | Cloud Monitoring uptime check. |
alert_policies | 10 | [] | Cloud Monitoring metric alert policies. |
startup_probe | 10 | { path="/api/ping", initial_delay_seconds=60, ... } | Startup probe passed to AnythingLLM Common. |
liveness_probe | 10 | { path="/api/ping", initial_delay_seconds=30, ... } | Liveness probe passed to AnythingLLM Common. |
initialization_jobs | 11 | [] | Kubernetes initialization Jobs. |
cron_jobs | 11 | [] | Kubernetes CronJobs. |
additional_services | 11 | [] | Additional GKE services alongside AnythingLLM. |
enable_cicd_trigger | 12 | false | Provisions a Cloud Build GitHub trigger. |
github_repository_url | 12 | "" | GitHub repository URL. |
github_token | 12 | "" | GitHub PAT. Sensitive. |
github_app_installation_id | 12 | "" | GitHub App installation ID. |
cicd_trigger_config | 12 | { branch_pattern = "^main$" } | Advanced trigger config. |
enable_cloud_deploy | 12 | false | Provisions a Cloud Deploy pipeline. |
cloud_deploy_stages | 12 | [dev, staging, prod(approval)] | Cloud Deploy stages. |
enable_binary_authorization | 12 | false | Enforces image attestation. |
enable_nfs | 13 | false | Provisions NFS shared storage. |
nfs_mount_path | 13 | '/mnt/nfs' | NFS container mount path. |
nfs_volume_name | 13 | 'nfs-data-volume' | NFS volume name. |
nfs_instance_name | 13 | "" | Existing NFS GCE VM name. |
nfs_instance_base_name | 13 | 'app-nfs' | Base name for inline NFS VM. |
create_cloud_storage | 14 | true | Set false to skip GCS bucket creation. |
storage_buckets | 14 | [{ name_suffix = "data" }] | Additional GCS buckets. |
gcs_volumes | 14 | [] | GCS Fuse volumes via CSI. |
manage_storage_kms_iam | 14 | false | CMEK for storage buckets. |
enable_artifact_registry_cmek | 14 | false | CMEK for Artifact Registry. |
max_images_to_retain | 14 | 7 | Maximum images in Artifact Registry. |
delete_untagged_images | 14 | true | Delete untagged images. |
image_retention_days | 14 | 30 | Image retention age in days. |
enable_redis | 15 | false | Enables Redis. |
redis_host | 15 | null | Redis hostname or IP. |
redis_port | 15 | '6379' | Redis TCP port. |
redis_auth | 15 | "" | Redis AUTH password. Sensitive. |
database_type | 16 | 'POSTGRES_15' | Cloud SQL engine. AnythingLLM requires PostgreSQL. |
application_database_name | 16 | 'anythingllmdb' | PostgreSQL database name. |
application_database_user | 16 | 'anythingllmuser' | Database user. |
database_password_length | 16 | 32 | Password length. Range: 16–64. |
enable_postgres_extensions | 16 | false | Enables PostgreSQL extension installation. |
postgres_extensions | 16 | [] | PostgreSQL extensions to install. |
enable_mysql_plugins | 16 | false | Not applicable — AnythingLLM uses PostgreSQL only. |
mysql_plugins | 16 | [] | Not applicable. |
enable_auto_password_rotation | 16 | false | Automated password rotation. |
rotation_propagation_delay_sec | 16 | 90 | Seconds to wait after rotation before restarting pods. |
backup_schedule | 17 | '0 2 * * *' | Backup cron schedule (UTC). |
backup_retention_days | 17 | 7 | Days to retain backups. |
enable_backup_import | 17 | false | Triggers a one-time restore on apply. |
backup_source | 17 | 'gcs' | 'gcs' or 'gdrive'. |
backup_file | 17 | 'backup.sql' | Backup filename. |
backup_format | 17 | 'sql' | Backup format. |
enable_custom_sql_scripts | 18 | false | Runs custom SQL scripts from GCS. |
custom_sql_scripts_bucket | 18 | "" | GCS bucket for SQL scripts. |
custom_sql_scripts_path | 18 | "" | Path prefix in the bucket. |
custom_sql_scripts_use_root | 18 | false | Run scripts as root DB user. |
enable_custom_domain | 19 | false | Provisions a Kubernetes Ingress. |
application_domains | 19 | [] | Custom domain names for the Ingress. |
reserve_static_ip | 19 | true | Provisions a global static IP. |
static_ip_name | 19 | "" | Static IP name. Auto-generated if empty. |
network_tags | 19 | ['nfsserver'] | GKE node network tags. |
enable_iap | 20 | false | Enables IAP on the Ingress. |
iap_authorized_users | 20 | [] | Users granted IAP access. |
iap_authorized_groups | 20 | [] | Google Groups granted IAP access. |
iap_oauth_client_id | 20 | "" | OAuth Client ID. Sensitive. |
iap_oauth_client_secret | 20 | "" | OAuth Client Secret. Sensitive. |
iap_support_email | 20 | "" | OAuth consent screen support email. |
enable_cloud_armor | 21 | false | Attaches Cloud Armor to the GKE Ingress. |
admin_ip_ranges | 21 | [] | Admin CIDR ranges. |
cloud_armor_policy_name | 21 | 'default-waf-policy' | Cloud Armor policy name. |
enable_cdn | 21 | false | Enables Cloud CDN on the Ingress backend. |
enable_vpc_sc | 22 | false | VPC Service Controls perimeter enforcement. |
vpc_cidr_ranges | 22 | [] | VPC CIDR ranges for VPC-SC. |
vpc_sc_dry_run | 22 | true | Dry-run mode for VPC-SC violations. |
organization_id | 22 | "" | GCP Organization ID for VPC-SC. |
enable_audit_logging | 22 | false | Enables Cloud Audit Logs. |
9. Outputs
| Output | Description |
|---|---|
service_name | Kubernetes Service name. |
service_url | External URL of the AnythingLLM service. |
namespace | Kubernetes namespace. |
service_cluster_ip | ClusterIP of the Kubernetes Service. |
service_external_ip | External LoadBalancer IP (if static IP is reserved). |
project_id | GCP project ID. |
deployment_id | Deployment ID suffix. |
database_instance_name | Cloud SQL PostgreSQL instance name. |
database_name | Application database name. |
database_user | Application database user. |
database_password_secret | Secret Manager secret name for the database password. |
database_host | Database host IP (sensitive). |
database_port | Database port. |
storage_buckets | Created GCS storage buckets. |
nfs_server_ip | NFS server internal IP (sensitive). |
nfs_mount_path | NFS mount path inside containers. |
container_image | Container image used for the deployment. |
container_registry | Artifact Registry repository name. |
deployment_summary | Summary of the deployment. |
initialization_jobs | Created initialization job names. |
cron_jobs | Created cron job names. |
statefulset_name | Name of the StatefulSet (if workload_type = 'StatefulSet'). |
cicd_enabled | Whether the CI/CD pipeline is enabled. |
kubernetes_ready | True when the GKE cluster endpoint is available and all Kubernetes workload resources have been deployed. |
artifact_registry_repository | Artifact Registry repository for container images. |
Configuration Pitfalls & Sensible Defaults
Risk levels: Critical (data loss, full outage, security breach) — High (service unavailable or significant degradation) — Medium (degraded function or increased cost) — Low (minor impact).
| Variable | Sensible Default | Risk | Consequence of Incorrect Value |
|---|---|---|---|
JWT_SECRET (auto-generated) | Random secret in Secret Manager | Critical | Signs all AnythingLLM authentication tokens. Rotating or regenerating invalidates all active sessions simultaneously. Treat as immutable after first user login. |
AUTH_TOKEN (optional) | "" (no token) | High | Leaving empty means the AnythingLLM REST API is accessible to any pod in the same namespace without a token. Provide a strong bearer token for production cluster deployments. |
STORAGE_DIR / GCS Fuse mount | Mounted at /app/server/storage | Critical | All workspace documents, vector indices, and conversation attachments are stored under STORAGE_DIR. Without a persistent volume mount, all data is lost when the pod is evicted or rescheduled. stateful_pvc_enabled = true or GCS Fuse is mandatory for production. |
stateful_pvc_enabled | false | High | GCS Fuse is the default persistence backend. If disabled accidentally and no PVC is configured, the pod writes to the ephemeral container filesystem and all data is lost on restart. |
nfs_mount_path | "/mnt/nfs" | High | Must match the STORAGE_DIR environment variable. A mismatch causes AnythingLLM to write to a local path while the NFS share is unused. |
LLM_PROVIDER (via environment_variables) | "native" | Critical | Without a correctly configured LLM provider and its associated API key in secret_environment_variables, all AI inference requests fail. LLM_PROVIDER must match the keys provided (e.g., "openai" requires OPENAI_API_KEY). |
EMBEDDING_ENGINE (via environment_variables) | "native" | High | Changing the embedding engine after workspace ingestion makes all existing vector indices incompatible. The engine must remain consistent for the lifetime of the data, or all documents must be re-ingested. |
secret_environment_variables | {} | Critical | All LLM provider API keys must come from Secret Manager references. Providing sensitive keys as plain environment_variables exposes them in Kubernetes pod specs and GCP audit logs. |
container_resources.memory_limit | 4Gi | High | AnythingLLM's native embedding pipeline requires 3–4 Gi RAM during ingestion. OOM-kill during document processing causes partial ingestion and corrupted vector indices. |
quota_memory_requests / quota_memory_limits | "4Gi" / "8Gi" | Critical | Must use binary suffixes (Gi, Mi). Bare integers are treated as bytes, preventing all pod scheduling in the namespace. |
enable_cloudsql_volume | true | Critical | Must remain true for Cloud SQL connectivity. Disabling causes all database connections to fail and AnythingLLM to crash on startup with a PostgreSQL connection error. |
database_type | "POSTGRES" | Critical | AnythingLLM requires PostgreSQL. Without a relational database, workspace metadata, users, and conversation history cannot be persisted. |
min_instance_count | 1 | High | Scale-to-zero requires the pod and the GCS Fuse mount to re-initialise on each cold start (30–60 s). Any in-flight AI operations during scale-down are lost. |
timeout_seconds | 300 | High | Long document ingestion operations or slow LLM completions cause 504 errors if the timeout is exceeded. Increase to 600–3600 for document-heavy or slow-LLM deployments. |
enable_nfs | false | Medium | Required for multi-replica deployments with shared file access. Without NFS or a shared PVC, each pod has an isolated storage view and cross-pod document access is impossible. |
workload_type | null (auto-select) | Medium | Setting stateful_pvc_enabled = true auto-selects StatefulSet. Setting both stateful_pvc_enabled = true and workload_type = "Deployment" fails at plan time with a validation error. |
enable_redis | false | Low | Optional for AnythingLLM. If enable_redis = true, redis_host must be resolvable from the pod or startup fails. |
backup_schedule | "" (disabled) | High | Without automated PostgreSQL backups, workspace metadata (users, workspaces, settings) is unprotected. Enable for production. |
enable_image_mirroring | true | Medium | Disabling pulls from the upstream registry (rate-limited in CI/CD environments). Keep enabled in production. |
application_version | "latest" | Medium | Unpinned versions risk schema-breaking upgrades. Pin to a specific release tag for production stability. |