Skip to main content

AnythingLLM on Google Kubernetes Engine

This document provides a comprehensive reference for the modules/AnythingLLM_GKE Terraform module. It covers architecture, IAM, configuration variables, AnythingLLM-specific behaviours, and operational patterns for deploying AnythingLLM on GKE Autopilot.


1. Module Overview

AnythingLLM is a private AI workspace and Retrieval-Augmented Generation (RAG) platform. It allows teams to chat with documents, connect to any LLM provider (OpenAI, Anthropic, Ollama, and others), and build AI-powered knowledge assistants — without sending data to third-party services. AnythingLLM GKE is a wrapper module built on top of App GKE. It uses App GKE for all GCP and Kubernetes infrastructure provisioning and injects AnythingLLM-specific application configuration, secrets, and storage configuration via AnythingLLM Common.

Key Capabilities:

  • Compute: GKE Autopilot, 2 vCPU / 4 Gi by default. Supports Deployment (stateless) and StatefulSet (persistent PVC) workload types. min_instance_count = 1 recommended to keep AnythingLLM warm.
  • Data Persistence: Cloud SQL PostgreSQL 15 (required by AnythingLLM's Prisma ORM). GCS document storage bucket auto-provisioned by AnythingLLM Common. StatefulSet PVCs for persistent vector store data. Optional NFS for shared document access.
  • Security: Four application-level secrets are auto-generated by AnythingLLM CommonJWT_SECRET, AUTH_TOKEN, SIG_KEY, and SIG_SALT — stored in Secret Manager and accessed via Workload Identity. Inherits Cloud Armor, IAP, Binary Authorization, and VPC Service Controls from App GKE.
  • Caching: Redis is disabled by default (enable_redis = false). Enable for session or cache workloads if required.
  • CI/CD: Cloud Build custom image pipeline by default; Cloud Deploy progressive delivery optional.
  • Reliability: Health probes target /api/ping with a 60-second initial delay and extended startup window (30 periods × 10 seconds) for AI model loading.
  • Scaling: Horizontal Pod Autoscaler (HPA) with min_instance_count and max_instance_count. Optional Vertical Pod Autoscaling (VPA).

Project & Application Identity

VariableGroupTypeDefaultDescription
project_id1stringGCP project ID. Required.
tenant_deployment_id2string'demo'Short suffix appended to all resource names.
support_users2list(string)[]Email recipients for monitoring alerts.
resource_labels2map(string){}Labels applied to all provisioned resources.
application_name3string'anythingllm'Base resource name. Do not change after initial deployment.
application_display_name3string'AnythingLLM'Human-readable name shown in dashboards.
application_description3string'AnythingLLM Private AI Workspace on GKE'Service description.
application_version3string'latest'Container image version tag.

Wrapper architecture: AnythingLLM GKE calls AnythingLLM Common to build an application_config object containing AnythingLLM-specific environment variables, probe configuration, and the db-init job definition. AnythingLLM Common generates and stores JWT_SECRET, AUTH_TOKEN, SIG_KEY, and SIG_SALT in Secret Manager and returns their IDs via module.anythingllm_app.secret_ids. The GOOGLE_CLOUD_STORAGE_BUCKET_NAME environment variable is set automatically from module.anythingllm_app.storage_buckets[0].name.

PostgreSQL note: AnythingLLM uses Prisma ORM and requires PostgreSQL. database_type = "POSTGRES_15" is the default. Do not set this to a MySQL or SQL Server variant.


2. IAM & Access Control

AnythingLLM GKE delegates all IAM provisioning to App GKE. Workload Identity binds the Kubernetes Service Account to a GCP Service Account, granting the pods access to Secret Manager, Cloud SQL, GCS, and Artifact Registry.

Application secrets: AnythingLLM Common auto-generates four secrets on first apply:

  • JWT_SECRET — signs AnythingLLM authentication tokens.
  • AUTH_TOKEN — optional API bearer token for programmatic access.
  • SIG_KEY — HMAC signing key for request signatures (32+ characters).
  • SIG_SALT — salt used alongside SIG_KEY for HMAC signatures (32+ characters).

These secrets are stored in Secret Manager and mounted into pods as Kubernetes Secret volumes or environment variables via Workload Identity. Plaintext is never written to Terraform state.

Database initialisation identity: The db-init Kubernetes Job runs under the pod's Workload Identity SA. It connects to Cloud SQL PostgreSQL via the Auth Proxy sidecar, using DB_HOST, DB_USER, and ROOT_PASSWORD (from Secret Manager).


3. Core Service Configuration

A. Compute (GKE Autopilot)

AnythingLLM's AI workloads require more resources than typical web applications. GKE Autopilot automatically provisions node capacity to meet pod resource requests.

min_instance_count = 1 is recommended to keep AnythingLLM warm and avoid cold starts for AI document chat and embedding operations.

StatefulSet for persistent vector store: Setting stateful_pvc_enabled = true automatically selects StatefulSet workload type. Each pod receives its own PVC mounted at /app/server/storage (the AnythingLLM storage directory) by default.

VariableGroupDefaultDescription
deploy_application4trueSet false for infrastructure-only deployment (SQL, storage, secrets).
container_image_source4'custom''custom' builds via Cloud Build. 'prebuilt' deploys an existing image URI.
container_image4""Override image URI. Leave empty for Cloud Build to manage.
container_build_config4{ enabled=true }Cloud Build configuration object.
enable_image_mirroring4trueMirrors the container image into Artifact Registry.
container_resources4{ cpu_limit="2000m", memory_limit="4Gi" }CPU/Memory limits and requests. AI workloads require at least 2 vCPU and 4 Gi.
min_instance_count41Minimum pod replicas. Set to 1 to keep AnythingLLM warm.
max_instance_count41Maximum pod replicas.
container_port43001AnythingLLM's native HTTP port.
container_protocol4'http1''http1' or 'h2c'.
enable_vertical_pod_autoscaling4falseEnables VPA to automatically adjust CPU/memory requests.
timeout_seconds4300Maximum load balancer backend response timeout. Increase for long document processing.
enable_cloudsql_volume4trueInjects Cloud SQL Auth Proxy sidecar for database connectivity.
cloudsql_volume_mount_path4'/cloudsql'Container path for the Auth Proxy Unix socket.
service_annotations4{}Annotations applied to the Kubernetes Service resource.
service_labels4{}Labels applied to the Kubernetes Service resource.

Differences from App GKE defaults:

VariableApp GKEAnythingLLM GKEReason
container_port80803001AnythingLLM's native port.
container_resources.cpu_limit'1000m''2000m'AI workloads require more CPU.
container_resources.memory_limit'512Mi''4Gi'LLM context, document vectors, and Prisma ORM require more RAM.
min_instance_count01Keep warm for AI operations.
health probe path'/healthz''/api/ping'AnythingLLM's health endpoint.
stateful_pvc_mount_path(app-defined)'/app/server/storage'AnythingLLM's storage directory for documents and vectors.
stateful_fs_groupvaries1000Matches the AnythingLLM container user GID.

B. Kubernetes Workload Configuration

VariableGroupDefaultDescription
workload_type6null'Deployment' (stateless) or 'StatefulSet' (persistent PVC). Auto-selects StatefulSet when stateful_pvc_enabled = true.
service_type6'LoadBalancer'Kubernetes Service type: 'ClusterIP', 'LoadBalancer', or 'NodePort'.
session_affinity6'ClientIP'Session affinity for the Kubernetes Service.
namespace_name6""Kubernetes namespace. Auto-generated if empty.
gke_cluster_name6""Target GKE cluster name. Auto-discovered if empty.
termination_grace_period_seconds660Seconds Kubernetes waits after SIGTERM before force-terminating.
deployment_timeout61800Seconds Terraform waits for the Deployment rollout to complete.
enable_network_segmentation6falseCreates Kubernetes NetworkPolicy resources to restrict pod-to-pod traffic.
enable_multi_cluster_service6falseCreates a ServiceExport for Multi-Cluster Services (MCS).
configure_service_mesh6falseEnables Istio service mesh injection for the application namespace.
network_tags19['nfsserver']Network tags applied to GKE nodes for VPC firewall rules.

C. StatefulSet Configuration

VariableGroupDefaultDescription
stateful_pvc_enabled7nullEnables PVC templates in the StatefulSet. Setting true auto-selects StatefulSet. Recommended for persistent AnythingLLM vector store data.
stateful_pvc_size7'20Gi'Storage size for each per-pod PVC. Recommended minimum 20 Gi for vector store data.
stateful_pvc_mount_path7'/app/server/storage'Mount path for the per-pod PVC. Set to AnythingLLM's storage directory.
stateful_pvc_storage_class7'standard-rwo'Kubernetes StorageClass for StatefulSet PVCs.
stateful_headless_service7nullCreates a headless Kubernetes Service for stable pod DNS.
stateful_pod_management_policy7nullPod creation order: 'OrderedReady' or 'Parallel'.
stateful_update_strategy7nullUpdate strategy: 'RollingUpdate' or 'OnDelete'.
stateful_fs_group71000Pod-level fsGroup GID. Set to 1000 to match the AnythingLLM container user.

D. Database (Cloud SQL — PostgreSQL 15)

AnythingLLM uses Prisma ORM and requires PostgreSQL. The entrypoint script constructs the DATABASE_URL Prisma connection string from the platform-injected DB_* variables at container start time.

VariableGroupDefaultDescription
database_type16'POSTGRES_15'Cloud SQL engine. AnythingLLM requires PostgreSQL.
application_database_name16'anythingllmdb'PostgreSQL database name. Do not change after initial deployment.
application_database_user16'anythingllmuser'Database user. Password auto-generated and stored in Secret Manager.
database_password_length1632Auto-generated password length. Range: 16–64.
enable_postgres_extensions16falseEnables PostgreSQL extension installation.
postgres_extensions16[]List of PostgreSQL extensions to install (e.g., ['uuid-ossp', 'vector']).
enable_auto_password_rotation16falseAutomated zero-downtime password rotation via Kubernetes CronJob.
rotation_propagation_delay_sec1690Seconds to wait after rotation before restarting GKE pods.

E. Storage (NFS, GCS, PVCs)

NFS is disabled by default (enable_nfs = false). For multi-pod deployments sharing document storage, enable NFS. For single-pod persistent storage, use stateful_pvc_enabled = true.

GCS document bucket: AnythingLLM Common automatically provisions a dedicated anythingllm-docs GCS bucket. The GOOGLE_CLOUD_STORAGE_BUCKET_NAME environment variable is set automatically.

VariableGroupDefaultDescription
create_cloud_storage14trueSet false to skip bucket creation. The anythingllm-docs bucket is always provisioned.
storage_buckets14[{ name_suffix = "data" }]Additional GCS buckets to provision.
gcs_volumes14[]GCS buckets to mount via CSI GCS Fuse.
manage_storage_kms_iam14falseCreates CMEK KMS keys and enables CMEK on storage buckets.
enable_artifact_registry_cmek14falseEnables CMEK encryption on Artifact Registry container images.
max_images_to_retain147Maximum container images to keep in Artifact Registry.
delete_untagged_images14trueAutomatically deletes untagged images.
image_retention_days1430Days after which images are eligible for deletion.
enable_nfs13falseProvisions a Cloud Filestore NFS instance for shared file storage.
nfs_mount_path13'/mnt/nfs'Container path where the NFS volume is mounted.
nfs_volume_name13'nfs-data-volume'Volume name for the NFS mount.
nfs_instance_name13""Existing NFS GCE VM name. Leave empty to auto-discover.
nfs_instance_base_name13'app-nfs'Base name for inline NFS GCE VM.

F. Networking

VariableGroupDefaultDescription
enable_custom_domain19falseProvisions a Kubernetes Ingress for application_domains.
application_domains19[]Custom domain names for the Ingress.
reserve_static_ip19trueProvisions a global static external IP. Recommended for production DNS.
static_ip_name19""Name for the static IP. Auto-generated if empty.
network_tags19['nfsserver']Network tags for GKE nodes.

G. Initialization & Bootstrap

A db-init Kubernetes Job is automatically provisioned by AnythingLLM Common when initialization_jobs is left as the default empty list ([]). It uses the postgres:15-alpine image and executes create-db-and-user.sh.

VariableGroupDefaultDescription
initialization_jobs11[]Kubernetes Jobs to run before the application starts. Leave empty for AnythingLLM Common to supply the default db-init job. Each entry must have at least one of command, args, or script_path.
cron_jobs11[]Scheduled Kubernetes CronJobs. Each entry: name, schedule, image, command, args, env_vars, secret_env_vars, cpu_limit, memory_limit, restart_policy, concurrency_policy, suspend, mount_nfs, mount_gcs_volumes, script_path.
additional_services11[]Sidecar or helper GKE services deployed alongside AnythingLLM.

4. Advanced Security

A. Cloud Armor WAF

When enable_cloud_armor = true, a Cloud Armor security policy is attached to the GKE Ingress backend.

VariableGroupDefaultDescription
enable_cloud_armor21falseAttaches a Cloud Armor security policy to the GKE Ingress.
admin_ip_ranges21[]Admin CIDR ranges for privileged access.
cloud_armor_policy_name21'default-waf-policy'Cloud Armor security policy name.
enable_cdn21falseEnables Cloud CDN on the GKE Ingress backend.

B. Identity-Aware Proxy (IAP)

When enable_iap = true, IAP is configured on the GKE Ingress backend. Google identity authentication is required before requests reach AnythingLLM.

VariableGroupDefaultDescription
enable_iap20falseEnables IAP on the GKE Ingress.
iap_authorized_users20[]Users/service accounts granted IAP access.
iap_authorized_groups20[]Google Groups granted IAP access.
iap_oauth_client_id20""OAuth 2.0 Client ID for IAP. Sensitive.
iap_oauth_client_secret20""OAuth 2.0 Client Secret for IAP. Sensitive.
iap_support_email20""Support email shown on the OAuth consent screen.

C. Binary Authorization

VariableGroupDefaultDescription
enable_binary_authorization12falseEnforces image attestation on the GKE cluster.

D. VPC Service Controls

VariableGroupDefaultDescription
enable_vpc_sc22falseRegisters module API calls within the project's VPC-SC perimeter.
vpc_cidr_ranges22[]VPC subnet CIDR ranges for VPC-SC network access level.
vpc_sc_dry_run22trueLogs VPC-SC violations without blocking.
organization_id22""GCP Organization ID for VPC-SC.
enable_audit_logging22falseEnables detailed Cloud Audit Logs.

5. Reliability Policies

A. Pod Disruption Budget

VariableGroupDefaultDescription
enable_pod_disruption_budget9falseCreates a Kubernetes PodDisruptionBudget.
pdb_min_available9'1'Minimum pods available during voluntary disruptions (integer or percentage).

B. Topology Spread

VariableGroupDefaultDescription
enable_topology_spread9falseAdds TopologySpreadConstraints to distribute pods across GKE zones.
topology_spread_strict9falseUses DoNotSchedule when topology spread cannot be satisfied.

C. Resource Quota

VariableGroupDefaultDescription
enable_resource_quota8falseCreates a Kubernetes ResourceQuota in the namespace.
quota_cpu_requests8""Total CPU requests allowed in the namespace.
quota_cpu_limits8""Total CPU limits allowed in the namespace.
quota_memory_requests8""Total memory requests. Must use binary suffix (e.g., '8Gi').
quota_memory_limits8""Total memory limits. Must use binary suffix (e.g., '16Gi').

Warning: quota_memory_requests and quota_memory_limits must include a binary unit suffix (e.g., '8Gi', '4096Mi'). Bare integers are treated as bytes by Kubernetes and will block all pod scheduling.

D. Health Probes & Uptime Monitoring

AnythingLLM requires a longer startup window for AI model loading and database migration.

VariableGroupDefaultDescription
startup_probe_config10{ path="/api/ping", initial_delay_seconds=60, failure_threshold=30, ... }Startup probe configuration.
health_check_config10{ path="/api/ping", initial_delay_seconds=30, failure_threshold=3, ... }Liveness probe configuration.
uptime_check_config10{ enabled=true, path="/" }Cloud Monitoring uptime check.
alert_policies10[]Cloud Monitoring metric alert policies.
startup_probe10{ path="/api/ping", initial_delay_seconds=60, failure_threshold=30, ... }Startup probe passed to AnythingLLM Common.
liveness_probe10{ path="/api/ping", initial_delay_seconds=30, failure_threshold=3, ... }Liveness probe passed to AnythingLLM Common.

6. Integrations

A. LLM Provider Configuration

Use environment_variables for non-sensitive provider configuration and secret_environment_variables for API keys:

environment_variables = {
LLM_PROVIDER = "openai"
EMBEDDING_ENGINE = "native"
VECTOR_DB = "lancedb"
}

secret_environment_variables = {
OPENAI_API_KEY = "anythingllm-openai-key"
ANTHROPIC_API_KEY = "anythingllm-anthropic-key"
}

B. Redis Cache

VariableGroupDefaultDescription
enable_redis15falseEnables Redis. Not required for AnythingLLM core functionality.
redis_host15nullRedis hostname or IP. Required when enable_redis = true.
redis_port15'6379'Redis TCP port (string).
redis_auth15""Redis AUTH password. Sensitive.

C. Backup & Import

VariableGroupDefaultDescription
backup_schedule17'0 2 * * *'Backup cron schedule (UTC).
backup_retention_days177Days to retain backup files in GCS.
enable_backup_import17falseTriggers a one-time database restore on apply.
backup_source17'gcs''gcs' or 'gdrive'.
backup_file17'backup.sql'Backup filename to import.
backup_format17'sql'Backup format: sql, tar, gz, tgz, tar.gz, zip, auto.

D. Custom SQL Scripts

VariableGroupDefaultDescription
enable_custom_sql_scripts18falseRuns custom SQL scripts from a GCS bucket against the database.
custom_sql_scripts_bucket18""GCS bucket containing SQL scripts.
custom_sql_scripts_path18""Path prefix within the bucket.
custom_sql_scripts_use_root18falseRun scripts as the root DB user.

E. CI/CD

VariableGroupDefaultDescription
enable_cicd_trigger12falseEnables a Cloud Build GitHub trigger.
github_repository_url12""Full HTTPS URL of the GitHub repository.
github_token12""GitHub PAT. Sensitive.
github_app_installation_id12""GitHub App installation ID.
cicd_trigger_config12{ branch_pattern = "^main$" }Advanced Cloud Build trigger configuration.
enable_cloud_deploy12falseProvisions a Cloud Deploy pipeline.
cloud_deploy_stages12[dev, staging, prod(approval)]Cloud Deploy promotion stages.

7. Platform-Managed Behaviours

BehaviourImplementationDetail
PostgreSQL 15 requireddatabase_type = "POSTGRES_15" fixed by AnythingLLM CommonAnythingLLM's Prisma ORM requires PostgreSQL.
Prisma DATABASE_URLConstructed by anythingllm-entrypoint.sh at container startThe entrypoint script builds the PostgreSQL connection string from DB_* vars.
Application secrets auto-generatedJWT_SECRET, AUTH_TOKEN, SIG_KEY, SIG_SALT provisioned by AnythingLLM CommonSecret IDs forwarded to App GKE via module_secret_env_vars.
GCS document bucketanythingllm-docs bucket provisioned by AnythingLLM CommonGOOGLE_CLOUD_STORAGE_BUCKET_NAME env var set automatically.
Fixed environment variablesSERVER_PORT=3001, STORAGE_DIR=/app/server/storage, UID=1000, GID=1000Set by AnythingLLM Common. Do not override.
StatefulSet PVC mount pathstateful_pvc_mount_path = '/app/server/storage' defaultPoints to AnythingLLM's document and vector storage directory.
fsGroup = 1000stateful_fs_group = 1000 defaultKubernetes chowns the PVC mount to match the AnythingLLM container user.
Unix socket by defaultenable_cloudsql_volume = true defaultAuth Proxy sidecar provides Cloud SQL connectivity.
Image mirroring enabledenable_image_mirroring = true defaultMirrors image into Artifact Registry.
Default db-init jobSupplied by AnythingLLM Common when initialization_jobs = []PostgreSQL database and user are created automatically.
kubernetes_ready gatingkubernetes_ready output gates all Kubernetes resourcesOn a first apply with an inline cluster, Kubernetes resources are skipped until the cluster endpoint is available. Re-run apply to complete deployment.

8. Variable Reference

All user-configurable variables exposed by AnythingLLM GKE, sorted by UI group then order.

VariableGroupDefaultDescription
module_description0(AnythingLLM platform text)Platform metadata: module description.
module_documentation0(docs URL)Platform metadata: documentation URL.
module_dependency0['Services GCP']Platform metadata: required modules.
module_services0(GCP service list)Platform metadata: GCP services consumed.
credit_cost0150Platform metadata: deployment credit cost.
require_credit_purchases0falsePlatform metadata: enforces credit balance check.
enable_purge0truePermits full deletion of module resources on destroy.
public_access0falsePlatform catalogue visibility.
deployment_id0""Deployment ID suffix. Auto-generated if empty.
resource_creator_identity0(platform SA)Service account used by Terraform to manage resources.
impersonation_service_account0""Service account to impersonate for GCP API calls.
project_id1GCP project ID. Required.
region1'us-central1'GCP region for all resources.
tenant_deployment_id2'demo'Short suffix appended to all resource names.
support_users2[]Email addresses for monitoring alerts.
resource_labels2{}Labels applied to all provisioned resources.
application_name3'anythingllm'Base resource name. Do not change after initial deployment.
application_display_name3'AnythingLLM'Human-readable name.
application_description3'AnythingLLM Private AI Workspace on GKE'Service description.
application_version3'latest'Container image version tag.
deploy_application4trueSet false for infrastructure-only deployment.
container_image_source4'custom''custom' (Cloud Build) or 'prebuilt' (existing image).
container_image4""Container image URI.
container_build_config4{ enabled=true }Cloud Build configuration object.
enable_image_mirroring4trueMirrors the container image into Artifact Registry.
container_resources4{ cpu_limit="2000m", memory_limit="4Gi" }CPU/Memory limits and optional requests.
min_instance_count41Minimum pod replicas.
max_instance_count41Maximum pod replicas.
container_port43001AnythingLLM's native port.
container_protocol4'http1''http1' or 'h2c'.
enable_vertical_pod_autoscaling4falseEnables VPA.
timeout_seconds4300Load balancer backend response timeout.
enable_cloudsql_volume4trueInjects Cloud SQL Auth Proxy sidecar.
cloudsql_volume_mount_path4'/cloudsql'Path for the Auth Proxy socket.
service_annotations4{}Kubernetes Service annotations.
service_labels4{}Kubernetes Service labels.
environment_variables5{}Plain-text env vars.
secret_environment_variables5{}Secret Manager references.
secret_rotation_period5'2592000s'Secret rotation notification frequency.
secret_propagation_delay530Seconds to wait after secret creation.
gke_cluster_name6""Target GKE cluster name. Auto-discovered if empty.
workload_type6null'Deployment' or 'StatefulSet'.
service_type6'LoadBalancer'Kubernetes Service type.
session_affinity6'ClientIP'Session affinity mode.
namespace_name6""Kubernetes namespace. Auto-generated if empty.
termination_grace_period_seconds660Seconds to wait after SIGTERM.
deployment_timeout61800Seconds Terraform waits for rollout.
enable_network_segmentation6falseCreates Kubernetes NetworkPolicy resources.
enable_multi_cluster_service6falseCreates a ServiceExport for MCS.
configure_service_mesh6falseEnables Istio injection.
stateful_pvc_enabled7nullEnables PVC templates. Auto-selects StatefulSet.
stateful_pvc_size7'20Gi'Storage per pod.
stateful_pvc_mount_path7'/app/server/storage'PVC mount path (AnythingLLM storage dir).
stateful_pvc_storage_class7'standard-rwo'Kubernetes StorageClass.
stateful_headless_service7nullCreates a headless service for StatefulSet DNS.
stateful_pod_management_policy7null'OrderedReady' or 'Parallel'.
stateful_update_strategy7null'RollingUpdate' or 'OnDelete'.
stateful_fs_group71000Pod fsGroup GID. Matches AnythingLLM container user.
enable_resource_quota8falseCreates a Kubernetes ResourceQuota.
quota_cpu_requests8""Total CPU requests in the namespace.
quota_cpu_limits8""Total CPU limits in the namespace.
quota_memory_requests8""Total memory requests. Must use binary suffix (e.g., '8Gi').
quota_memory_limits8""Total memory limits. Must use binary suffix.
enable_pod_disruption_budget9falseCreates a Kubernetes PodDisruptionBudget.
pdb_min_available9'1'Minimum pods available during disruptions.
enable_topology_spread9falseTopologySpreadConstraints for multi-zone distribution.
topology_spread_strict9falseDoNotSchedule when spread cannot be satisfied.
startup_probe_config10{ path="/api/ping", initial_delay_seconds=60, ... }Startup probe configuration.
health_check_config10{ path="/api/ping", initial_delay_seconds=30, ... }Liveness probe configuration.
uptime_check_config10{ enabled=true, path="/" }Cloud Monitoring uptime check.
alert_policies10[]Cloud Monitoring metric alert policies.
startup_probe10{ path="/api/ping", initial_delay_seconds=60, ... }Startup probe passed to AnythingLLM Common.
liveness_probe10{ path="/api/ping", initial_delay_seconds=30, ... }Liveness probe passed to AnythingLLM Common.
initialization_jobs11[]Kubernetes initialization Jobs.
cron_jobs11[]Kubernetes CronJobs.
additional_services11[]Additional GKE services alongside AnythingLLM.
enable_cicd_trigger12falseProvisions a Cloud Build GitHub trigger.
github_repository_url12""GitHub repository URL.
github_token12""GitHub PAT. Sensitive.
github_app_installation_id12""GitHub App installation ID.
cicd_trigger_config12{ branch_pattern = "^main$" }Advanced trigger config.
enable_cloud_deploy12falseProvisions a Cloud Deploy pipeline.
cloud_deploy_stages12[dev, staging, prod(approval)]Cloud Deploy stages.
enable_binary_authorization12falseEnforces image attestation.
enable_nfs13falseProvisions NFS shared storage.
nfs_mount_path13'/mnt/nfs'NFS container mount path.
nfs_volume_name13'nfs-data-volume'NFS volume name.
nfs_instance_name13""Existing NFS GCE VM name.
nfs_instance_base_name13'app-nfs'Base name for inline NFS VM.
create_cloud_storage14trueSet false to skip GCS bucket creation.
storage_buckets14[{ name_suffix = "data" }]Additional GCS buckets.
gcs_volumes14[]GCS Fuse volumes via CSI.
manage_storage_kms_iam14falseCMEK for storage buckets.
enable_artifact_registry_cmek14falseCMEK for Artifact Registry.
max_images_to_retain147Maximum images in Artifact Registry.
delete_untagged_images14trueDelete untagged images.
image_retention_days1430Image retention age in days.
enable_redis15falseEnables Redis.
redis_host15nullRedis hostname or IP.
redis_port15'6379'Redis TCP port.
redis_auth15""Redis AUTH password. Sensitive.
database_type16'POSTGRES_15'Cloud SQL engine. AnythingLLM requires PostgreSQL.
application_database_name16'anythingllmdb'PostgreSQL database name.
application_database_user16'anythingllmuser'Database user.
database_password_length1632Password length. Range: 16–64.
enable_postgres_extensions16falseEnables PostgreSQL extension installation.
postgres_extensions16[]PostgreSQL extensions to install.
enable_mysql_plugins16falseNot applicable — AnythingLLM uses PostgreSQL only.
mysql_plugins16[]Not applicable.
enable_auto_password_rotation16falseAutomated password rotation.
rotation_propagation_delay_sec1690Seconds to wait after rotation before restarting pods.
backup_schedule17'0 2 * * *'Backup cron schedule (UTC).
backup_retention_days177Days to retain backups.
enable_backup_import17falseTriggers a one-time restore on apply.
backup_source17'gcs''gcs' or 'gdrive'.
backup_file17'backup.sql'Backup filename.
backup_format17'sql'Backup format.
enable_custom_sql_scripts18falseRuns custom SQL scripts from GCS.
custom_sql_scripts_bucket18""GCS bucket for SQL scripts.
custom_sql_scripts_path18""Path prefix in the bucket.
custom_sql_scripts_use_root18falseRun scripts as root DB user.
enable_custom_domain19falseProvisions a Kubernetes Ingress.
application_domains19[]Custom domain names for the Ingress.
reserve_static_ip19trueProvisions a global static IP.
static_ip_name19""Static IP name. Auto-generated if empty.
network_tags19['nfsserver']GKE node network tags.
enable_iap20falseEnables IAP on the Ingress.
iap_authorized_users20[]Users granted IAP access.
iap_authorized_groups20[]Google Groups granted IAP access.
iap_oauth_client_id20""OAuth Client ID. Sensitive.
iap_oauth_client_secret20""OAuth Client Secret. Sensitive.
iap_support_email20""OAuth consent screen support email.
enable_cloud_armor21falseAttaches Cloud Armor to the GKE Ingress.
admin_ip_ranges21[]Admin CIDR ranges.
cloud_armor_policy_name21'default-waf-policy'Cloud Armor policy name.
enable_cdn21falseEnables Cloud CDN on the Ingress backend.
enable_vpc_sc22falseVPC Service Controls perimeter enforcement.
vpc_cidr_ranges22[]VPC CIDR ranges for VPC-SC.
vpc_sc_dry_run22trueDry-run mode for VPC-SC violations.
organization_id22""GCP Organization ID for VPC-SC.
enable_audit_logging22falseEnables Cloud Audit Logs.

9. Outputs

OutputDescription
service_nameKubernetes Service name.
service_urlExternal URL of the AnythingLLM service.
namespaceKubernetes namespace.
service_cluster_ipClusterIP of the Kubernetes Service.
service_external_ipExternal LoadBalancer IP (if static IP is reserved).
project_idGCP project ID.
deployment_idDeployment ID suffix.
database_instance_nameCloud SQL PostgreSQL instance name.
database_nameApplication database name.
database_userApplication database user.
database_password_secretSecret Manager secret name for the database password.
database_hostDatabase host IP (sensitive).
database_portDatabase port.
storage_bucketsCreated GCS storage buckets.
nfs_server_ipNFS server internal IP (sensitive).
nfs_mount_pathNFS mount path inside containers.
container_imageContainer image used for the deployment.
container_registryArtifact Registry repository name.
deployment_summarySummary of the deployment.
initialization_jobsCreated initialization job names.
cron_jobsCreated cron job names.
statefulset_nameName of the StatefulSet (if workload_type = 'StatefulSet').
cicd_enabledWhether the CI/CD pipeline is enabled.
kubernetes_readyTrue when the GKE cluster endpoint is available and all Kubernetes workload resources have been deployed.
artifact_registry_repositoryArtifact Registry repository for container images.

Configuration Pitfalls & Sensible Defaults

Risk levels: Critical (data loss, full outage, security breach) — High (service unavailable or significant degradation) — Medium (degraded function or increased cost) — Low (minor impact).

VariableSensible DefaultRiskConsequence of Incorrect Value
JWT_SECRET (auto-generated)Random secret in Secret ManagerCriticalSigns all AnythingLLM authentication tokens. Rotating or regenerating invalidates all active sessions simultaneously. Treat as immutable after first user login.
AUTH_TOKEN (optional)"" (no token)HighLeaving empty means the AnythingLLM REST API is accessible to any pod in the same namespace without a token. Provide a strong bearer token for production cluster deployments.
STORAGE_DIR / GCS Fuse mountMounted at /app/server/storageCriticalAll workspace documents, vector indices, and conversation attachments are stored under STORAGE_DIR. Without a persistent volume mount, all data is lost when the pod is evicted or rescheduled. stateful_pvc_enabled = true or GCS Fuse is mandatory for production.
stateful_pvc_enabledfalseHighGCS Fuse is the default persistence backend. If disabled accidentally and no PVC is configured, the pod writes to the ephemeral container filesystem and all data is lost on restart.
nfs_mount_path"/mnt/nfs"HighMust match the STORAGE_DIR environment variable. A mismatch causes AnythingLLM to write to a local path while the NFS share is unused.
LLM_PROVIDER (via environment_variables)"native"CriticalWithout a correctly configured LLM provider and its associated API key in secret_environment_variables, all AI inference requests fail. LLM_PROVIDER must match the keys provided (e.g., "openai" requires OPENAI_API_KEY).
EMBEDDING_ENGINE (via environment_variables)"native"HighChanging the embedding engine after workspace ingestion makes all existing vector indices incompatible. The engine must remain consistent for the lifetime of the data, or all documents must be re-ingested.
secret_environment_variables{}CriticalAll LLM provider API keys must come from Secret Manager references. Providing sensitive keys as plain environment_variables exposes them in Kubernetes pod specs and GCP audit logs.
container_resources.memory_limit4GiHighAnythingLLM's native embedding pipeline requires 3–4 Gi RAM during ingestion. OOM-kill during document processing causes partial ingestion and corrupted vector indices.
quota_memory_requests / quota_memory_limits"4Gi" / "8Gi"CriticalMust use binary suffixes (Gi, Mi). Bare integers are treated as bytes, preventing all pod scheduling in the namespace.
enable_cloudsql_volumetrueCriticalMust remain true for Cloud SQL connectivity. Disabling causes all database connections to fail and AnythingLLM to crash on startup with a PostgreSQL connection error.
database_type"POSTGRES"CriticalAnythingLLM requires PostgreSQL. Without a relational database, workspace metadata, users, and conversation history cannot be persisted.
min_instance_count1HighScale-to-zero requires the pod and the GCS Fuse mount to re-initialise on each cold start (30–60 s). Any in-flight AI operations during scale-down are lost.
timeout_seconds300HighLong document ingestion operations or slow LLM completions cause 504 errors if the timeout is exceeded. Increase to 6003600 for document-heavy or slow-LLM deployments.
enable_nfsfalseMediumRequired for multi-replica deployments with shared file access. Without NFS or a shared PVC, each pod has an isolated storage view and cross-pod document access is impossible.
workload_typenull (auto-select)MediumSetting stateful_pvc_enabled = true auto-selects StatefulSet. Setting both stateful_pvc_enabled = true and workload_type = "Deployment" fails at plan time with a validation error.
enable_redisfalseLowOptional for AnythingLLM. If enable_redis = true, redis_host must be resolvable from the pod or startup fails.
backup_schedule"" (disabled)HighWithout automated PostgreSQL backups, workspace metadata (users, workspaces, settings) is unprotected. Enable for production.
enable_image_mirroringtrueMediumDisabling pulls from the upstream registry (rate-limited in CI/CD environments). Keep enabled in production.
application_version"latest"MediumUnpinned versions risk schema-breaking upgrades. Pin to a specific release tag for production stability.