Skip to main content

Qdrant on Cloud Run — Lab Guide

📖 Configuration Guide

Overview

Estimated time: 1–2 hours

Qdrant is a high-performance vector database and similarity search engine built in Rust. This lab deploys Qdrant on Google Cloud Run backed by a Cloud Storage bucket for persistent collection storage. Cloud Run provides serverless hosting with scale-to-zero capability.

What the Module Automates

  • Cloud Run v2 (Gen2) service with GCS FUSE volume mount
  • Cloud Storage bucket for Qdrant collection data (/qdrant/storage)
  • Artifact Registry repository and Cloud Build image pipeline
  • Secret Manager secret for API key (when enabled)
  • Serverless VPC Access / Direct VPC Egress
  • Cloud Run IAM and service account bindings
  • Cloud Monitoring uptime checks (/readyz)
  • Automated backup Cloud Run Job

What You Do Manually

  • Note the Cloud Run service URL from the RAD UI deployment panel
  • Retrieve the API key from Secret Manager (if enabled)
  • Connect to Qdrant using the Python client or REST API
  • Create collections and upsert vectors with payload
  • Run filtered similarity searches
  • Review logs in Cloud Logging

CLI and REST API Overview

ToolPurpose
gcloudAccess secrets, inspect Cloud Run services, view logs
curlDirect Qdrant REST API calls

Install: Google Cloud SDK


Prerequisites

  1. A GCP project with billing enabled.
  2. The Services GCP module deployed in the same project (provides VPC and networking).
  3. The following APIs enabled (Services GCP handles this):
    • run.googleapis.com
    • secretmanager.googleapis.com
    • artifactregistry.googleapis.com
    • cloudbuild.googleapis.com
    • storage.googleapis.com
  4. gcloud authenticated: gcloud auth application-default login
  5. Python 3.9+ and pip for the Qdrant client steps.

Phase 1 — Deploy Infrastructure [AUTOMATED]

Step 1.1 — Configure Variables

VariableRequiredDefaultDescription
project_idYesGCP project ID
tenant_deployment_idNo"demo"Short deployment identifier
deployment_idNo""Auto-generated suffix
regionNo"us-central1"GCP region
application_nameNo"qdrant"Base name for Cloud Run service and resources
application_versionNo"latest"Qdrant Docker image tag (e.g., "v1.9.0")
deploy_applicationNotrueSet false to provision infrastructure without deploying
cpu_limitNo"1000m"CPU per Cloud Run instance
memory_limitNo"1Gi"Memory per Cloud Run instance
min_instance_countNo1Min instances (1 avoids cold starts during index loading)
max_instance_countNo1Max instances (keep at 1 — single-writer)
enable_api_keyNofalseGenerate and store API key in Secret Manager
ingress_settingsNo"internal""internal" (VPC only) or "all" (requires API key)
backup_scheduleNo"0 2 * * *"Cron schedule for automated backups
backup_retention_daysNo7Days to retain backup files
support_usersNo[]Email addresses for monitoring alerts

Step 1.2 — Initiate Deployment

Deployment is initiated from the RAD UI. Fill in the variables form and click Deploy.

Approximate deployment durations:

PhaseDuration
Artifact Registry image build (Cloud Build)5–8 min
Cloud Run service deployment2–3 min
GCS bucket creation1 min
Total8–12 min

Step 1.3 — Record Outputs

OutputDescription
service_urlHTTPS URL of the Qdrant Cloud Run service
service_nameCloud Run service name
deployment_idUnique deployment identifier
storage_bucketsQdrant storage bucket name

Set shell variables for use in later steps:

export PROJECT="your-gcp-project-id"
export REGION="us-central1"

# Discover the Cloud Run service
export SERVICE=$(gcloud run services list \
--project=${PROJECT} \
--region=${REGION} \
--format="value(metadata.name)" \
--filter="metadata.name~qdrant" \
--limit=1)
export SERVICE_URL=$(gcloud run services describe ${SERVICE} \
--project=${PROJECT} \
--region=${REGION} \
--format="value(status.url)")

echo "Qdrant URL: ${SERVICE_URL}"

Phase 2 — Verify Deployment [MANUAL]

Step 2.1 — Check the Readiness Endpoint

curl -s "${SERVICE_URL}/readyz"

Expected result:

{"result": true, "status": "ok", "time": 0.000012}

If the response returns a 503, Cloud Run may still be starting. Wait 30 seconds and retry.

gcloud equivalent:

gcloud run services describe ${SERVICE} \
--region=${REGION} \
--project=${PROJECT} \
--format="value(status.conditions)"

Step 2.2 — Check the Liveness Endpoint

curl -s "${SERVICE_URL}/livez"

Expected result: {"result": true, "status": "ok", "time": ...}

Note: The liveness endpoint (/livez) and readiness endpoint (/readyz) are distinct in Qdrant. The startup probe uses /readyz; the liveness probe uses /livez. This prevents spurious container restarts during large collection loading.

Step 2.3 — Check Qdrant Health Details

curl -s "${SERVICE_URL}/healthz" | python3 -m json.tool

REST API equivalent:

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://run.googleapis.com/v2/projects/${PROJECT}/locations/${REGION}/services/${SERVICE}"

Step 2.4 — Retrieve API Key (if enabled)

export API_SECRET=$(gcloud secrets list \
--project=${PROJECT} \
--filter="name~qdrant" \
--filter="name~api-key" \
--format="value(name)" \
--limit=1)

export QDRANT_API_KEY=$(gcloud secrets versions access latest \
--secret="${API_SECRET}" \
--project=${PROJECT})

echo "API key retrieved: ${#QDRANT_API_KEY} characters"

Phase 3 — Create Collections [MANUAL]

Step 3.1 — Install the Qdrant Python Client

pip install qdrant-client

Step 3.2 — Connect to Qdrant

from qdrant_client import QdrantClient

# Without API key
client = QdrantClient(url="${SERVICE_URL}")

# With API key
client = QdrantClient(
url="${SERVICE_URL}",
api_key="${QDRANT_API_KEY}"
)

# Verify connection
print(client.get_collections())

Expected result: An empty CollectionsResponse is returned.

Step 3.3 — Create a Collection

from qdrant_client.models import Distance, VectorParams

client.create_collection(
collection_name="my_knowledge_base",
vectors_config=VectorParams(
size=384, # dimensionality matches your embedding model
distance=Distance.COSINE
)
)
print("Collection created")

REST API equivalent:

curl -s -X PUT "${SERVICE_URL}/collections/my_knowledge_base" \
-H "Content-Type: application/json" \
-H "api-key: ${QDRANT_API_KEY}" \
-d '{
"vectors": {
"size": 384,
"distance": "Cosine"
}
}' | python3 -m json.tool

gcloud — verify the collection is persisted in GCS:

export STORAGE_BUCKET=$(gcloud storage buckets list \
--project=${PROJECT} \
--filter="name~qdrant" \
--format="value(name)" \
--limit=1)
gcloud storage ls gs://${STORAGE_BUCKET}/

Expected result: Qdrant's WAL and collection directory structure appear in the bucket.

Step 3.4 — Upsert Points with Payload

import numpy as np
from qdrant_client.models import PointStruct

# Generate sample vectors
vectors = np.random.rand(5, 384).tolist()

points = [
PointStruct(
id=i,
vector=vectors[i],
payload={
"source": f"document_{i}",
"category": ["database", "cloud", "ai", "kubernetes", "search"][i],
"score": round(float(np.random.random()), 3)
}
)
for i in range(5)
]

client.upsert(
collection_name="my_knowledge_base",
points=points
)
print(f"Upserted {len(points)} points")

Expected result: 5 points added to the collection.


Phase 4 — Run Searches [MANUAL]

import numpy as np

query_vector = np.random.rand(384).tolist()

results = client.search(
collection_name="my_knowledge_base",
query_vector=query_vector,
limit=3,
with_payload=True
)

for result in results:
print(f" ID: {result.id}, Score: {result.score:.4f}, Category: {result.payload['category']}")

Expected result: The 3 nearest vectors are returned with their similarity scores (higher = more similar for cosine distance).

from qdrant_client.models import Filter, FieldCondition, MatchValue

results = client.search(
collection_name="my_knowledge_base",
query_vector=query_vector,
query_filter=Filter(
must=[
FieldCondition(
key="score",
range={"gte": 0.5}
)
]
),
limit=3,
with_payload=True
)
print(f"Filtered results: {len(results)}")

REST API equivalent:

curl -s -X POST "${SERVICE_URL}/collections/my_knowledge_base/points/search" \
-H "Content-Type: application/json" \
-H "api-key: ${QDRANT_API_KEY}" \
-d '{
"vector": [0.1, 0.2, 0.3],
"filter": {
"must": [{"key": "category", "match": {"value": "database"}}]
},
"limit": 3,
"with_payload": true
}' | python3 -m json.tool

Step 4.3 — Scroll Through All Points

results, next_offset = client.scroll(
collection_name="my_knowledge_base",
limit=10,
with_payload=True,
with_vectors=False
)
print(f"Retrieved {len(results)} points")
for point in results:
print(f" {point.id}: {point.payload}")

Step 4.4 — Collection Info

info = client.get_collection("my_knowledge_base")
print(f"Points count: {info.points_count}")
print(f"Vectors count: {info.vectors_count}")
print(f"Indexed vectors: {info.indexed_vectors_count}")

REST API equivalent:

curl -s "${SERVICE_URL}/collections/my_knowledge_base" \
-H "api-key: ${QDRANT_API_KEY}" | python3 -m json.tool

Phase 5 — Explore Cloud Logging [MANUAL]

Step 5.1 — View Qdrant Application Logs

Navigate to Logging > Logs Explorer in the Cloud Console.

resource.type="cloud_run_revision"
resource.labels.service_name="${SERVICE}"
resource.labels.location="${REGION}"

gcloud equivalent:

gcloud logging read \
'resource.type="cloud_run_revision" AND resource.labels.service_name="'${SERVICE}'"' \
--project=${PROJECT} \
--limit=50 \
--format="table(timestamp, textPayload)"

Expected result: Qdrant startup logs appear, showing GCS FUSE mount initialization and the HTTP server startup on port 6333.

Step 5.2 — Review Uptime Check

Navigate to Monitoring > Uptime checks in the Cloud Console.

Expected result: The uptime check targeting /readyz shows Passing from multiple global locations.


Phase 6 — Cloud Run Features [MANUAL]

Step 6.1 — Inspect the Cloud Run Service

gcloud run services describe ${SERVICE} \
--region=${REGION} \
--project=${PROJECT}

Expected result: Service shows Ready. Container spec shows the GCS FUSE volume mount at /qdrant/storage, environment variables (QDRANT__STORAGE__STORAGE_PATH, QDRANT__SERVICE__HTTP_PORT), and probe configuration.

Step 6.2 — View Revisions

gcloud run revisions list \
--service=${SERVICE} \
--region=${REGION} \
--project=${PROJECT}

Expected result: The active revision shows 100% traffic.

Step 6.3 — Create a Collection Snapshot

Qdrant supports collection snapshots for backup and migration:

snapshot = client.create_snapshot(collection_name="my_knowledge_base")
print(f"Snapshot created: {snapshot.name}")

REST API equivalent:

curl -s -X POST "${SERVICE_URL}/collections/my_knowledge_base/snapshots" \
-H "api-key: ${QDRANT_API_KEY}" | python3 -m json.tool

Phase 7 — Undeploy [AUTOMATED]

When you are finished, return to the RAD UI, navigate to your deployment, and click Undeploy to remove all resources.

Approximate undeploy duration: 5–10 minutes.

Warning: This permanently deletes the Qdrant storage bucket and all stored collections. Use Qdrant's snapshot API to export collections before undeploying.


Summary

ActionPhaseAutomated
Cloud Run service provisioning1Yes
GCS storage bucket creation1Yes
Secret Manager API key1Yes (if enabled)
Container image build (Cloud Build)1Yes
IAM and service account bindings1Yes
Note service URL from RAD UI2No
Verify Qdrant readiness endpoint2No
Retrieve API key2No
Install Python client3No
Create collections3No
Upsert vectors with payload3No
Run similarity searches4No
Run filtered searches4No
Create collection snapshots6No
Review Cloud Logging5No
Undeploy infrastructure7Yes