Skip to main content

Professional Cloud DevOps Engineer (PDE) Certification Lab Map

The Professional Cloud DevOps Engineer certification validates your ability to build and manage CI/CD pipelines, apply SRE practices (SLOs, error budgets, incident response), implement observability, and optimize service performance and cost on Google Cloud. The RAD platform's four foundation modules — Services_GCP (shared platform infrastructure), App_CloudRun (Cloud Run v2 deployment engine), App_GKE (GKE Autopilot deployment engine), and App_Common (shared building blocks for Cloud Deploy, monitoring, and dashboards) — give you a live, inspectable lab: every Cloud Build trigger, Cloud Deploy stage, alert policy, and traffic split discussed in this guide is real infrastructure you can deploy, break, and fix.

How to use this guide

  • Pick a deployment profile below and deploy it through your deployment portal.
  • Work through the matching section guide (PDE_Section_N_Exploration_Guide.md) — each subsection has hands-on steps with real gcloud, kubectl, and tofu commands.
  • Use the coverage legend honestly: 📘 topics (most pure SRE theory and incident management process) must be studied outside the platform; the section guides give pointers.
  • The platform itself is part of the lab — Section 1 treats the deployment modules as the IaC artifact the exam expects you to reason about.

Coverage legend

SymbolMeaning
Fully demonstrated — deploy it, see it, modify it in the RAD platform
🟡Partially demonstrated — the modules touch the concept; supplement with docs
📘Concept-only — not implemented by the modules; study pointers provided

Deployment profiles

Profile: Pipeline engineer

Purpose: end-to-end CI/CD — GitHub push → Kaniko build → Artifact Registry → Binary Authorization attestation → Cloud Deploy promotion with a prod approval gate. Modules: App_CloudRun (optionally on top of Services_GCP).

VariableValue
enable_cicd_triggertrue
github_repository_urlhttps://github.com/<you>/<repo>
github_tokena PAT with repo + admin:repo_hook (first apply only)
enable_cloud_deploytrue
cicd_enable_cloud_deploytrue
enable_binary_authorizationtrue
binauthz_evaluation_modeREQUIRE_ATTESTATION
support_users["you@example.com"]

Estimated incremental cost: low–moderate — Cloud Build minutes and Artifact Registry storage dominate; Cloud Deploy itself adds no direct charge for Cloud Run targets, you pay for the per-stage Cloud Run services.

Profile: GKE release engineer

Purpose: rolling updates, HPA/VPA, PodDisruptionBudgets, and Cloud Deploy to GKE namespaces. Modules: Services_GCP + App_GKE.

VariableValue
create_google_kubernetes_engine (Services_GCP)true
gke_cluster_mode (Services_GCP)AUTOPILOT (default)
min_instance_count (App_GKE)2
max_instance_count (App_GKE)4
enable_pod_disruption_budget (App_GKE)true (default)
enable_topology_spread (App_GKE)true
enable_cicd_trigger + enable_cloud_deploy (App_GKE)true (optional, for the GKE CD path)

Estimated incremental cost: moderate–high — GKE Autopilot bills per pod resource request plus a cluster management fee; multiple replicas multiply the cost.

Profile: Observability baseline

Purpose: notification channels, threshold alert policies, auto-generated dashboards, and full audit logging to explore in Logs Explorer. Modules: Services_GCP + either application engine.

VariableValue
support_users (app module)["you@example.com"]
alert_policies (app module)one entry, e.g. on run.googleapis.com/request_count
configure_email_notification (Services_GCP)true
notification_alert_emails (Services_GCP)["ops@example.com"]
alert_cpu_threshold / alert_memory_threshold / alert_disk_threshold (Services_GCP)80 (defaults)
enable_audit_loggingtrue

Estimated incremental cost: low — audit logging (DATA_READ/DATA_WRITE on allServices) is the dominant driver via Cloud Logging ingestion volume.

Profile: Cost-lean serverless

Purpose: scale-to-zero economics, CPU throttling, revision pruning, and Artifact Registry cleanup policies for Section 5. Modules: App_CloudRun only.

VariableValue
min_instance_count0 (default)
max_instance_count3
cpu_always_allocatedfalse
max_revisions_to_retain7 (default)
delete_untagged_imagestrue (default)
image_retention_days30 (default)

Estimated incremental cost: minimal — the service scales to zero between requests; only storage and per-request compute accrue.

Section 1: Bootstrapping and maintaining a Google Cloud organization (~20% of the exam)

The exam opens with organization-level design: resource hierarchy, IaC discipline, CI/CD architecture choices, and multi-environment management. The RAD modules are themselves the IaC artifact, and the Cloud Deploy stage model is the multi-environment lab.

Exam topicCoverageWhere in RADGuide
1.1 Designing the overall resource hierarchy📘project-scoped only; resource_labels for governance labelsSection 1 guide
1.2 Managing infrastructurethe deployment modules themselves; tofu plan drift detection; Cloud Deploy owns the container image while IaC owns the rest; IaC CI checksSection 1 guide
1.3 Designing a CI/CD architecture stackinline Cloud Build trigger, Cloud Deploy delivery pipeline, Binary AuthorizationSection 1 guide
1.4 Managing multiple environmentscloud_deploy_stages (dev/staging/prod), per-stage services and namespacesSection 1 guide

Section 2: Building and implementing CI/CD pipelines (~25% of the exam)

The heaviest exam section and the strongest area of the RAD lab: an inline Cloud Build pipeline (Kaniko → attestation → deploy), Artifact Registry with cleanup policies, Binary Authorization, and a real Cloud Deploy pipeline with approvals, automation rules, and rollback.

Exam topicCoverageWhere in RADGuide
2.1 Designing pipelinesenable_cicd_trigger, Kaniko v1.23.2 build step, Artifact Registry cleanup policiesSection 2 guide
2.2 Implementing and managing pipelinescloud_deploy_stages, traffic_split, kubectl set image direct path, revision pruningSection 2 guide
2.3 Managing pipeline configuration and secretsgithub_token (never in state), secret_environment_variables, enable_auto_password_rotationSection 2 guide
2.4 Auditing and logging of code and configurationsData Access audit logging, Binary Authorization attestations, Cloud Deploy release historySection 2 guide

Section 3: Applying site reliability engineering practices (~18% of the exam)

SLO/error-budget theory is mostly 📘 — the modules emit the metrics SLIs are built from but do not create SLO objects. Capacity management and incident mitigation, however, are fully hands-on: autoscaling, PDBs, traffic splitting, and instant rollback.

Exam topicCoverageWhere in RADGuide
3.1 Balancing change, velocity, and reliability of the service📘threshold alerts as proto-SLIs; no SLO/error-budget objectsSection 3 guide
3.2 Managing service lifecyclemin_instance_count/max_instance_count, GKE HPA (CPU 70% / memory 80%), enable_vertical_pod_autoscalingSection 3 guide
3.3 Mitigating incident impact on userstraffic_split rollback, Cloud Deploy rollback, enable_pod_disruption_budget, probes, Cloud Armor rate limitingSection 3 guide

Section 4: Implementing observability practices and troubleshooting issues (~25% of the exam)

The second-heaviest section. The modules provision notification channels, fixed and custom alert policies, per-platform dashboards, GKE workload logging, managed Prometheus, synthetic uptime checks (uptime_check_config — created for publicly reachable endpoints, with a check_passed alert policy), and (optionally) full data-access audit logs.

Exam topicCoverageWhere in RADGuide
4.1 Instrumenting and collecting telemetry🟡GKE workload logging/monitoring + managed Prometheus on the Services_GCP cluster; enable_audit_logging; uptime_check_config synthetic checksSection 4 guide
4.2 Troubleshooting and analyzing issues🟡Logs Explorer over module-deployed workloads; revision/Pod diagnostics; Cloud Logging build logsSection 4 guide
4.3 Managing metrics, dashboards, and alertsthe monitoring layer (90% CPU/memory alerts, renotify 1800s), alert_policies, auto-generated dashboards, Services_GCP threshold alertsSection 4 guide

Section 5: Optimizing performance and cost (~12% of the exam)

Performance tuning (execution environment, CPU allocation, resource requests) is fully demonstrated; FinOps tooling (billing export, Recommender, CUDs) is 📘, with the modules providing the levers those tools would recommend pulling.

Exam topicCoverageWhere in RADGuide
5.1 Collecting performance information in Google Cloud🟡execution_environment, cpu_always_allocated, container_resources, managed Prometheus; Trace/Profiler 📘Section 5 guide
5.2 Implementing FinOps practices for optimizing resource utilization and costs🟡scale-to-zero, request-only CPU, VPA, AR cleanup policies, GKE cost allocation; billing export/Recommender 📘Section 5 guide