Skip to main content

PSE Certification Preparation Guide: Section 3 — Ensuring data protection (~23% of the exam)


Download PDF

This guide helps candidates preparing for the Google Cloud Professional Cloud Security Engineer (PSE) certification explore Section 3 of the exam through the lens of the Tech Equity RAD platform at https://radmodules.dev. Three modules are relevant to this section: GCP Services, which establishes the foundational shared infrastructure; App CloudRun, which deploys serverless containerised applications on Cloud Run; and App GKE, which deploys containerised workloads on GKE Autopilot.

You interact with each module by configuring its variables in the RAD UI deployment portal, then exploring the resulting infrastructure in the GCP Console. This guide maps each exam topic to the relevant variables you can configure and the console locations where you can observe the outcomes. It also highlights PSE objectives that are not currently implemented by these modules, providing guidelines for self-guided research and exploration.


3.1 Protecting sensitive data and preventing data loss

Secret Manager and Private Data Service Access

Concept: Securing application secrets with automated rotation and restricting access to data services to the private network only.

In the RAD UI:

  • Secret Manager with Automated Rotation: The enable_auto_password_rotation variable (Group 11 for App CloudRun, Group 17 for App GKE) configures Cloud Scheduler to trigger a Cloud Run Job or Kubernetes CronJob on a defined schedule. The job generates a new credential, updates the database user, creates a new secret version in Secret Manager, and marks it as primary — all without application code changes.
  • Private Cloud SQL: Cloud SQL instances are provisioned with private IP only, meaning only workloads within the VPC can reach the database port. Public internet access is disabled, eliminating direct database exposure.

Console Exploration: Navigate to Security > Secret Manager. Select a secret generated by the module and review its Versions tab — observe how each rotation creates a new version while prior versions remain available for graceful rollover. Click the Permissions tab to confirm the fine-grained resource-level IAM binding (only the workload service account holds roles/secretmanager.secretAccessor on this specific secret). Navigate to SQL > [instance] > Connections > Networking to confirm that only a Private IP is configured and no Public IP is shown.

Real-world example: A SaaS company rotates its Cloud SQL database password every 7 days automatically. When the rotation job runs at 03:00 UTC, it generates a cryptographically random 32-character password, updates the app_user database login, writes the new password as a new version in Secret Manager, and retires the previous version. The application fetches the current primary version on the next connection pool refresh — achieving zero-downtime credential rotation with a complete audit trail in Cloud Logging showing every secret access, every version creation, and every rotation job execution.

💡 Additional Data Protection Objectives & Learning Guidelines

  • Sensitive Data Protection (SDP) — Discovering and Redacting PII: Research Cloud DLP (rebranded as Sensitive Data Protection). SDP can scan Cloud Storage buckets, BigQuery tables, Cloud SQL databases, and Datastore for over 150 built-in infoType detectors (names, credit card numbers, national IDs, passport numbers, medical record numbers, etc.) and report findings to Security Command Center. For real-time streaming pipelines (e.g., Pub/Sub message payloads), use SDP's de-identification transformations inline. Navigate to Security > Sensitive Data Protection to explore inspection templates and job creation.
  • Pseudonymization and Format-Preserving Encryption: Study two key SDP de-identification techniques: CryptoDeterministicConfig (deterministic tokenization — the same input always produces the same token, enabling correlation across datasets while protecting the original value) and CryptoReplaceFfxFpeConfig (format-preserving encryption — the tokenized output maintains the same format as the original, e.g., a 16-digit credit card number is replaced with another 16-digit number). Format-preserving encryption is critical for systems where downstream processes expect a specific data format and cannot be modified.
  • Restricting Access to GCP Data Services: Beyond network-level restrictions, study how to apply fine-grained IAM controls on BigQuery datasets (roles/bigquery.dataViewer on a specific dataset rather than roles/bigquery.admin at the project level), Cloud Storage buckets (resource-level IAM with Uniform Bucket-Level Access), and Cloud SQL instances (grant roles/cloudsql.client to specific service accounts). Combine IAM restrictions with VPC Service Controls perimeters to create defence-in-depth: IAM controls identity-based access, VPC SC controls location-based access.
  • Protecting Compute Instance Metadata: The GCE metadata server at 169.254.169.254 is accessible from inside any VM or container and can expose the instance's service account OAuth token, startup scripts, and custom metadata values. For GKE, Workload Identity blocks pods from directly querying the node's metadata server, preventing a compromised pod from stealing the node's service account token. For Compute Engine VMs, use the --metadata-from-file flag with caution and never store sensitive values (passwords, private keys) in instance metadata.

3.2 Managing encryption at rest, in transit, and in use

Google Default Encryption, CMEK, and TLS

Concept: Ensuring data is encrypted at every stage of its lifecycle and that key control aligns with regulatory obligations.

In the RAD UI:

  • Encryption at Rest (Google Default): All GCP storage services used by the modules — Cloud SQL, Cloud Storage, Secret Manager — use Google-managed AES-256 encryption at rest by default, with no configuration required.
  • Customer-Managed Encryption Keys (CMEK): The enable_cmek variable (Group 12 in GCP Services) provisions a Cloud KMS Key Ring and Crypto Key, then configures Cloud SQL and Cloud Storage to encrypt data with the customer-managed key. The relevant service agent (e.g., the Cloud SQL service account) is granted roles/cloudkms.cryptoKeyEncrypterDecrypter on the specific key.
  • Encryption in Transit: The Global External Application Load Balancer terminates HTTPS using a Google-managed SSL/TLS certificate, enforcing TLS 1.2 minimum. Traffic between the load balancer and Cloud Run or GKE backends uses Google's internal encrypted network fabric.
  • Object Lifecycle Policies: The backup_retention_days variable (Group 6) configures an Object Lifecycle rule on the backup Cloud Storage bucket, automatically deleting backup objects beyond the retention window to enforce data minimisation requirements.

Console Exploration: Navigate to Security > Cloud Key Management. View the Key Ring and Crypto Keys created by the module. Click on a key and check the IAM tab — observe the service agent binding with roles/cloudkms.cryptoKeyEncrypterDecrypter. Navigate to Network Services > Load Balancing, select the frontend configuration, and verify the HTTPS protocol and the attached managed SSL certificate. Navigate to a Cloud Storage bucket's Lifecycle tab to review active retention rules.

Real-world example: A healthcare organization stores patient records in Cloud SQL with CMEK enabled via a key managed in Cloud KMS within their Assured Workloads environment. When a data subject invokes their GDPR right to erasure, the organization revokes the CMEK key (moving it to the DISABLED state in Cloud KMS) — rendering all data encrypted under that key permanently inaccessible across Cloud SQL, Cloud Storage, and any associated backups simultaneously, without having to individually locate and delete every record in every table.

💡 Additional Encryption Objectives & Learning Guidelines

  • Cloud External Key Manager (EKM): Research Cloud EKM, which allows you to encrypt GCP data using keys stored in a supported external key management system (outside Google Cloud entirely). Unlike CMEK where Google holds the key material in Cloud KMS, with EKM Google makes a call to the external KMS to unwrap the data encryption key for each operation — Google never holds the plaintext key. This is required for the highest-assurance compliance scenarios where the customer needs proof that the cloud provider cannot access plaintext data even under legal compulsion ("HYOK" — Hold Your Own Key).
  • Software vs. Hardware Keys: Understand the distinction between software-protected keys (generated and stored in software within Cloud KMS, lower cost) and hardware-protected keys (generated and stored in Cloud HSM — FIPS 140-2 Level 3 validated hardware security modules embedded in Google data centres, higher assurance). HSM-backed keys are required by compliance frameworks such as PCI-DSS and FedRAMP High where key material must never leave a hardware boundary. Select HSM as the protection level when creating a key in Cloud KMS.
  • Key Rotation, Revocation, and Import: Understand automatic key rotation in Cloud KMS (configure a rotation period of 90 days — Google generates new key material on schedule, with all new encryption operations using the new version while old versions can still decrypt existing data). Manual key rotation creates a new version without automatically disabling old versions. Key states to know: ENABLED (active), DISABLED (no new operations, existing data still readable if re-enabled), DESTROYED (key material permanently gone, data irrecoverable). Study key import to bring externally generated key material into Cloud KMS for hybrid environments.
  • Confidential Computing: Research Confidential VMs and Confidential GKE Nodes, which use AMD SEV (Secure Encrypted Virtualization) to encrypt the VM's entire memory footprint with a key generated in hardware — inaccessible to the hypervisor, Google infrastructure operators, or other VMs on the same host. This is "encryption in use" — protecting data while it is actively being processed. Use cases include cryptographic key processing, training AI models on sensitive proprietary datasets, and financial computation requiring assurance that even the cloud provider cannot observe intermediate state.

3.3 Securing AI workloads

💡 Additional AI Workload Security Objectives & Learning Guidelines

The RAD modules deploy standard web applications rather than AI/ML pipelines. Securing AI workloads is a distinct and growing exam domain. Candidates should self-study these topics using Vertex AI, Model Armor, and Sensitive Data Protection documentation.

  • Security and Privacy Controls for AI/ML Systems: AI workloads introduce attack vectors beyond those of traditional applications. Key threats to understand: training data poisoning (adversarial data inserted into training sets to cause model misbehaviour), model inversion attacks (inferring sensitive training data from model outputs), prompt injection (malicious user inputs that override LLM system instructions or extract confidential system prompt content), and model extraction attacks (reconstructing a proprietary model through repeated API queries). Research mitigations for each using Google Cloud-native controls.
  • Model Armor: Research Model Armor, Google Cloud's managed guardrails service for Gemini and other Vertex AI models. Model Armor evaluates both inputs (user prompts) and outputs (model responses) against configurable safety filters — detecting and blocking prompt injection attempts, jailbreak patterns, harmful content categories (hate speech, dangerous instructions, CSAM), and PII leakage in responses. Navigate to Vertex AI > Model Armor in the console to explore template creation and sensitivity threshold configuration.
  • Sensitive Data Protection for AI Training Data: Use Sensitive Data Protection (SDP) to scan and de-identify training datasets before they are ingested into Vertex AI Pipelines, preventing PII from being memorized by fine-tuned models and potentially surfaced in responses. Apply de-identification transformations (pseudonymization, redaction, format-preserving encryption) to BigQuery training tables or Cloud Storage training files as a pipeline pre-processing step. This is particularly important for LLMs fine-tuned on internal company data containing employee records, customer information, or financial data.
  • Security Requirements for IaaS-Hosted vs. PaaS-Hosted Training Models: For IaaS-hosted training (Compute Engine GPU/TPU instances): the customer is responsible for OS hardening (Shielded VM, patch management), preventing lateral movement between training nodes (VPC with no external IPs, firewall rules restricting inter-node traffic), and securing model checkpoints and training data in Cloud Storage with resource-level IAM. For PaaS-hosted training (Vertex AI Training): Google manages the node OS and runtime; the customer's security responsibilities are scoped to Vertex AI IAM role assignments, CMEK configuration for training data and model artefact storage, and VPC Service Controls to restrict which identities can access the Vertex AI API endpoint.
  • Implementing Security Controls for Vertex AI: Key controls to understand and configure: (1) Apply VPC Service Controls to restrict Vertex AI API access to within a defined perimeter, preventing training data exfiltration via the Vertex AI endpoint even if credentials are compromised outside the perimeter. (2) Use granular IAM roles: roles/aiplatform.user for inference-only access, roles/aiplatform.modelUser for model serving, roles/aiplatform.admin for full control — avoid over-provisioning developers with admin-level roles. (3) Enable CMEK for Vertex AI so that model artefacts, training checkpoints, and pipeline outputs stored in Cloud Storage are encrypted with your key. (4) Configure private endpoints for Vertex AI prediction services (privateServiceConnect option) so that inference traffic from your applications stays entirely within the VPC and never traverses the public internet.