Featured image of post Data Masking in Kubernetes

Data Masking in Kubernetes

Required if you have to comply with GDPR, HIPAA, PCI DSS, or SOC 2

https://en.wikipedia.org/wiki/The_Mask_(1994_film)

1. What is Data Masking?

Data masking is the process of obscuring or modifying data to protect sensitive information while preserving its format and usability.

1.1 Why Use Data Masking?

  • πŸ” Protects personally identifiable information (PII)
  • πŸš€ Enables secure testing and analytics
  • 🏦 Ensures compliance with data privacy regulations
  • πŸ›‘ Prevents unauthorized data exposure

1.2 Types of Data Masking

TypeDescriptionExample
Static MaskingData is masked before being storedJohn Doe β†’ J*** D**
Dynamic MaskingData is masked in real-time for users based on rolesOnly authorized users see full data
TokenizationData is replaced with a reference token123-45-6789 β†’ TOKEN-001
EncryptionData is encoded and requires a key to decryptpassword123 β†’ U2FsdGVkX1...

Now, let’s implement data masking in Kubernetes. πŸš€


2. Why Compliance Requires Data Masking

Data masking is required by various regulations to protect sensitive data.

Compliance StandardData Masking Requirement
GDPR (EU)Requires PII protection and pseudonymization
HIPAA (US Healthcare)Protects patient health data (PHI)
PCI DSS (Credit Card Security)Masks credit card numbers
SOC 2 (Enterprise Security)Requires data protection for auditing

Now, let’s implement data masking in Kubernetes pods.


3. Implementing Data Masking in Kubernetes Pods

3.1 Static Data Masking in a Kubernetes Database

Modify sensitive data before storing it in a database.

Step 1: Create a Kubernetes Secret for Database Credentials

1
2
3
4
5
6
7
8
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  username: YWRtaW4=   # base64("admin")
  password: cGFzc3dvcmQ=  # base64("password")

Apply:

1
kubectl apply -f db-secret.yaml

Step 2: Deploy a PostgreSQL Database with Masking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres-db
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:latest
        env:
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: username
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: password
        ports:
        - containerPort: 5432

Apply:

1
kubectl apply -f postgres-deployment.yaml

Step 3: Apply Static Data Masking to PostgreSQL

Connect to the database:

1
kubectl exec -it postgres-pod -- psql -U admin

Create a masked table:

1
2
3
4
5
6
7
8
9
CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  name TEXT,
  email TEXT,
  masked_email TEXT GENERATED ALWAYS AS (LEFT(email, 3) || '****@****.com') STORED
);

INSERT INTO users (name, email) VALUES ('John Doe', 'john.doe@example.com');
SELECT name, masked_email FROM users;

Output:

1
2
3
 name  |    masked_email     
-------------------
 John  | joh****@****.com

Now, email data is masked before storage! πŸ”₯


4. Dynamic Data Masking in Kubernetes Pods

Dynamic masking hides data based on user roles.

4.1 Deploy an API with Dynamic Masking

Create an Express.js API inside a Kubernetes pod.

Step 1: Create a Node.js API with Data Masking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
const express = require("express");
const app = express();
const users = [
  { id: 1, name: "John Doe", email: "john.doe@example.com", role: "admin" },
  { id: 2, name: "Jane Smith", email: "jane.smith@example.com", role: "user" }
];

// Masking function
function maskEmail(email) {
  return email.replace(/(.{3}).*@/, "$1****@****.com");
}

// API Route with Dynamic Masking
app.get("/users", (req, res) => {
  const role = req.query.role || "user";
  const maskedUsers = users.map(user => ({
    id: user.id,
    name: user.name,
    email: role === "admin" ? user.email : maskEmail(user.email)
  }));
  res.json(maskedUsers);
});

app.listen(3000, () => console.log("Server running on port 3000"));

Step 2: Create a Kubernetes Deployment

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: user-api
  template:
    metadata:
      labels:
        app: user-api
    spec:
      containers:
      - name: api
        image: node:alpine
        command: ["node", "server.js"]
        ports:
        - containerPort: 3000

Apply:

1
kubectl apply -f user-api-deployment.yaml

Step 3: Test Dynamic Masking

Regular user request:

1
curl "http://user-api-service:3000/users?role=user"

Response:

1
2
3
4
[
  { "id": 1, "name": "John Doe", "email": "joh****@****.com" },
  { "id": 2, "name": "Jane Smith", "email": "jan****@****.com" }
]

Admin request:

1
curl "http://user-api-service:3000/users?role=admin"

Response:

1
2
3
4
[
  { "id": 1, "name": "John Doe", "email": "john.doe@example.com" },
  { "id": 2, "name": "Jane Smith", "email": "jane.smith@example.com" }
]

Now, only admins see unmasked data! πŸ”₯


5. Final Thoughts

Data masking is essential for security and compliance in Kubernetes.

Key Takeaways

βœ… Static masking protects stored data
βœ… Dynamic masking controls visibility based on roles
βœ… Data masking ensures GDPR, HIPAA, and PCI DSS compliance
βœ… Use masking techniques inside Kubernetes pods for better security

By implementing data masking inside Kubernetes, you ensure data privacy and regulatory compliance. πŸš€