🐳

Docker Challenges

Easy

Challenge: Optimize This Dockerfile

The following Dockerfile builds a Node.js application but results in a 1.2GB image. Your mission: reduce it to under 100MB.

Dockerfile (Before)
FROM ubuntu:latest
RUN apt-get update && apt-get install -y nodejs npm
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["npm", "start"]
πŸ’‘ Hint 1

Consider using a smaller base image like node:alpine

πŸ’‘ Hint 2

Multi-stage builds can help separate build and runtime dependencies

πŸ’‘ Hint 3

Only copy package.json before running npm install to leverage Docker cache

Docker Optimization Best Practices
Medium

Challenge: Debug the Container Network

Your frontend container can't connect to the backend container. Both are running, but requests timeout. Find and fix the networking issue.

docker-compose.yml
version: '3.8'
services:
  frontend:
    image: my-frontend
    ports:
      - "3000:3000"
    environment:
      - API_URL=http://localhost:8080
  
  backend:
    image: my-backend
    ports:
      - "8080:8080"
πŸ’‘ Hint 1

In Docker Compose, containers don't use localhost to talk to each other

πŸ’‘ Hint 2

Service names in Compose act as DNS hostnames

Docker Networking Docker Compose
☸️

Kubernetes Challenges

Medium

Challenge: Fix the OOMKilled Pod

Your Java application pod keeps getting OOMKilled. The app needs 512MB heap, but the pod keeps crashing. Fix the resource configuration.

deployment.yaml
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
      - name: java-app
        image: my-java-app
        resources:
          limits:
            memory: "512Mi"
          requests:
            memory: "256Mi"
        env:
        - name: JAVA_OPTS
          value: "-Xmx512m"
πŸ’‘ Hint 1

JVM needs memory for more than just heap (metaspace, native memory, etc.)

πŸ’‘ Hint 2

A good rule of thumb: container memory limit = heap + 256MB overhead

Kubernetes Resource Management Java
Hard

Challenge: Implement Zero-Downtime Deployment

Users report brief 502 errors during deployments. Configure the deployment for true zero-downtime updates with proper health checks.

deployment.yaml (Current)
apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
  template:
    spec:
      containers:
      - name: web-app
        image: my-web-app:v2
        ports:
        - containerPort: 8080
πŸ’‘ Hint 1

Add readiness probes to ensure traffic only goes to ready pods

πŸ’‘ Hint 2

Configure maxSurge and maxUnavailable in the rolling update strategy

πŸ’‘ Hint 3

Add a preStop lifecycle hook to handle graceful shutdown

Kubernetes Deployment High Availability
πŸ”„

CI/CD Challenges

Easy

Challenge: Secure the GitHub Actions Workflow

This workflow has a security vulnerability that could leak secrets. Find and fix the issue before it causes problems.

.github/workflows/deploy.yml
name: Deploy
on: push
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Deploy
        run: |
          echo "Deploying with key: ${{ secrets.API_KEY }}"
          curl -X POST https://api.example.com/deploy \
            -H "Authorization: ${{ secrets.API_KEY }}"
πŸ’‘ Hint 1

Never echo secrets - they'll appear in logs!

πŸ’‘ Hint 2

Use ::add-mask:: or environment variables properly

GitHub Actions Security Secrets Management
Medium

Challenge: Speed Up the Pipeline

This pipeline takes 15 minutes to run. Optimize it to run in under 5 minutes while maintaining all the same checks.

.github/workflows/ci.yml
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: npm install
      - run: npm run lint
      - run: npm run test
      - run: npm run build
      - run: docker build -t myapp .
πŸ’‘ Hint 1

Cache node_modules between runs

πŸ’‘ Hint 2

Run lint and tests in parallel using a matrix strategy

πŸ’‘ Hint 3

Use Docker layer caching with docker/build-push-action

GitHub Actions Performance Caching
πŸ—οΈ

Infrastructure Challenges

Hard

Challenge: Debug the Terraform State Lock

Your team is getting "state lock" errors in Terraform, but no one is running applies. Diagnose and resolve the orphaned lock without losing state.

Error Message
Error: Error acquiring the state lock

Error message: ConditionalCheckFailedException: 
The conditional request failed

Lock Info:
  ID:        a1b2c3d4-5678-90ab-cdef
  Path:      s3://my-bucket/terraform.tfstate
  Created:   2024-12-10 14:30:00.123 +0000 UTC
  Info:
πŸ’‘ Hint 1

Check the DynamoDB table for the orphaned lock entry

πŸ’‘ Hint 2

terraform force-unlock can help, but use with caution!

πŸ’‘ Hint 3

Always ensure the pipeline that created the lock has truly stopped

Terraform State Management Troubleshooting
🚧

More Challenges Coming Soon!

I'm constantly adding new challenges based on real-world scenarios. Have a challenge idea? Connect with me on LinkedIn!

πŸ’Ό Share Your Ideas