Docker Challenges
Challenge: Optimize This Dockerfile
The following Dockerfile builds a Node.js application but results in a 1.2GB image. Your mission: reduce it to under 100MB.
FROM ubuntu:latest
RUN apt-get update && apt-get install -y nodejs npm
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["npm", "start"]
π‘ Hint 1
Consider using a smaller base image like node:alpine
π‘ Hint 2
Multi-stage builds can help separate build and runtime dependencies
π‘ Hint 3
Only copy package.json before running npm install to leverage Docker cache
Challenge: Debug the Container Network
Your frontend container can't connect to the backend container. Both are running, but requests timeout. Find and fix the networking issue.
version: '3.8'
services:
frontend:
image: my-frontend
ports:
- "3000:3000"
environment:
- API_URL=http://localhost:8080
backend:
image: my-backend
ports:
- "8080:8080"
π‘ Hint 1
In Docker Compose, containers don't use localhost to talk to each other
π‘ Hint 2
Service names in Compose act as DNS hostnames
Kubernetes Challenges
Challenge: Fix the OOMKilled Pod
Your Java application pod keeps getting OOMKilled. The app needs 512MB heap, but the pod keeps crashing. Fix the resource configuration.
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: java-app
image: my-java-app
resources:
limits:
memory: "512Mi"
requests:
memory: "256Mi"
env:
- name: JAVA_OPTS
value: "-Xmx512m"
π‘ Hint 1
JVM needs memory for more than just heap (metaspace, native memory, etc.)
π‘ Hint 2
A good rule of thumb: container memory limit = heap + 256MB overhead
Challenge: Implement Zero-Downtime Deployment
Users report brief 502 errors during deployments. Configure the deployment for true zero-downtime updates with proper health checks.
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
strategy:
type: RollingUpdate
template:
spec:
containers:
- name: web-app
image: my-web-app:v2
ports:
- containerPort: 8080
π‘ Hint 1
Add readiness probes to ensure traffic only goes to ready pods
π‘ Hint 2
Configure maxSurge and maxUnavailable in the rolling update strategy
π‘ Hint 3
Add a preStop lifecycle hook to handle graceful shutdown
CI/CD Challenges
Challenge: Secure the GitHub Actions Workflow
This workflow has a security vulnerability that could leak secrets. Find and fix the issue before it causes problems.
name: Deploy
on: push
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy
run: |
echo "Deploying with key: ${{ secrets.API_KEY }}"
curl -X POST https://api.example.com/deploy \
-H "Authorization: ${{ secrets.API_KEY }}"
π‘ Hint 1
Never echo secrets - they'll appear in logs!
π‘ Hint 2
Use ::add-mask:: or environment variables properly
Challenge: Speed Up the Pipeline
This pipeline takes 15 minutes to run. Optimize it to run in under 5 minutes while maintaining all the same checks.
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: npm install
- run: npm run lint
- run: npm run test
- run: npm run build
- run: docker build -t myapp .
π‘ Hint 1
Cache node_modules between runs
π‘ Hint 2
Run lint and tests in parallel using a matrix strategy
π‘ Hint 3
Use Docker layer caching with docker/build-push-action
Infrastructure Challenges
Challenge: Debug the Terraform State Lock
Your team is getting "state lock" errors in Terraform, but no one is running applies. Diagnose and resolve the orphaned lock without losing state.
Error: Error acquiring the state lock
Error message: ConditionalCheckFailedException:
The conditional request failed
Lock Info:
ID: a1b2c3d4-5678-90ab-cdef
Path: s3://my-bucket/terraform.tfstate
Created: 2024-12-10 14:30:00.123 +0000 UTC
Info:
π‘ Hint 1
Check the DynamoDB table for the orphaned lock entry
π‘ Hint 2
terraform force-unlock can help, but use with caution!
π‘ Hint 3
Always ensure the pipeline that created the lock has truly stopped
More Challenges Coming Soon!
I'm constantly adding new challenges based on real-world scenarios. Have a challenge idea? Connect with me on LinkedIn!
πΌ Share Your Ideas