Page cover

Kubernetes – Core Pod Lifecycle & Storage

Task

#week_one – Mastering Pod Management

duration: 1 week

ForgTech company wants to test your ability to deliver their requirements into a local Kubernetes environment. This will help you build a good reputation within the engineering team. The purpose of this task is to prove your capabilities in managing the complete lifecycle of a Pod, ensuring high availability through advanced health checks and implementing persistent storage solutions.

Company Scenario

The ForgTech Cloud Team is moving toward a "Container-First" strategy for all internal microservices. However, the initial deployments have suffered from intermittent downtime during startup and lack data persistence when pods are restarted. To maintain our standard of excellence, you are required to design a pod specification that is resilient, resource-aware, and capable of handling stateful data.

Mission Description

You are tasked with building a robust application environment within a local cluster. You must configure pods that can signal their internal health to the Kubernetes control plane, manage their own resource consumption to avoid cluster-wide starvation, and mount persistent volumes to ensure data survives pod crashes. This is an assignment to build a production-ready pod manifest, not a basic tutorial.

Technical Requirements

  1. Initialize a local Kubernetes environment using kind or minikube. Do not use cloud-managed services [Instructions].

  2. Pod Creation: Define a multi-container pod (Sidecar pattern) utilizing a lightweight image (e.g., nginx or alpine).

  3. Health Probes: Implement all three probe types to manage the pod lifecycle:

    • Startup Probe: To handle slow-starting legacy components.

    • Readiness Probe: To ensure the pod only receives traffic when the application is fully initialized.

    • Liveness Probe: To automatically restart the container if the application becomes unresponsive.

  4. Resource Management: Define explicit Requests and Limits for CPU and Memory. The configuration must prevent "noisy neighbor" behavior while allowing for necessary bursts.

  5. Persistent Storage: Configure a Volume and a VolumeMount. For this local lab, use a PersistentVolume (PV) and PersistentVolumeClaim (PVC) backed by local storage to ensure data persistence across pod restarts.

  6. Operational Interaction: Demonstrate the ability to work with a running pod by executing commands internally (e.g., kubectl exec) to verify the mounted volume's integrity.

Constraints & Standards

  • Environment Rules: Use native Kubernetes YAML manifests.

  • Metadata Standards: Every resource (Pods, PVCs, PVs) MUST include the following labels:

    • Key: “Environment”, Value: “terraformChamps”

    • Key: “Owner”, Value: <“Your_first_name“>

  • Code Quality: Manifests must be clean, modular, and follow standard YAML indentation.

  • Documentation: Build a personal document consisting of what you learned with deep details regarding the differences between the three probe types to assist you in refreshing your knowledge later.

Bonus (Optional)

  1. Architecture Diagram: Build a diagram showing the relationship between the Pod, its Probes, and the Volume lifecycle.

  2. Technical Blog: Create a post on Hashnode or DEV explaining how Requests vs. Limits affect the Kubernetes Scheduler.

  3. ConfigMap Integration: Inject environment variables into your pod using a ConfigMap.

References


Solution Explaination

Architecture Overview

Before diving into the code, let's understand what we're building:

This architecture implements:

  • Multi-container pod using the Sidecar pattern

  • Three health probes for comprehensive lifecycle management

  • Persistent storage ensuring data survives pod restarts

  • Resource constraints preventing the "noisy neighbor" problem

Noisy neighbors occur when a single Pod consumes all node resources affecting others, and the solution is to set resource requests, enforce memory limits, and leave CPU limits unset in trusted clusters.


Understanding Health Probes

This is where most teams get tripped up. Let's break down each probe type and understand when and why to use each one.

The Three Probe Types: A Complete Breakdown

1. Startup Probe: The Patient Guardian

Purpose: Protects slow-starting applications from premature termination

The Problem It Solves: Imagine you have a legacy Java application that takes 90 seconds to start. Without a startup probe, Kubernetes might think it's dead and kill it before it finishes initializing.

How It Works:

  • Runs first during pod startup

  • While running, it disables liveness and readiness probes

  • Once it succeeds once, it never runs again

  • If it fails after all attempts, the container restarts

Key Configuration:

Startup probes are used for slow-starting applications to give them enough time to initialize, preventing liveness probes from killing the container before the app is actually ready.

2. Readiness Probe: The Traffic Controller

Purpose: Controls when the pod receives traffic

The Problem It Solves: Your application is running but not ready to handle requests (e.g., warming up caches, establishing database connections). Without a readiness probe, Kubernetes sends traffic immediately, causing errors.

How It Works:

  • Runs continuously throughout the pod's lifetime

  • When it fails, pod is removed from Service endpoints (no traffic)

  • When it succeeds, pod is added back to Service endpoints (receives traffic)

  • Container is never restarted due to readiness failures

Key Configuration:

Readiness probes do not restart containers; they only control traffic flow by removing an unhealthy but running application from service until it is ready again.

3. Liveness Probe: The Watchdog

Purpose: Detects and recovers from application broken states

The Problem It Solves: Your application is running (the process exists) but completely hung/deadlocked. Without a liveness probe, it stays in this broken state forever.

How It Works:

  • Runs continuously after startup probe succeeds

  • When it fails repeatedly, Kubernetes restarts the container

  • This is the "have you tried turning it off and on again" probe

Key Configuration:

Liveness probes are not for crashed processes; they are used to detect a running but unhealthy application, and once the failureThreshold is exceeded, Kubernetes kills and restarts the container to recover the app.

Critical Timing Relationships

The probes work together in a specific sequence:

Storage Design

Containers are ephemeral by design, but applications often need persistent data. Here's how we solve this.

The Storage Stack

PersistentVolume (PV) - The Storage Resource

What is a PersistentVolume?

A PersistentVolume (PV) is a cluster-wide storage resource that represents a piece of physical storage in your Kubernetes cluster. Think of it as a hard drive or storage space that Kubernetes knows about and can allocate to applications.

Why do we need it?

Containers are ephemeral - when they die, their data dies with them. But many applications need to persist data beyond the container's lifetime (logs, databases, user uploads, etc.). PersistentVolumes solve this by providing storage that exists independently of any pod.

Key Characteristics:

  • Cluster-wide resource: Not tied to any specific namespace

  • Lifecycle independence: Exists before and after pods are created/destroyed

  • Administrator-managed: Usually created by cluster admins

  • Supports various backends: hostPath (local), NFS, cloud storage (EBS, Azure Disk), etc.

In our scenario: We're creating a 2Gi storage volume on the local node using hostPath at /mnt/petclinic_logs. This storage will persist even when our pod is deleted, allowing us to recover logs after pod restarts.

The Manifest:

Key Points:

  • PV is a cluster-wide resource (not namespaced)

  • Represents actual storage capacity

  • hostPath is for local development only (not production!)

  • ReadWriteOnce means one node can mount it read-write

PersistentVolumeClaim (PVC) - The Storage Request

What is a PersistentVolumeClaim?

A PersistentVolumeClaim (PVC) is a request for storage by a user or application. If a PersistentVolume is the "storage available," then a PVC is the "storage requested." It's like making a reservation for storage space.

Why do we need it?

PVCs provide an abstraction layer between pods and storage. Instead of pods directly mounting PVs (which would require knowing infrastructure details), they request storage through PVCs. This separates concerns: developers request storage through PVCs, and administrators provision storage through PVs.

Key Characteristics:

  • Namespace-scoped: Belongs to a specific namespace (unlike PVs)

  • User-managed: Created by application developers/users

  • Binds to PVs: Kubernetes automatically finds a suitable PV that matches the PVC's requirements

  • Used by pods: Pods reference PVCs, not PVs directly

How PVC binds to PV: Kubernetes looks for a PV that:

  1. Has enough capacity (≥ requested storage)

  2. Matches the access mode (ReadWriteOnce, ReadOnlyMany, etc.)

  3. Matches the storage class (if specified)

In our scenario: We're requesting 1Gi of storage (from our 2Gi PV) with ReadWriteOnce access mode. We explicitly bind to our PV using volumeName, but normally Kubernetes would automatically find a matching PV.

The Manifest:

Key Points:

  • PVC is namespaced (belongs to a project)

  • Requests storage from a PV

  • Can request less than PV capacity

  • Binding happens automatically (or explicitly via volumeName)

Using the Volume in a Pod

Testing Persistence:


Resource Management

Resource management is about two things: guarantees (requests) and limits.

Requests vs. Limits: The Critical Difference

How Kubernetes Uses This Information

CPU Requests & Limits

Memory Requests & Limits

circle-info

If a Container specifies its own memory limit, but does not specify a memory request, Kubernetes automatically assigns a memory request that matches the limit. Similarly, if a Container specifies its own CPU limit, but does not specify a CPU request, Kubernetes automatically assigns a CPU request that matches the limit.

Quality of Service (QoS) Classes

Kubernetes assigns QoS classes based on your resource configuration:

QoS Class
Configuration
Priority
Use Case

Guaranteed

Requests = limits for all resources

Highest

Critical production apps

Burstable

At Least one Container has Memory or CPU request

Medium

Most applications

BestEffort

No requests or limits

Lowest

Development/testing

Our configuration creates Burstable pods:

With Burstable QoS, the best practice is to set CPU and memory requests, enforce a memory limit to protect the node from memory leaks, and leave the CPU limit unset so pods can better utilize shared CPU capacity without throttling.


Pod Manifest

Now that we've covered storage and resource management, let's look at the complete pod configuration that ties everything together.

Pod - The Application Runtime

What is a Pod?

A Pod is the smallest deployable unit in Kubernetes. It's a wrapper around one or more containers that share storage, network, and a specification for how to run the containers. Think of a pod as a "logical host" - just like multiple processes can run on the same physical machine and share resources, multiple containers in a pod share resources.

Why do we need it?

Pods provide:

  1. Container co-location: Run multiple containers that need to work closely together

  2. Shared resources: Containers in a pod share the same network namespace (localhost), storage volumes, and process namespace (if configured)

  3. Lifecycle management: All containers in a pod are scheduled together, start together, and scale together

  4. Health monitoring: Kubernetes can monitor container health through probes

Single Container vs Multi-Container Pods:

Single Container Pod (Most common):

  • One application per pod

  • Simple, straightforward deployment

  • Example: A web server

Multi-Container Pod (Our approach):

  • Multiple containers that need tight coupling

  • Multi-Container Patterns:

    1. Sidecar (our example):

      • Auxiliary functionality (logging, monitoring)

      • Shares pod lifecycle

    2. Ambassador:

      • Proxy to external services

      • Simplifies network configuration

    3. Adapter

      • Standardizes output/formats

      • Useful for heterogeneous environments

      • Example: Main app + log shipper, Main app + proxy

The Sidecar Pattern (What we're using):

The Sidecar pattern involves running a helper container alongside the main application container. The sidecar enhances or extends the main container's functionality without changing it.

Our Sidecar Use Case:

  • Main Container (petclinic): Runs the Spring Boot application

  • Sidecar Container (log-exporter): Collects and writes logs to persistent storage

Benefits of this pattern:

  • Separation of concerns (main app doesn't need to handle log persistence)

  • Independent scaling and updates

  • Reusable sidecars across different applications

  • Shared storage through volume mounts

Health Probes in Action:

Our pod implements all three probe types to ensure maximum reliability:

  1. Startup Probe (120s window):

    • Gives the Spring Boot app time to initialize

    • Prevents premature killing during slow startup

  2. Readiness Probe (every 5s):

    • Checks if app is ready to serve traffic

    • Removes pod from service if health check fails

  3. Liveness Probe (every 10s):

    • Detects if app is deadlocked or hung

    • Restarts container if health check fails repeatedly

Resource Management:

We define both requests and limits:

  • Requests: Guaranteed resources for scheduling

  • Limits: Maximum resources to prevent runaway consumption

Storage Integration:

The pod mounts our PVC (pvc-petclinicserver) as a volume (logs-volume), which is then mounted into the sidecar container at /var/log. This creates a persistent log storage location.

Pod Manifest:


Validation & Operations

Scripts for Testing

Probe Status Monitor (probe-status.sh)

This script provides intelligent probe status monitoring:

Usage:

Monitor Health Probes

CPU Resource Test (resource-cpu-test.sh)

This script validates CPU throttling behavior:

Usage:

Test Resource Limits


Step-by-Step Deployment Guide

Step 1: Create Your Cluster

Step 2: Deploy Storage Resources

Step 3: Deploy the Pod

Step 4: Monitor Probe Status

Step 5: Verify Persistent Storage

Step 6: Test Resource Behavior


Debugging Common Issues

Issue 1: Pod Stuck in "Pending" State

Issue 2: PVC Not Binding

Issue 3: Container Constantly Restarting

Issue 4: Probe Failures


Conclusion

Building production-ready Kubernetes pods isn't just about running containers it's about:

  1. Resilience: Health probes that prevent downtime

  2. Efficiency: Resource management that prevents waste

  3. Reliability: Persistent storage that survives failures

The configuration we built to solves:

  • No more startup downtime: Startup probe allows slow initialization

  • No more data loss: PV/PVC ensures persistence

  • No more resource contention: Proper requests and limits

You Could see the Full Implementation in GitHubarrow-up-right

This is the foundation of cloud-native application deployment. Master these concepts, and you'll be ready to run production workloads with confidence.


Questions or Feedback?

If you found this guide helpful, please share it with your team. If you have questions or suggestions, feel free to reach out!

Author: Omar Tamer

Happy Kubernetes-ing!


Last updated