Implementing Auto-Scaling for Improved Performance: A Backend Engineer’s Journey

4 min readNov 27, 2023

As a backend engineer, my side project involved building a scalable web application hosted on a Virtual Private Server (VPS). The challenge was to ensure that the application could handle varying loads efficiently without manual intervention. To address this, I decided to implement an auto-scaling model using Docker and custom scripts. In this article, I’ll walk you through the steps I took to achieve this.

I was at the limit of my cloud knowledge and the project needed to be cost-effective, well… cost-effective

Dockerizing the Application

The first step was to containerize my application using Docker. This allowed me to package the application along with its dependencies, ensuring consistency across different environments. I created a Dockerfile that defined the application's environment, dependencies, and configurations. Additionally, I used Docker Compose to manage multiple services within my application.

version: '3'
services:
  web:
    image: my-web-app:latest
    ports:
      - "8080:80"
    deploy:
      replicas: 3  # Initial number of replicas

Setting Up Auto-Scaling with Docker

Auto-Scaling Script

To achieve auto-scaling, I developed a custom bash script (scale.sh) that periodically monitored the CPU usage of my Docker containers and adjusted the number of replicas based on predefined thresholds.

#!/bin/bash

# Define scaling conditions based on metrics (e.g., CPU usage)
SCALE_UP_THRESHOLD=80
SCALE_DOWN_THRESHOLD=20
# Get current CPU usage
CPU_USAGE=$(docker stats --format "{{.CPUPerc}}" $(docker ps --format "{{.Names}}" | grep "my-web-service") | awk -F. '{print $1}')
# Scale up if CPU usage is above the threshold
if [ "$CPU_USAGE" -gt "$SCALE_UP_THRESHOLD" ]; then
  docker service scale my-web-service=$(($(docker service inspect --format='{{.Spec.Mode.Replicated.Replicas}}' my-web-service) + 1))
  echo "$(date): Scaling up due to high CPU usage: $CPU_USAGE%" >> scaling.log
fi
# Scale down if CPU usage is below the threshold
if [ "$CPU_USAGE" -lt "$SCALE_DOWN_THRESHOLD" ]; then
  docker service scale my-web-service=$(($(docker service inspect --format='{{.Spec.Mode.Replicated.Replicas}}' my-web-service) - 1))
  echo "$(date): Scaling down due to low CPU usage: $CPU_USAGE%" >> scaling.log
fi

Scheduling the Auto-Scaling Script

I utilized cron jobs to schedule the execution of the auto-scaling script at regular intervals. This ensured that the application’s performance was consistently monitored and adjusted as needed.

*/5 * * * * /path/to/scale.sh

Monitoring with Prometheus and Grafana

To gain insights into the performance metrics of my application, I set up Prometheus and Grafana. Docker Compose was used to orchestrate these monitoring services.

version: '3'
services:
  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus:/etc/prometheus
    ports:
      - "9090:9090"

  grafana:
    image: grafana/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    ports:
      - "3000:3000"
    depends_on:
      - prometheus

Custom Dashboards and Logging

Grafana provided a user-friendly interface to create custom dashboards that visualized key metrics. The auto-scaling activities were logged to a file (scaling.log) for future analysis and auditing.

Testing the Auto-Scaling Script

Ensuring the reliability of the auto-scaling script is crucial for the smooth operation of your application. Here, I’ll outline how I approached testing to verify that the script performs as expected.

1. Unit Testing:

Before deploying the script in a live environment, I conducted unit testing to validate its core functionality. This involved creating a test environment with a simulated workload to mimic varying CPU usage scenarios.

Sample Unit Test Script (test_scale.sh):

#!/bin/bash

# Mock CPU usage for testing
mock_cpu_usage() {
  echo "50"  # Adjust as needed for different scenarios
}
# Source the auto-scaling script
source scale.sh
# Set the CPU usage to simulate a scenario
CPU_USAGE=$(mock_cpu_usage)
# Test scaling up
SCALE_UP_THRESHOLD=80
SCALE_DOWN_THRESHOLD=20
echo "Testing scaling up..."
$CPU_USAGE=85
./scale.sh
# Validate that the script scaled up the replicas appropriately
# Test scaling down
echo "Testing scaling down..."
$CPU_USAGE=15
./scale.sh
# Validate that the script scaled down the replicas appropriately
echo "Unit tests completed successfully!"

2. Integration Testing:

Integration testing involved deploying the application and the auto-scaling script in a controlled environment that closely resembled the production setup. This allowed me to observe how the script interacted with the Docker services in a more realistic scenario.

Sample Integration Test Script (integration_test.sh):

#!/bin/bash
# Set up the test environment
# Deploy your application using Docker Compose
docker-compose up -d
docker-compose -f docker-compose-monitoring.yml up -d
# Source the auto-scaling script
source scale.sh
# Monitor the logs to observe scaling activities
tail -f scaling.log &
# Trigger the script manually or wait for the cron job to execute
# Observe the logs and verify that scaling activities are logged correctly
# Clean up the test environment
docker-compose down
docker-compose -f docker-compose-monitoring.yml down
echo "Integration tests completed successfully!"

3. Continuous Integration (CI) Pipeline:

To automate testing in my development workflow, I integrated the script testing into a continuous integration pipeline. This pipeline would execute the unit and integration tests whenever changes were pushed to the version control repository. This ensured that any modifications to the script were thoroughly tested before deployment.

By incorporating these testing practices, I gained confidence in the reliability of the auto-scaling script. Regular testing helped catch potential issues early in the development process, reducing the risk of disruptions in the production environment.

Conclusion

Implementing auto-scaling for my side project on a VPS using Docker, custom scripts, and monitoring tools significantly improved the application’s performance. The ability to dynamically scale resources based on real-time metrics allowed me to optimize resource usage and enhance user experience. This experience underscores the importance of automation in maintaining a responsive and efficient backend infrastructure.

As a backend engineer, this project not only expanded my knowledge of containerization and orchestration but also highlighted the value of proactive performance management. The combination of Docker, auto-scaling scripts, and monitoring tools proved to be a robust solution for achieving scalability and ensuring a reliable user experience in a dynamic environment.

In future iterations, I plan to explore more advanced auto-scaling techniques, integrate additional monitoring solutions, and fine-tune the scaling thresholds for optimal performance. The journey continues as I strive to keep pace with the evolving landscape of backend engineering and cloud infrastructure.