Automate Everything in Zsh — Backup, Maintain, Deploy #27

Sandy Lane

•April 18, 2026

Video: Automate Everything in Zsh — Backup, Maintain, Deploy #27 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page

Test what you've read · interactive walkthrough

Zsh Lesson 27: Automate Everything — Backup, Deploy, Maintain

Three patterns: backup (tar + retention), deploy (build + sync + reload + verify), maintain (log rotation, disk check, cleanup). The shell scripts that turn manual work into one command.

By this point you have the tools. This lesson is about putting them together into automation that actually runs in production.

Backup automation

The pattern:

Create a timestamped archive.
Verify it exists and is non-empty.
Apply retention policy.
Log the result.

#!/usr/bin/env zsh
set -euo pipefail

DATE=$(date +%Y-%m-%d_%H%M)
BACKUP_DIR="/backup/auto"
SOURCE_DIRS=(
  /home/alice/work
  /etc
)

success() { printf "\033[32m✓ %s\033[0m\n" "$1"; }
warn()    { printf "\033[33m⚠ %s\033[0m\n" "$1"; }

mkdir -p "$BACKUP_DIR"

# Create per-source archives
for dir in "${SOURCE_DIRS[@]}"; do
  name=$(basename "$dir")
  archive="$BACKUP_DIR/${name}_${DATE}.tar.gz"
  tar -czf "$archive" -C "$(dirname "$dir")" "$name"
  size=$(du -h "$archive" | cut -f1)
  success "Backed up $name ($size)"
done

# Verify
tar -tzf "$BACKUP_DIR/work_${DATE}.tar.gz" > /dev/null

# Retention: delete archives older than 7 days
old=$(find "$BACKUP_DIR" -name "*.tar.gz" -mtime +7 | wc -l)
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +7 -delete
warn "Removed $old old backups"

success "Total: $(ls "$BACKUP_DIR"/*.tar.gz | wc -l) archives"

Schedule via cron (lesson 24):

0 3 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1

Daily at 3 AM, log everything.

Backup with rsync (incremental)

For large data sets, use rsync with hard links — only changed files take space:

#!/usr/bin/env zsh
set -euo pipefail

SRC=/home/alice/data/
DEST_BASE=/backup/snapshots
DATE=$(date +%Y-%m-%d_%H%M)
LATEST="$DEST_BASE/latest"
NEW="$DEST_BASE/$DATE"

mkdir -p "$DEST_BASE"

# Hard-link from previous if it exists
if [[ -d "$LATEST" ]]; then
  rsync -a --delete --link-dest="$LATEST" "$SRC" "$NEW"
else
  rsync -a "$SRC" "$NEW"
fi

# Update "latest" symlink
ln -sfn "$NEW" "$LATEST"

# Retention
find "$DEST_BASE" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} +

Each snapshot is a full directory tree, but unchanged files are hard-linked — disk usage is just the delta.

Deploy automation

The pattern:

Check dependencies.
Build.
Save the previous version (rollback safety).
Deploy.
Verify.

#!/usr/bin/env zsh
set -euo pipefail

REMOTE=myserver
APP_DIR=/var/www/myapp
BUILD_DIR=./build

success() { printf "\033[32m✓ %s\033[0m\n" "$1"; }
error()   { printf "\033[31m✗ %s\033[0m\n" "$1" >&2; }
info()    { printf "\033[34mℹ %s\033[0m\n" "$1"; }

# Step 1: Check dependencies
echo "--- Step 1: Check Dependencies ---"
for tool in rsync git ssh curl; do
  if command -v "$tool" > /dev/null; then
    success "$tool found"
  else
    error "Missing: $tool"; exit 1
  fi
done

# Step 2: Build
echo "--- Step 2: Build ---"
rm -rf "$BUILD_DIR"
npm run build -- --output "$BUILD_DIR"
success "Built $(find "$BUILD_DIR" -type f | wc -l) files"

# Step 3: Sync (with rollback safety)
echo "--- Step 3: Deploy ---"
ssh "$REMOTE" "[[ -d $APP_DIR/current ]] && mv $APP_DIR/current $APP_DIR/previous || true"
rsync -avz --delete "$BUILD_DIR/" "$REMOTE:$APP_DIR/current/"
ssh "$REMOTE" "ln -sfn $APP_DIR/current /var/www/active"

# Step 4: Reload
ssh "$REMOTE" "sudo systemctl reload myapp"

# Step 5: Verify
sleep 2
if curl -fsS "https://example.com/health" > /dev/null; then
  success "Deploy verified"
else
  error "Health check failed — rolling back"
  ssh "$REMOTE" "ln -sfn $APP_DIR/previous /var/www/active && sudo systemctl reload myapp"
  exit 1
fi

Each step has clear success/failure. Rollback on failed health check.

Atomic deploys with symlinks

The "current/previous" pattern with symlinks:

/var/www/myapp/
  releases/
    20260508-103000/
    20260508-094500/
    20260507-160000/
  current -> releases/20260508-103000/

current is a symlink. Deploying replaces the symlink atomically (ln -sfn new current). Rollback is changing the symlink back.

This is what tools like Capistrano, Mina, and most modern deploy systems do.

System maintenance

Daily/weekly chores:

#!/usr/bin/env zsh
set -euo pipefail

LOG=/var/log/maintenance.log

run() {
  echo "[$(date)] $*" | tee -a "$LOG"
  "$@" >> "$LOG" 2>&1
}

# 1. Rotate logs
run logrotate -f /etc/logrotate.conf

# 2. Clean apt cache
run apt-get clean

# 3. Remove orphaned packages
run apt-get autoremove -y

# 4. Disk usage report
run df -h /

# 5. Check large files
echo "Top 10 largest files in /var:" >> "$LOG"
find /var -type f -size +100M 2>/dev/null | xargs du -h 2>/dev/null | sort -rh | head >> "$LOG"

# 6. Old temp files
run find /tmp -type f -mtime +7 -delete

# 7. Failed services
run systemctl --failed

Run weekly via cron. The run wrapper logs each command before running.

Self-healing scripts

Add resilience for flaky operations:

retry() {
  local max=$1; shift
  local n=0
  until "$@"; do
    n=$((n + 1))
    if (( n >= max )); then
      echo "Failed after $max attempts" >&2
      return 1
    fi
    echo "Attempt $n failed; retrying in $((n * 2))s..." >&2
    sleep $((n * 2))
  done
}

retry 5 curl -fsSL "$URL" -o output.json

Exponential backoff: 2, 4, 6, 8, 10 seconds.

Locking: prevent concurrent runs

LOCK=/tmp/myjob.lock

if [[ -e "$LOCK" ]]; then
  pid=$(cat "$LOCK")
  if kill -0 "$pid" 2>/dev/null; then
    echo "Already running (PID $pid)" >&2
    exit 0
  fi
  rm -f "$LOCK"
fi

echo $$ > "$LOCK"
trap 'rm -f "$LOCK"' EXIT

# ... do work ...

Pid file: store the PID in a lock file. If another instance starts, check if that PID is still alive. Trap EXIT ensures the lock is removed.

For more rigorous locking, use flock (Linux):

exec 200>/tmp/myjob.lock
flock -n 200 || { echo "Already running" >&2; exit 0; }
# ... work ...

flock is atomic — no race conditions.

Notifications on failure

notify_failure() {
  local message="$1"
  curl -s -X POST -H "Content-Type: application/json" \
    -d "{\"text\":\"❌ $(hostname): $message\"}" \
    "$SLACK_WEBHOOK_URL"
}

trap 'notify_failure "Backup failed: see /var/log/backup.log"' ERR

# ... script body ...

trap ... ERR fires on any failed command (with set -e). Send to Slack, email, PagerDuty.

For email: mail -s "Backup failed" admin@example.com < error.log.

Health check pattern

checks=(
  "test -d /var/www/myapp"
  "systemctl is-active myapp.service"
  "curl -fsS https://example.com/health"
  "test $(df / | tail -1 | awk '{print $5}' | tr -d %) -lt 90"
)

failed=0
for check in "${checks[@]}"; do
  if eval "$check" > /dev/null 2>&1; then
    echo "✓ $check"
  else
    echo "✗ $check"
    failed=$((failed + 1))
  fi
done

[[ $failed -eq 0 ]] || exit 1

A list of one-liner checks. Each uses eval (be careful with user input — these are constants).

Logging library

Reusable logger:

# lib/log.sh
LOG_FILE="${LOG_FILE:-/tmp/script.log}"
LOG_LEVEL="${LOG_LEVEL:-info}"

_log() {
  local level="$1"; shift
  printf "%s [%s] %s\n" "$(date +%Y-%m-%dT%H:%M:%S)" "$level" "$*" >> "$LOG_FILE"
}

log_info()  { _log INFO  "$@"; }
log_warn()  { _log WARN  "$@"; }
log_error() { _log ERROR "$@"; }
log_debug() { [[ "$LOG_LEVEL" == "debug" ]] && _log DEBUG "$@"; }

Source it in scripts:

source lib/log.sh
log_info "Starting backup"
log_warn "Disk 90% full"
log_error "Failed to connect"

Common stumbles

Backup that runs but fails silently. Always check tar's exit code; verify the archive isn't empty.

Retention deletes too much. Test with -print before -delete. Or use a "trash" dir: mv to /backup/expired/ and clean later.

Deploy that succeeds without verifying. Always health-check after. A fast curl -fsS /health catches 90% of broken deploys.

Rollback doesn't work. If deploy is broken, rollback might be too. Test rollback regularly — schedule a "deploy + rollback" drill weekly.

Cron silently broken. No mail, no log, you don't notice. Always redirect output to a log file. Periodically check the log exists and is recent.

Locking races. A naive [[ -e lock ]]; touch lock has a race window. Use flock or write the PID and verify.

set -e skipping cleanup. If set -e exits on error, your final cleanup might be skipped. Use trap ... EXIT.

Hardcoded paths. Scripts written in /Users/alice/scripts won't run for bob. Use $HOME and config files.

What's next

Lesson 28: macOS-specific tricks. Clipboard, Spotlight, disk utilities.

Recap

Backup: tar + retention; rsync --link-dest for incremental snapshots; verify before celebrating. Deploy: check deps, build, save previous, sync, reload, verify, rollback on failure. Maintenance: log rotation, package cleanup, disk monitoring, old-temp cleanup. Use locking (flock or pid file) to prevent concurrent runs. Always log + redirect cron output. Trap ERR for failure notifications.

Next lesson: macOS productivity tricks.

Ready? Take the quiz on the full lesson page →

Test what you've learned. Watch the lesson and try the interactive quiz on the same page.