Frictions and Complexities of "Simple" Scripts

Recently, I have been working on building my home lab. Naturally, I want this to be automated and deterministic. While writing a Bash script to configure and deploy a server running Debian and Paperless-ngx I was reminded of how even “simple” scripts can rapidly become unwieldy, fragile, and challenging to maintain.

Infrastructure-as-code tools solve these problems. Ansible, Chef, Salt, Nix/NixOS, Terraform, Bicep, and the like. To varying degrees, these tools employ declarative patterns to achieve idempotency, determinism, and consistency. This is in contrast to an imperative Bash script (or any script). Where declarative tools allow for expressing the desired end state, an imperative script is a series of steps to arrive at the end state.

Additionally, and again to varying degrees, many declarative infrastructure-as-code tools support diffing between the desired and current state. This allows you to see what it intends to change. The average imperative script does not support this, and if it did, it could easily double or triple the line count.

I thought it would be interesting to annotate the script and describe its steps, exposing its complexities, problematic assumptions, and workarounds. Hopefully, this will help explain why “simple” scripts are, more often than not, far from simple - and hopefully make a strong case for using other tools when needed.

Good intentions #

Perhaps optimistically, I envisioned the script as straightforward and only a few lines long. However, as I write this, it’s now over a hundred and thirty lines long. I intend to convert this to Ansible because, honestly, writing long Bash scripts is tedious and not how I want to manage my machines.

While I’m on the topic of being honest, I don’t particularly enjoy the Bash syntax or language, and I always feel a slight wave of despair wash over me when I open a load-bearing Bash script hundreds of lines long.

I’m also not proficient in Bash because when I try to use it in anger, I always need to deal with the same frustrating problems - problems that don’t exist or have good solutions in other languages and tools. But once again, I fell into the trap of thinking it would be fine for this task.

I considered using one of the various open-source Bash script templates, such as this one. However, this has over seven hundred lines.

Furthermore, the point of this post is that there are more problems here than simply the language choice for a setup script like this. It’s all the edge cases, footguns, error handling, and the imperative nature of the script.

One of my favourite blogs, rachelbythebay, has some great commentary and observations about exactly this. In her post “Your simple script is someone else’s bad day” she says:

If all of those steps actually succeed, then sure, okay, you win, and it’s probably an improvement over the old manual processes.

Without those checks, what happens if the subsequent steps run, and actually manage to get in some weird state because they ran when they shouldn’t have? It might even make it unable to run again later without manual intervention, since now it won’t be starting from a fresh slate.

Assurances and tooling woes #

I also would like to be able to test (both unit and integration) more complex scripts, especially as I build more services and servers for my homelab. At that point, mucking around with scripts will become infeasible.

The closest solution for Bash is bash_unit where the getting started instructions are… clone the repo and modify as needed. Ouch! The goal is to reduce line count, not increase it, so that’s not a viable option.

Both Ansible and Terraform have built-in support for tests.

Requirements #

The script needed to perform several tasks:

Update package lists and upgrade packages via apt
Install needed dependencies: Git, cURL
Add package signing keys and repositories for Docker and Tailscale VPN and then install them
Authenticate with Tailscale VPN
Configure SSH: Add my public key, disable password-based authentication and then restart sshd
- Create /home/$USER/.ssh/authorized_keys
- Assign appropriate permissions with chmod
Install and configure UFW firewall
Create several directories for Paperless-ngx and its users
Copy configuration files and Docker files to one of these directories
Start Paperless-ngx via docker compose up

Parsing arguments #

The first thing I wanted to write was argument handling. Currently, there are two arguments required: a public SSH key and a Tailscale pre-authentication key. Additionally, I’d like to be able to handle missing arguments. It also needs to be run with root (sudo), so it asserts that too.

1
#!/bin/bash
2

3
# Exit on error
4
set -eu
5

6
# Configuration
7
PUBLIC_KEY=""
8
TAILSCALE_KEY=""
9

10
USER=$(who -m | awk '{print $1}')
11

12
cd "/home/$USER"
13

14
# Function to show usage
15
usage() {
16
  echo "Usage: $0 -k '<public-key-string>' -t '<tailscale-auth-key>'"
17
  exit 1
18
}
19

20
# Parse command line options
21
while getopts ":k:t:" opt; do
22
  case ${opt} in
23
    k )
24
      PUBLIC_KEY=$OPTARG
25
      ;;
26
   t )
27
      TAILSCALE_KEY=$OPTARG
28
      ;;
29
    \? )
30
      echo "Invalid Option: -$OPTARG" 1>&2
31
      usage
32
      ;;
33
    : )
34
      echo "Option -$OPTARG requires an argument." 1>&2
35
      usage
36
      ;;
37
  esac
38
done
39
shift $((OPTIND -1))
40

41
# Validate required option for Public Key
42
if [ -z "${PUBLIC_KEY}" ]; then
43
  echo "Missing required argument: -k '<public_key_string>'"
44
  usage
45
fi
46

47
# Validate required option for Tailscale Key
48
if [ -z "${TAILSCALE_KEY}" ]; then
49
  echo "Missing required argument: -t '<tailscale_auth_key>'"
50
  usage
51
fi
52

53
# Ensure running as root
54
if [ "$(id -u)" -ne 0 ]; then
55
  echo "This script must be run as root."
56
  exit 1
57
fi

Fifty-seven lines of code, and we’re not even at the stage of installing any packages. This is simply the paperwork¹ to have the script in a usable state. A particularly obnoxious section of code is block three.

This features a loop, primitive parsing (and not even good parsing, it only supports single characters), a case block, variable assignment, and side effects (writing to STDOUT and STDERR). Positively a nightmare for anyone concerned about functional programming or readable code.

To top it off, it needs a bunch of noisy syntax thrown around like magic runes to make it work.

Bash, being a text-orientated language and shell, deals primarily with characters and strings.²

Adding package repositories #

The next step is installing packages, which involves adding package repositories for Tailscale and Docker. Ideally, I’d be using Podman for the Paperless-ngx container, but Paperless-ngx does not work in a rootless container³.

This is a string manipulation-heavy section, involving echo and tee. This is where the script starts to become more tedious to read and write.


58 collapsed lines
1
#!/bin/bash
2

3
# Exit on error
4
set -eu
5

6
# Configuration
7
PUBLIC_KEY=""
8
TAILSCALE_KEY=""
9

10
USER=$(who -m | awk '{print $1}')
11

12
cd "/home/$USER"
13

14
# Function to show usage
15
usage() {
16
  echo "Usage: $0 -k '<public-key-string>' -t '<tailscale-auth-key>'"
17
  exit 1
18
}
19

20
# Parse command line options
21
while getopts ":k:t:" opt; do
22
  case ${opt} in
23
    k )
24
      PUBLIC_KEY=$OPTARG
25
      ;;
26
   t )
27
      TAILSCALE_KEY=$OPTARG
28
      ;;
29
    \? )
30
      echo "Invalid Option: -$OPTARG" 1>&2
31
      usage
32
      ;;
33
    : )
34
      echo "Option -$OPTARG requires an argument." 1>&2
35
      usage
36
      ;;
37
  esac
38
done
39
shift $((OPTIND -1))
40

41
# Validate required option for Public Key
42
if [ -z "${PUBLIC_KEY}" ]; then
43
  echo "Missing required argument: -k '<public_key_string>'"
44
  usage
45
fi
46

47
# Validate required option for Tailscale Key
48
if [ -z "${TAILSCALE_KEY}" ]; then
49
  echo "Missing required argument: -t '<tailscale_auth_key>'"
50
  usage
51
fi
52

53
# Ensure running as root
54
if [ "$(id -u)" -ne 0 ]; then
55
  echo "This script must be run as root."
56
  exit 1
57
fi
58

59
# Update packages
60
echo "Updating and upgrading packages..."
61
apt-get update -y && apt-get upgrade -y
62

63
# Install git and curl
64
apt-get install -y git curl podman
65

66
# Add Tailscale package
67
echo "Installing Tailscale..."
68
curl -fsSL "https://pkgs.tailscale.com/stable/debian/$(lsb_release -cs).noarmor.gpg" | sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg >/dev/null
69
curl -fsSL "https://pkgs.tailscale.com/stable/debian/$(lsb_release -cs).tailscale-keyring.list" | tee /etc/apt/sources.list.d/tailscale.list
70

71
# Add Docker package
72
apt-get install -y ca-certificates
73
install -m 0755 -d /etc/apt/keyrings
74
curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
75
chmod a+r /etc/apt/keyrings/docker.asc
76

77
echo \
78
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
79
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
80
  tee /etc/apt/sources.list.d/docker.list > /dev/null
81
apt-get update
82
apt-get install -y tailscale docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
83

84
echo "Authenticating with Tailscale, please wait..."
85
tailscale up --authkey "$TAILSCALE_KEY"

Setting up SSH and the firewall #

The purpose of this section doesn’t need much of an explanation, but I do want to focus on the way that the script has to update the configuration. Here, I’m using sed to replace “no” strings with “yes” strings. This works, but once again, reminds me that this is all working at the text level.

There’s a certain persistent feeling of fragility when configuring systems with plain text when those systems all use disparate and made-up formats (essentially schemaless), that require regular expressions and string replacement to update. Another example of where infrastructure-as-code is a clear winner.

I can’t pin this one on Bash, though. This is down to OpenSSH and its maintainers thinking that yet another delimited text format is acceptable when several far safer formats exist that benefit from a defined grammar and syntax, of which there are available parsing libraries.

However, this leads to the theme of this article, none of this is particularly elegant or foolproof. What if, for example, this regular expression was used on a file that has one of these “no” or “yes” strings written in a documentation comment?

Well, sed doesn’t replace all occurrences without an explicit flag. In this scenario, running the script twice falls into the idempotency trap again, potentially replacing the wrong “yes” or “no” strings. To accurately modify only the intended string, the script must distinguish the correct line from irrelevant comments.

However, without a structured file format, identifying the target line involves checking line numbers or differentiating between comments and actual code. This is yet more unnecessary complexity in what should be a simple Bash script.

It works fine for now with this particular file, but there’s no guarantee this temerarious and reckless “apply regular expressions to configuration files and hope for the best” approach would work for other files. This underscores the advantages of using infrastructure-as-code or, more simply, good file formats.


87 collapsed lines
1
#!/bin/bash
2

3
# Exit on error
4
set -eu
5

6
# Configuration
7
PUBLIC_KEY=""
8
TAILSCALE_KEY=""
9

10
USER=$(who -m | awk '{print $1}')
11

12
cd "/home/$USER"
13

14
# Function to show usage
15
usage() {
16
  echo "Usage: $0 -k '<public-key-string>' -t '<tailscale-auth-key>'"
17
  exit 1
18
}
19

20
# Parse command line options
21
while getopts ":k:t:" opt; do
22
  case ${opt} in
23
    k )
24
      PUBLIC_KEY=$OPTARG
25
      ;;
26
   t )
27
      TAILSCALE_KEY=$OPTARG
28
      ;;
29
    \? )
30
      echo "Invalid Option: -$OPTARG" 1>&2
31
      usage
32
      ;;
33
    : )
34
      echo "Option -$OPTARG requires an argument." 1>&2
35
      usage
36
      ;;
37
  esac
38
done
39
shift $((OPTIND -1))
40

41
# Validate required option for Public Key
42
if [ -z "${PUBLIC_KEY}" ]; then
43
  echo "Missing required argument: -k '<public_key_string>'"
44
  usage
45
fi
46

47
# Validate required option for Tailscale Key
48
if [ -z "${TAILSCALE_KEY}" ]; then
49
  echo "Missing required argument: -t '<tailscale_auth_key>'"
50
  usage
51
fi
52

53
# Ensure running as root
54
if [ "$(id -u)" -ne 0 ]; then
55
  echo "This script must be run as root."
56
  exit 1
57
fi
58

59
# Update packages
60
echo "Updating and upgrading packages..."
61
apt-get update -y && apt-get upgrade -y
62

63
# Install git and curl
64
apt-get install -y git curl podman
65

66
# Add Tailscale package
67
echo "Installing Tailscale..."
68
curl -fsSL "https://pkgs.tailscale.com/stable/debian/$(lsb_release -cs).noarmor.gpg" | sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg >/dev/null
69
curl -fsSL "https://pkgs.tailscale.com/stable/debian/$(lsb_release -cs).tailscale-keyring.list" | tee /etc/apt/sources.list.d/tailscale.list
70

71
# Add Docker package
72
apt-get install -y ca-certificates
73
install -m 0755 -d /etc/apt/keyrings
74
curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
75
chmod a+r /etc/apt/keyrings/docker.asc
76

77
echo \
78
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
79
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
80
  tee /etc/apt/sources.list.d/docker.list > /dev/null
81
apt-get update
82
apt-get install -y tailscale docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
83

84
echo "Authenticating with Tailscale, please wait..."
85
tailscale up --authkey "$TAILSCALE_KEY"
86

87
# Setup SSH
88
echo "Setting up SSH key..."
89
mkdir -p "/home/$USER/.ssh"
90
echo "$PUBLIC_KEY" > "/home/$USER/.ssh/authorized_keys"
91
chown -R "$USER":"$USER" /home/"$USER"/.ssh
92
chmod 700 /home/"$USER"/.ssh && chmod 600 /home/"$USER"/.ssh/authorized_keys
93

94
# Update SSH configuration to disable password authentication
95
sed -i '/^#PasswordAuthentication yes/c\PasswordAuthentication no' /etc/ssh/sshd_config
96
sed -i '/^#PermitRootLogin prohibit-password/c\PermitRootLogin without-password' /etc/ssh/sshd_config
97
systemctl restart sshd
98

99
# Install and configure UFW
100
echo "Configuring firewall (UFW)..."
101
apt-get install -y ufw
102
ufw allow OpenSSH
103
ufw allow http
104
ufw allow https
105
ufw --force enable
20 collapsed lines
106

107
# Create consume directories for different people
108
mkdir -v "./paperless-inbox"
109
# cd ~/paperless-inbox
110
for personName in <name list here>; do
111
  mkdir "./paperless-inbox/$personName"
112
done
113

114
# Create paperless-ngx directory
115
mkdir -v "./paperless-ngx"
116

117
# Copy configuration files
118
cp -a configuration/linux/dpm/. paperless-ngx/
119

120
# Start paperless-ngx superuser creation (will prompt for input)
121
cd paperless-ngx
122
docker compose run --rm webserver createsuperuser
123

124
# Start paperless-ngx
125
docker compose up -d

Setting up Paperless-ngx #

After all of this, as I put it earlier, paperwork had been written it was time to achieve the task I’d set out to do: automate the configuration of a fresh system and deployment of Paperless-ngx to it.

The only interesting part here is that I create a custom consumption directory structure that Paperless-ngx uses to automatically tag files, which forms part of the workflow I setup.


105 collapsed lines
1
#!/bin/bash
2

3
# Exit on error
4
set -eu
5

6
# Configuration
7
PUBLIC_KEY=""
8
TAILSCALE_KEY=""
9

10
USER=$(who -m | awk '{print $1}')
11

12
cd "/home/$USER"
13

14
# Function to show usage
15
usage() {
16
  echo "Usage: $0 -k '<public-key-string>' -t '<tailscale-auth-key>'"
17
  exit 1
18
}
19

20
# Parse command line options
21
while getopts ":k:t:" opt; do
22
  case ${opt} in
23
    k )
24
      PUBLIC_KEY=$OPTARG
25
      ;;
26
   t )
27
      TAILSCALE_KEY=$OPTARG
28
      ;;
29
    \? )
30
      echo "Invalid Option: -$OPTARG" 1>&2
31
      usage
32
      ;;
33
    : )
34
      echo "Option -$OPTARG requires an argument." 1>&2
35
      usage
36
      ;;
37
  esac
38
done
39
shift $((OPTIND -1))
40

41
# Validate required option for Public Key
42
if [ -z "${PUBLIC_KEY}" ]; then
43
  echo "Missing required argument: -k '<public_key_string>'"
44
  usage
45
fi
46

47
# Validate required option for Tailscale Key
48
if [ -z "${TAILSCALE_KEY}" ]; then
49
  echo "Missing required argument: -t '<tailscale_auth_key>'"
50
  usage
51
fi
52

53
# Ensure running as root
54
if [ "$(id -u)" -ne 0 ]; then
55
  echo "This script must be run as root."
56
  exit 1
57
fi
58

59
# Update packages
60
echo "Updating and upgrading packages..."
61
apt-get update -y && apt-get upgrade -y
62

63
# Install git and curl
64
apt-get install -y git curl podman
65

66
# Add Tailscale package
67
echo "Installing Tailscale..."
68
curl -fsSL "https://pkgs.tailscale.com/stable/debian/$(lsb_release -cs).noarmor.gpg" | sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg >/dev/null
69
curl -fsSL "https://pkgs.tailscale.com/stable/debian/$(lsb_release -cs).tailscale-keyring.list" | tee /etc/apt/sources.list.d/tailscale.list
70

71
# Add Docker package
72
apt-get install -y ca-certificates
73
install -m 0755 -d /etc/apt/keyrings
74
curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
75
chmod a+r /etc/apt/keyrings/docker.asc
76

77
echo \
78
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
79
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
80
  tee /etc/apt/sources.list.d/docker.list > /dev/null
81
apt-get update
82
apt-get install -y tailscale docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
83

84
echo "Authenticating with Tailscale, please wait..."
85
tailscale up --authkey "$TAILSCALE_KEY"
86

87
# Setup SSH
88
echo "Setting up SSH key..."
89
mkdir -p "/home/$USER/.ssh"
90
echo "$PUBLIC_KEY" > "/home/$USER/.ssh/authorized_keys"
91
chown -R "$USER":"$USER" /home/"$USER"/.ssh
92
chmod 700 /home/"$USER"/.ssh && chmod 600 /home/"$USER"/.ssh/authorized_keys
93

94
# Update SSH configuration to disable password authentication
95
sed -i '/^#PasswordAuthentication yes/c\PasswordAuthentication no' /etc/ssh/sshd_config
96
sed -i '/^#PermitRootLogin prohibit-password/c\PermitRootLogin without-password' /etc/ssh/sshd_config
97
systemctl restart sshd
98

99
# Install and configure UFW
100
echo "Configuring firewall (UFW)..."
101
apt-get install -y ufw
102
ufw allow OpenSSH
103
ufw allow http
104
ufw allow https
105
ufw --force enable
106

107
# Create consume directories for different people
108
mkdir -v "./paperless-inbox"
109
# cd ~/paperless-inbox
110
for personName in <name list here>; do
111
  mkdir "./paperless-inbox/$personName"
112
done
113

114
# Create paperless-ngx directory
115
mkdir -v "./paperless-ngx"
116

117
# Copy configuration files
118
cp -a configuration/linux/dpm/. paperless-ngx/
119

120

121
# Start paperless-ngx superuser creation (will prompt for input)
122
cd paperless-ngx
123
docker compose run --rm webserver createsuperuser
124

125
# Start paperless-ngx
126
docker compose up -d

Thoughts #

Overall, I’m glad that I have this script. It does what I set out to achieve. It’s a starting point or at least a reference for what an iteration of it needs to do because it absolutely needs improving.

I described some downsides: terse and difficult-to-scan syntax, error-prone argument parsing, very error-prone and destructive string manipulation in configuration files, a lack of idempotency, and an overall sense that this is the wrong tool for the task.

So, what will I convert this to? Ansible. It’s very declarative and mostly idempotent[idempotent]. It also has a package manager called Galaxy, which is sorely missing in Bash. I touched on this earlier.

I have decided that Ansible will become the language behind my homelab infrastructure. It fits a nice middle ground that some other languages or tools don’t quite reach.

For example, I like the Terraform model of “it’s all just pseudo-JSON”, but for some incomprehensible reason, no one has thought to apply Terraform to bare metal/plain operating system configuration.⁴ Its two main areas are containers and cloud infrastructure, so that’s ruled out.

I also really like the Nix/NixOS paradigm-shifting model of being fully declarative, idempotent, software dependency isolation, and all the rest. However, it isn’t a tool that would allow me to, for example, network boot totally blank systems with PXE to run a Linux installer.

I’m not at all familiar with Chef, Puppet, or Salt. Some of them require a dedicated server to manage state and monitor nodes, and that’s not really the type of complexity I want to deal with.

This puts Ansible in a favourable position. It has the right balance of being mostly idempotent while allowing for side effects and other system-orientated actions. After all, it simply issues commands over SSH. This makes it a pretty attractive option, and this is why Ansible is so prevalent in Linux system administration circles and with people building infrastructure and homelab environments.

So, my next task is to learn Ansible. It will look something like the following code block. I am pretty excited about the ways I will be able to define and build my homelab servers and services.

Hopefully, you can see the significant jump in capability and readability compared to the Bash script I have currently. The Docker APT repository is added, APT packages are installed, and finally the Docker service is started. Recall how I described a third of the original Bash script as paperwork. Well, a large part of that is taken care of by these few lines.

1
tasks:
2
  - name: Add Docker GPG key
3
    apt_key:
4
      url: https://download.docker.com/linux/debian/gpg
5
      state: present
6

7
  - name: Add Docker repository
8
    apt_repository:
9
      repo: deb [arch=amd64] https://download.docker.com/linux/debian bullseye stable
10
      state: present
11

12
  - name: Install Docker Engine
13
    apt:
14
      name:
15
        - docker-ce
16
        - docker-ce-cli
17
        - containerd.io
18

19
  - name: Start Docker service
20
    service:
21
      name: docker
22
      state: started

Footnotes #

Get it? ↩
PowerShell, an object-orientated language that I am fond of using (though it has a couple of annoyances - I can think of at least three different ways to define a function), works at a higher level than strings and has a more modern approach to syntax, with better error-handling semantics too. ↩
Well, technically, Paperless-ngx supports rootless containers if you don’t want to use multiple OCR languages. I do, so this isn’t an option. This feels like an oversight. ↩
I don’t understand why this isn’t an area Terraform isn’t pursuing. I’ve read very poorly explained and vague “explanations” online that ultimately fail to explain anything helpful. ↩