By having a Highly Available Kubernetes Pi Cluster, you will have full control over your production grade environment on-premise
HA Kubernetes Pi Cluster (Part I)
(Total Setup Time: 25 mins)
On this special day, I will like to wish all Singaporeans and Singapore a Happy 55th National Day!
With the newly purchase 2x Raspberry Pi Model B 8GB and 64GB SD card to my collection, I will setup a Highly Available Kubernetes Pi Cluster. In this guide, I will setup an external etcd key-value store. In the next article, I will continue with the HA configuration.
Preparing OS
(10 mins)
First, I am using Ubuntu Server (64-bit) as my OS. After burning the image onto my 64GB SD card, create an empty file ssh in d:/boot. This section is required for each master nodes.
Second, change the default password ubuntu for the default ubuntu user. Upgrade the OS:
sudo apt update
sudo apt upgrade
Third, change the hostname by running:
sudo vi /etc/hostname
sudo vi /etc/hosts
Fourth, letting iptables to see bridged traffic:
# Checks if br_netfilter is loaded
lsmod | grep br_netfilter
# Loads its explicitly
sudo modprobe br_netfilter
# Sees bridged traffic
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system
Fifth, enable memory cgroup, by adding the following to /boot/firmware/cmdline.txt:
cgroup=cpuset cgroup_enable=memory cgroup_memory=1 swapaccount=1
Finally, add the following to /boot/firmware/usercfg.txt for disabling WiFi and Bluetooth:
dtoverlay=disable-wifi
dtoverlay=disable-bt
# Memory group should be 1 after reboot
sudo reboot
grep mem /proc/cgroups | awk '{ print $4 }'
Creating Virtual IP
(5 mins)
First, install keepalived referencing from LVS-NAT-Keepalived for all master nodes.
#Installs keepalived
sudo apt install keepalived
# Configures keepalived
sudo vi /etc/keepalived/keepalived.conf
#VRRP Instances definitions
#state MASTER for first master, BACKUP for other master nodes
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
authentication {
auth_type PASS
auth_pass seehiong
}
virtual_ipaddress {
192.168.100.200
}
}
# Virtual Servers definitions
virtual_server 192.168.100.200 6443 {
delay_loop 6
lb_algo rr
lb_kind NAT
protocol TCP
real_server 192.168.100.119 6443 {
weight 1
TCP_CHECK {
connect_timeout 3
connect_port 6443
}
}
real_server 192.168.100.173 6443 {
weight 1
TCP_CHECK {
connect_timeout 3
connect_port 6443
}
}
real_server 192.168.100.100 6443 {
weight 1
TCP_CHECK {
connect_timeout 3
connect_port 6443
}
}
}
# Restarts keepalived
sudo systemctl restart keepalived
Second, test the connection, which will fail at this point in time:
nc -v 192.168.100.200 6443
# Expected result
nc: connect to 192.168.100.200 port 6443 (tcp) failed: Connection refused
Preparing certs for etcd
(5 mins)
First, by following openssl CA, configure openssl and create root cert:
sudo su
# Openssl configuration
vi /usr/lib/ssl/openssl.cnf
[ CA_default ]
dir = /root/ca
mkdir /root/ca
cd /root/ca
mkdir newcerts certs crl private requests
touch index.txt
echo '1234' > serial
# Root certificate
openssl genrsa -aes256 -out private/cakey.pem 4096
openssl req -new -x509 -key private/cakey.pem -out cacert.pem -days 3650 -set_serial 0 -subj '/C=SG/ST=SG/O=seehiong/CN=master1'
Second, create certs for all master nodes:
# Create master nodes' certificate
cd /root/ca/requests/
openssl genrsa -out etcd-key.pem
openssl req -new -key etcd-key.pem -out etcd.csr -subj '/C=SG/ST=SG/O=seehiong/CN=master1,master2,master3,localhost,cluster-endpoint'
openssl ca -in etcd.csr -out etcd.pem \
-extfile <(printf "subjectAltName=IP:192.168.100.119,IP:192.168.100.173,IP:192.168.100.100,IP:192.168.100.200,IP:127.0.0.1,\
DNS:master1,DNS:master2,DNS:master3,DNS:localhost,DNS:cluster-endpoint")
openssl genrsa -out peer-etcd-key.pem
openssl req -new -key peer-etcd-key.pem -out peer-etcd.csr -subj '/C=SG/ST=SG/O=seehiong/CN=192.168.100.119,192.168.100.173,192.168.100.100,192.168.100.200'
openssl ca -in peer-etcd.csr -out peer-etcd.pem \
-extfile <(printf "subjectAltName=IP:192.168.100.119,IP:192.168.100.173,IP:192.168.100.100,IP:192.168.100.200,IP:127.0.0.1,\
DNS:master1,DNS:master2,DNS:master3,DNS:localhost,DNS:cluster-endpoint")
rm etcd.csr peer-etcd.csr
mv etcd-key.pem peer-etcd-key.pem /root/ca/private/
mv etcd.pem peer-etcd.pem /root/ca/certs/
# Protects /root/ca folder
chmod -R 600 /root/ca
Third, copies all certs to /srv/etcd-certs/ for all master nodes.
# Copies certs to master1
cp /root/ca/cacert.pem /root/ca/private/etcd-key.pem /root/ca/private/peer-etcd-key.pem \
/root/ca/certs/etcd.pem /root/ca/certs/peer-etcd.pem /srv/etcd-certs/
# Copies certs to other master nodes
scp /root/ca/cacert.pem /root/ca/private/etcd-key.pem /root/ca/private/peer-etcd-key.pem \
/root/ca/certs/etcd.pem /root/ca/certs/peer-etcd.pem ubuntu@master2:/tmp
scp /root/ca/cacert.pem /root/ca/private/etcd-key.pem /root/ca/private/peer-etcd-key.pem \
/root/ca/certs/etcd.pem /root/ca/certs/peer-etcd.pem ubuntu@master3:/tmp
# ssh into other master nodes, perform these
sudo mv /tmp/*.pem /srv/etcd-certs/
sudo chown -R etcd:etcd /srv/etcd-certs/
Lastly, update ca-certificate on all nodes:
sudo cp /srv/etcd-certs/cacert.pem /usr/local/share/ca-certificates
sudo update-ca-certificates --fresh
Setting up etcd
(5 mins)
First, by following v3.4.10 release, setup etcd on each master as follows:
ETCD_VER=v3.4.10
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GITHUB_URL}
# Downloads the arm64 architecture for Raspberry Pi
rm -f /tmp/etcd-${ETCD_VER}-linux-arm64.tar.gz
rm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-arm64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-arm64.tar.gz
# Extracts etcd
tar xzvf /tmp/etcd-${ETCD_VER}-linux-arm64.tar.gz -C /tmp/etcd-download-test --strip-components=1
rm -f /tmp/etcd-${ETCD_VER}-linux-arm64.tar.gz
# Checks version
/tmp/etcd-download-test/etcd --version (Error: etcd on unsupported platform without ETCD_UNSUPPORTED_ARCH=arm64 set)
/tmp/etcd-download-test/etcdctl version (Success: etcdctl version: 3.4.10, API version: 3.4)
# Moves to /usr/local/bin
sudo cp /tmp/etcd-download-test/etcd /usr/local/bin/
sudo cp /tmp/etcd-download-test/etcdctl /usr/local/bin/
export ETCD_UNSUPPORTED_ARCH=arm64
# Checks version again
etcd --version (Sccuess: running etcd on unsupported architecture "arm64" since ETCD_UNSUPPORTED_ARCH is set)
Second, prepares etcd as a service on master1:
sudo vi /lib/systemd/system/etcd.service
# Inserts the following into etcd.service for master1
[Unit]
Description=etcd key-value store
Documentation=https://etcd.io/docs/v3.4.0/
[Service]
User=etcd
Type=notify
Environment=ETCD_UNSUPPORTED_ARCH=arm64
#Loggingg flags
Environment=ETCD_LOGGER=zap
# Member flags
Environment=ETCD_NAME=infra1
Environment=ETCD_DATA_DIR=/var/lib/etcd
Environment=ETCD_LISTEN_PEER_URLS=https://192.168.100.119:2380
Environment=ETCD_LISTEN_CLIENT_URLS=https://192.168.100.119:2379,https://127.0.0.1:2379
Environment=ETCD_HEARTBEAT_INTERVAL=1000
Environment=ETCD_ELECTION_TIMEOUT=5000
# Clustering flags
Environment=ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.100.119:2380
Environment=ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
Environment=ETCD_INITIAL_CLUSTER=infra1=https://192.168.100.119:2380,infra2=https://192.168.100.173:2380,infra3=https://192.168.100.100:2380
Environment=ETCD_INITIAL_CLUSTER_STATE=new
Environment=ETCD_ADVERTISE_CLIENT_URLS=https://192.168.100.119:2379
# Security flags
Environment=ETCD_CLIENT_CERT_AUTH=true
Environment=ETCD_TRUSTED_CA_FILE=/srv/etcd-certs/cacert.pem
Environment=ETCD_CERT_FILE=/srv/etcd-certs/etcd.pem
Environment=ETCD_KEY_FILE=/srv/etcd-certs/etcd-key.pem
Environment=ETCD_PEER_CLIENT_CERT_AUTH=true
Environment=ETCD_PEER_TRUSTED_CA_FILE=/srv/etcd-certs/cacert.pem
Environment=ETCD_PEER_CERT_FILE=/srv/etcd-certs/peer-etcd.pem
Environment=ETCD_PEER_KEY_FILE=/srv/etcd-certs/peer-etcd-key.pem
ExecStart=/usr/local/bin/etcd
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
Third, creates etcd data folder and system account on master1:
sudo mkdir -p /var/lib/etcd
# etcd fails if file permissions are not set correctly
sudo chmod -R 700 /var/lib/etcd
# Creates system user
sudo adduser --system etcd
sudo addgroup etcd
sudo usermod -aG etcd etcd
Fourth, install etcd as a service:
sudo systemctl daemon-reload
sudo systemctl enable etcd
sudo systemctl stop etcd
sudo systemctl start etcd.service
systemctl status etcd.service
# Check for logs
journalctl -xeu etcd
Fifth, ssh into other master nodes (verify that step 1 is done by etcd –version) and perform the following:
ssh ubuntu@master2
sudo vi /lib/systemd/system/etcd.service
# Variations for step 2 for master2
# Inserts the following into etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://etcd.io/docs/v3.4.0/
[Service]
User=etcd
Type=notify
Environment=ETCD_UNSUPPORTED_ARCH=arm64
#Loggingg flags
Environment=ETCD_LOGGER=zap
# Member flags
Environment=ETCD_NAME=infra2
Environment=ETCD_DATA_DIR=/var/lib/etcd
Environment=ETCD_LISTEN_PEER_URLS=https://192.168.100.173:2380
Environment=ETCD_LISTEN_CLIENT_URLS=https://192.168.100.173:2379,https://127.0.0.1:2379
Environment=ETCD_HEARTBEAT_INTERVAL=1000
Environment=ETCD_ELECTION_TIMEOUT=5000
# Clustering flags
Environment=ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.100.173:2380
Environment=ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
Environment=ETCD_INITIAL_CLUSTER=infra1=https://192.168.100.119:2380,infra2=https://192.168.100.173:2380,infra3=https://192.168.100.100:2380
Environment=ETCD_INITIAL_CLUSTER_STATE=new
Environment=ETCD_ADVERTISE_CLIENT_URLS=https://192.168.100.173:2379
# Security flags
Environment=ETCD_CLIENT_CERT_AUTH=true
Environment=ETCD_TRUSTED_CA_FILE=/srv/etcd-certs/cacert.pem
Environment=ETCD_CERT_FILE=/srv/etcd-certs/etcd.pem
Environment=ETCD_KEY_FILE=/srv/etcd-certs/etcd-key.pem
Environment=ETCD_PEER_CLIENT_CERT_AUTH=true
Environment=ETCD_PEER_TRUSTED_CA_FILE=/srv/etcd-certs/cacert.pem
Environment=ETCD_PEER_CERT_FILE=/srv/etcd-certs/peer-etcd.pem
Environment=ETCD_PEER_KEY_FILE=/srv/etcd-certs/peer-etcd-key.pem
ExecStart=/usr/local/bin/etcd
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
ssh ubuntu@master3
sudo vi /lib/systemd/system/etcd.service
# Variations for step 2 for master3
# Inserts the following into etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://etcd.io/docs/v3.4.0/
[Service]
User=etcd
Type=notify
Environment=ETCD_UNSUPPORTED_ARCH=arm64
#Loggingg flags
Environment=ETCD_LOGGER=zap
# Member flags
Environment=ETCD_NAME=infra3
Environment=ETCD_DATA_DIR=/var/lib/etcd
Environment=ETCD_LISTEN_PEER_URLS=https://192.168.100.100:2380
Environment=ETCD_LISTEN_CLIENT_URLS=https://192.168.100.100:2379,https://127.0.0.1:2379
Environment=ETCD_HEARTBEAT_INTERVAL=1000
Environment=ETCD_ELECTION_TIMEOUT=5000
# Clustering flags
Environment=ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.100.100:2380
Environment=ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
Environment=ETCD_INITIAL_CLUSTER=infra1=https://192.168.100.119:2380,infra2=https://192.168.100.173:2380,infra3=https://192.168.100.100:2380
Environment=ETCD_INITIAL_CLUSTER_STATE=new
Environment=ETCD_ADVERTISE_CLIENT_URLS=https://192.168.100.100:2379
# Security flags
Environment=ETCD_CLIENT_CERT_AUTH=true
Environment=ETCD_TRUSTED_CA_FILE=/srv/etcd-certs/cacert.pem
Environment=ETCD_CERT_FILE=/srv/etcd-certs/etcd.pem
Environment=ETCD_KEY_FILE=/srv/etcd-certs/etcd-key.pem
Environment=ETCD_PEER_CLIENT_CERT_AUTH=true
Environment=ETCD_PEER_TRUSTED_CA_FILE=/srv/etcd-certs/cacert.pem
Environment=ETCD_PEER_CERT_FILE=/srv/etcd-certs/peer-etcd.pem
Environment=ETCD_PEER_KEY_FILE=/srv/etcd-certs/peer-etcd-key.pem
ExecStart=/usr/local/bin/etcd
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
# Follows step 3
sudo mkdir -p /var/lib/etcd
sudo chmod -R 700 /var/lib/etcd
sudo adduser --system etcd
sudo addgroup etcd
sudo usermod -aG etcd etcd
sudo mkdir -p /srv/etcd-certs/
sudo mv ~/*.csr ~/*.pem /srv/etcd-certs/
sudo chown -R etcd:etcd /srv/etcd-certs/ /var/lib/etcd /usr/local/bin/etcd /usr/local/bin/etcdctl
# Follows step 4
sudo systemctl daemon-reload
sudo systemctl enable etcd
sudo systemctl start etcd
sudo systemctl stop etcd
systemctl status etcd
Finally, by following etcd [security]https://etcd.io/docs/v3.4.0/op-guide/security/) guide, I test my etcd setup using:
sudo curl -k -L https://localhost:2379/metrics --cacert /srv/etcd-certs/cacert.pem --cert /srv/etcd-certs/etcd.pem --key /srv/etcd-certs/etcd-key.pem | grep -v debugging
sudo etcdctl --cacert /srv/etcd-certs/cacert.pem --cert /srv/etcd-certs/etcd.pem --key /srv/etcd-certs/etcd-key.pem member list
Troubleshooting
Request sent was ignored by remote peer due to cluster ID mismatch
I solved mine by changing ETCD_INITIAL_CLUSTER_TOKEN=[something else]. You can check the end point health.
sudo etcdctl --endpoints=https://cluster-endpoint:2379 --cacert=/srv/etcd-certs/cacert.pem --cert=/srv/etcd-certs/etcd.pem --key=/srv/etcd-certs/etcd-key.pem endpoint health
If it still fails, you may try to re-create the folder by:
sudo rm -rf /var/lib/etcd && sudo mkdir -p /var/lib/etcd
sudo chown -R etcd:etcd /var/lib/etcd && sudo chmod -R 700 /var/lib/etcd
ERROR:There is already a certificate for /C=SG/ST=SG/O=seehiong/CN=192.168.100.119
You may try to revoke the old certificate and sign the CSR again:
openssl ca -revoke /root/ca/newcerts/1240.pem
Replacing a faulty member
You may have to re-configure your etcd cluster when your SD card failed
# Get member list
sudo etcdctl --cacert /srv/etcd-certs/cacert.pem --cert /srv/etcd-certs/etcd.pem --key /srv/etcd-certs/etcd-key.pem member list
# Delete the member based on the ID
sudo etcdctl --cacert /srv/etcd-certs/cacert.pem --cert /srv/etcd-certs/etcd.pem --key /srv/etcd-certs/etcd-key.pem member remove 34ef554257cff34e
# Add the previous member (previous settings remain the same, e.g. IP address)
sudo etcdctl --cacert /srv/etcd-certs/cacert.pem --cert /srv/etcd-certs/etcd.pem --key /srv/etcd-certs/etcd-key.pem member add infra3 --peer-urls=https://192.168.100.182:2380
Similar message as this will appear:
sudo vi /lib/systemd/system/etcd.service
# Make the following changes
Environment=ETCD_INITIAL_CLUSTER_STATE=existing
# Add new configuration
Environment=ETCD_INITIAL_CLUSTER="infra2=https://192.168.100.181:2380,infra3=https://192.168.100.182:2380,infra1=https://192.168.100.180:2380"
# Restart etcd
sudo systemctl daemon-reload
sudo systemctl start etcd