Category Archives: openshift

From Zero to Openshift in 30 Minutes

Discover how to leverage the power of kcli and libvirt to rapidly deploy a full OpenShift cluster in under 30 minutes, cutting through the complexity often associated with OpenShift installations.  

Prerequisites

Server with 8+ cores, minimum of 64GB RAM (96+ for >1 worker node)
Fast IO
– dedicated NVMe libvirt storage or
– NVMe LVMCache fronting HDD (surprisingly effective!)
OS installed (tested with CentOS Stream 8)
Packages libvirt + git installed
Pull-secret (store in openshift_pull.json) obtained from https://cloud.redhat.com/openshift/install/pull-secret

Install kcli

[steve@shift ~]$ git clone https://github.com/karmab/kcli.git
[steve@shift ~]$ cd kcli; ./install.sh

Configure parameters.yml


(see https://kcli.readthedocs.io/en/latest/#deploying-kubernetes-openshift-clusters)

example:-

[steve@shift ~]$ cat parameters.yml
cluster: shift413
domain: shift.local
version: stable
tag: '4.13'
ctlplanes: 3
workers:3
ctlplane_memory:16384
worker_memory:16384
ctlplane_numcpus: 8
worker_numcpus: 4

Note 1: To deploy Single Node Openshift (SNO) set ctlplanes to 1 and workers to 0.

Note 2: Even a fast Xeon with NVMe storage may have difficulty deploying more than 3 workers before the installer times out.
An RFE exists to make the timeout configurable, see:

https://access.redhat.com/solutions/6379571
https://issues.redhat.com/browse/RFE-2512

Deploy cluster

[steve@shift ~]$ kcli create kube openshift --paramfile parameters.yml $cluster

Note: openshift_pull.json and parameters.yml should be in your current working directory, or adjust above as required

Monitor Progress

If you wish to monitor progress, find IP of bootsrap node:-

[steve@shift ~]$ virsh net-dhcp-leases default
 Expiry Time           MAC address         Protocol   IP address           Hostname            Client ID or DUID
---------------------------------------------------------------------------------------------------------------------
 2023-07-19 15:48:02   52:54:00:08:41:71   ipv4       192.168.122.103/24   ocp413-ctlplane-0   01:52:54:00:08:41:71
 2023-07-19 15:48:02   52:54:00:10:2a:9d   ipv4       192.168.122.100/24   ocp413-ctlplane-1   01:52:54:00:10:2a:9d
 2023-07-19 15:46:30   52:54:00:2b:98:2a   ipv4       192.168.122.211/24   ocp413-bootstrap    01:52:54:00:2b:98:2a
 2023-07-19 15:48:03   52:54:00:aa:d7:02   ipv4       192.168.122.48/24    ocp413-ctlplane-2   01:52:54:00:aa:d7:02

then ssh to bootstrap node as core user and follow instructions:-

[steve@shift ~]# ssh core@192.168.122.231
journalctl -b -f -u release-image.service -u bootkube.service
Once cluster is deployed you'll receive the following message:-

INFO Waiting up to 40m0s (until 3:42PM) for the cluster at https://api.ocp413.lab.local:6443 to initialize...
INFO Checking to see if there is a route at openshift-console/console...
INFO Install complete!                            
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/root/.kcli/clusters/ocp413/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp413.lab.local
INFO Login to the console with user: "kubeadmin", and password: "qTT5W-F5Cjz-BIPx2-KWXQx"
INFO Time elapsed: 16m18s                        
Deleting ocp413-bootstrap

Note: Whilst the above credentials can be found later, it’s worthwhile making a note of the above.  I save to a text file on the host.

Confirm Status

[root@shift ~]# export KUBECONFIG=/root/.kcli/clusters/ocp413/auth/kubeconfig
[root@lab ~]# oc status
In project default on server https://api.ocp413.lab.local:6443
svc/openshift - kubernetes.default.svc.cluster.local
svc/kubernetes - 172.30.0.1:443 -> 6443
View details with 'oc describe <resource>/<name>' or list resources with 'oc get all'.
[root@shift ~]# oc get nodes
NAME                          STATUS   ROLES                  AGE   VERSION
ocp413-ctlplane-0.lab.local   Ready    control-plane,master   68m   v1.26.5+7d22122
ocp413-ctlplane-1.lab.local   Ready    control-plane,master   68m   v1.26.5+7d22122
ocp413-ctlplane-2.lab.local   Ready    control-plane,master   68m   v1.26.5+7d22122
ocp413-worker-0.lab.local     Ready    worker                 51m   v1.26.5+7d22122
ocp413-worker-1.lab.local     Ready    worker                 51m   v1.26.5+7d22122
ocp413-worker-2.lab.local     Ready    worker                 52m   v1.26.5+7d22122
[root@shift ~]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.13.4    True        False         42m     Cluster version is 4.13.4

And logging in via https://console-openshift-console.apps.ocp413.lab.local/ 

Note: If the cluster is not installed on your workstation, it’s may be easier to install a browser on the server then forward X connections, rather than maintaining a local hosts file or modifying local DNS to catch and resolve local cluster queries:

ssh -X user@server 

Success \o/

For detailed kcli documentation see: https://kcli.readthedocs.io/en/latest/

OpenShift: How to determine the latest version in an update channel.

Latest OpenShift Releases using Red Hat OpenShift Console
  1. Visit https://console.redhat.com/openshift/releases
    or
  2. Visit the Red Hat OpenShift Container Update Graph at https://access.redhat.com/labs/ocpupgradegraph/update_channel
  3. Using the CLI (curl & jq):-
curl -s https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.11 | jq -r '.nodes[].version' | sort -V | tail -n1

Also, to check available upgrade edges:-

curl -s -XGET "https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.11" --header 'Accept:application/json' |jq '. as $graph | $graph.nodes | map(.version == "4.10.36") | index(true) as $orig | $graph.edges | map(select(.[0] == $orig)[1]) | map($graph.nodes[.]) | .[].version'

Further examples can be found at https://access.redhat.com/solutions/4583231

GlusterFS / VG Metrics in Prometheus (OCP)

We had a requirement to gather LVM (VG) metrics via Prometheus to alert when GlusterFS is running low on ‘brick’ storage space. Currently, within Openshift 3.9 the only metrics seem to relate to mounted FS. A ‘heketi exporter module’ exists but this only reports space within allocated blocks. There doesn’t appear to be any method to pull metrics from the underlying storage.

We solved this by using a Prometheus pushgateway. Metrics are pushed from Gluster hosts using curl (via cron) and then pulled using a standard Prometheus scrape configuration (via prometheus configmap in OCP). Alerts are then pushed via alertmanager and eventually Cloudforms.

Import the pushgateway image:

oc import-image openshift/prom-pushgateway --from= docker.io/prom/pushgateway --confirm

Create pod and expose route. Then, add scrape config to prometheus configmap:-


- job_name: openshift-pushgateway
scrape_interval: 30s
scrape_timeout: 30s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- pushgateway-route.example.com

On GlusterFS hosts we then gather metrics in whatever way we like and push to the gateway via curl. Example:-

echo "VG_Gluster 42" | curl --data-binary @- http://pushgateway-route.example.com/metrics/job/pv_mon/pv/vg"

The metrics are then visible via prometheus UI / Grafana and alerts via alertmanager and CFME respectively.

Openshift: Recovery from Head Gear (or Node) Failure

This is another question that has been raised several times recently. Perhaps a node vanishes and is unrecoverable, how do we recover from the loss of a head gear? Is it possible to promote a normal gear to head status?

The simple answer appears to be … no.

The solution here is to run backups of /var/lib/openshift on all nodes.

In the case of node failure a fresh node can be built, added to the district, /var/lib/openshift restored from backup then a ‘oo-admin-regenerate-gear-metadata’ executed. This (as the name suggests) recreates metadata associated with all gears on the node. This includes gear entries in passwd/group files, cgroup rules and limits.conf.

OpenShift: Testing of Resource Limits, CGroups

Recently I’ve had two customers asking the same question.

How can we put sufficient load on a gear or node in order to demonstrate:-
a) cgroup limits
b) guaranteed resource allocation
c) ‘worst case scenario’ performance expectations

This is perhaps a reasonable question but very difficult to answer. Most of the limits in OSE are imposed by cgroups, mostly with clearly defined limits (as defined in the nodes /etc/openshift/resource_limits.conf). The two obvious exceptions are disk space (using quota) and CPU.

Whilst CPU is implemented by cgroups, this is defined in terms of shares; You can’t guarantee a gear x cpu cycles, only allocate a share and always in competition with other gears. However, by default a gear will only use one CPU core.

When trying to create a cartridge to demonstrate behavior under load, I quickly realised the openshift-watchman process is quick to throttle misbehaving gears. If during testing you see unexpected behaviour, remember to test with and without watchman running!

I took the DIY cartridge as an example and modified the start hook to start a ‘stress’ process. Environment variables can be set using rhc to specify number of CPU, VM, IO and HD threads. This cartridge does not create network load.

http://www.track3.org.uk/~steve/openshift/openshift-snetting-cartridge-stress-0.0.1-1.el6.x86_64.rpm

Collection and analysis of load/io data is left to the user.

Creating of a ‘stress’ application:-


[steve@broker ~]$ rhc app create snstress stress
Using snetting-stress-0.1 (StressTest 0.1) for 'stress'

Application Options
-------------------
Domain: steve
Cartridges: snetting-stress-0.1
Gear Size: default
Scaling: no

Creating application 'snstress' ... done

Disclaimer: Experimental cartridge to stress test a gear (CPU/IO).
Use top/iotop/vmstat/sar to demonstrate cgroup limits and watchman throttling.
STRESS_CPU_THREADS=1
STRESS_IO_THREADS=0
STRESS_VM_THREADS=0
STRESS_HD_THREADS=0
Note: To override these values use 'rhc env-set' and restart gear
See http://tinyurl.com/procgrr for Resource Management Guide
Stress testing started.

Waiting for your DNS name to be available ... done

Initialized empty Git repository in /home/steve/snstress/.git/

Your application 'snstress' is now available.

URL: http://snstress-steve.example.com/
SSH to: 55647297e3c9c34266000137@snstress-steve.example.com
Git remote: ssh://55647297e3c9c34266000137@snstress-steve.example.com/~/git/snstress.git/
Cloned to: /home/steve/snstress

Run 'rhc show-app snstress' for more details about your app.

‘top’ running on the target node (one core at 100% user):-


top - 14:19:49 up 5:50, 1 user, load average: 0.76, 0.26, 0.11
Tasks: 139 total, 3 running, 135 sleeping, 0 stopped, 1 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 99.7%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1019812k total, 474820k used, 544992k free, 75184k buffers
Swap: 835580k total, 440k used, 835140k free, 80596k cached

Using rhc we stop the application, define some variables (add IO worker threads) and restart:-


[steve@broker ~]$ rhc app stop snstress
RESULT:
snstress stopped
[steve@broker ~]$ rhc app-env STRESS_IO_THREADS=1 --app snstress
Setting environment variable(s) ... done
[steve@broker ~]$ rhc app-env STRESS_VM_THREADS=1 --app snstress
Setting environment variable(s) ... done
[steve@broker ~]$ rhc app-env STRESS_HD_THREADS=1 --app snstress
Setting environment variable(s) ... done
[steve@broker ~]$ rhc app start snstress
RESULT:
snstress started

Check node ‘top’ again (note multiple threads):-


top - 14:23:20 up 5:54, 1 user, load average: 0.53, 0.40, 0.20
Tasks: 142 total, 4 running, 137 sleeping, 0 stopped, 1 zombie
Cpu0 : 1.3%us, 0.3%sy, 0.0%ni, 97.7%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.7%us, 11.9%sy, 0.0%ni, 87.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 2.6%us, 7.3%sy, 0.0%ni, 86.8%id, 2.6%wa, 0.0%hi, 0.0%si, 0.7%st
Cpu3 : 6.6%us, 0.3%sy, 0.0%ni, 92.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.3%st
Mem: 1019812k total, 636048k used, 383764k free, 64732k buffers
Swap: 835580k total, 692k used, 834888k free, 68716k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20637 4325 20 0 262m 198m 176 R 12.0 19.9 0:04.35 stress
20635 4325 20 0 6516 192 100 R 9.6 0.0 0:04.33 stress
20636 4325 20 0 6516 188 96 R 8.0 0.0 0:02.42 stress

Not what’s expected?


[root@node1 ~]# service openshift-watchman status
Watchman is running

Hmmm…


[root@node1 node]# tail -f /var/log/messages
May 26 15:33:55 node1 watchman[7672]: Throttler: throttle => 55647297e3c9c34266000137 (99.99)

… demonstrating watchman is doing its job! But, let’s stop watchman and let the abuse begin…


[root@node1 ~]# service openshift-watchman stop
Stopping Watchman

Top (notice high IO Wait)…


top - 14:26:46 up 5:57, 1 user, load average: 0.70, 0.41, 0.22
Tasks: 142 total, 4 running, 137 sleeping, 0 stopped, 1 zombie
Cpu0 : 0.0%us, 5.4%sy, 0.0%ni, 23.7%id, 69.5%wa, 0.3%hi, 0.3%si, 0.7%st
Cpu1 : 0.3%us, 6.0%sy, 0.0%ni, 64.2%id, 27.8%wa, 0.0%hi, 0.7%si, 1.0%st
Cpu2 : 12.2%us, 0.0%sy, 0.0%ni, 87.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.7%st
Cpu3 : 0.7%us, 11.3%sy, 0.0%ni, 76.4%id, 10.6%wa, 0.0%hi, 0.7%si, 0.3%st

Mem: 1019812k total, 910040k used, 109772k free, 66360k buffers
Swap: 835580k total, 692k used, 834888k free, 339780k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22182 4325 20 0 6516 192 100 R 12.3 0.0 0:00.70 stress
22184 4325 20 0 262m 226m 176 R 10.6 22.7 0:00.60 stress
22185 4325 20 0 7464 1264 152 R 7.3 0.1 0:00.53 stress

Further analysis can be done using vmstat, iotop, sar or your tool of preference.

If IO stops after a few seconds it’s also worth tailing your application log:-


[steve@broker ~]$ rhc tail snstress
[2015-05-26 14:25:34] INFO going to shutdown ...
[2015-05-26 14:25:34] INFO WEBrick::HTTPServer#start done.
stress: info: [21775] dispatching hogs: 1 cpu, 1 io, 1 vm, 1 hdd
[2015-05-26 14:25:35] INFO WEBrick 1.3.1
[2015-05-26 14:25:35] INFO ruby 1.8.7 (2013-06-27) [x86_64-linux]
[2015-05-26 14:25:36] INFO WEBrick::HTTPServer#start: pid=21773 port=8080
stress: FAIL: [21780] (591) write failed: Disk quota exceeded
stress: FAIL: [21775] (394) <-- worker 21780 returned error 1 stress: WARN: [21775] (396) now reaping child worker processes stress: FAIL: [21775] (400) kill error: No such process

I hope someone, somewhere, finds this useful :o)

OSE 2.x Support Node (MongoDB) Firewall

This is effectively a ‘reverse firewall’;  allow everything *except* connections to MongoDB.  A connection to Mongo without authentication can do little more than query the MongoDB db.version() however some still consider this a security risk.

#!/bin/bash -x
#
# Script to firewall Openshift Support (Mongo) Nodes
# 21/04/15 snetting

IPTABLES=/sbin/iptables

# Add all brokers and support nodes here (use FQDNs)
OSE_HOSTS="broker1.domain
broker2.domain
supportnode1.domain
supportnode2.domain"

# Convert to IPs and add localhost
MONGO_IPS=$(dig $OSE_HOSTS +short)
MONGO_IPS="$(echo $MONGO_IPS | tr ' ' ','),127.0.0.1"

# Add iptables ACCEPT rules
$IPTABLES -A INPUT -p tcp -s $MONGO_IPS --destination-port 27017 -j ACCEPT

# Add iptables REJECT (port 27017)
$IPTABLES -A INPUT -p tcp --destination-port 27017 -j REJECT