Beyond the Dashboard: How a 14B LLM Brought Real Intelligence to My Server Monitoring

If you manage Linux servers, you already know the morning ritual. You sip your coffee and stare at your monitoring dashboards. Grafana, Zabbix, Datadog – pick your poison. They are excellent at showing you lines on a graph, but let’s be honest; traditional monitoring is fundamentally “dumb.”

Standard monitoring relies on rigid thresholds. If CPU usage hits 95%, you get an alert. But what if that 95% CPU usage is just the scheduled weekly backup running alongside a routine malware scan? Your dashboard doesn’t care. It fires off an alert anyway, contributing to the slow, inevitable creep of alert fatigue.

I wanted something better. I didn’t just want monitoring; I wanted monitoring plus intelligence.

Recently, I decided to put a relatively small, 14-billion parameter Large Language Model (qwen3:14b) to work on a real-life, practical use case: acting as an intelligent filter for my daily systems audits. The result has completely transformed how I handle server alerts.

The Problem: Alert Fatigue and the Need for Context

You can’t just pipe gigabytes of raw logs into an LLM; you’ll blow past the context window in seconds. But you also can’t rely on simple grep scripts to tell you if a server is actually in trouble.

My goal was to build a system that evaluates the holistic health of the server – logs, malware scans, SELinux alerts, HTTP health checks, and system metrics – and makes a decision about whether I actually need to be bothered.

Crucially, I wanted this delivered via email. I don’t want to be tethered to a VPN to check a dashboard. Email notifies me wherever I am, but only if I actually need to look at it.

The Architecture: Bash Meets Python Meets LLM

To make this work, I built a pipeline that marries traditional Linux CLI tools with modern AI:

  1. The Cronjob: Schedules the logic. It runs daily checks in the background, but forces a full “all-clear” overview email once a week.
  2. The Bash Aggregator: This is the workhorse. It uses standard tools to ruthlessly filter out routine log noise. It grabs only the critical errors from Apache, Exim, and systemd. It runs maldet, pings local and external DNS, calculates PHP-FPM memory averages, and checks disk space, formatting it all into a concise text block.
  3. The Python LLM Wrapper: This script takes the aggregated data and POSTs it to my local LLM API.

The Magic: Contextual Intelligence

The brilliance of this setup lies in the system prompt. Because the 14B model is highly capable of following instructions, I don’t just ask it to summarise. I give it operational authority to decide if an alert is warranted.

If the model evaluates the filtered logs and metrics and finds nothing operationally threatening, it stays entirely quiet. I receive nothing. No daily spam, no ignored “All OK” alerts.

I only get a daily email if the LLM detects a genuine issue. However, to ensure the system hasn’t silently died, I have a cronjob flag (--force-email) that triggers every Sunday. If everything is fine on Sunday, the LLM generates a friendly, comprehensive “All Clear” summary of the week.

Real-World Impact

This shifts the use of LLMs from an interesting novelty to a highly practical engineering tool.

Instead of waking up to a generic “High Memory Usage” alert, the LLM evaluates the context. It looks at the available system RAM, checks the average PHP-FPM process size, realises a pool is misconfigured, and sends an email recommending a specific, mathematically safe max_children value that leaves a 250MB buffer to prevent an Out-Of-Memory kernel panic.

It doesn’t replace human oversight, but it entirely automates the diagnostic triage. By combining the data aggregation of standard monitoring with the contextual intelligence of an LLM, I’m saving significant time and curing my alert fatigue.

And on Sundays, when the forced check-in runs, the prompt is instructed to remind my sysadmin colleagues – to enjoy a good coffee and a slice of cake. It’s a pretty great way to end the week.

From the Ham Shack to the Edge: Delivering Real Value with Local LLMs

What happens when you combine an NVIDIA RTX 3060, an open-weight 14-billion parameter LLM, and a global network of amateur radio operators? You get a surprisingly perfect example of edge computing.

By day, I help enterprises scale complex architectures as a Red Hat OpenShift Technical Account Manager. But like many in our industry, my passion for problem-solving does not stop when I log off. A colleague recently suggested I stop keeping my side projects to myself and share what I build in my spare time. So, let me show you how I am using small, locally hosted AI to solve real-world problems right here in my amateur radio shack.

The Context: Open Source Digital Radio

Amateur Radio is currently transitioning from analogue to digital. Much of the digital voice landscape has historically been dominated by locked-down proprietary codecs. FreeDV changes that; It is a suite of digital voice modes built entirely on open-source software. No secrets, no proprietary lock-in; just a community free to experiment.

The Problem: Finding the Signal in the Noise

When using FreeDV, operators rely on live data APIs to see who is transmitting. While standard tools are great, I wanted a way to instantly identify opportunities to communicate with rare, distant stations (known as ‘DX’ in the community) without manually parsing through an hour of rolling logs.

The Solution: FreeDV Reporter+

I built a free, web-based tool called FreeDV Reporter+. It connects to the FreeDV Socket.IO API to map live activity, but the real magic is the AI integration.

Every hour, the application analyses the live data stream, condenses the logs, and generates a human-readable summary of the best opportunities on the bands.

Under the Hood: Lean AI on Consumer Hardware

The application is a lightweight Python Flask application, containerised then deployed using Podman.

Instead of relying on expensive cloud APIs, the AI summarisation runs locally on a consumer-grade NVIDIA RTX 3060 GPU, using the qwen3-14b model via the OpenWebUI API. This is a perfect example of how a relatively small (14b) parameter model can provide real, tangible value by assessing live data right at the source.

Scaling from the Shack to the Enterprise Edge

Processing data near the physical location of the user is the fundamental definition of Edge computing. While my setup is a grassroots project, the architectural principles apply directly to modern enterprise challenges.

If an organisation wanted to scale an architecture like FreeDV Reporter+ across thousands of locations (perhaps telecom base stations, logistics networks or electrical substations), they would need to deploy and manage these AI applications consistently. This is exactly where Red Hat OpenShift comes in.

OpenShift extends Kubernetes to the edge, allowing teams to manage remote deployments with the same operational consistency as their core data centres. By utilising Red Hat OpenShift AI, teams can serve and monitor right-sized models securely at the edge, reducing latency and preserving bandwidth.

Crucially, OpenShift AI incorporates vLLM to manage the costs and performance of inferencing. vLLM is an open-source framework that drastically increases inference throughput and optimises memory usage. When deploying models to constrained edge environments where compute resources are strictly limited, this level of efficiency is absolutely vital to keeping operations lean and responsive.

Furthermore, with tools like RHEL AI and InstructLab, developers can easily fine-tune models for more specialist purposes. You get enterprise-grade accuracy for specific tasks without the massive compute costs associated with general-purpose behemoths.

The Takeaway

You do not need massive data centres to extract real value from AI. By strategically deploying small, fine-tuned LLMs at the edge, you can deliver real-time insights securely and efficiently. Whether you are hunting for rare radio signals across the globe or optimising a production line, the edge is where AI truly gets to work.

If you are a radio amateur, I would love for you to try out FreeDV Reporter+. And if you are interested in running robust AI workloads at the edge, let us connect and talk about OpenShift!

What I Learnt Giving an LLM Agent Control of My Crypto Wallet

UkkoTrader Dashboard Header showing Net Loss from early trades

In my role as a Senior OpenShift Technical Account Manager at Red Hat, I focus on mission-critical stability; helping organisations navigate the shift from cloud-native architectures to AI-ready operations. But there is a distinct difference between advising on a scalable MLOps workflow and trusting a local LLM to trade your own capital in a volatile market.

Would you trust an AI agent with your bank account? I did; and it was a masterclass in ‘Boom or Bust’ logic.

Continue reading What I Learnt Giving an LLM Agent Control of My Crypto Wallet

Fedora 42 Meets CUDA 12.9: The Quest to Build vllm (InstructLab)

Over the past couple of weeks I’ve been wrestling with building vllm (with CUDA support) under Fedora 42. Here’s the short version of what went wrong:-

  1. Python version confusion
    • My virtualenv was pointing at Python 3.11 but CMake kept complaining it couldn’t find “python3.11.”
    • Fix: explicitly passed -DPYTHON_EXECUTABLE=$(which python) to CMake, which got past the Python lookup errors.
  2. CUDA toolkit headers/libs not found
    • Although Fedora’s CUDA 12.9 RPMs were installed, CMake couldn’t locate CUDA_INCLUDE_DIRS or CUDA_CUDART_LIBRARY.
    • Fix: set CUDA_HOME=/usr/local/cuda-12.9 and passed -DCUDA_TOOLKIT_ROOT_DIR & -DCUDA_SDK_ROOT_DIR to CMake.
  3. cuDNN import errors
    • Pip’s PyTorch import of libcudnn.so.9 failed during the vllm build.
    • Fix: reinstalled torch/cu121 via the official PyTorch Cu121 wheel index so that all the nvidia-cudnn-cu12 wheels were in place.
  4. GCC / Clang version mismatches
    • CUDA 12.9’s nvcc choked on GCC 15 (“unsupported GNU version”) and later on Clang 20.
    • Tried installing gcc-14 and symlinking it into PATH, exporting CC=/usr/bin/gcc-14 / CXX=/usr/bin/g++-14, and even passing -DCMAKE_CUDA_HOST_COMPILER, but CMake’s CUDA‐ID test was still failing on the Fedora header mismatch.
    • Ultimately we switched to Clang 20 with --allow-unsupported-compiler, which let us get past the version “block.”
  5. Math header noexcept conflicts
    • CMake’s nvcc identification build then ran into four “expected a declaration” errors in CUDA’s math_functions.h, caused by mismatched noexcept(true) on sinpi/cospi vs system headers.
    • I patched those lines (removing or adding, I forget which, the trailing noexcept(true)) so cudafe++ could preprocess happily.
  6. Missing NVToolsExt library
    • After all that, CMake could find CUDA and compile—but hit kotlinCopyEditThe link interface of target "torch::nvtoolsext" contains: CUDA::nvToolsExt but the target was not found.
    • Looking under /usr/local/cuda-12.9, there was no libnvToolsExt.so* at all—only the NVTX‐3 interop helper (libnvtx3interop.so*) lived under the extracted toolkit tree.

Current hurdle
I still don’t have the core NVTX library (libnvToolsExt.so.*) in /usr/local/cuda-12.9/…/lib, so the CMake target CUDA::nvToolsExt remains unavailable. This library appears to be missing from the both the Fedora cuda-nvtx and the NVIDIA nvtx toolkit download/runfile. This appears to be a known issue with recent versions.

Work continues and a full process will be documented, once successful.

From Zero to Openshift in 30 Minutes

Discover how to leverage the power of kcli and libvirt to rapidly deploy a full OpenShift cluster in under 30 minutes, cutting through the complexity often associated with OpenShift installations.  

Prerequisites

Server with 8+ cores, minimum of 64GB RAM (96+ for >1 worker node)
Fast IO
– dedicated NVMe libvirt storage or
– NVMe LVMCache fronting HDD (surprisingly effective!)
OS installed (tested with CentOS Stream 8)
Packages libvirt + git installed
Pull-secret (store in openshift_pull.json) obtained from https://cloud.redhat.com/openshift/install/pull-secret

Install kcli

[steve@shift ~]$ git clone https://github.com/karmab/kcli.git
[steve@shift ~]$ cd kcli; ./install.sh

Configure parameters.yml


(see https://kcli.readthedocs.io/en/latest/#deploying-kubernetes-openshift-clusters)

example:-

[steve@shift ~]$ cat parameters.yml
cluster: shift413
domain: shift.local
version: stable
tag: '4.13'
ctlplanes: 3
workers:3
ctlplane_memory:16384
worker_memory:16384
ctlplane_numcpus: 8
worker_numcpus: 4

Note 1: To deploy Single Node Openshift (SNO) set ctlplanes to 1 and workers to 0.

Note 2: Even a fast Xeon with NVMe storage may have difficulty deploying more than 3 workers before the installer times out.
An RFE exists to make the timeout configurable, see:

https://access.redhat.com/solutions/6379571
https://issues.redhat.com/browse/RFE-2512

Deploy cluster

[steve@shift ~]$ kcli create kube openshift --paramfile parameters.yml $cluster

Note: openshift_pull.json and parameters.yml should be in your current working directory, or adjust above as required

Monitor Progress

If you wish to monitor progress, find IP of bootsrap node:-

[steve@shift ~]$ virsh net-dhcp-leases default
 Expiry Time           MAC address         Protocol   IP address           Hostname            Client ID or DUID
---------------------------------------------------------------------------------------------------------------------
 2023-07-19 15:48:02   52:54:00:08:41:71   ipv4       192.168.122.103/24   ocp413-ctlplane-0   01:52:54:00:08:41:71
 2023-07-19 15:48:02   52:54:00:10:2a:9d   ipv4       192.168.122.100/24   ocp413-ctlplane-1   01:52:54:00:10:2a:9d
 2023-07-19 15:46:30   52:54:00:2b:98:2a   ipv4       192.168.122.211/24   ocp413-bootstrap    01:52:54:00:2b:98:2a
 2023-07-19 15:48:03   52:54:00:aa:d7:02   ipv4       192.168.122.48/24    ocp413-ctlplane-2   01:52:54:00:aa:d7:02

then ssh to bootstrap node as core user and follow instructions:-

[steve@shift ~]# ssh core@192.168.122.231
journalctl -b -f -u release-image.service -u bootkube.service
Once cluster is deployed you'll receive the following message:-

INFO Waiting up to 40m0s (until 3:42PM) for the cluster at https://api.ocp413.lab.local:6443 to initialize...
INFO Checking to see if there is a route at openshift-console/console...
INFO Install complete!                            
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/root/.kcli/clusters/ocp413/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.ocp413.lab.local
INFO Login to the console with user: "kubeadmin", and password: "qTT5W-F5Cjz-BIPx2-KWXQx"
INFO Time elapsed: 16m18s                        
Deleting ocp413-bootstrap

Note: Whilst the above credentials can be found later, it’s worthwhile making a note of the above.  I save to a text file on the host.

Confirm Status

[root@shift ~]# export KUBECONFIG=/root/.kcli/clusters/ocp413/auth/kubeconfig
[root@lab ~]# oc status
In project default on server https://api.ocp413.lab.local:6443
svc/openshift - kubernetes.default.svc.cluster.local
svc/kubernetes - 172.30.0.1:443 -> 6443
View details with 'oc describe <resource>/<name>' or list resources with 'oc get all'.
[root@shift ~]# oc get nodes
NAME                          STATUS   ROLES                  AGE   VERSION
ocp413-ctlplane-0.lab.local   Ready    control-plane,master   68m   v1.26.5+7d22122
ocp413-ctlplane-1.lab.local   Ready    control-plane,master   68m   v1.26.5+7d22122
ocp413-ctlplane-2.lab.local   Ready    control-plane,master   68m   v1.26.5+7d22122
ocp413-worker-0.lab.local     Ready    worker                 51m   v1.26.5+7d22122
ocp413-worker-1.lab.local     Ready    worker                 51m   v1.26.5+7d22122
ocp413-worker-2.lab.local     Ready    worker                 52m   v1.26.5+7d22122
[root@shift ~]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.13.4    True        False         42m     Cluster version is 4.13.4

And logging in via https://console-openshift-console.apps.ocp413.lab.local/ 

Note: If the cluster is not installed on your workstation, it’s may be easier to install a browser on the server then forward X connections, rather than maintaining a local hosts file or modifying local DNS to catch and resolve local cluster queries:

ssh -X user@server 

Success \o/

For detailed kcli documentation see: https://kcli.readthedocs.io/en/latest/

OpenShift: How to determine the latest version in an update channel.

Latest OpenShift Releases using Red Hat OpenShift Console
  1. Visit https://console.redhat.com/openshift/releases
    or
  2. Visit the Red Hat OpenShift Container Update Graph at https://access.redhat.com/labs/ocpupgradegraph/update_channel
  3. Using the CLI (curl & jq):-
curl -s https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.11 | jq -r '.nodes[].version' | sort -V | tail -n1

Also, to check available upgrade edges:-

curl -s -XGET "https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.11" --header 'Accept:application/json' |jq '. as $graph | $graph.nodes | map(.version == "4.10.36") | index(true) as $orig | $graph.edges | map(select(.[0] == $orig)[1]) | map($graph.nodes[.]) | .[].version'

Further examples can be found at https://access.redhat.com/solutions/4583231

Stable Diffusion on cpu

Example image of cat
CPU Rendered Cat 512×512

AKA “Stable Diffusion without a GPU” 🙂

Currently, the ‘Use CPU if no CUDA device detected’ [1] pull request has not merged. Following the instructions at [2] and jumping down the dependency rabbit hole, I finally have Stable Diffusion running on an old dual XEON server.

[1] https://github.com/CompVis/stable-diffusion/pull/132
[2] https://github.com/CompVis/stable-diffusion/issues/219

Server specs:-
Fujitsu R290 Xeon Workstation
Dual Intel(R) Xeon(R) CPU E5-2670 @ 2.60GHz
96 GB RAM
SSD storage

Sample command line:-

time python3.8 scripts/txt2img.py --prompt "AI image creation using CPU" --plms --ckpt sd-v1-4.ckpt --H 768 --W 512 --skip_grid --n_samples 1 --n_rows 1 --ddim_steps 50

Initial tests show the following:-

ResolutionStepsRAM
(GB)
Time
(minutes)
768 x 51250~1015
1024 x 76850~3024
1280 x 102450~6566
1536 x 128050OOMN/A
Resolution, Peak RAM and Time to Render

Notes:
1) Typically only 18 (out of 32 cores) active regardless of render size.
2) As expected, the calculation is entirely CPU bound.
3) For an unknown reason, even with –n_samples and –n_rows of 1, two images were still created (time halved for single image in above table).

Another CPU Rendered Cat 512×512

Conclusion:

It works. We gain resolution at the huge expense of memory and time.

AmigaOS4.1 (PPC) under FS-UAE and QEMU

I recently purchased AmigaOS 4.1 with a plan to familiarise myself with the OS via emulation before purchasing the Freescale QorIQ P1022 e500v2 ‘Tabor’ motherboard. In particular, I wanted to investigate the ssh and X display options, including AmiCygnix.

OS4.1 running under FS-UAE & QEMU, showing config and network status

However, despite being familiar with OS3.1 and FS-UAE I still managed to hit a few gotchas with the OS4 install and configuration.

Installation of the QEMU module was simple using the download and simple instructions from: https://fs-uae.net/download#plugins. In my case this was version 3.8.2qemu2.2.0 and installed in ~/Documents/FS-UAE/Plugins/QEMU-UAE/Linux/x86-64/ (your path may vary).

I then tried multiple FS-UAE configurations in order to get the emulated machine to boot with PPC, RTG and network support. A few options clash resulting in a purple screen on boot. Rather than work through the process from scratch, it’s easier to simply list my config here:-

[fs-uae]
accelerator = cyberstorm-ppc
amiga_model = A4000/OS4
gfx_card_flash_file = /home/snetting/Documents/FS-UAE/Kickstarts/picasso_iv_flash.rom
graphics_card = picasso-iv
graphics_memory = 131072
hard_drive_0 = /home/snetting/Amiga/SteveOS41.hdf
kickstart_file = Kickstart v3.1 rev 40.70 (1993)(Commodore)(A4000).rom
network_card = a2065
zorro_iii_memory = 524288

I used FS-UAE (and FS-UAE-Launcher) version 2.8.3.

Things to note:

  1. See http://eab.abime.net/showthread.php?t=75195 for install advice regarding disk partitioning and FS type. This is important!
  2. Shared folders (between host OS and Emulation) are *not* currently supported when using PPC under FS-UAE. Post install, many additional packages were required, including network drivers which resulted in a catch-22 situation. I worked around this by installing a 3.1.4 instance and mounting both the OS4 and ‘shared’ drives here, copying the required files over then booting back into the OS4 PPC environment.
  3. For networking, UAE.bsdsocket.library in UAE should be disabled but the A2065 network card enabled. The correct driver from aminet is: http://aminet.net/package/driver/net/Ethernet
  4. The latest updates to OS4.1 (final) enable Zorro III RAM to be used in addition to accelerator RAM; essential for AmiCygnix. Once OS4.1 is installed and network configured, use the included update tool to pull OS4.1 FE updates.

The documentation at http://eab.abime.net/showthread.php?t=75195 is definitely useful as a reference but don’t rely on it; it’s dated (2014) and not necessarily accurate.

Whilst I’ve written this from memory, I’ll happily recreate my install from scratch if anyone has any specific questions or issues.

Good luck!

ROMs are available from Cloanto: https://www.amigaforever.com/
OS4.1 and updates from Hyperion: https://www.amigaos.net/

Atari ST and Amiga Desktop Wallpapers

I couldn’t find any good quality 1920×1080 (so called ‘full HD’) desktop wallpapers featuring either Atari ST GEM or Commodore Amiga Workbench 1.3. So, assembled from parts taken from various images on google, scaled with correct aspect ration maintained, tidied and assembled to fill the full resolution and with no JPEG compression artifacts – here we are:-

Atari GEM Desktop, 1920×1080 PNG
Commodore Amiga Workbench 1.3 + Boing Ball, 1920×1080 PNG

You’re welcome 🙂

Building qtel (Echolink client) under Fedora Linux

With both my previous bad experience building qtel (the Linux EchoLink client) and recent discussions on a forum around similar difficulties – I thought I’d identify, resolve and document the issues.

I’m not sure what’s changed but the process is now very simple (Fedora 28):-

git clone https://github.com/sm0svx/svxlink.git
cd svxlink/
cd src
sudo dnf install cmake libsigc++20-devel qt-devel popt-devel libgcrypt-devel gsm-devel tcl-devel
cmake .
make
cp bin/qtel DESTINATION_PATH_OF_CHOICE

Depending on libs already installed, additional packages may be required – as indicated by failures during the ‘cmake’ stage.