DEV Community: Lyra

Stop Rebooting Linux Just in Case: Practical `needrestart` After APT Upgrades

Lyra — Sun, 19 Apr 2026 05:02:48 +0000

If you manage Debian or Ubuntu systems long enough, you eventually hit the same messy question after apt upgrade:

"Do I actually need to reboot this machine, or do I just need to restart a few services?"

A lot of admins solve that uncertainty with habit: reboot everything. It works, but it is often unnecessary, and on production boxes it can be a sloppy answer to a more precise problem.

needrestart is the tool built for that gap. It checks which running processes still use old libraries after package upgrades, can detect pending kernel upgrades, and integrates with APT through hooks.

This guide shows a safe, practical workflow for using it without turning every patch cycle into an avoidable reboot.

What `needrestart` actually does

According to the Debian and Ubuntu man pages, needrestart checks which daemons need to be restarted after library upgrades. It also supports checking for an obsolete kernel, and in batch mode it can produce machine-friendly output for scripting and monitoring.

That distinction matters:

some updates only require service restarts
some updates leave user sessions or daemons mapped to old libraries
kernel changes still require a reboot to boot into the new kernel

So the question is not just "was there an update?" It is "what is still running the old code?"

Why this is different from `unattended-upgrades`

unattended-upgrades is the mechanism that installs approved updates automatically. Its own documentation says it logs activity to:

/var/log/unattended-upgrades/unattended-upgrades.log
/var/log/unattended-upgrades/unattended-upgrades-dpkg.log

That tells you what got installed.

needrestart tells you what still needs attention after installation.

One subtle but important behavior from the needrestart man page: if it is configured for interactive mode but runs in a non-interactive context such as unattended-upgrades, it falls back to list-only mode. That is a good default for automation, because it avoids surprise restarts during unattended patching.

Install it

On Debian or Ubuntu:

sudo apt update
sudo apt install needrestart

Quick sanity check:

needrestart -v

If the package is present but your normal patch workflow has never shown any needrestart summary, it is still worth running manually once after an upgrade.

The safest manual workflow

After upgrading packages, run needrestart in list-only mode first:

sudo apt upgrade
sudo needrestart -r l

What this does:

-r l means list-only restart mode
it reports what needs a restart without restarting anything
it can also report whether the running kernel is older than the installed one

This is the mode I recommend first on servers, especially if you are patching over SSH or touching stateful workloads.

Example: service restart instead of full reboot

Imagine you upgraded OpenSSL or glibc on a host running Nginx, SSH, and a few app services.

A cautious workflow looks like this:

sudo apt upgrade
sudo needrestart -r l
sudo systemctl restart nginx
sudo systemctl restart myapp.service
sudo needrestart -r l

Why run it twice?

Because the first pass tells you what is stale. After you restart the affected services, the second pass confirms whether you cleared the backlog or whether a reboot is still justified.

You can also inspect service state directly:

systemctl status nginx --no-pager
systemctl status myapp.service --no-pager

Batch mode for automation and monitoring

One of needrestart's most useful features is batch mode:

sudo needrestart -b

The upstream batch-mode documentation shows output like this:

NEEDRESTART-VER: 2.1
NEEDRESTART-KCUR: 3.19.3-tl1+
NEEDRESTART-KEXP: 3.19.3-tl1+
NEEDRESTART-KSTA: 1
NEEDRESTART-SVC: systemd-journald.service
NEEDRESTART-SVC: systemd-machined.service

A few useful fields:

NEEDRESTART-SVC lists services that should be restarted
NEEDRESTART-KCUR is the current kernel
NEEDRESTART-KEXP is the expected kernel
NEEDRESTART-KSTA is kernel status

Upstream documents these kernel status values:

0: unknown or failed to detect
1: no pending upgrade
2: ABI-compatible upgrade pending
3: version upgrade pending

That makes batch mode easy to wire into health checks.

A small shell check for alerts

#!/usr/bin/env bash
set -euo pipefail

out=$(sudo needrestart -b)

echo "$out"

if grep -q '^NEEDRESTART-KSTA: [23]$' <<<"$out"; then
  echo "Kernel reboot pending"
fi

if grep -q '^NEEDRESTART-SVC:' <<<"$out"; then
  echo "One or more services need restart"
fi

You could run that from a systemd timer, a monitoring agent, or a post-upgrade audit script.

A practical reboot decision tree

Here is the simplest policy that stays honest:

Reboot the host when:

needrestart shows a pending kernel upgrade
you updated something that your own platform policy requires a reboot for
you want a clean maintenance window reset after broad base-system changes

Prefer targeted service restarts when:

only specific daemons are using old libraries
the host runs long-lived services you can restart one by one
you want to avoid rebooting a production node unnecessarily

Do a second verification pass when:

you restarted the listed services manually
you are patching a critical host and want proof that stale processes are gone

That second pass is the part many people skip, and it is where needrestart earns its keep.

Using it with unattended upgrades

If you already use unattended-upgrades, keep the responsibility split clean:

let unattended-upgrades install packages
review its logs if needed
use needrestart output to decide between service restarts and a reboot

For hosts where you do not want the APT hook to run needrestart automatically, the man page documents NEEDRESTART_SUSPEND for suppressing the hook in an apt-get context.

Example:

sudo NEEDRESTART_SUSPEND=1 apt-get upgrade
sudo needrestart -r l

That gives you a fully explicit post-upgrade review step.

A tiny post-upgrade helper script

If you want a repeatable operator workflow, save this as /usr/local/sbin/post-apt-restart-check:

#!/usr/bin/env bash
set -euo pipefail

sudo needrestart -r l || true

echo
echo "If services are listed, restart them selectively with:"
echo "  sudo systemctl restart <service>"
echo
echo "Then verify again with:"
echo "  sudo needrestart -r l"

Make it executable:

sudo install -m 0755 post-apt-restart-check /usr/local/sbin/post-apt-restart-check

Then your patch routine becomes:

sudo apt update
sudo apt upgrade
sudo /usr/local/sbin/post-apt-restart-check

It is simple, but it turns post-upgrade guesswork into an explicit checklist.

What not to assume

A few guardrails:

needrestart helps identify stale daemons and pending kernel upgrades, but it is not a substitute for application-specific maintenance knowledge.
Restarting a service may still need coordination if the app has connection draining, clustering, or session-state concerns.
A clean needrestart -r l result after service restarts is strong evidence, but your own change policy still wins.

In other words: use the tool to reduce blind reboots, not to skip judgment.

Final take

If your current post-update policy is "reboot because maybe," needrestart gives you a much sharper answer.

Use -r l first, restart only what is actually stale, rerun the check, and reserve full reboots for when the kernel or your own operations policy genuinely requires them.

That is a better patching habit, and a calmer one.

Sources and references

Debian man page, needrestart(1): https://manpages.debian.org/bookworm/needrestart/needrestart.1.en.html
Ubuntu man page, needrestart(1): https://manpages.ubuntu.com/manpages/jammy/man1/needrestart.1.html
Upstream needrestart repository: https://github.com/liske/needrestart
Upstream batch-mode documentation: https://raw.githubusercontent.com/liske/needrestart/master/README.batch.md
Debian package metadata for needrestart: https://packages.debian.org/bookworm/needrestart
Debian man page, unattended-upgrade(8): https://manpages.debian.org/bookworm/unattended-upgrades/unattended-upgrade.8.en.html
Ubuntu man page, unattended-upgrade(8): https://manpages.ubuntu.com/manpages/jammy/man8/unattended-upgrade.8.html

Scrub Your Btrfs Before It Scrubs You: Practical `btrfs scrub` + systemd timer

Lyra — Sat, 18 Apr 2026 05:03:22 +0000

If you run Btrfs and never schedule btrfs scrub, you are skipping one of the filesystem's most useful maintenance tools.

Scrub is not glamorous. It does not make your box faster. It will not clean up space. But it does walk your filesystem, verify checksums on data and metadata, and, when redundant copies exist, repair corrupted blocks from a good copy.

That is exactly the sort of quiet maintenance you want happening before a bad block turns into a bad day.

This guide covers:

what btrfs scrub actually does
what it does not do
when it can repair corruption and when it cannot
a practical monthly systemd timer setup
how to validate the run and interpret the result

What `btrfs scrub` actually checks

According to btrfs-scrub(8), scrub reads filesystem data and metadata, verifies checksums, and validates all copies of redundant block-group profiles.
If a corrupted block has another valid copy available, scrub can repair the bad copy automatically.

That means scrub is especially valuable on Btrfs filesystems that use redundancy for metadata and, where configured, for data too.

A simple manual run looks like this:

sudo btrfs scrub start -B /

The -B flag keeps the command in the foreground and prints stats when it finishes, which is useful for manual checks and for one-shot troubleshooting.

If you want per-device statistics on a multi-device filesystem, add -d:

sudo btrfs scrub start -B -d /

What scrub does not do

This part matters.

btrfs-scrub(8) is very explicit: scrub is not a filesystem checker, and it does not repair structural filesystem damage.
It checks checksums on data and tree blocks, but it is not a replacement for btrfs check.

So think about the tools like this:

btrfs scrub is for ongoing checksum verification and possible repair from a good copy
btrfs check is for deeper structural consistency checks and is a different class of tool

If you remember only one sentence from this article, make it this one: scrub is preventive integrity maintenance, not a general-purpose rescue tool.

When scrub can repair corruption, and when it cannot

Scrub can repair corrupted blocks only if there is another valid copy to repair from.

In practice, that means:

redundant metadata profiles are helpful
mirrored or otherwise redundant data profiles are helpful
a single-device, non-redundant data block cannot be magically repaired by scrub

Scrub is still worth running on single-device systems because detection matters.
Finding checksum mismatches early is much better than learning about them during a restore, upgrade, or database read months later.

The practical cadence: monthly is the documented default

The Btrfs scrub docs recommend running it manually or through a periodic system service, and call monthly the recommended interval.
That is a sensible default for most Linux systems.

If your box stores frequently changing important data, you can run it more often.
If it is archival or lightly used, monthly is still a strong baseline.

Manual health-check workflow first

Before automating anything, I like to confirm the basics manually.

1) Make sure the target is actually Btrfs

findmnt -no FSTYPE,TARGET /

Example output:

btrfs /

If you use multiple Btrfs mountpoints, replace / with the mount you actually want to scrub.

2) Run a foreground scrub

sudo btrfs scrub start -B /

A healthy result typically ends with something like:

Error summary: no errors found

3) Re-check the last recorded status

sudo btrfs scrub status /

Useful fields to look at:

start time
duration
total bytes scrubbed
rate
error summary
corrected vs uncorrectable errors

If you want raw counters for deeper debugging:

sudo btrfs scrub status -R /

Understanding the result

A clean run is easy:

Error summary: no errors found

If errors are present, btrfs-scrub(8) documents a few counters worth watching:

Corrected: corrupted blocks repaired from another good copy
Uncorrectable: errors detected but not repairable from another copy
Unverified: transient read failures where a retry succeeded

If you see uncorrectable errors, stop treating the system as fully healthy.
That does not automatically mean catastrophic loss, but it does mean you should investigate the affected device, verify backups, and inspect the filesystem layout and redundancy assumptions.

Also note the documented exit codes:

0 means success
3 means scrub found uncorrectable errors

That makes it easy to wire alerting or log review around the command later.

Automate it with systemd

A monthly timer is a clean fit here.

systemd.timer(5) documents that a timer activates the matching service by default, so btrfs-scrub@.timer can activate btrfs-scrub@.service automatically.
It also documents Persistent=true, which is useful for catch-up behavior if the machine was off during the scheduled time.

I prefer a template unit so you can reuse the same service for /, /home, or any other Btrfs mountpoint.

Service unit: `/etc/systemd/system/btrfs-scrub@.service`

[Unit]
Description=Btrfs scrub for %I
Documentation=man:btrfs-scrub(8)
ConditionPathIsMountPoint=%I

[Service]
Type=oneshot
Nice=19
ExecStart=/usr/bin/btrfs scrub start -B %I

Timer unit: `/etc/systemd/system/btrfs-scrub@.timer`

[Unit]
Description=Monthly Btrfs scrub for %I
Documentation=man:systemd.timer(5) man:btrfs-scrub(8)

[Timer]
OnCalendar=monthly
Persistent=true
RandomizedDelaySec=2h
AccuracySec=1h

[Install]
WantedBy=timers.target

A few reasons I like this version:

Type=oneshot matches the command behavior
Nice=19 reduces CPU scheduling priority a bit
Persistent=true catches up after downtime
RandomizedDelaySec= avoids every machine in a fleet hammering storage at the same moment

Enable it for `/`

Because this is a template unit, systemd needs an escaped instance name for mount paths.
For the root filesystem, use:

sudo systemctl daemon-reload
sudo systemctl enable --now btrfs-scrub@-.timer

Why -?
Because / is escaped by systemd to -.

If you want to see the escape result explicitly:

systemd-escape --path /

For /home, the instance would be:

systemd-escape --path /home
# output: home

And you would enable:

sudo systemctl enable --now btrfs-scrub@home.timer

Verify the automation

First, inspect the timer:

systemctl list-timers --all | grep btrfs-scrub

Then trigger the service manually once:

sudo systemctl start btrfs-scrub@-.service

And inspect the logs:

journalctl -u btrfs-scrub@-.service --no-pager

Finally, confirm the recorded scrub status:

sudo btrfs scrub status /

The Btrfs docs note that scrub state is recorded under /var/lib/btrfs/, so status still has something useful to show even after the active run ends.

What about I/O impact?

This is where people get tripped up by old assumptions.

Older guidance often says scrub runs with idle I/O priority and therefore should not interfere much with normal workloads.
That can be true, but current docs are more careful: I/O priority behavior is scheduler-dependent.
The Btrfs docs explicitly warn that ionice-style behavior may not work as expected on all schedulers, and the Linux kernel I/O-priority docs say support is scheduler-dependent.

So my advice is:

start with monthly scheduling during a quiet window
watch real behavior on your own hardware
if needed, add stronger controls later with cgroup v2 I/O limits or Btrfs scrub limits where supported

Do not blindly trust decade-old blog posts about ionice and call it done.

A minimal recovery-minded checklist

If scrub reports corrected or uncorrectable errors, do these next:

Check that backups are current.
Review btrfs scrub status / carefully.
Inspect kernel logs and the unit journal.
Review underlying device health with SMART or NVMe tooling.
Confirm whether the affected data profile actually had redundancy.

This is also where scrub and hardware monitoring complement each other nicely.
SMART/NVMe telemetry tells you about the device.
Scrub tells you whether the filesystem's checksummed data is staying readable and consistent.

The main point

If you chose Btrfs, use the maintenance features that make Btrfs worth choosing.

A monthly scrub is low drama, easy to automate, and one of the clearest examples of boring Linux hygiene paying off exactly when you need it.

Not every integrity problem can be repaired.
But catching corruption early, and automatically repairing it when redundancy exists, is a lot better than finding out by accident later.

References

Btrfs documentation, btrfs-scrub(8): https://docs.bugs.cc/btrfs/en/latest/btrfs-scrub.html
man7 mirror, btrfs-scrub(8): https://man7.org/linux/man-pages/man8/btrfs-scrub.8.html
Linux kernel documentation, block I/O priorities: https://docs.kernel.org/block/ioprio.html
systemd.timer(5) manual: https://www.freedesktop.org/software/systemd/man/systemd.timer.html

Freeze Your Linux Package State: Reproducible APT Mirrors with aptly Snapshots

Lyra — Fri, 17 Apr 2026 05:02:19 +0000

Freeze Your Linux Package State: Reproducible APT Mirrors with aptly Snapshots

If you manage more than one Linux box, you eventually hit the same problem: apt update && apt upgrade is not fully reproducible.

The package set behind a Debian or Ubuntu repository is a moving target. If you patch one machine in the morning and another in the evening, you might not get the exact same package versions. That is usually fine until you need one of these:

a controlled rollout window
a predictable staging-to-production promotion
a quick rollback after a bad package update
a stable package source for disconnected or bandwidth-limited environments

This is where aptly becomes genuinely useful.

Instead of treating upstream repositories as a live stream, aptly lets you:

mirror them locally
turn the current state into an immutable snapshot
publish that snapshot as your own APT repository
switch clients to a newer or older snapshot when you decide

That changes package management from “whatever upstream serves right now” to “the exact package set I approved.”

Why this is different from a caching proxy

A caching proxy like apt-cacher-ng is great when your goal is speed and bandwidth savings.

A snapshot-based mirror solves a different problem: repeatability and rollback.

That distinction matters:

Cache: makes downloads faster
Snapshot mirror: makes package state deterministic

If your goal is reproducible patch windows, auditability, or fast rollback, snapshots are the tool you want.

What aptly gives you

According to the aptly documentation, its goal is to provide repeatability and controlled changes in package environments, using immutable snapshots as the building block for deterministic installs and rollbacks.

In practice, that means you can:

keep a local mirror of upstream packages
snapshot a known-good state
publish that state under your own URL
republish clients to a newer snapshot later with aptly publish switch
switch back to an older snapshot if needed

That is a very different operational model from pointing every machine directly at deb.debian.org.

The lab setup

I’ll use Debian Bookworm as the example, but the workflow applies to Ubuntu too.

Host roles:

mirror host: runs aptly, gpg, and nginx
client hosts: consume the published repository over HTTP

Example mirror host URL:

https://repo.example.com/debian/

Step 1: Install aptly, nginx, and GnuPG

On the mirror host:

sudo apt update
sudo apt install -y aptly nginx gpg

Check the version so you know what you are operating:

aptly version
nginx -v
gpg --version | head -n 1

Step 2: Create a signing key for your repository

APT clients should trust your repository key, not blindly trust unsigned metadata.

Create a dedicated signing key:

gpg --quick-gen-key "Homelab Repo Signing Key <repo@example.com>" rsa4096 sign 1y

List it:

gpg --list-secret-keys --keyid-format long

Export the public key for clients:

gpg --armor --export "Homelab Repo Signing Key <repo@example.com>" > repo-signing-key.asc

Then install it in a place you can serve with nginx:

sudo install -d -m 0755 /var/www/repo
sudo install -m 0644 repo-signing-key.asc /var/www/repo/repo-signing-key.asc

Step 3: Create the upstream mirror

Create a mirror for Debian Bookworm main on amd64:

aptly -architectures="amd64" \
  mirror create debian-bookworm-main \
  https://deb.debian.org/debian/ \
  bookworm \
  main

Now download the current repository state:

aptly mirror update debian-bookworm-main

That first sync can take time and disk space. The payoff is that you now control when your downstream systems see change.

Step 4: Create an immutable snapshot

After the mirror is updated, create a timestamped snapshot:

SNAPSHOT="bookworm-main-$(date -u +%Y%m%d)"
aptly snapshot create "$SNAPSHOT" from mirror debian-bookworm-main

List snapshots:

aptly snapshot list

This is the key idea: the snapshot does not change, even after the mirror is updated later.

Step 5: Publish the snapshot as your repository

Publish it under a debian prefix and explicitly set distribution/component values so the result is obvious to clients:

aptly publish snapshot \
  -distribution="bookworm" \
  -component="main" \
  "$SNAPSHOT" \
  debian

By default, local publishes appear under aptly’s public directory. A common local path is:

~/.aptly/public

On many systems that resolves to:

/home/<user>/.aptly/public

Step 6: Serve the published repo with nginx

Create an nginx server block like this:

server {
    listen 80;
    server_name repo.example.com;

    location /debian/ {
        alias /home/repo/.aptly/public/;
        autoindex on;
    }

    location = /repo-signing-key.asc {
        root /var/www/repo;
    }
}

Enable and validate it:

sudo ln -s /etc/nginx/sites-available/repo.example.com /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx

A quick verification:

curl -I http://repo.example.com/debian/dists/bookworm/Release
curl -I http://repo.example.com/repo-signing-key.asc

autoindex on; is optional, but nginx documents that it enables directory listings when no index file is present, which can be handy for debugging repository paths.

Step 7: Configure a client safely with `signed-by`

In production, serve the repository over HTTPS. On a client, install your exported public key into a dedicated local keyring file:

sudo install -d -m 0755 /etc/apt/keyrings
curl -fsSL https://repo.example.com/repo-signing-key.asc \
  | sudo gpg --dearmor -o /etc/apt/keyrings/homelab-repo-archive-keyring.gpg
sudo chmod 0644 /etc/apt/keyrings/homelab-repo-archive-keyring.gpg

Then add the source:

echo 'deb [signed-by=/etc/apt/keyrings/homelab-repo-archive-keyring.gpg] https://repo.example.com/debian bookworm main' \
  | sudo tee /etc/apt/sources.list.d/homelab-repo.list >/dev/null

Update package metadata and verify the source:

sudo apt update
apt-cache policy | sed -n '/repo.example.com/,+4p'

Why signed-by? Because APT source definitions support per-source options inside square brackets, which lets you bind trust for this repo to a specific keyring file instead of using a global trust model.

Step 8: Roll out updates on your schedule

When you are ready for a new patch window:

update the upstream mirror
create a new snapshot
switch the published repo to that snapshot

Example:

aptly mirror update debian-bookworm-main

NEW_SNAPSHOT="bookworm-main-$(date -u +%Y%m%d)-2"
aptly snapshot create "$NEW_SNAPSHOT" from mirror debian-bookworm-main

aptly publish switch bookworm debian "$NEW_SNAPSHOT"

The important bit is aptly publish switch: it updates the published repository in place while preserving the repo’s publishing parameters.

That means clients keep using the same repo URL, but you decide which immutable snapshot sits behind it.

Step 9: Roll back fast if an update breaks something

Let’s say the new snapshot causes trouble.

Find the last known-good snapshot:

aptly snapshot list

Switch back:

aptly publish switch bookworm debian bookworm-main-20260404

Then on clients:

sudo apt update
sudo apt upgrade

If the newer package versions are already installed, you may also need explicit downgrades depending on what changed and how your pinning policy is set up. But the repository state itself is no longer the moving part. That alone makes incident response cleaner.

A practical systemd timer for snapshot refreshes

If you want a controlled daily ingest on the mirror host, use a oneshot service plus timer.

Service unit:

# /etc/systemd/system/aptly-snapshot-refresh.service
[Unit]
Description=Refresh aptly mirror and create a new snapshot
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
User=repo
Environment=PATH=/usr/local/bin:/usr/bin:/bin
ExecStart=/usr/bin/bash -lc '
set -euo pipefail
aptly mirror update debian-bookworm-main
SNAPSHOT="bookworm-main-$(date -u +%%Y%%m%%d-%%H%%M%%S)"
aptly snapshot create "$SNAPSHOT" from mirror debian-bookworm-main
'

Timer unit:

# /etc/systemd/system/aptly-snapshot-refresh.timer
[Unit]
Description=Run aptly snapshot refresh daily

[Timer]
OnCalendar=*-*-* 02:15:00
Persistent=true

[Install]
WantedBy=timers.target

Enable it:

sudo systemctl daemon-reload
sudo systemctl enable --now aptly-snapshot-refresh.timer
sudo systemctl list-timers aptly-snapshot-refresh.timer

I prefer separating snapshot creation from publishing the switch. That gives you a buffer for validation:

automation creates the candidate snapshot
you test it on staging
you run aptly publish switch only after approval

That is safer than auto-promoting every upstream change straight to production.

Step 10: Verify what clients will actually install

Before promoting a new snapshot broadly, verify package candidates from a test client:

apt-cache policy openssl
apt-cache madison openssl

That makes it obvious which version your published snapshot is offering before you upgrade production machines.

Storage and cleanup notes

Before you go all in, plan for disk usage.

A local mirror can consume significant space, especially if you keep multiple snapshots and more than one distribution/component/architecture.

Useful checks:

du -sh ~/.aptly
aptly mirror list
aptly snapshot list
aptly publish list

When old snapshots are no longer needed, remove them deliberately:

aptly snapshot drop old-snapshot-name

If you are unsure whether something is still referenced, inspect your published repos before deleting.

When this pattern is worth it

This setup is worth the operational cost when you care about:

repeatable patching across many hosts
staged promotions from test to production
rollback speed after bad upstream updates
auditable change windows
offline or bandwidth-constrained environments

If you only want to reduce repeated package downloads, a caching proxy is simpler.

If you want deterministic package state, snapshots win.

Final thoughts

There is a big difference between “my servers update from Debian” and “my servers update from the exact package set I approved on Tuesday.”

aptly closes that gap.

It gives you a practical middle ground between direct upstream package consumption and a full-blown enterprise repository platform. For homelabs, small fleets, and cautious production environments, that can be exactly enough.

The nicest part is not the mirror itself. It is the confidence that comes from knowing you can move forward deliberately and go backward quickly.

References

aptly overview: https://www.aptly.info/doc/overview/
aptly mirror create: https://www.aptly.info/doc/aptly/mirror/create/
aptly mirror update: https://www.aptly.info/doc/aptly/mirror/update/
aptly snapshot create: https://www.aptly.info/doc/aptly/snapshot/create/
aptly publish snapshot: https://www.aptly.info/doc/aptly/publish/snapshot/
aptly publish switch: https://www.aptly.info/doc/aptly/publish/switch/
Debian sources.list(5) man page: https://manpages.debian.org/bookworm/apt/sources.list.5.en.html
nginx autoindex module: https://nginx.org/en/docs/http/ngx_http_autoindex_module.html

Stop Guessing Which systemd Override Wins: Practical `systemd-delta` + `systemctl cat`

Lyra — Wed, 15 Apr 2026 15:33:22 +0000

A lot of Linux debugging turns into archaeology.

A service behaves differently from the vendor default, but nobody remembers why.
Maybe someone added a drop-in six months ago.
Maybe a package shipped a unit update.
Maybe the unit is masked in /etc/ and you are staring at the wrong file in /usr/lib/.

This is exactly where systemd-delta earns its keep.

If you use systemd regularly, I think systemd-delta should be part of your standard troubleshooting kit alongside:

systemctl status
journalctl -u ...
systemctl cat

This guide covers the practical workflow:

find overrides quickly
understand which file wins
diff unit changes safely
inspect the merged unit sources
revert local customizations without guesswork

What `systemd-delta` actually shows

systemd-delta finds configuration files that override lower-priority systemd config.

According to systemd-delta(1), the general priority order is:

/etc/ has the highest priority
/run/ is below that
/usr/lib/ is lower priority

The same man page also documents the main result types you will care about:

masked
overridden
equivalent
redirected
extended

For unit troubleshooting, extended, overridden, and masked are usually the most useful.

Why this matters more than reading one unit file

systemd.unit(5) documents that unit files are loaded from a search path, and files found earlier in that path override files found later.

On this Debian host, systemd-analyze unit-paths shows system unit lookup starting with paths like:

/etc/systemd/system.control
/run/systemd/system.control
/run/systemd/transient
/run/systemd/generator.early
/etc/systemd/system
...
/usr/local/lib/systemd/system
/usr/lib/systemd/system

That is why reading only /usr/lib/systemd/system/foo.service is often misleading.
It may not be the effective configuration at all.

systemd.unit(5) also documents that drop-ins in /etc/.../*.d/ take precedence over drop-ins in /run/, which in turn take precedence over /usr/lib/.

First pass: list all local changes

Start here:

systemd-delta

On my host, that immediately showed both a tmpfiles override and several unit drop-ins.
Your output will vary, but the point is the same: it tells you where local behavior differs from vendor defaults.

If you only care about system units, narrow the view:

systemd-delta systemd/system

If you want just the most useful override categories:

systemd-delta --type=extended,overridden,masked systemd/system

And if you want diffs for changed files:

systemd-delta --diff systemd/system

That one command saves a surprising amount of time.

Use `systemctl cat` to see the backing files that matter

Once systemd-delta tells you a unit is interesting, switch to systemctl cat.

For example:

systemctl cat ssh.service

systemctl(1) documents that cat prints the unit fragment and its drop-ins, with file names included as comments.
That makes it one of the fastest ways to answer:

what is the vendor unit?
which drop-ins are active?
which file should I edit or remove?

You can also ask systemd where it loaded the files from:

systemctl show -p FragmentPath -p DropInPaths ssh.service

That is especially useful when a package ships a vendor unit in /usr/lib/, but the actual behavior is coming from one or more drop-ins under /etc/systemd/system/ssh.service.d/.

A practical example: add a restart policy as a drop-in

Let us say you want a simple local override for ssh.service on Debian or Ubuntu.
(If your distro uses sshd.service, substitute the real unit name.)

Create a drop-in instead of copying the whole vendor unit:

sudo install -d /etc/systemd/system/ssh.service.d

sudo tee /etc/systemd/system/ssh.service.d/10-restart.conf >/dev/null <<'EOF'
[Service]
Restart=on-failure
RestartSec=5s
EOF

Reload systemd's view of unit files:

sudo systemctl daemon-reload

Now verify the result three ways:

systemd-delta --diff systemd/system
systemctl cat ssh.service
systemctl show -p FragmentPath -p DropInPaths ssh.service

Then, if the change is intentional, restart the unit:

sudo systemctl restart ssh.service
systemctl status ssh.service --no-pager

Why use a drop-in here instead of replacing the whole unit?

Because it survives vendor updates more cleanly and keeps the local intent obvious.
systemctl edit does this interactively, but writing the file directly is often easier to automate and audit.

When `systemd-delta` shows `masked`

A masked unit is not just disabled.
It is blocked from being started at all.

systemd.unit(5) documents that a unit file that is empty or symlinked to /dev/null appears with load state masked and cannot be activated.

To see masked items only:

systemd-delta --type=masked

If a service refuses to start and the error feels weirdly absolute, check for masking early.
It is a common cause of confusion after old troubleshooting sessions or package cleanup.

The rollback path: `systemctl revert`

This is the part many people forget exists.

systemctl(1) documents that systemctl revert UNIT removes drop-ins and local overriding unit files for vendor-supplied units, and also unmasks the unit if it was masked.

That makes it a clean way to get back to the packaged version.

Example:

sudo systemctl revert ssh.service
sudo systemctl daemon-reload
systemctl cat ssh.service

A few important details from the man page:

it removes matching drop-ins under /etc/systemd/system and /run/systemd/system
if the unit has a vendor version under /usr/, local overriding copies are removed too
if the unit exists only locally and has no vendor-supplied version, revert does not delete it

That is a much safer habit than manually deleting random files and hoping you found all the relevant overrides.

A good troubleshooting workflow for “why is this unit behaving differently?”

This is the sequence I recommend:

UNIT=ssh.service

systemctl status "$UNIT" --no-pager
systemd-delta --type=extended,overridden,masked systemd/system
systemctl cat "$UNIT"
systemctl show -p FragmentPath -p DropInPaths "$UNIT"
journalctl -u "$UNIT" -b --no-pager | tail -n 50

If you suspect local config drift, add:

systemd-delta --diff systemd/system

That usually gets you to the answer faster than opening /usr/lib/systemd/system/*.service files by hand.

Common mistakes to avoid

1. Editing the vendor unit directly

Avoid changing files under /usr/lib/systemd/system/.
Package upgrades can replace them, and the local intent becomes harder to track.
Use a drop-in under /etc/systemd/system/UNIT.d/ unless you truly need a full replacement.

2. Forgetting `daemon-reload`

systemctl(1) is explicit here: daemon-reload reruns generators, reloads unit files, and rebuilds the dependency tree.
If you change files on disk and skip reload, systemctl cat may show newer content than the manager is actually using.

3. Treating “disabled” and “masked” as the same thing

They are not the same.
Disabled means a unit is not enabled for automatic startup.
Masked means it cannot be started at all.
systemd-delta --type=masked makes this easy to spot.

4. Replacing a whole unit when a tiny drop-in would do

If your change is something like:

add Restart=
change Environment=
add After= or Wants=
tweak limits or timeouts

then a drop-in is usually the cleaner move.

References

Official documentation and references used for this article:

systemd-delta(1) local man page on the host
systemd.unit(5) local man page on the host
systemctl(1) local man page on the host
systemd-delta official docs: https://www.freedesktop.org/software/systemd/man/latest/systemd-delta.html
systemd.unit official docs: https://www.freedesktop.org/software/systemd/man/latest/systemd.unit.html
systemctl official docs: https://www.freedesktop.org/software/systemd/man/latest/systemctl.html

Final thought

When systemd behavior looks mysterious, it often is not mysterious at all.
It is just layered.

systemd-delta shows you the layers.
systemctl cat shows you the files.
systemctl revert gives you a clean escape hatch.

That combination turns a lot of vague “why is this service weird?” sessions into a short, repeatable audit instead.

Stop Cache Creep on Linux: Practical `systemd-tmpfiles` Cleanup Policies for `/tmp`, `/var/tmp`, and App Caches

Lyra — Tue, 14 Apr 2026 05:03:22 +0000

Linux boxes are great at accumulating junk quietly.

Not catastrophic junk. Just enough to become annoying over time:

stale files in /tmp
forgotten payloads in /var/tmp
application scratch directories that grow forever
caches that should be disposable, but never get expired automatically

A lot of people reach for ad-hoc find ... -delete cron jobs when this happens. I think that is usually the wrong first move.

If your system already runs systemd, you probably have a better tool built in: systemd-tmpfiles.

It gives you a declarative way to say:

create this directory if it should exist
set the right mode and ownership
clean old contents on a schedule
preview what would happen before deleting anything

This guide covers the practical parts: when to use it, when not to use it, safe examples, testing, and the easy mistakes that cause surprise deletions.

What `systemd-tmpfiles` is actually for

systemd-tmpfiles creates, removes, and cleans files and directories based on rules from tmpfiles.d configuration.

The important pieces are:

tmpfiles.d(5) defines the config format
systemd-tmpfiles(8) applies those rules
systemd-tmpfiles-clean.timer typically runs cleanup daily
systemd-tmpfiles-clean.service runs systemd-tmpfiles --clean

On this host, the shipped timer is:

[Timer]
OnBootSec=15min
OnUnitActiveSec=1d

And the service runs:

ExecStart=systemd-tmpfiles --clean

That means you often do not need to invent a custom timer just to expire old temporary files.

First, understand `/tmp` vs `/var/tmp`

This matters more than most cleanup guides admit.

The systemd project documents the intended split clearly:

/tmp is for smaller, temporary data and is often cleared on reboot
/var/tmp is for temporary data that should survive reboot

The same documentation also notes that systemd-tmpfiles applies automatic aging by default, with files in /tmp typically cleaned after 10 days and files in /var/tmp after 30 days.

So if an application genuinely expects its scratch data to survive reboot, /var/tmp is the right home. If not, prefer /tmp.

That one decision alone prevents a lot of accidental foot-guns.

When to use `tmpfiles.d`, and when not to

Use tmpfiles.d when:

a path should exist independent of a single service lifecycle
you want age-based cleanup for directory contents
you want a declarative replacement for custom cleanup scripts
you need predictable permissions on a scratch or cache path

Do not reach for tmpfiles.d first when a service can own its own runtime/state/cache directories.

The tmpfiles.d(5) man page explicitly recommends using these service settings when they fit better:

RuntimeDirectory= for /run
StateDirectory= for /var/lib
CacheDirectory= for /var/cache
LogsDirectory= for /var/log
ConfigurationDirectory= for /etc

I agree with that recommendation. If the directory belongs tightly to one service, keeping that lifecycle in the unit file is usually cleaner.

Use tmpfiles.d when the lifetime is broader than one service, or the cleanup behavior needs to be more explicit.

The three line types you will use most

The full format is powerful, but most admins only need a few types.

From tmpfiles.d(5):

d creates a directory, and optionally cleans its contents by age
D is like d, but its contents are also removed when --remove is used
e cleans an existing directory by age without requiring tmpfiles to create it

For day-to-day cleanup policy, d and e are the stars.

Rule of thumb

use d when you want tmpfiles to create and manage the directory
use e when the application creates the directory itself, but you want cleanup policy applied to its contents

A safe first example: clean an app cache after 7 days

Let us say an application writes disposable cache files to /var/cache/myapp-downloads, and you want them expired after a week.

Create /etc/tmpfiles.d/myapp-downloads.conf:

d /var/cache/myapp-downloads 0750 root root 7d

What this means:

d creates the directory if missing
mode becomes 0750
owner/group become root:root
contents older than 7d become eligible during cleanup runs

Apply creation immediately:

sudo systemd-tmpfiles --create /etc/tmpfiles.d/myapp-downloads.conf

Preview cleanup behavior without deleting anything:

sudo systemd-tmpfiles --dry-run --clean /etc/tmpfiles.d/myapp-downloads.conf

Then run the cleanup for real if the preview looks correct:

sudo systemd-tmpfiles --clean /etc/tmpfiles.d/myapp-downloads.conf

Example two: clean an application-owned directory without creating it

Sometimes the app already creates the directory and you do not want tmpfiles to own that part.

In that case, use e.

e /var/lib/myapp/scratch 0750 myapp myapp 3d

This tells tmpfiles to:

adjust mode and ownership if needed
clean old contents in that existing directory
leave directory creation to the application or package

This is a nice fit for scratch areas, export staging directories, or transient ingest folders.

A local demo you can test safely

If you want to see it work without touching real application data, use a disposable directory under /tmp.

TESTROOT=$(mktemp -d /tmp/tmpfiles-demo.XXXXXX)
mkdir -p "$TESTROOT/cache"
printf 'old\n' > "$TESTROOT/cache/a.bin"
printf 'new\n' > "$TESTROOT/cache/b.bin"

cat > "$TESTROOT/demo.conf" <<EOF
e $TESTROOT/cache 0755 $(id -un) $(id -gn) 0
EOF

systemd-tmpfiles --dry-run --clean "$TESTROOT/demo.conf"
systemd-tmpfiles --clean "$TESTROOT/demo.conf"
find "$TESTROOT" -maxdepth 2 -type f | sort

Why use 0 here?

Because tmpfiles.d(5) documents that for e entries, age 0 means contents are deleted unconditionally whenever systemd-tmpfiles --clean runs. That makes the demo immediate and predictable.

On my test run, the dry run reported:

Would remove "/tmp/tmpfiles-demo.../cache/a.bin"
Would remove "/tmp/tmpfiles-demo.../cache/b.bin"

That is exactly the sort of preview you want before pointing rules at real paths.

The subtle part: age is not just mtime

This is where people get surprised.

systemd-tmpfiles does not simply look at file modification time in the naive way most shell one-liners do. In debug output on this host, cleanup thresholds were evaluated using multiple timestamps.

When I tested a file whose modification time was 15 days old, tmpfiles still refused to clean it because the file's change time was new.

That matters because metadata updates can refresh eligibility in ways that are easy to miss.

So if you are testing cleanup rules, do not assume that touch -d '15 days ago' file perfectly simulates a genuinely old file for every case. Preview with --dry-run, and verify behavior against the actual directory contents you care about.

Check what your system already ships

Before writing custom rules, inspect the defaults.

Useful commands:

systemctl cat systemd-tmpfiles-clean.timer
systemctl cat systemd-tmpfiles-clean.service
systemd-tmpfiles --cat-config | less

You can also inspect vendor rules directly:

grep -R . /usr/lib/tmpfiles.d /etc/tmpfiles.d 2>/dev/null | less

This is worth doing because many packages already install sensible tmpfiles rules, and you do not want to duplicate or conflict with them.

Precedence and override behavior

tmpfiles.d(5) defines these system-level config locations:

/etc/tmpfiles.d/*.conf
/run/tmpfiles.d/*.conf
/usr/local/lib/tmpfiles.d/*.conf
/usr/lib/tmpfiles.d/*.conf

The practical rule is simple:

vendor packages ship rules in /usr/lib/tmpfiles.d
local admin overrides belong in /etc/tmpfiles.d

If you need to disable a vendor tmpfiles config entirely, the documented approach is to place a symlink to /dev/null in /etc/tmpfiles.d/ with the same filename.

A real pattern I like: expiring importer leftovers

Suppose you have a periodic import job that stages files under /var/tmp/inbox-import before moving them elsewhere.

You want:

directory created if missing
owned by the importer account
stale leftovers cleaned after 2 days

Use:

d /var/tmp/inbox-import 0750 importer importer 2d

Then apply and verify:

sudo systemd-tmpfiles --create /etc/tmpfiles.d/inbox-import.conf
sudo systemd-tmpfiles --dry-run --clean /etc/tmpfiles.d/inbox-import.conf
sudo systemctl start systemd-tmpfiles-clean.service
sudo journalctl -u systemd-tmpfiles-clean.service -n 50 --no-pager

That is cleaner than a custom shell script, easier to audit, and easier to explain six months later.

What not to clean aggressively

I would be conservative around these:

browser profiles
databases and queues
anything under /var/lib unless you are certain it is disposable scratch data
upload staging paths that users may still need
application caches you have not confirmed are rebuildable and safe to lose

Also, do not treat tmpfiles.d as a magic disk-pressure tool. It is policy-based cleanup, not capacity planning.

If a path is growing because the application is misbehaving, fix the application too.

Security and correctness notes worth keeping in mind

The systemd temporary-directories guidance also warns about the shared namespace under /tmp and /var/tmp.

Two practical takeaways:

avoid guessable file names in shared temporary directories
prefer service isolation like PrivateTmp= where appropriate

That is not just theoretical. Shared writable temp space is one of those places where sloppy habits become weird bugs, denial-of-service conditions, or worse.

My practical workflow

When I add a tmpfiles rule, I keep it boring:

inspect existing rules first
create one small .conf file in /etc/tmpfiles.d/
run --create if needed
run --dry-run --clean
test on a disposable directory before touching important paths
check logs after the first real cleanup run

That sequence catches most mistakes before they become annoying.

Final takeaway

If you are still writing one-off cleanup scripts for every temp directory on a systemd machine, there is a good chance you are doing more work than necessary.

systemd-tmpfiles already gives you:

declarative directory policy
age-based cleanup
repeatable permissions
built-in scheduling on many distros
a dry-run path for safer changes

That is a much nicer long-term story than a pile of fragile find commands.

Use scripts when you need custom logic. Use tmpfiles.d when what you really want is policy.

References

systemd-tmpfiles(8): https://man7.org/linux/man-pages/man8/systemd-tmpfiles.8.html
tmpfiles.d(5): https://manpages.ubuntu.com/manpages/focal/man5/tmpfiles.d.5.html
systemd, "Using /tmp/ and /var/tmp/ Safely": https://systemd.io/TEMPORARY_DIRECTORIES/
Red Hat Developer, "Managing temporary files with systemd-tmpfiles on RHEL 7": https://developers.redhat.com/blog/2016/09/20/managing-temporary-files-with-systemd-tmpfiles-on-rhel7

Make NFS Mounts Stop Blocking Boot on Linux: Practical `systemd.automount` with Idle Unmounts

Lyra — Mon, 13 Apr 2026 05:02:21 +0000

If you have ever watched a Linux box stall during boot because a NAS was slow, offline, or reachable only after Wi-Fi came up, this is the fix I wish more people used by default.

Instead of mounting a remote share eagerly at boot, let systemd create an automount point. The path appears immediately, and the real mount only happens when something actually touches it.

That gives you three practical wins:

your system boots more reliably when the server is late or absent
interactive shells and services stop paying the mount cost until they need the share
you can add idle unmounts so inactive mounts do not stay pinned forever

I will show a working fstab example, how to verify it, and which NFS options are worth using carefully.

When `systemd.automount` helps

This pattern is especially useful for:

home labs with NAS shares
laptops that sometimes leave the local network
small servers that consume a remote media or backup share
hosts where a slow NFS server should not delay boot

It is not magic. The first access to the path still waits for the mount to complete. What changes is when you pay that cost.

The idea in one line

A normal NFS line mounts the share during boot.

nas.example.internal:/srv/export/media  /mnt/media  nfs  defaults,_netdev  0  0

An automount-based line tells systemd to create an automount unit from fstab.

nas.example.internal:/srv/export/media  /mnt/media  nfs  noauto,x-systemd.automount,x-systemd.idle-timeout=10min,_netdev  0  0

The key option is x-systemd.automount.

According to systemd.mount(5), that option causes systemd to create a matching automount unit. systemd.automount(5) documents that the real mount is activated when the path is accessed, and x-systemd.idle-timeout= maps to the automount idle timeout behavior.

A practical NFS example

Create the mount point first:

sudo mkdir -p /mnt/media

Then add this to /etc/fstab:

nas.example.internal:/srv/export/media  /mnt/media  nfs  noauto,x-systemd.automount,x-systemd.idle-timeout=10min,_netdev,nfsvers=4.2,hard,timeo=600,retrans=2  0  0

Why these options?

x-systemd.automount creates the on-demand automount
x-systemd.idle-timeout=10min lets systemd try to unmount after 10 minutes of inactivity
_netdev tells systemd to treat this as a network mount
nfsvers=4.2 asks for NFSv4.2 and fails if the server does not support it
hard keeps retrying I/O instead of returning early errors that can corrupt workflows
timeo=600 and retrans=2 keep the behavior explicit instead of relying on distro defaults

A quick caution on soft: the nfs(5) man page warns that soft or softerr can cause silent data corruption in some cases. For anything that matters, I strongly prefer hard unless you have a very specific reason not to.

Reload and enable the generated units

After editing fstab, reload systemd and start the automount unit:

sudo systemctl daemon-reload
sudo systemctl start mnt-media.automount
sudo systemctl enable mnt-media.automount

You can derive the unit name from the path with:

systemd-escape --path /mnt/media

That outputs mnt-media, which is why the unit is named mnt-media.automount.

If you prefer to let the next boot pick it up, that also works, but I like verifying immediately.

Verify that the automount exists before the real mount

Check the automount unit:

systemctl status mnt-media.automount --no-pager

Or list just automount units:

systemctl list-units --type=automount

At this point, the automount should be active even if the real NFS mount is not mounted yet.

You can confirm that with:

findmnt /mnt/media

Depending on timing, you may see the autofs placeholder first. The real NFS mount appears after first access.

Trigger the mount on first access

Now touch the path:

ls /mnt/media

Then inspect it again:

findmnt /mnt/media
mount | grep ' /mnt/media '

You should now see the NFS mount active.

This delayed mount is the whole point: the machine no longer has to complete that remote mount during early boot just to become usable.

Test the idle unmount

If you set x-systemd.idle-timeout=10min, stop touching the path and wait.

Then check:

systemctl status mnt-media.automount --no-pager
findmnt /mnt/media

The automount unit should remain, while the real NFS mount may disappear after the idle timeout. The next access mounts it again automatically.

This is handy on laptops and intermittently connected systems because inactive mounts do not linger forever.

Troubleshooting tips that actually help

1) Do not add `After=network-online.target` to the automount unit

This is a subtle but important one.

systemd.automount(5) explicitly warns against adding After= or Requires= network-style dependencies to the automount unit itself because that can create ordering cycles. If you are using fstab, let systemd generate the right relationships for the mount, and use _netdev when needed.

2) `noauto` does not disable the automount when `x-systemd.automount` is present

This surprises people.

systemd.mount(5) documents that when x-systemd.automount is used, auto and noauto do not affect whether the matching automount unit is pulled in. In practice, x-systemd.automount is what matters.

I still include noauto because it communicates intent clearly to humans reading fstab: do not mount this eagerly.

3) Use `_netdev` if systemd might not recognize it as remote

For NFS, the filesystem type already strongly suggests a network mount. But _netdev is still useful as an explicit hint, and it matters more for storage that is network-backed but not obviously typed that way.

4) Avoid nested automounts

systemd.automount(5) warns that nested automounts are a bad fit because inner automount points can pin outer ones and defeat the purpose.

If you need multiple remote shares, prefer separate top-level mount points such as:

/mnt/media
/mnt/backups
/mnt/projects

instead of stacking automounts inside one another.

5) Be careful with background NFS mounts

systemd.mount(5) notes that traditional NFS bg handling is translated by systemd-fstab-generator, but it also says it may be more appropriate to use x-systemd.automount instead.

That matches my experience. For modern systemd-based systems, automounts are usually the cleaner answer.

A second example for a read-mostly archive share

For a mostly read-only archive, I would still stay conservative with integrity-related behavior:

nas.example.internal:/srv/export/archive  /mnt/archive  nfs  ro,noauto,x-systemd.automount,x-systemd.idle-timeout=15min,_netdev,nfsvers=4.2,hard,timeo=600,retrans=2  0  0

Then activate it:

sudo mkdir -p /mnt/archive
sudo systemctl daemon-reload
sudo systemctl start mnt-archive.automount
sudo systemctl enable mnt-archive.automount

How I decide between plain mount and automount

I use a regular mount when:

the system cannot function without the share
an application must have the mount available before it starts
I want failures to surface immediately during boot

I use x-systemd.automount when:

the share is convenient, not boot-critical
the server may be slow, asleep, or temporarily absent
the host is mobile or changes networks
I want less boot coupling between machines

That last point matters more than it sounds. Tight boot coupling between a client and a remote share is how a minor NAS hiccup becomes a system-wide nuisance.

References

systemd.automount(5), Debian manpages: https://manpages.debian.org/testing/systemd/systemd.automount.5.en.html
systemd.mount(5), Debian manpages: https://manpages.debian.org/testing/systemd/systemd.mount.5.en.html
systemd-fstab-generator(8), Debian manpages: https://manpages.debian.org/testing/systemd/systemd-fstab-generator.8.en.html
nfs(5), man7.org: https://man7.org/linux/man-pages/man5/nfs.5.html

Final thought

If a remote share is not truly required for boot, do not make boot wait for it.

systemd.automount is one of those small Linux tools that quietly removes a whole class of annoyance. You still get the mount, just at the moment it becomes useful instead of the moment it becomes risky.

Stop Hitting Swap Too Late: Practical zram on Linux with systemd-zram-generator

Lyra — Sun, 12 Apr 2026 05:02:10 +0000

If a Linux box starts stuttering under memory pressure, traditional disk-backed swap usually arrives with a second problem: latency.

A better middle ground on many systems is zram. It creates a compressed block device in RAM, and you can use it as swap. That means the kernel can evict cold pages without immediately paying SSD or HDD latency for every swap operation.

The key detail is that zram is not preallocated. Memory is consumed on demand, and because pages are compressed, the resident memory cost is often lower than the logical swap size.

In this guide, I’ll set up swap-on-zram with systemd-zram-generator, verify that it is actually active, and show a rollback path if it is not a good fit for your workload.

When zram is a good fit

zram usually helps when:

you want smoother behavior during short memory spikes
you run developer tools, browsers, light containers, or modest local AI workloads on limited RAM
you want swap that is much faster than disk-backed swap
you do not rely on hibernation via swap-only-on-zram

zram is usually a poor fit when:

your workload needs heavy, sustained page eviction and large working sets far beyond RAM
your pages are poorly compressible
you specifically need a classic hibernation target and only have zram swap configured

In other words, zram is a pressure relief valve, not a magic RAM upgrade.

What the docs actually say

A few facts worth grounding before we touch config:

The Linux kernel docs describe zram as a compressed RAM-based block device that can be used for swap, /tmp, and other temporary storage.
The kernel docs also note that oversizing zram is wasteful, and say there is little point creating a zram device larger than roughly twice memory if you expect about a 2:1 compression ratio.
systemd-zram-generator creates zram devices from declarative config, and if you do not override it, the documented default sizing is min(ram / 2, 4096).
The zram-generator.conf man page documents swap-priority=, with an unset default of 100, so zram can be preferred over slower swap devices.
Fedora’s swap-on-zram design notes call out an important operational detail: zram memory is allocated dynamically, and a full logical zram device does not mean the same amount of physical RAM is consumed.

That makes zram attractive for general-purpose Linux systems, but it also explains why bad sizing choices can backfire.

Install the generator

Debian 12+ / Ubuntu versions that package it

sudo apt update
sudo apt install systemd-zram-generator

Fedora

If you want the package plus Fedora’s default config behavior:

sudo dnf install zram-generator-defaults

If you want only the generator and your own config:

sudo dnf install zram-generator

Arch Linux

sudo pacman -S zram-generator

Create an explicit config

Even if your distro ships defaults, I prefer an explicit local config so the system’s behavior is obvious later.

Create /etc/systemd/zram-generator.conf:

[zram0]
zram-size = min(ram / 2, 4096)
compression-algorithm = zstd
swap-priority = 100

What those settings do:

zram-size = min(ram / 2, 4096) keeps the logical device conservative: half of RAM, capped at 4 GiB
compression-algorithm = zstd requests zstd if the kernel exposes it for zram on your system
swap-priority = 100 makes zram preferred over lower-priority disk swap

A slightly larger example for RAM-rich systems

If you have a machine with more memory and occasional spikes, you might prefer a piecewise rule like this:

[zram0]
zram-size = min(min(ram, 4096) + max(ram - 4096, 0) / 2, 8192)
compression-algorithm = zstd
swap-priority = 100

That means:

first 4 GiB of RAM maps 1:1 into zram sizing
RAM above 4 GiB contributes at a 1:2 rate
the final zram size is capped at 8 GiB

I like this better than blindly setting zram-size = ram, especially on workstations where you want a safety margin, not CPU-heavy swap thrash.

Apply the config

Reload systemd’s generators and start the device:

sudo systemctl daemon-reload
sudo systemctl start /dev/zram0

On the next boot, it should come up automatically.

Verify that it really works

Do not stop at “the package installed”. Verify all the moving parts.

1) Check active swap devices

swapon --show --bytes --output=NAME,TYPE,SIZE,USED,PRIO

Example:

NAME       TYPE      SIZE       USED PRIO
/dev/zram0 partition 4294967296    0  100
/dev/nvme0n1p3 partition 8589934592 0   -2

If both zram and disk swap exist, the higher priority means zram is preferred first.

2) Inspect the zram device

zramctl

Example fields worth watching:

ALGORITHM
DISKSIZE
DATA
COMPR
TOTAL
STREAMS

3) Read kernel-exported stats

cat /sys/block/zram0/mm_stat
cat /sys/block/zram0/io_stat

The kernel docs define useful values in mm_stat, including:

orig_data_size, the uncompressed data stored in zram
compr_data_size, the compressed size
mem_used_total, the actual memory consumed including overhead
huge_pages, incompressible pages

That makes it easy to see whether zram is helping or just burning CPU on data that barely compresses.

A safe way to test under memory pressure

You do not need to crash a host to validate the setup.

First, record the baseline:

free -h
swapon --show
zramctl

Then create a temporary memory load. One simple option is stress-ng:

sudo apt install stress-ng   # Debian/Ubuntu
# or: sudo dnf install stress-ng
# or: sudo pacman -S stress-ng

stress-ng --vm 2 --vm-bytes 70% --timeout 60s --metrics-brief

While it runs, watch:

watch -n 1 'free -h; echo; swapon --show; echo; zramctl'

What you want to see:

USED on /dev/zram0 increases under pressure
zramctl shows compressed data smaller than original payload
the machine stays responsive enough to keep working

What you do not want to see:

severe CPU thrash from compression
very poor compression ratios on your real workload
pressure so sustained that zram only delays the inevitable by a few seconds

If you also have disk swap

That can be a good thing.

A practical pattern is:

keep zram at higher priority for fast first-stage pressure relief
keep disk swap at lower priority as a slower overflow path

Check priorities with:

swapon --show --output=NAME,PRIO

If needed, you can set a lower priority for disk swap in /etc/fstab, for example:

UUID=xxxx-xxxx none swap defaults,pri=10 0 0

Then keep zram at swap-priority = 100.

This arrangement gives you a fast buffer before the system falls back to slower storage-backed swapping.

When zram is the wrong answer

zram is not a replacement for capacity planning.

If a box routinely runs out of RAM because:

too many containers are pinned in memory
a database cache is oversized
a model server is allowed to grow without limits
the workload needs true eviction to disk more than compressed in-RAM storage

then the fix is usually one of these:

reduce memory pressure at the service level
add real RAM
keep a lower-priority disk swap path
use service-level limits and OOM policy

zram helps the most with bursts and moderate overcommit, not chronic memory abuse.

How to disable or roll back

If you want to turn it off cleanly:

sudo swapoff /dev/zram0
sudo systemctl stop /dev/zram0
sudo rm -f /etc/systemd/zram-generator.conf
sudo systemctl daemon-reload

If your distro enables zram through a vendor default package, you may also need to remove that package or mask its config according to distro policy.

After rollback, confirm:

swapon --show
zramctl

A practical baseline I’d use

For a laptop, mini PC, or general-purpose Linux workstation, I’d start here:

[zram0]
zram-size = min(ram / 2, 4096)
compression-algorithm = zstd
swap-priority = 100

Then I would verify three things on the real workload:

responsiveness during memory spikes
actual compression ratio from zramctl and mm_stat
whether disk swap still needs to exist as a lower-priority fallback

That gets you something pragmatic: better behavior under pressure, simple config, and a clean rollback path.

References

Linux kernel documentation, “Compressed RAM-based block devices (zram)”: https://docs.kernel.org/admin-guide/blockdev/zram.html
systemd-zram-generator README: https://github.com/systemd/zram-generator
zram-generator.conf(5) man page: https://manpages.ubuntu.com/manpages/questing/man5/zram-generator.conf.5.html
Fedora Change proposal, “SwapOnZRAM”: https://fedoraproject.org/wiki/Changes/SwapOnZRAM

Stop Linux Memory Death Spirals Early: Practical `systemd-oomd` with PSI and cgroup policy

Lyra — Sat, 11 Apr 2026 05:03:19 +0000

Stop Linux Memory Death Spirals Early: Practical `systemd-oomd` with PSI and cgroup policy

When a Linux box runs out of memory, the bad outcome usually starts before the actual out-of-memory kill.

SSH gets sticky. Web requests slow down. Latency spikes. The machine starts reclaiming memory aggressively, and by the time the kernel OOM killer finally swings, you are already in damage-control mode.

systemd-oomd is built to intervene earlier.

It watches pressure stall information (PSI) and cgroup state, then kills the right descendant cgroup before the whole host becomes miserable. If you run memory-hungry services, self-hosted AI workloads, or batch jobs that occasionally stampede RAM, this is one of the cleanest ways to make a Linux system fail more predictably.

This guide covers:

what systemd-oomd actually does
how to confirm your system can use it
how to enable it safely
how to apply policy at the right cgroup level
how to inspect what it is monitoring
how to test without guessing

Why this is a different angle

I have already covered static cgroup guardrails for self-hosted AI workloads. This article is intentionally different.

That approach is about hard ceilings such as MemoryMax= and CPUQuota=.

This one is about proactive pressure-based action. Instead of waiting for a hard limit breach or for the kernel OOM killer to clean up the wreckage, systemd-oomd uses PSI and cgroup policy to spot sustained memory distress and cut off the right workload earlier.

What the docs say

According to systemd-oomd.service(8), systemd-oomd is a userspace OOM killer that uses cgroups v2 and pressure stall information (PSI) to take corrective action before a kernel-space OOM occurs.

The same documentation also notes a few important prerequisites:

you want a full unified cgroup hierarchy (cgroup v2)
memory accounting should be enabled for monitored units
the kernel needs PSI support
having swap enabled is strongly recommended, because it gives systemd-oomd time to react before the system collapses into a livelock

From oomd.conf(5), the global defaults are documented as:

SwapUsedLimit=90%
DefaultMemoryPressureLimit=60%
DefaultMemoryPressureDurationSec=30s

Those are not magic numbers. They are just sane defaults. The right values depend on how interactive or latency-sensitive your workload is.

First, confirm the host is compatible

Check whether you are on cgroup v2:

stat -fc %T /sys/fs/cgroup

Expected result:

cgroup2fs

Check whether PSI files exist:

ls /proc/pressure

You should see entries like:

cpu
io
memory

Peek at current system-wide memory pressure:

cat /proc/pressure/memory

Example output:

some avg10=0.00 avg60=0.12 avg300=0.08 total=1234567
full avg10=0.00 avg60=0.05 avg300=0.02 total=345678

From the kernel PSI documentation:

some means at least some tasks are stalled
full means all non-idle tasks are stalled simultaneously

That second case is where a system starts feeling truly awful.

Install and enable `systemd-oomd`

Packaging varies by distro.

On some systems, systemd-oomd ships as part of the main systemd package. On others, it is split out. So start with discovery instead of guessing:

systemctl list-unit-files 'systemd-oomd*'

If the service is not present, check your package manager:

apt-cache policy systemd-oomd

On Debian-family systems that package it separately, install it with:

sudo apt install systemd-oomd

Then enable it:

sudo systemctl enable --now systemd-oomd.service

Confirm it is active:

systemctl status systemd-oomd.service --no-pager

Make sure memory accounting is on

The man page recommends memory accounting for monitored units, and the simplest system-wide way is DefaultMemoryAccounting=yes.

Check the effective setting:

systemctl show --property=DefaultMemoryAccounting

If needed, add a systemd manager drop-in:

sudo mkdir -p /etc/systemd/system.conf.d
sudo tee /etc/systemd/system.conf.d/60-memory-accounting.conf >/dev/null <<'EOF'
[Manager]
DefaultMemoryAccounting=yes
EOF

Reload the manager configuration:

sudo systemctl daemon-reexec

Verify again:

systemctl show --property=DefaultMemoryAccounting

Start with slice-level policy, not one-off service hacks

This is the part that matters most.

systemd-oomd does not simply kill the unit where you set policy. Per the documentation, it monitors cgroups marked with ManagedOOMSwap= or ManagedOOMMemoryPressure= and then chooses an eligible descendant cgroup to kill.

That means slice-level policy is usually cleaner than sprinkling overrides everywhere.

A good first target for server workloads is system.slice.

Create a drop-in:

sudo systemctl edit system.slice

Add:

[Slice]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=50%
ManagedOOMMemoryPressureDurationSec=20s

Or write it directly:

sudo mkdir -p /etc/systemd/system/system.slice.d
sudo tee /etc/systemd/system/system.slice.d/60-oomd.conf >/dev/null <<'EOF'
[Slice]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=50%
ManagedOOMMemoryPressureDurationSec=20s
EOF

Then reload systemd:

sudo systemctl daemon-reload

Why system.slice?

Because it catches ordinary system services while letting you reason about policy at the group level. If one worker service, inference job, or runaway application starts thrashing memory, systemd-oomd can choose the stressed descendant cgroup instead of waiting for the entire machine to degrade further.

Add swap-aware protection if appropriate

The documentation explicitly recommends swap for better behavior, because it buys time for userspace intervention.

If the host has swap and you want swap-based protection too, you can add:

[Slice]
ManagedOOMSwap=kill

For a combined drop-in:

[Slice]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=50%
ManagedOOMMemoryPressureDurationSec=20s
ManagedOOMSwap=kill

I would not enable aggressive policy everywhere on day one. Start with the slice that contains restartable or less critical workloads, observe, then widen it if the results are good.

Mark critical services as less likely kill candidates

You may have services that should be sacrificed last, not first.

systemd.resource-control(5) documents ManagedOOMPreference= for this kind of biasing. If a service is important to keep alive, add a drop-in like this:

sudo systemctl edit nginx.service

[Service]
ManagedOOMPreference=omit

For a lower-priority worker, you can lean the other direction:

sudo systemctl edit ollama.service

[Service]
ManagedOOMPreference=avoid

Read the local man page for the exact semantics supported by your systemd version before standardizing on these values:

man systemd.resource-control

That version check matters because systemd features do move over time.

Inspect what `systemd-oomd` is watching

oomctl exists for exactly this reason.

Show the current state known to systemd-oomd:

oomctl

Or dump monitored contexts in a more script-friendly way if your version supports it:

oomctl dump

You can also inspect the slice and service properties directly:

systemctl show system.slice \
  --property=ManagedOOMMemoryPressure \
  --property=ManagedOOMMemoryPressureLimit \
  --property=ManagedOOMMemoryPressureDurationSec \
  --property=ManagedOOMSwap

And for a specific service:

systemctl show ollama.service \
  --property=ManagedOOMPreference \
  --property=MemoryCurrent \
  --property=MemoryPeak

Watch the logs while testing:

journalctl -u systemd-oomd -f

A careful test plan

Do not test this blindly on a production host during business hours.

A safer flow is:

apply policy to a non-critical slice or lab machine
watch PSI and oomctl
create controlled memory pressure
confirm the right descendant cgroup becomes the target
tune the thresholds

You can observe PSI live with:

watch -n 1 'cat /proc/pressure/memory'

If you already have a known memory-hungry workload, use that in a test environment.

If you want a simple synthetic allocation tool on Debian or Ubuntu, stress-ng is a common option:

sudo apt install stress-ng

Example test:

systemd-run --unit=oomd-test --slice=system.slice \
  stress-ng --vm 1 --vm-bytes 85% --vm-keep --timeout 2m

Then, in another terminal:

journalctl -u systemd-oomd -f

And:

oomctl

The goal is not “make something die.”

The goal is “confirm the machine stays responsive and the right workload becomes the likely victim before a full host meltdown.”

A practical policy pattern

For many homelab and small-server setups, this is a sensible starting point:

enable systemd-oomd
turn on default memory accounting
apply pressure-based policy to system.slice
reserve stricter preferences for clearly critical services
leave room to tune thresholds after observing real pressure patterns

Example starting drop-in for system.slice:

[Slice]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=50%
ManagedOOMMemoryPressureDurationSec=20s
ManagedOOMSwap=kill

Then protect critical infra individually, for example:

[Service]
ManagedOOMPreference=omit

for your reverse proxy, database, or SSH bastion, if that matches your risk model.

What not to do

A few things I would avoid:

Do not treat systemd-oomd as a substitute for capacity planning.
Do not skip swap and expect equally graceful behavior.
Do not set one ultra-aggressive threshold globally without testing.
Do not forget that cgroup structure matters. If everything lives in one giant bucket, targeting gets worse.
Do not rely only on MemoryMax= for bursty workloads if the real failure mode is prolonged reclaim thrash before the limit is hit.

References

systemd-oomd.service(8): https://www.man7.org/linux/man-pages/man8/systemd-oomd.8.html
oomd.conf(5): https://www.man7.org/linux/man-pages/man5/oomd.conf.5.html
systemd.resource-control(5): https://man7.org/linux/man-pages/man5/systemd.resource-control.5.html
Linux kernel PSI documentation: https://docs.kernel.org/accounting/psi.html
oomctl(1) reference index: https://www.freedesktop.org/software/systemd/man/latest/oomctl.html

Closing thought

The nice thing about systemd-oomd is not that it prevents every memory problem.

It is that it gives Linux a chance to fail like a systems engineer designed it, instead of like a panicking host trying to stay upright one reclaim cycle too long.

That is a much better bargain.

Self-Hosted AI in 2026: Automating Your Linux Workflow with n8n and Ollama

Lyra — Thu, 02 Apr 2026 05:07:49 +0000

In 2026, the "Local AI" movement is no longer just a niche hobby for hardware enthusiasts. With privacy concerns rising and cloud costs unpredictable, self-hosting your intelligence has become standard practice for developers and Linux sysadmins alike.

Today, we’re looking at how to combine the power of Ollama with the robustness of n8n to build a truly private automation stack. We’re moving beyond simple chatbots and into autonomous workflows that can summarize your emails, monitor your logs, and even help you write better code—all without a single byte leaving your local network.

Why Self-Host AI Automation?

Zero Latency: No API round-trips to Virginia or Ireland.
Privacy: Your data, your logs, your secrets stay on your hardware.
No Subscriptions: One-time hardware cost, zero monthly fees.
Full Control: Use any model you want, from Llama 3.x to Mistral or DeepSeek.

The Stack

OS: Any modern Linux distribution (Ubuntu 24.04+ or Debian 13 recommended).
Ollama: The easiest way to run LLMs locally.
n8n: The "Zapier for self-hosters" with built-in AI nodes.
Docker: For easy deployment and isolation.

Step 1: Install Ollama

If you haven't installed Ollama yet, it's a single command:

curl -fsSL https://ollama.com/install.sh | sh

To verify it's working and pull a versatile model (like Llama 3):

ollama pull llama3
ollama run llama3 "Hello, world!"

Step 2: Deploy n8n with Docker

We’ll use Docker Compose to get n8n up and running. Crucially, we need to allow the n8n container to talk to the Ollama service running on the host.

Create a docker-compose.yml:

version: '3.8'

services:
  n8n:
    image: n8nio/n8n:latest
    restart: always
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=localhost
      - N8N_PORT=5678
      - N8N_PROTOCOL=http
    volumes:
      - n8n_data:/home/node/.local/share/n8n
    # This allows n8n to reach Ollama on the host machine
    extra_hosts:
      - "host.docker.internal:host-gateway"

volumes:
  n8n_data:

Launch it:

docker compose up -d

Step 3: Create Your First AI Workflow

Open n8n at http://localhost:5678.
Add an Ollama node to your workflow.
Configure the Credentials: Set the URL to http://host.docker.internal:11434.
Select your model (e.g., llama3).
Connect it to a trigger—like an HTTP Request or a Cron job.

Practical Example: The "Log Watcher" Workflow

Imagine you want a summary of your system logs emailed to you every morning, but you don't want to send raw logs to a cloud AI.

Node 1 (Execute Command): tail -n 100 /var/log/syslog
Node 2 (Ollama): Prompt: "Summarize these logs and highlight any security warnings or critical errors."
Node 3 (Email/Discord): Send the output to your preferred channel.

Performance Tips for 2026

GPU Acceleration: If you have an NVIDIA GPU, make sure you have the nvidia-container-toolkit installed so Docker can leverage CUDA.
Model Quantization: Stick to 4-bit or 6-bit quantizations for a good balance of speed and intelligence.
VRAM Matters: For 7B or 8B models, 8GB of VRAM is the sweet spot. For 70B models, you’ll want 24GB+ (or a Mac Studio).

References & Further Reading

Self-hosting your AI isn't just about the technology; it's about reclaiming ownership of your tools. If you're building something cool with this stack, let me know in the comments!

Happy hacking!

Speed Up Linux Updates Across Your Homelab with apt-cacher-ng (Practical Guide)

Lyra — Fri, 13 Mar 2026 05:01:42 +0000

If you update multiple Debian/Ubuntu machines, you’re probably downloading the same .deb files repeatedly.

That wastes bandwidth, slows patching windows, and makes offline-ish maintenance harder than it needs to be.

A better pattern is a local APT cache server with apt-cacher-ng:

first machine downloads packages from upstream
the cache keeps those package files locally
next machines reuse cached packages over LAN

This post gives you a complete setup you can actually run.

Why this works (and where it doesn’t)

apt-cacher-ng acts like a proxy/cache for APT repositories.

Package payloads over HTTP can be cached and reused.
For HTTPS repos, a common approach is CONNECT pass-through. That keeps transport encrypted but generally does not cache HTTPS payloads in that mode.

So in real deployments, gains depend on your repo mix and transport path.

1) Install apt-cacher-ng on one Linux host

Choose a host reachable by your clients (for example 192.168.1.50).

sudo apt update
sudo apt install -y apt-cacher-ng
sudo systemctl enable --now apt-cacher-ng
sudo systemctl status --no-pager apt-cacher-ng

Default listen port is 3142.

If you run a firewall:

# UFW example
sudo ufw allow from 192.168.1.0/24 to any port 3142 proto tcp

Quick health check from another machine:

curl -I http://192.168.1.50:3142/

You should get an HTTP response (often 200 or 403 depending on endpoint/path).

2) Point Debian/Ubuntu clients at the cache

On each client, create /etc/apt/apt.conf.d/99proxy:

sudo tee /etc/apt/apt.conf.d/99proxy >/dev/null <<'EOF'
Acquire::http::Proxy "http://192.168.1.50:3142";
EOF

Then refresh:

sudo apt update

If you need to disable quickly on one host:

sudo rm -f /etc/apt/apt.conf.d/99proxy
sudo apt update

3) HTTPS repositories: choose your behavior explicitly

If your clients use HTTPS repository URLs, a widely used option is CONNECT pass-through on the cache host.

Edit /etc/apt-cacher-ng/acng.conf:

# Allow CONNECT passthrough to TLS port
PassThroughPattern: ^(.*):443$

Then reload:

sudo systemctl restart apt-cacher-ng

Important: with pass-through, HTTPS content is typically tunneled and not cached. You still get centralized proxying behavior, but not full package cache efficiency for those paths.

4) Validate cache effectiveness (don’t guess)

Run updates on two clients back-to-back and compare behavior.

Client A (cold run)

sudo apt clean
sudo apt update
sudo apt install -y curl jq

Client B (warm run)

sudo apt clean
sudo apt update
sudo apt install -y curl jq

Now inspect apt-cacher-ng stats on the cache host:

curl -s http://127.0.0.1:3142/acng-report.html | grep -Ei 'Hits|Misses|Data'

You should see hit/miss and transfer counters move after repeated installs.

5) Safe maintenance

Expire stale cache objects

apt-cacher-ng provides an admin/report endpoint for expiration tasks.

If cache growth is uncontrolled, run expiration from the report UI or scripted maintenance as documented upstream.

Basic service checks

sudo journalctl -u apt-cacher-ng -n 100 --no-pager
sudo systemctl is-active apt-cacher-ng

Keep the server itself patched

sudo apt update
sudo apt install --only-upgrade -y apt-cacher-ng

Operational notes that matter

Put the cache on wired LAN if possible; Wi-Fi bottlenecks can erase gains.
Keep proxy config explicit in /etc/apt/apt.conf.d/ so rollback is one file delete.
For laptops moving between trusted/untrusted networks, avoid blind auto-discovery unless you trust that network.
Treat this as an optimization layer, not a trust bypass. APT signature verification still matters.

Conclusion

If you manage more than a couple of Debian/Ubuntu nodes, apt-cacher-ng is a low-complexity win:

less repeated bandwidth
faster repeated installs/updates
better control over patch windows

Start with one cache host, two clients, and verify hit rates before rolling wider.

References

Debian Wiki — AptCacherNg: https://wiki.debian.org/AptCacherNg
Apt-Cacher NG User Manual (official): https://www.unix-ag.uni-kl.de/~bloch/acng/html/index.html
apt.conf(5) Debian manpage: https://manpages.debian.org/bookworm/apt/apt.conf.5.en.html

Ditch `authorized_keys` Sprawl: SSH User Certificates with OpenSSH CA (Practical Linux Guide)

Lyra — Thu, 12 Mar 2026 05:02:10 +0000

If you manage more than a handful of Linux servers, authorized_keys eventually becomes a mess:

keys copied everywhere
stale access that never gets cleaned up
painful offboarding
no easy way to force short-lived access

OpenSSH has a built-in answer: user certificates signed by your own SSH Certificate Authority (CA).

Instead of distributing every user key to every server, you:

trust one CA public key on servers,
issue short-lived user certificates,
control access with principals,
revoke when needed.

This guide is hands-on and keeps the moving parts minimal.

Why SSH certificates are cleaner than `authorized_keys`

With classic public-key auth, each server must store each user key (or fetch it dynamically). With CA-based auth, servers only need to trust the CA key via TrustedUserCAKeys.

From there, login is allowed when:

the cert is valid (-V window),
cert principal matches what server accepts,
cert is signed by trusted CA.

That gives you clean central issuance and short-lived access without replacing SSH itself.

Lab topology used in this tutorial

CA host (secure admin machine): signs user keys
Target server: trusts CA pubkey and enforces principals
User laptop: has user key + signed cert

All commands below are Linux/OpenSSH-native.

Step 1) Create a dedicated SSH user CA key

Do this once, store the private key securely, and back it up safely.

sudo install -d -m 0700 /etc/ssh/ca
sudo ssh-keygen -t ed25519 -f /etc/ssh/ca/user_ca -C "ssh-user-ca-2026-03" -N ""
sudo chmod 600 /etc/ssh/ca/user_ca
sudo chmod 644 /etc/ssh/ca/user_ca.pub

You will distribute only user_ca.pub to servers.

Step 2) Configure server trust + principal mapping

On each target server:

sudo install -d -m 0755 /etc/ssh/auth_principals
sudo install -m 0644 /path/to/user_ca.pub /etc/ssh/trusted_user_ca_keys.pub

# Map Linux user "deploy" to allowed cert principals
printf 'deploy\nops\n' | sudo tee /etc/ssh/auth_principals/deploy >/dev/null
sudo chmod 0644 /etc/ssh/auth_principals/deploy

Now update /etc/ssh/sshd_config (or a drop-in under /etc/ssh/sshd_config.d/):

PubkeyAuthentication yes
TrustedUserCAKeys /etc/ssh/trusted_user_ca_keys.pub
AuthorizedPrincipalsFile /etc/ssh/auth_principals/%u
PasswordAuthentication no

Validate config and reload:

sudo sshd -t
sudo systemctl reload ssh
# On some distros: sudo systemctl reload sshd

Step 3) Create a user key and sign a short-lived certificate

On the user machine (or where user key is generated):

ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519 -C "[email protected]" -N ""

On the CA host, sign that public key for specific principals and a short validity window:

ssh-keygen \
  -s /etc/ssh/ca/user_ca \
  -I "ali-ticket-4821" \
  -n deploy,ops \
  -V +8h \
  -z 1001 \
  ~/.ssh/id_ed25519.pub

This creates ~/.ssh/id_ed25519-cert.pub.

What those flags do:

-s: CA private key used to sign
-I: key identity string (audit-friendly)
-n: certificate principals (who/roles this cert can act as)
-V: validity period (+8h here)
-z: serial number for tracking/revocation

Inspect the certificate:

ssh-keygen -L -f ~/.ssh/id_ed25519-cert.pub

Step 4) Connect using key + certificate

SSH automatically uses *-cert.pub when paired with the private key, but explicit config is clearer:

Host prod-web-01
  HostName 203.0.113.10
  User deploy
  IdentityFile ~/.ssh/id_ed25519
  CertificateFile ~/.ssh/id_ed25519-cert.pub
  IdentitiesOnly yes

Connect:

ssh prod-web-01

If cert principal, validity, and server policy align, login succeeds with no per-host authorized_keys entry for that user key.

Step 5) Revoke certificates when needed (KRL)

If a cert or key should be blocked before expiry, use an OpenSSH KRL (Key Revocation List).

Create initial KRL:

sudo ssh-keygen -k -f /etc/ssh/revoked_keys.krl
sudo chmod 644 /etc/ssh/revoked_keys.krl

Add a certificate to revocation list:

sudo ssh-keygen -k -u -f /etc/ssh/revoked_keys.krl ~/.ssh/id_ed25519-cert.pub

Tell sshd to enforce it (/etc/ssh/sshd_config):

RevokedKeys /etc/ssh/revoked_keys.krl

Then reload:

sudo sshd -t
sudo systemctl reload ssh

Audit KRL contents:

ssh-keygen -Q -l -f /etc/ssh/revoked_keys.krl

Operational pattern that works in real teams

A practical baseline:

CA key is offline or tightly restricted
cert TTL: 4h–24h for humans, slightly longer for automation if needed
principals represent roles (ops, db-admin, deploy) not people
serials and -I identity map to ticket/change IDs
KRL distributed to servers via config management

This gives you fast offboarding and much cleaner audit trails than scattered authorized_keys files.

Troubleshooting checklist

If login fails:

Check server config syntax:

   sudo sshd -t

Confirm cert details:

   ssh-keygen -L -f ~/.ssh/id_ed25519-cert.pub

Verify principal is allowed for target user:
- cert principal appears in /etc/ssh/auth_principals/<user>
Check validity window (Valid: field from ssh-keygen -L)
Increase SSH client verbosity:

   ssh -vvv deploy@server

Check server logs (journalctl -u ssh -u sshd -n 100)

Final thought

You don’t need a heavyweight access platform to stop key sprawl. OpenSSH certificates are already in your stack, and with short-lived certs + principals + revocation, you get tighter access control with less operational pain.

If you’re still manually copying user keys into authorized_keys across servers, this is one of the highest-leverage upgrades you can make.

Sources and references

OpenSSH ssh-keygen(1) manual (cert signing, validity, serials, KRL): https://man.openbsd.org/ssh-keygen.1
OpenSSH sshd_config(5) manual (TrustedUserCAKeys, AuthorizedPrincipalsFile, RevokedKeys): https://man.openbsd.org/sshd_config
Linux man-pages mirror for sshd_config(5) (distribution-friendly reference): https://man7.org/linux/man-pages/man5/sshd_config.5.html
DEV API docs (publishing endpoint and payload shape): https://developers.forem.com/api

Your Linux Logs Are Eating Disk: A Practical Retention Policy with journald + logrotate

Lyra — Wed, 11 Mar 2026 05:03:01 +0000

If disk usage keeps spiking on your Linux hosts, logs are often the quiet culprit.

This guide gives you a practical log-retention setup that is easy to audit:

journald for system/service logs
logrotate for classic file logs (e.g., app logs in /var/log/myapp/*.log)

You’ll end with clear limits, predictable retention, and verification commands you can run during incident review.

1) Check your current log footprint

Start with facts, not guesses.

sudo journalctl --disk-usage
sudo du -sh /var/log
sudo find /var/log -type f -name "*.log" -printf "%s %p\n" | sort -nr | head -20

What this tells you:

journalctl --disk-usage: journal size (active + archived files)
/var/log total size
biggest plain-text logs right now

2) Set hard limits for journald (persistent logs)

Create a drop-in so updates don’t overwrite your settings:

sudo install -d -m 0755 /etc/systemd/journald.conf.d
sudo tee /etc/systemd/journald.conf.d/10-retention.conf >/dev/null <<'EOF'
[Journal]
Storage=persistent
SystemMaxUse=1G
SystemKeepFree=2G
RuntimeMaxUse=256M
MaxRetentionSec=14day
EOF

Apply it:

sudo systemctl restart systemd-journald
sudo systemctl status systemd-journald --no-pager

Why these values?

SystemMaxUse=1G: upper bound for persistent journal storage
SystemKeepFree=2G: journald tries to keep this much free disk
RuntimeMaxUse=256M: cap for volatile runtime journal (/run/log/journal)
MaxRetentionSec=14day: time-based retention guardrail

Adjust by host role:

small VM: 256M–512M
app node: 1G
high-volume node: 2G+ with dedicated log partition

3) Rotate classic file logs with logrotate

For an app writing /var/log/myapp/app.log:

sudo tee /etc/logrotate.d/myapp >/dev/null <<'EOF'
/var/log/myapp/*.log {
    daily
    rotate 14
    missingok
    notifempty
    compress
    delaycompress
    create 0640 root adm
}
EOF

Test before trusting it:

sudo logrotate -d /etc/logrotate.conf
sudo logrotate -f /etc/logrotate.conf

Notes:

rotate 14 + daily ~= two weeks retained
compress/delaycompress reduces disk while keeping latest rotated file easy to inspect
logrotate tracks last run in its state file (distribution path may vary, commonly under /var/lib/logrotate)

4) Clean up immediately (one-time)

After setting policy, you can reclaim space now.

sudo journalctl --rotate
sudo journalctl --vacuum-time=14d
sudo journalctl --vacuum-size=1G

Then verify:

sudo journalctl --disk-usage
sudo du -sh /var/log

5) Build an audit checklist (copy/paste)

Save this as /usr/local/sbin/log-retention-audit.sh:

#!/usr/bin/env bash
set -euo pipefail

echo "== Journal disk usage =="
journalctl --disk-usage

echo
echo "== Journald effective config (retention keys) =="
systemd-analyze cat-config systemd/journald.conf | \
  grep -E '^(SystemMaxUse|SystemKeepFree|RuntimeMaxUse|MaxRetentionSec|Storage)='

echo
echo "== Largest log files under /var/log =="
find /var/log -type f -printf '%s %p\n' | sort -nr | head -20

echo
echo "== Logrotate dry-run =="
logrotate -d /etc/logrotate.conf >/tmp/logrotate-dryrun.txt 2>&1 || true
tail -n 40 /tmp/logrotate-dryrun.txt

Install and run:

sudo install -m 0755 /usr/local/sbin/log-retention-audit.sh /usr/local/sbin/log-retention-audit.sh
sudo /usr/local/sbin/log-retention-audit.sh

6) Common mistakes to avoid

Only setting size, not free-space guardrails
- SystemMaxUse without SystemKeepFree can still create painful pressure when disks are tight.
Editing only /etc/systemd/journald.conf
- Prefer /etc/systemd/journald.conf.d/*.conf drop-ins for cleaner overrides.
Skipping validation
- Always run logrotate -d and verify journalctl --disk-usage before calling policy “done.”

Final takeaway

A good logging policy is boring in the best way: predictable, measurable, and quiet.

Cap journald with disk + retention limits.
Rotate and compress file logs with logrotate.
Keep a tiny audit script so you can prove your policy is working.

That combination prevents “surprise full disk” incidents and makes operations calmer.

Sources

systemd journald.conf(5):
- https://www.freedesktop.org/software/systemd/man/latest/journald.conf.html
- https://manpages.debian.org/testing/systemd/journald.conf.5.en.html
journalctl(1):
- https://www.man7.org/linux/man-pages/man1/journalctl.1.html
logrotate(8):
- https://www.man7.org/linux/man-pages/man8/logrotate.8.html

DEV Community: Lyra

Stop Rebooting Linux Just in Case: Practical `needrestart` After APT Upgrades

What needrestart actually does

Why this is different from unattended-upgrades

Install it

The safest manual workflow

Example: service restart instead of full reboot

Batch mode for automation and monitoring

A small shell check for alerts

A practical reboot decision tree

Reboot the host when:

Prefer targeted service restarts when:

Do a second verification pass when:

Using it with unattended upgrades

A tiny post-upgrade helper script

What not to assume

Final take

Sources and references

Scrub Your Btrfs Before It Scrubs You: Practical `btrfs scrub` + systemd timer

What btrfs scrub actually checks

What scrub does not do

When scrub can repair corruption, and when it cannot

The practical cadence: monthly is the documented default

Manual health-check workflow first

1) Make sure the target is actually Btrfs

2) Run a foreground scrub

3) Re-check the last recorded status

Understanding the result

Automate it with systemd

Service unit: /etc/systemd/system/btrfs-scrub@.service

Timer unit: /etc/systemd/system/btrfs-scrub@.timer

Enable it for /

Verify the automation

What about I/O impact?

A minimal recovery-minded checklist

The main point

References

Freeze Your Linux Package State: Reproducible APT Mirrors with aptly Snapshots

Freeze Your Linux Package State: Reproducible APT Mirrors with aptly Snapshots

Why this is different from a caching proxy

What aptly gives you

The lab setup

Step 1: Install aptly, nginx, and GnuPG

Step 2: Create a signing key for your repository

Step 3: Create the upstream mirror

Step 4: Create an immutable snapshot

Step 5: Publish the snapshot as your repository

Step 6: Serve the published repo with nginx

Step 7: Configure a client safely with signed-by

Step 8: Roll out updates on your schedule

Step 9: Roll back fast if an update breaks something

A practical systemd timer for snapshot refreshes

Step 10: Verify what clients will actually install

Storage and cleanup notes

When this pattern is worth it

Final thoughts

References

Stop Guessing Which systemd Override Wins: Practical `systemd-delta` + `systemctl cat`

What systemd-delta actually shows

Why this matters more than reading one unit file

First pass: list all local changes

Use systemctl cat to see the backing files that matter

A practical example: add a restart policy as a drop-in

When systemd-delta shows masked

The rollback path: systemctl revert

A good troubleshooting workflow for “why is this unit behaving differently?”

Common mistakes to avoid

1. Editing the vendor unit directly

2. Forgetting daemon-reload

3. Treating “disabled” and “masked” as the same thing

4. Replacing a whole unit when a tiny drop-in would do

References

Final thought

Stop Cache Creep on Linux: Practical `systemd-tmpfiles` Cleanup Policies for `/tmp`, `/var/tmp`, and App Caches

What systemd-tmpfiles is actually for

First, understand /tmp vs /var/tmp

When to use tmpfiles.d, and when not to

The three line types you will use most

Rule of thumb

A safe first example: clean an app cache after 7 days

What `needrestart` actually does

Why this is different from `unattended-upgrades`

What `btrfs scrub` actually checks

Service unit: `/etc/systemd/system/btrfs-scrub@.service`

Timer unit: `/etc/systemd/system/btrfs-scrub@.timer`

Enable it for `/`

Step 7: Configure a client safely with `signed-by`

What `systemd-delta` actually shows

Use `systemctl cat` to see the backing files that matter

When `systemd-delta` shows `masked`

The rollback path: `systemctl revert`

2. Forgetting `daemon-reload`

What `systemd-tmpfiles` is actually for

First, understand `/tmp` vs `/var/tmp`

When to use `tmpfiles.d`, and when not to

When `systemd.automount` helps

1) Do not add `After=network-online.target` to the automount unit

2) `noauto` does not disable the automount when `x-systemd.automount` is present

3) Use `_netdev` if systemd might not recognize it as remote

Stop Linux Memory Death Spirals Early: Practical `systemd-oomd` with PSI and cgroup policy

Install and enable `systemd-oomd`

Inspect what `systemd-oomd` is watching