The Mosburn Lab

SSO Across the Stack: Building a Unified Lab Identity Layer

2026-05-17T00:00:00+00:00

The previous post in this series covered deploying a self-hosted development stack — Redmine for issue tracking, a wiki, a Git forge, and a team chat platform. What it didn’t cover was authentication: each service had its own user database, its own login page, and its own password to manage.

That’s not a lab. That’s a collection of services that happen to run on the same machine.

The follow-up work was building a proper identity layer: Keycloak as the OIDC provider, every service authenticated through it, users managed in one place. This post covers what that process actually looked like — including the parts that didn’t work.

Why Keycloak

The short answer is that Keycloak is what you’d run in production. Auth0, Okta, and Entra ID are all excellent — and all subscription-based. For a lab environment where the point is to learn the tools before you need them professionally, running the enterprise-grade open-source equivalent makes more sense than paying for managed identity or wiring up something half-functional.

Keycloak handles:

OIDC and SAML (the two protocols you’ll encounter in enterprise environments)
Realm isolation — multiple tenants on one instance
Fine-grained client scopes and protocol mappers
MFA, brute force protection, and session management
A user federation layer for LDAP and Active Directory integration

The mosburn realm on the lab Keycloak instance is the single source of truth for lab identity. One user account. One password. Works across every service in the stack.

The stack

The lab runs as a Docker Compose stack with an nginx reverse proxy routing traffic by subdomain:

Service	URL	Role
Keycloak	keycloak.mosburn.lab	Identity provider
Redmine	redmine.mosburn.lab	Issue and project tracking
Wiki.js	wiki.mosburn.lab	Documentation and wiki
Forgejo	forgejo.mosburn.lab	Git forge

Each service runs with its own PostgreSQL database. The nginx config proxies *.mosburn.lab subdomains to the relevant containers, and each container has extra_hosts pointing keycloak.mosburn.lab to the host gateway — so backchannel token exchange calls route correctly through nginx rather than failing to resolve the hostname inside the Docker network.

That last part took longer than it should have. OIDC has two distinct communication patterns that get confused in a containerised setup: the browser-facing redirects (which need the public hostname) and the server-to-server token exchange (which needs a hostname the container can actually resolve). Getting both working simultaneously requires understanding which path each call takes.

What the “free tier” actually means for SSO

The first lesson was expensive in time: many self-hosted applications have quietly moved SSO behind a paywall.

Mattermost was the original choice for team chat. It’s widely deployed, has a good mobile app, and seemed like the obvious pick. The OpenID Connect option in System Console has a prominent upsell wall requiring Professional. The GitLab OAuth option — historically free — also now requires Professional. Mattermost as a free self-hosted platform no longer supports SSO in any meaningful sense.

Docmost was the original wiki choice. Clean interface, good editor, PostgreSQL backend. Version 0.80.2 ships with OIDC code visible in the dist bundle under ee/sso — enterprise edition. The OIDC_ENABLED environment variable that appears in older documentation does nothing in current releases.

Docmost was replaced with Wiki.js, which has free OIDC via a Generic OpenID Connect strategy. Mattermost was not replaced with another self-hosted chat platform — and the reasoning is worth spelling out, because it applies more broadly.

The pattern here is worth noting for anyone else building a self-hosted stack: always verify SSO support against the current version before building your deployment around a service. Documentation tends to lag behind licensing changes by months.

Alerts need to reach you

A self-hosted chat platform sounds good in theory. In practice, the primary value of a chat system in a lab context isn’t discussion — it’s automation output. Build failures, deployment events, monitoring alerts, cron job completions. The things that interrupt you, or should.

Self-hosted chat doesn’t deliver those notifications to your phone when you’re out riding. A pipeline that fails overnight doesn’t show up until you’re back at a desk. A monitoring alert at 2am goes nowhere. The Redmine plugin that posts to a webhook might as well be logging to /dev/null if the webhook endpoint isn’t running a client you have in your pocket.

Discord already handles this. Webhooks are first-class, the mobile app is reliable, and the notification model actually works. CI/CD integrations are a few lines of configuration. You get the same “build failed, retry?” message over lunch that you’d see in a professional Slack or Teams setup — without running infrastructure to achieve it.

The lab stack uses Discord for automation output via webhooks. The Redmine redmine_messenger plugin posts issue activity to a dedicated channel. Forgejo CI can post pipeline results the same way. No server to maintain, no database to back up, no container to debug when a MongoDB replica set election decides to happen at an inconvenient time.

Configuring OIDC across heterogeneous services

Each service has a different approach to OIDC configuration, which is instructive.

Redmine uses the devopskube/redmine_openid_connect plugin — a CAS-style patch rather than OmniAuth. Configuration is through the plugin admin UI, not configuration.yml. The discovery document is cached; clearing tmp/cache/ is required when changing realm configuration. There was also a Ruby bug in the plugin where dynamic_config_expiry was passed as a String to Rails.cache.fetch which expects a Numeric — patched in the Dockerfile with a sed one-liner after clone.

Forgejo has native OAuth2 support with an OpenID Connect provider type. Configuration is through the admin UI with the Keycloak discovery URL. The only complication was ensuring the discovery URL used the public hostname without the internal port — Keycloak’s KC_HOSTNAME needs to be set to the full URL (http://keycloak.mosburn.lab) rather than just the hostname, otherwise the issuer includes :8080 and backchannel calls from containers fail.

Wiki.js uses a Generic OpenID Connect strategy (not Generic OAuth2 — they’re different entries and using the wrong one wastes time). The callback URL is UUID-based, not derived from the strategy display name, so you have to check the Configuration Reference section to find the actual redirect URI to register in Keycloak. Setting the Site URL in the admin UI is required to get the correct protocol — Wiki.js will advertise HTTPS if it detects a forwarded-HTTPS header, regardless of what you set in environment variables.

Ansible integration

The Docker Compose stack is the lab testing environment. The production deployments — Redmine, Forgejo, and the others — are managed by the Ansible roles in the mosburn.* namespace and deployed to dedicated VMs via lab.yml.

The compose stack mirrors the production architecture closely enough that configuration validated here translates directly. Client IDs, redirect URIs, and token endpoint URLs carry over. The main difference is the hostname scheme: *.mosburn.lab with subdomains points at the local nginx in the lab, and at real hostnames with TLS in production.

The Keycloak mosburn realm and its client configurations are the portable artifact. When a new service is validated in the lab stack, the same Keycloak client configuration works in production — different URL, same auth flow.

What’s left

The other gap is Vaultwarden — and it’s staying a gap. The technical case is straightforward: 1password ships as a managed package in mosburn.common, Vaultwarden would replace it, and the SSO infrastructure built here would handle authentication with no extra work. The implementation would be an afternoon.

The question is whether it should be on the list at all.

There’s a maintenance cost attached to every self-hosted service. Updates, database backups, the occasional broken container, the plugin that stops working after a version bump. Most of the time that cost is invisible — until it isn’t, and you’re debugging a Redmine cache issue on a Sunday instead of being somewhere else.

I stopped running Gentoo on my daily driver for the same reason. It’s a spectacular operating system and I learned more maintaining it than I did from any formal source. But at some point the ratio of time spent maintaining the environment to time spent doing things in the environment tips the wrong way. A weekend afternoon that could go toward a long ride or an afternoon buried in medieval history is worth more than another self-hosted service I’ll spend three hours debugging when the next major release drops a breaking change.

1Password is paid software. It works. It gets security updates without my involvement. For credentials — which are the one category of data where a self-hosting failure has real consequences — the maintenance-free option is the right call.

The lab exists to learn things that transfer to professional contexts and to run services that genuinely benefit from local control. A password manager that syncs across devices and has a mobile app that works in a car park is not in that category. Neither is a chat platform.

Some things are worth paying for. Identity is infrastructure. Knowing which parts of that infrastructure to outsource is part of the discipline.

The Commute Calculator: What a Hybrid Offer Actually Costs

2026-05-11T00:00:00+00:00

My wife got a hybrid role offer. Good job, real company, worth taking seriously. And almost immediately the conversation turned into: okay, but how do we actually compare this to what she’s doing now?

She’s remote. The new role is three days a week in office. Everyone acts like that’s a minor detail. It’s not.

The instinct is to think about gas. $80 a month, whatever, you factor it in and move on. But that’s not the cost. That’s the part of the cost you can see, and it’s also the smallest part.

The number you’re not thinking about

Here’s the thing about fuel costs: they feel real because you’re physically pumping gas. But run the math on 25 miles each way, three days a week, and you’re at about $950 a year. Yeah. Less than a grand.

The IRS mileage rate — $0.67 a mile, which folds in depreciation, maintenance, insurance, all of it — gets you closer to $4,500. That’s a more honest number. Still not the one that matters.

Your time matters.

At $110k, you’re making about $53 an hour. A 45-minute commute each way, three days a week, is 202 hours a year you’re not getting back. That’s $10,700. Not in gas. Not in wear on the car. In hours.

More than double the vehicle cost. Most people walk into a salary negotiation thinking about $950. They should be walking in thinking about $15,200.

The gross-up problem

Even $15,200 isn’t the right number to use at the table, because salary is pre-tax and commute costs come out of the other side.

You need to gross it up. Figure out how much pre-tax salary you’d actually need to earn to cover what the commute takes from you post-tax.

At a combined marginal rate around 28% — federal plus Colorado — that $15,200 in real post-tax cost turns into $21,100 in required pre-tax salary. That’s what the commute actually costs you in offer terms.

So the hybrid offer at $110k is equivalent to a remote job paying $88,900.

If the remote option is sitting there at $100k, the $110k hybrid isn’t $10k better. It’s $11,100 worse. You’d need them at $121k before you’re genuinely ahead.

That’s not a rounding error you shrug off. That’s a whole different answer to the question “is this offer worth it.”

So I asked Claude to build it

Doing this math by hand every time you tweak a variable is stupid. Change the days per week, adjust the salary, swap in a different commute distance — you’re redoing everything from scratch.

I described the problem to Claude. The inputs, the two modes (IRS rate versus actual fuel cost and MPG), the time cost calculation, 2025 federal tax brackets for the gross-up, hybrid days dialed in separately from total work weeks. It built a full-stack FastAPI backend and React frontend in a day.

This is the self-hosted AI stack doing actual work on an actual problem. Not a proof of concept. The calculator handles commute distance, days per week, salary, parking costs. Shows you the pay period breakdown — exactly how much more per paycheck you’d need. Has an e-bike ROI panel that tells you break-even in months if you can bike part of the route. Metric and imperial toggles.

We used it. That’s the whole point.

What the numbers said

Here’s what the calculator spits out for the scenario above — 25 miles each way, 3 days per week, $110k, 45-minute commute:

	Annual cost
Vehicle cost (IRS rate)	$4,500
Time cost	$10,700
Total commute cost	$15,200
Gross-up to pre-tax (28%)	$21,100
Equivalent remote salary	$88,900

The offer letter says $110k. The number you’re negotiating from is $88,900.

Your numbers will be different. Shorter commute, two days a week, higher salary — the calculator takes all of it. The point isn’t the specific output. It’s that you run it before you walk in, not after.

The calculator doesn’t decide anything. It just makes sure you’re not negotiating against a number you made up in your head.

The repo is at github.com/mosburn/commute-calculator. Clone it, run make install && make run, put in your actual situation. Do it before you negotiate.

Home SOC: Security Research with TheHive and Cortex

2026-05-06T00:00:00+00:00

Security research is an unusual hobby. The tooling is powerful, the learning curve is steep, and the infrastructure requirements are substantial enough that most people doing it at home are either running underpowered setups or spinning up cloud instances they forget to turn off between sessions.

The Mosburn Lab takes a different approach: infrastructure-as-code deployment of professional security tooling, torn down and rebuilt when needed, with reproducible state managed entirely by Ansible.

What a home SOC actually means

A Security Operations Center, in enterprise terms, is a team with a toolchain for detecting, investigating, and responding to incidents. The core stack typically includes:

SIEM — aggregating and correlating logs across the environment
Case management — structured workflows for tracking investigations
Threat intelligence — enriching indicators with external data
Orchestration — automating the mechanical parts of analysis

Running a home SOC isn’t running an enterprise program. It’s having access to the same class of tooling for learning how these systems work before you need them professionally, practicing incident response against known-bad samples in a controlled environment, conducting vulnerability research with proper case tracking, and analyzing malware without touching anything near actual production data.

TheHive: case management that takes investigation seriously

TheHive is an open-source incident response platform built around cases — structured investigations that can hold observables, tasks, timeline entries, and links to related cases.

The workflow is familiar to anyone who’s done professional incident response:

Alert comes in (manually or via integration)
Case opens with relevant observables — IP addresses, domains, file hashes, email headers
Tasks assigned and tracked within the case
Observables sent to Cortex for automated enrichment
Timeline builds as the investigation progresses
Case closes with documented findings

In practice, I use this for malware analysis sessions. Each sample gets a case. Associated infrastructure — C2 servers, distribution domains, related hashes — tracked as observables. Come back to a sample three weeks later and the context is still there.

Cortex: the enrichment engine

Cortex is TheHive’s companion platform. It runs analyzers — integrations with threat intelligence services, OSINT tools, and analysis platforms — against observables submitted from TheHive.

Out of the box:

VirusTotal, MalwareBazaar, abuse.ch
Shodan, Censys
WHOIS, DNS, BGP lookups
URLScan, URLhaus
Hybrid Analysis, Any.run (API key required)
Local analysis tools (YARA, strings, capa)

From TheHive it’s one click: submit an observable to Cortex, pick the analyzers, get enriched results back in the case timeline. What used to be a sequence of manual lookups across a dozen browser tabs becomes a parallelized automated enrichment run.

The infrastructure reality

TheHive and Cortex have real infrastructure requirements. TheHive needs Elasticsearch or OpenSearch for storage. Cortex needs Docker for its analyzer workers. The combination runs best with 8GB dedicated to the stack.

The hive.yml playbook deploys both services using roles targeting a dedicated host. Current deployment is CentOS-based, reflecting the original role architecture. Native Fedora and Ubuntu support is on the roadmap.

The mosburn.elk role provides the Elasticsearch backend TheHive depends on. The mosburn.filebeat role ships logs from other lab hosts into the ELK stack, giving TheHive’s integrated search a view across the entire lab environment.

The controlled research environment

The most important thing about running security tooling at home is isolation. Analyzing malware or testing exploits on a machine that shares a network with family devices and personal data is not responsible research practice.

The Mosburn Lab handles this through network segmentation and VirtualBox-based isolation. Packer builds clean Fedora and Ubuntu images. Ansible provisions research environments from those images. Session ends, the VM is snapshotted or destroyed. The base image stays clean.

The mosburn.vbox role manages VirtualBox installation across supported platforms. Research happens inside VMs. The VMs are disposable. The methodology for creating them isn’t.

What makes this sustainable

Security research infrastructure is only useful if it’s there when you need it. The common failure mode for home security labs: setup is painful enough that you avoid rebuilding after a problem, and eventually the environment is too stale to trust.

The Ansible approach makes rebuild cost low. Full TheHive + Cortex + ELK stack deploys in a single playbook run. The time between “I need a clean research environment” and “I have one” is measured in minutes, not hours.

That’s the dividend from investing upfront in Ansible roles and Molecule tests. The lab doesn’t accumulate debt. When you need it, it works.

And when it doesn’t, you run the playbook again.

The Self-Hosted AI Stack: Privacy, Power, and Local Models

2026-05-04T00:00:00+00:00

If you’re doing serious work in 2026, you’re using AI tools. The question isn’t whether — it’s what you’re handing over when you do.

Cloud AI is capable and convenient. It also logs your requests, uses interactions to improve future models, and builds a picture of what you’re working on. For most tasks that tradeoff is fine. For security research, unreleased code, and infrastructure configs with real hostnames and IP ranges in them, it’s not.

The move isn’t to refuse all cloud AI. It’s to be deliberate about what leaves your network and what doesn’t.

The two-tier model

The Mosburn Lab runs AI at two levels:

Local inference via Ollama, running open-weight models on-device. No network required, no API keys, no logging anywhere but your own machine. Quality is lower than frontier models for complex reasoning — but for code completion, summarization, and exploratory work, it’s often good enough. And it’s always private.

Cloud access via CLI tools for frontier models: Claude, Gemini, Codex. Used when the task actually needs frontier-quality reasoning, with the understanding that requests are processed by the provider. The tradeoff is explicit and accepted, not invisible and assumed.

The mosburn.ai Ansible role manages both tiers. Installs CLI tools via npm across Fedora, Debian, Ubuntu, Arch, and Gentoo. Each tool is independently toggleable. Ollama handled separately — binary install, systemd service registration.

Ollama and local models

Ollama is the most approachable entry point to local LLM inference. It handles model download, quantization selection, and serving through a REST API that’s OpenAI-compatible — tools built against the OpenAI API can point at a local Ollama instance without modification. That compatibility matters more than it sounds.

Models I keep running locally:

Llama 3.1 8B — fast, reasonable quality for most tasks, fits in 8GB VRAM
Qwen2.5-Coder 7B — noticeably better than general models for code completion and explanation
Mistral 7B — solid for summarization and classification

The honest tradeoff: these aren’t GPT-4o or Claude Opus for complex multi-step reasoning. Reviewing a pull request or explaining unfamiliar code — genuinely useful. Designing a distributed system architecture from scratch — reach for a frontier model. The capability gap is real for certain tasks.

Claude CLI and the case for frontier access

Claude is my primary cloud tool. CLI integration means it’s available from any terminal — claude "explain this error" or claude "review this Ansible role" without switching to a browser.

What makes Claude specifically useful for infrastructure work is the combination of context length and instruction-following. Feed it an entire Ansible role and ask for a review of idempotence and error handling, and you get useful output. The same task on a local 7B model produces inconsistent results.

Requests go to Anthropic’s infrastructure. I don’t send security research artifacts, unreleased project code, or anything with internal hostnames through cloud tools. That’s not paranoia — that’s just being deliberate about it.

The mosburn.ai role

# defaults/main.yml
mosburn_ai_install_claude: true
mosburn_ai_install_gemini: true
mosburn_ai_install_codex: false
mosburn_ai_install_ollama: true

Flip the booleans for what you want. The role installs Node.js if it’s not present and uses npm for the CLI tools. Ollama gets binary installation plus systemd. Runs the same across all supported distributions, which matters because my workstations are Fedora and my test VMs are Ubuntu.

The data posture in practice

Rules I actually follow:

Security research artifacts — local only. Malware analysis, exploit research, threat intelligence stays on-device.
Personal project code — local for routine tasks, frontier models for architecture review, with awareness that the latter is logged.
Public or open-source work — cloud tools freely. No privacy concern with code going public anyway.
Infrastructure code — careful. Ansible roles with internal hostnames, IP ranges, and role-specific variable names reveal your environment topology.

The Ansible approach makes this easier to enforce. AI tooling consistently available across all hosts means I can make deliberate choices about which tool for which task without worrying about what’s installed where.

What’s next

The integration I want to build is RAG against the lab’s own documentation. Docmost generates docs. Forgejo stores code. A local vector store and embedding model would let me query both without sending anything off-device.

The technology exists — Ollama for embedding, ChromaDB or Qdrant for vector storage. The missing piece is the ingestion pipeline, which is its own interesting infrastructure problem.

The Self-Hosted Dev Stack: Forgejo, Redmine, and Docmost

2026-05-01T00:00:00+00:00

The default move for developer tooling in 2026 is SaaS. GitHub for code, Linear or Jira for tickets, Notion for docs. The integrated experience is real, reliability is generally fine, and the cost sneaks up on you once you’re paying for seat licenses, API tiers, and storage charges.

But cost isn’t why I moved this in-house. The data posture is.

Code repositories have business logic, security research, personal projects — things you’d rather not see crawled for a training dataset. Documentation has architecture decisions, threat models, operational notes. None of that should live exclusively in someone else’s cloud by default.

This isn’t about not trusting GitHub or Notion specifically. It’s about not having a clean answer when someone asks where that data actually lives and who can see it.

Forgejo: git without the platform baggage

Forgejo is a community fork of Gitea. Full git hosting — web interface, SSH cloning, pull requests, CI hooks, package registry — running on a single modest server.

I picked Forgejo over Gitea for governance reasons. Forgejo is explicitly community-run with a published roadmap. Self-hosted infrastructure should be one acquisition away from nothing, not one acquisition away from a pricing surprise.

In the Mosburn Lab, Forgejo runs on Docker with a PostgreSQL backend. The mosburn.forgejo role handles container deployment and lifecycle, database provisioning, app.ini configuration via Ansible template, and Keycloak OAuth2 provider registration via the Forgejo admin API.

That last piece is worth calling out. After the service starts, the role POSTs to /api/v1/admin/oauth2 to register Keycloak as an OIDC provider. The call is idempotent — a 422 response means the provider already exists, which the role treats as success. First run or fiftieth, the end state is the same.

Redmine: old, stable, zero dollars

Redmine has been around since 2006. Parts of the UI show it. The plugin ecosystem is extremely stable, the API is well-documented, and it’s been free for almost two decades without that changing.

I’ve used Jira, Linear, Plane, Basecamp. For a home lab and personal projects, the overhead of any of those — including self-hosted options — is more than I need. Issue tracker, time logging, wiki. That’s what I use. Redmine has all of it.

The mosburn.redmine role is the most complex in the stack. The core Docker image gets extended with a custom Dockerfile that installs the redmine_openid_connect plugin at build time:

FROM redmine:6
RUN apt-get update -qq \
    && apt-get install -y --no-install-recommends git \
    && git clone --depth 1 \
        https://github.com/CACI-IMG/redmine_openid_connect.git \
        /usr/src/redmine/plugins/redmine_openid_connect \
    && bundle install --without development test

OIDC configuration lives in a configuration.yml template Ansible renders with the Keycloak issuer URL, realm, and client credentials. Redmine picks it up at startup. Users get a “Sign in with Keycloak” button alongside the local auth form.

Docmost: Notion, but yours

Docmost is newer — collaborative docs platform, block-based editor, nested pages, real-time collaboration. Think Notion without the pricing page.

It’s the simplest of the three to deploy. The entire configuration is environment variables:

OIDC_ENABLED: "true"
OIDC_PROVIDER_NAME: "Mosburn"
OIDC_CLIENT_ID: "docmost"
OIDC_CLIENT_SECRET: ""
OIDC_ISSUER: "https:///realms/"

Ansible renders these into the Docker Compose environment block. No config files to manage, no plugin installation, no custom image.

The pattern underneath all three

PostgreSQL for persistent storage. Keycloak for authentication. Ansible for deployment, configuration, and lifecycle. systemd managing the Docker Compose layer on the host.

The native installation path — mosburn_*_use_docker=false — follows the same pattern with the packaging system substituting for Docker. Fedora and Ubuntu task files handle the platform differences; the Ansible interface stays the same.

Updating Forgejo means changing mosburn_forgejo_version in defaults/main.yml and rerunning the playbook. Rotating a database password means updating the value and rerunning. The playbook is the single source of truth for what’s running and how it’s configured.

Not just services that work. Services whose state is fully described in version-controlled code and can be reproduced exactly. That’s the goal.

Infrastructure as Code, Test-First: Ansible TDD for the Home Lab

2026-04-29T00:00:00+00:00

The first time I wrote an Ansible role with tests, I thought I was wasting time. Writing a failing test, then code to make it pass, for something as deterministic as “install this package.” The indirection felt pointless.

Third time I caught a regression before it hit a live host, I stopped complaining about it.

TDD for infrastructure is the same discipline as TDD for application code. Same reasons it matters, same objections people have, same payoff when you stick with it.

Why infrastructure needs tests more than application code

Application code fails loudly. Stack trace. Broken build. Fast feedback.

Infrastructure fails quietly. A misconfigured service starts fine but behaves wrong. A missing package doesn’t matter until something tries to use it three months later. A task that’s idempotent on Fedora silently breaks on Ubuntu. You find out when you’re running a playbook on a production host at midnight and something’s not where it should be.

Tests close that loop. Running Molecule against a role before pushing is what gives you the same confidence in your infrastructure code that a test suite gives you in application code.

The 95% threshold

The Mosburn Lab holds 95% coverage for both tests and documentation:

Every task in every role has a Molecule test verifying the intended end state
Every variable in defaults/main.yml has documentation in the role README
Every playbook has a comment header explaining what it does and how to run it

95% isn’t 100%. The gap is for tasks where testing the outcome is genuinely harder than testing the behavior — waiting on an external API after a service starts, verifying a database migration ran correctly. Those get integration tests rather than unit tests. They still get tests.

The workflow

For every new role or task:

Write a Molecule scenario asserting the desired end state
Run Molecule — confirm the test fails for the right reason
Write the task
Run Molecule — confirm it passes
Run again with --check to verify idempotence

Step 2 is not optional. A test that passes before you write the code isn’t testing anything. It’s wrong documentation waiting to burn you.

# Start a new role
ansible-galaxy role init roles/mosburn.newrole
cd roles/mosburn.newrole
molecule init scenario

# Write the test first
vim molecule/default/verify.yml

# Confirm it fails
molecule test

# Write the role tasks
vim tasks/main.yml

# Confirm it passes
molecule test

# Confirm idempotence
molecule converge
molecule idempotence

What a Molecule scenario looks like

For mosburn.keycloak, the verify step checks that the Docker service is running, the Keycloak container responds on the configured port, and the systemd unit is enabled:

---
- name: Verify Keycloak deployment
  hosts: all
  tasks:
    - name: Assert keycloak systemd unit is enabled and active
      ansible.builtin.systemd:
        name: keycloak
      register: keycloak_service
      failed_when: >
        keycloak_service.status.ActiveState != 'active' or
        keycloak_service.status.UnitFileState != 'enabled'

    - name: Assert Keycloak HTTP endpoint responds
      ansible.builtin.uri:
        url: "http://localhost:8080/realms/master"
        status_code: 200
      retries: 10
      delay: 10

    - name: Assert docker-compose.yml is present
      ansible.builtin.stat:
        path: /opt/keycloak/docker-compose.yml
      register: compose_file
      failed_when: not compose_file.stat.exists

The native installation path gets its own Molecule scenario — different platform image, verify step checking the Keycloak binary and PostgreSQL instead of Docker.

Multi-distro testing

Every role supporting Fedora and Ubuntu gets a matrix that tests both:

# molecule/default/molecule.yml
platforms:
  - name: fedora
    image: docker.io/fedora:latest
    pre_build_image: true
  - name: ubuntu
    image: docker.io/ubuntu:24.04
    pre_build_image: true

This is where the per-distribution task file pattern earns its keep. tasks/Fedora.yml and tasks/Ubuntu.yml test independently in the matrix. A breakage on one platform doesn’t hide behind a passing test on the other.

Documentation coverage

The 95% doc threshold applies to role variables. Every variable in defaults/main.yml needs a README entry covering what it controls, the default value, and any gotchas.

mosburn.keycloak has 12 variables. All 12 documented. When I need to pass mosburn_keycloak_native_version to a playbook run six months from now, I won’t be grepping the role source to figure out what it does.

The discipline pays off

These roles deploy across Fedora, Ubuntu, and Gentoo. Some have been running for months. When I add a new distribution target or change a package name, Molecule tells me before a live host does.

That’s the whole point. Tests aren’t fun to write. But a lab that fails silently isn’t something you can trust, and infrastructure you can’t trust is a liability you happen to own.

Write the test first. Always.

Identity is Infrastructure: Why Keycloak Comes First

2026-04-22T00:00:00+00:00

When I mapped out the Mosburn Lab stack, the list looked like this: Forgejo for code, Redmine for project tracking, Docmost for documentation, TheHive for security research, ELK for log aggregation. Every single one has a login page. Every login page means credentials. And credentials at scale mean password managers, shared secrets, access drift, and eventually the realization that someone’s dev account still has admin rights to something they stopped using a year ago.

Better password hygiene doesn’t fix this. Centralizing authentication before you deploy anything else does.

Why Keycloak

Keycloak is Red Hat’s open source identity and access management platform. OAuth2, OpenID Connect, SAML 2.0. Every major application framework can talk to it. Runs fine on modest hardware. Has an admin interface that doesn’t require you to write LDAP queries.

The alternatives I actually considered:

Authentik is excellent and has a better UI, honestly. I went with Keycloak because the enterprise support story is stronger and the third-party integration docs are more thorough. Either works.

Authelia is lighter and simpler, but it’s primarily a forward-auth proxy rather than a full identity provider. Fine for protecting static services. Not enough for apps that need to issue tokens to their own APIs.

Kanidm is interesting — Rust implementation, strong consistency guarantees — but ecosystem support is still catching up.

Keycloak wins on coverage. If a self-hostable application supports OIDC, Keycloak has a documented path for it.

The architecture

In the Mosburn Lab, Keycloak sits at the center of the auth graph:

                        ┌──────────────────┐
                        │    Keycloak      │
                        │  (mosburn realm) │
                        └────────┬─────────┘
              ┌──────────────────┼──────────────────┐
              │                  │                  │
         ┌────▼───┐        ┌─────▼────┐      ┌─────▼────┐
         │Forgejo │        │ Redmine  │      │ Docmost  │
         └────────┘        └──────────┘      └──────────┘

Each application registers as an OIDC client in the mosburn realm: unique client ID and secret, scoped redirect URIs, claims mapped to what the application actually needs. Users log in once. Keycloak issues a session. Everything else inherits it.

What this changes operationally

Onboarding: One account in Keycloak. Access to every service that role covers, no per-application user creation required.

Offboarding: Disable the account. Access revoked everywhere immediately. No checklist of applications to hunt down.

Audit trail: Authentication events centralized. You can see when anyone last logged into any service from a single view.

Access policy: Role-based access defined once. Apps trust the claims in the Keycloak token instead of maintaining their own permission models.

The Ansible approach

The mosburn.keycloak role deploys via Docker Compose with a PostgreSQL backend and a systemd unit managing the container lifecycle. Structured to support native installation on Fedora and Ubuntu too — useful when you don’t want Docker overhead on a given host.

Configuration in defaults/main.yml. Realm name, admin credentials, database passwords, hostname — all variables. Nothing hardcoded. Deploy to any inventory host with the right credentials via -e or Ansible Vault.

Deployment order is not optional

Keycloak has to be running before you configure the services authenticating against it. This is why lab.yml deploys it first:

- name: Deploy Keycloak
  hosts: keycloak
  become: true
  roles:
    - mosburn.docker_host
    - mosburn.keycloak

- name: Deploy Forgejo
  hosts: forgejo
  ...

The Forgejo role registers Keycloak as an OAuth2 provider via the Forgejo admin API at the end of its run. Keycloak not up? Registration fails. Order is enforced by the playbook, not by hope.

One thing I’d do differently

Realm and client configuration is currently manual after the initial deploy. The keycloak_* modules in community.general can automate client creation, role mapping, and identity provider config — but they need the Keycloak admin REST API available during the Ansible run. Next iteration of the role handles that idempotently, the same way the Forgejo role handles its OAuth2 provider registration. For now, the changeme client secrets in defaults/main.yml make it obvious that post-deploy config is still required.

Identity first. Everything else depends on it.

From Chaos to Control: The Case for a Business-Grade Home Lab

2026-04-15T00:00:00+00:00

There’s a particular kind of dread that comes with SSH-ing into a machine you set up two years ago. You don’t remember what’s running. You’re not sure which config file is authoritative. The service you need is up, but you couldn’t tell anyone why, and you’re afraid to touch anything in case you break it.

That’s not infrastructure. That’s debt with a power cord.

My personal stuff ran exactly like that for years. Every new project meant another server config done by hand, another package installed with a vague plan to document it later, another thing that would stop working on the next rebuild. Automations I’d forgotten were still running. Critical services with no idea how they got configured.

The breaking point was a threat intelligence setup I needed for some security research. I had the tools, I knew how to use them, and standing up a clean environment took longer than the actual research. Next time I needed it I’d be starting from scratch again. That’s not a workflow. That’s a punishment.

The business framing

Something shifts when you start treating home infrastructure the way a small business would treat production systems. Not in budget or complexity — in discipline.

A three-person dev shop doesn’t wing it the way I was winging mine. They have version-controlled config so they know what changed when something breaks. They have reproducible environments so a new person isn’t starting from tribal knowledge. They have centralized access control instead of a spreadsheet of passwords. They know when something stops working before a user does.

None of that requires money. It requires taking it seriously.

The decision to go all-in on Ansible

I looked at the options. NixOS is genuinely elegant but it’s a full mental model shift and I wasn’t ready to commit. Kubernetes is the right answer to a different question. Chef and Puppet are more operational overhead than I wanted for a one-person shop.

Ansible fit because it maps to how I already think. Tasks run on hosts, in order, with predictable outcomes. I could write useful automation in an afternoon. The ceiling is high enough that I still haven’t hit it.

The harder call was committing to test-driven development for the roles. Writing a failing test before writing the task felt like friction. It’s not. Every single time I’ve skipped that discipline I’ve paid for it in debugging time. Without exception.

What “business-grade” actually means here

Not PCI compliance in a spare bedroom. It means:

Every host is built from Ansible. Not “mostly Ansible.” If it’s not in a role, it doesn’t run on my network.
Environments are disposable. Any host can be rebuilt from scratch in one playbook run. Packer images give me a clean starting point every time.
Identity is centralized. Keycloak handles auth for everything. One set of credentials, OIDC across every service.
The lab documents itself. This blog exists because decisions made and then forgotten are the same as decisions never made.

What’s coming

The rest of this series covers the specific choices: why Keycloak has to come before everything else, how TDD changes the way I write roles, what a self-hosted AI development environment actually looks like in practice, and how to run a home security research capability without a SOC budget.

The archaeology phase is over.