Forem: Chikara Inohara

Deep Dive: How Proxmox Actually Keeps Your Cluster in Sync (Corosync & pmxcfs Internals)

Chikara Inohara — Sat, 07 Mar 2026 21:18:49 +0000

⚠️ Fair warning: I'm still learning this stuff, so some details might not be 100% perfect. Take it as a fellow homelab explorer's notes, not official docs!

In my last post, we talked about the outside view of a Proxmox cluster — quorum, split-brain, and how Corosync's strict timeouts decide when a node is declared dead. We looked at token-passing and fencing from a bird's eye view.

This time, let's crack open the hood and look inside.

Proxmox VE's cluster features are incredibly powerful, but for a lot of us, it feels like a black box. How does it actually stay in sync? What happens byte-by-byte when you change a VM config?

I went down a research rabbit hole diving into the source code of Corosync and pmxcfs, and here's what I found.

🏗️ Architecture Overview: Two Key Components

Everything in a Proxmox cluster boils down to two layers working in tight coordination:

📝 Note: This diagram is reused from my original Japanese article on Qiita — too lazy to redraw it in English, sorry! The key things to spot: pmxcfs lives in RAM on each node, syncs between nodes via Corosync, and the SQLite DB is persisted to disk (shown as USB here — more on why that matters later!).

Corosync (Totem Protocol)

Job: Cluster membership management + message ordering guarantee

It provides something called Virtual Synchrony — every node receives messages in the exact same order. This is achieved through the token-passing mechanism we covered last time.

pmxcfs (Proxmox Cluster File System)

Job: Manages all the config files you see under /etc/pve

Here's the fun part — it's actually a SQLite database living in memory on each node. It just looks like a regular filesystem thanks to FUSE mounting. Wild, right?

🔄 Corosync / Totem Protocol: The Details

The heart of Corosync is the Totem Single-Ring Protocol. Regardless of your physical network topology, it creates a logical ring of nodes and circulates a special packet called a token around that ring to control who can send messages.

Token Passing

Only the node currently holding the token is allowed to broadcast (multicast) a message. This elegantly prevents write conflicts — no two nodes can write simultaneously. Everything is serialized.

Node A → [Token] → Node B → [Token] → Node C → [Token] → back to Node A

📝 Another one from the Japanese version! Left: node 4 holds the token and multicasts Message1 to all nodes. Right: the token has passed to node 1, which now multicasts Message2. Only the token holder gets to send — everyone else just listens.

ARU (All Received Up to)

The token carries a sequence number called aru — short for "All Received Up to." Think of it as a receipt: "Everyone in the ring has confirmed they got messages up to this point."

When the token completes a full loop and comes back with an updated ARU, the original sender knows with certainty: everyone got it.

What Actually Happens When a Token Arrives (`totemsrp.c`)

Based on the Corosync source code (exec/totemsrp.c), here's the processing order:

Receive token from the previous node
Retransmit check — did I miss any messages? If so, request retransmission
Multicast send — flush any pending messages (like pmxcfs config changes) out to the network
Update & pass — increment the sequence number, hand the token to the next node

💾 How pmxcfs Syncs Data: The Journey of a Write

Okay, here's where it gets really interesting. What actually happens when you edit a VM config in the Proxmox UI?

Step 1: Write Request from Application

A process like pvedaemon writes to /etc/pve/qemu-server/100.conf. This gets intercepted by FUSE and handed off to the pmxcfs process.

Step 2: CPG Broadcast via Corosync

pmxcfs bundles the change as a transaction and sends it through Corosync's CPG (Closed Process Group) API — essentially asking Corosync to deliver this to every node in the cluster.

The data sits in Corosync's send buffer, waiting for the token to come around.

Step 3: Receive and Immediately Persist ← This is the critical part

When each node's pmxcfs receives the transaction (including the original sender!), it does two things:

1. Update in-memory SQLite DB

The change is applied to the node's in-memory database instantly.

2. fsync() to disk

This is the big one. pmxcfs immediately calls fsync() on the backing SQLite file to flush it to the physical disk.

fsync() blocks until the OS confirms the data has been physically written to storage. No faking it.

Step 4: Transaction Committed

Once every node's fsync() completes and the token comes back with an updated ARU, that transaction is officially committed cluster-wide. Consistency guaranteed. ✅

⚡ Why Your System Disk I/O Matters More Than You Think

Now the architectural picture should make the consequences clear: Proxmox's config sync waits for every node to finish writing to disk.

The Domino Effect of Slow I/O

Slow fsync on Node C
  → pmxcfs on Node C is blocked
    → Corosync process stalls
      → Token circulation delayed
        → Timeout triggered
          → Node declared dead 💀

All because of a slow disk write. That's how tightly coupled these components are.

⚠️ Homelab Warning: Watch Your System Disk!

This is particularly nasty for common homelab setups:

Cheap USB sticks as boot media
Old spinning HDDs for the OS
Network-attached storage running the system

VM/container storage being slow? Usually fine. But Proxmox's own system disk being slow? That can destabilize your entire cluster.

💡 You can measure this yourself! Proxmox ships with a built-in benchmark tool called pveperf. Run it and check the fsync/s number. In my own testing: a USB stick scored 30–50 fsync/s, while an SSD hit 3,000+. That's nearly a 100x difference!

What to Actually Use

Storage Type	Homelab OK?	Notes
NVMe / SATA SSD	✅ Great	Ideal for system disk
Enterprise SSD (with PLP)	✅ Best	Power-loss protection = extra safety
2.5" HDD	⚠️ Okay-ish	Watch for latency spikes
USB stick	❌ Avoid	Way too slow for fsync
SD card	❌ Avoid	Same problem, often worse

📋 Summary

Here's the full picture of what we covered:

Component	Role
Corosync / Totem	Token-passing ring, message ordering, membership
ARU	Confirms all nodes received each message
pmxcfs	In-memory SQLite DB, FUSE-mounted as `/etc/pve`
`fsync()`	Blocks until data hits physical disk on every node
System disk I/O	Directly impacts cluster stability

The Practical Takeaway

When choosing hardware for a Proxmox cluster, most people think: CPU → RAM → Network → Storage. But for cluster stability, you should actually be thinking about fsync latency early in your planning.

Even in a homelab, using a fast SSD for the system disk (not just VM storage) will make your cluster dramatically more stable.

Pair this knowledge with the timeout tuning from the last post, and you'll have a much more resilient setup!

📚 References

The Totem Single-Ring Ordering and Membership Protocol (paper)

Corosync Source Code (exec/totemsrp.c, lib/cpg.c)

Proxmox VE Docs — Cluster Network

If you found this useful, drop a ❤️! And if you spot anything I got wrong, please call it out in the comments — I'm still learning and corrections are very welcome 🙏

🎯 The Heart of a Proxmox Cluster: Understanding Corosync for a Stable Homelab

Chikara Inohara — Tue, 16 Sep 2025 13:31:33 +0000

📝 Introduction

Setting up a Proxmox cluster feels like unlocking a new superpower, doesn't it? You get to:

Manage multiple servers from a single interface
Live-migrate VMs like you're in The Matrix
Feel like a proper sysadmin (even if you're just wearing pajamas)

Please don't judge my messy cluster... pve1 decided to take a vacation and these VMs are just test dummies!

But here's the thing - I never really stopped to think about what's actually happening under the hood to make all this magic work. It just... worked, you know?

What changed my mind?
Recently at work, I had to do some research on cluster technologies, and I fell down the rabbit hole of learning about Corosync - the critical component that keeps Proxmox clusters from falling apart. It was one of those "aha!" moments where everything suddenly clicked!

So today, let's dive into what I learned about Corosync, why it matters, and answer the big question for us homelabbers: "Should I actually care about this stuff?"

🤝 What Exactly is Corosync?

Think of Corosync as the nervous system of your cluster.

It's the open-source software that lets all your Proxmox servers gossip with each other, constantly checking if everyone's still alive and sharing important updates. Without it, your cluster would be like a group chat where nobody knows if anyone else is online.

Corosync's Main Jobs:

📋 Membership Management
- Keeps track of who's in the club
- Knows exactly which nodes are active right now
💬 Messaging
- Makes sure commands reach all nodes
- "Hey everyone, we're starting VM 101 on node 3!"
⚖️ Quorum Management
- The "majority rules" system
- This is the big one! (More on this in a sec)

⚖️ Understanding "Quorum" - The Cluster's Democracy

Deep dive into Proxmox Quorum docs

If you remember just one thing from this post, make it Quorum. It's basically democracy for servers - decisions only happen when the majority agrees.

🧠 The Dreaded "Split-Brain" Problem

Let me paint you a picture of what could go wrong without quorum:

Imagine you have a 4-node cluster, and suddenly your network has a bad day. The cluster splits into two groups of two nodes each.

Without quorum rules, both groups would think:

"The other guys must have crashed!"
"We're the real cluster now!"
"Let's start all those VMs that were on the other nodes!"

Result? Both sides try to run the same VMs, write to the same storage, and basically create digital chaos. This nightmare scenario is called a split-brain, and yes, it's as scary as it sounds! 😱

How Quorum Saves the Day

The solution is elegantly simple:

The Majority Rules
Only the group with MORE than half the total votes can keep operating.

Got 3 nodes and 2 are talking? ✅ You have quorum (2 > 1.5)
Got 4 nodes and only 2 are talking? ❌ No quorum (2 = 2, not greater)
Got 5 nodes and 3 are talking? ✅ You have quorum (3 > 2.5)

Any group without a majority goes into "safe mode" and stops all cluster operations. This is called fencing, and while it might seem harsh, it's way better than data corruption!

When you see this scary red X in Proxmox:

Your node is basically saying: "I'm in the minority, so I'm sitting this one out to avoid causing problems!"

💥 When Things Get Aggressive

Nodes take "safety first" to the extreme. If a node loses contact with the cluster for too long (usually after a few tens of seconds), it might literally reboot itself as a precaution!

I learned this the hard way when a brief network hiccup caused one of my nodes to panic and restart. Not fun when you have important VMs running!

You can watch the drama unfold in real-time in your system logs:

🔢 Why Odd Numbers are Your Friend

Here's why everyone recommends an odd number of nodes:

\text{Quorum} = \left\lfloor \frac{\text{Total Nodes}}{2} \right\rfloor + 1

Let me break it down with real examples:

Nodes	Can Survive	Why?
3 nodes	1 failure	2 remaining > 1.5 ✅
4 nodes	1 failure	2 remaining = 2 ❌ Risk of 2v2 split!
5 nodes	2 failures	3 remaining > 2.5 ✅

The takeaway?
Even numbers = potential 50/50 splits = bad times

Stick with 3, 5, or 7 nodes for a happier cluster life!

🏢 The "Enterprise-Grade" Setup (aka Overkill for Most of Us)

If you're running mission-critical stuff, here's what the pros recommend:

Redundant dedicated networks for Corosync
Separate physical switches just for cluster traffic
Multiple NICs on each node
Basically, treat Corosync traffic like it's made of gold

For a homelab? Yeah... probably not happening. But it's good to know what "best practice" looks like!

🏡 The Realistic Homelab Approach

Here's what I'm actually running (and it works fine!):

Everything goes through a single NIC per node - management, VM traffic, Corosync, the works. Is it perfect? Nope. Does it work? Absolutely!

⚠️ Watch Out For These Gotchas:

Network Saturation
- Don't try to migrate VMs while uploading ISOs while backing up while... you get it
- I've definitely made my cluster unhappy by being too ambitious with simultaneous transfers
Cheap Switches
- That $20 switch might save money but could cause random cluster hiccups
- Invest in something decent if you're having stability issues

My advice? Start simple with single NICs. Only add complexity when you actually hit problems!

🤔 "But I Only Have 2 Nodes!"

A 2-node cluster isn't great for High Availability (since losing one = losing quorum), but it's totally fine if you just want easier management!

The Emergency Recovery Trick

When one node dies in a 2-node cluster, here's your lifeline:

# Check if you've lost quorum
$ pvecm status
# Output: Quorum: No 😱

# Tell the surviving node it's now a 1-node cluster
$ pvecm expected 1

# Check again
$ pvecm status
# Output: Quorum: Yes 🎉

Pro tip: QDevice to the rescue!
You can also add a QDevice - basically a tiny third voter (like a Raspberry Pi) that breaks ties in 2-node clusters. It's a bit more complex to set up, but worth investigating if you're stuck with 2 nodes long-term.

Check out:

💭 Final Thoughts

So that's what I've learned about Corosync - the unsung hero keeping our Proxmox clusters from descending into chaos!

The TL;DR:

Understand Quorum (majority rules!)
Keep your network stable (especially latency)
Use odd numbers of nodes when possible
Don't overthink it for a homelab

The beauty of homelabbing is learning enterprise concepts and then figuring out what actually matters for your setup. You don't need redundant 10Gb networks and enterprise switches - you just need to understand the principles and adapt them to your reality (and budget)!

What's your cluster setup like? Are you running the recommended odd number of nodes, or living dangerously with an even number? Let me know in the comments!

Found this helpful? Drop a ❤️ and follow for more homelab adventures and my Devops learning adventures too! I'm always breaking things and (usually) fixing them, so there's plenty more to come!

My 2-Year Journey to Becoming a DevOps Engineer - The Roadmap

Chikara Inohara — Sun, 31 Aug 2025 09:07:58 +0000

Hello everyone! 👋

I usually write about my homelab setup on Qiita (a Japanese blog site), but today, I want to share something different—a major professional goal I'm setting for myself.

I'm officially starting a two-year journey to become a DevOps Engineer.

This isn't just about learning new tech. It's a public commitment. I plan to document my progress, my struggles, and my victories right here. Think of it as a captain's log for my career voyage. This first post is about sharing the map I'll be using to navigate.

Why DevOps?

What draws me to DevOps is the holistic approach to the software lifecycle. I'm fascinated by the idea of using tools like Infrastructure as Code (IaC) to automate everything, building resilient and reliable systems from the ground up. It's about bridging the gap between development and operations, and I want to be that bridge.

The Inspiration Behind This Journey

Before diving into my roadmap, I want to share what sparked this ambitious plan. I came across several YouTube videos that not only inspired me but also helped crystallize my approach to this career transition:

🎯 "How to become a DevOps Engineer in 2025"

This video is a comprehensive roadmap that provides a structured approach to becoming a DevOps Engineer. It covers everything from setting up a home lab and mastering Linux fundamentals to diving deep into containers, programming, cloud technologies, and Kubernetes. The creator also emphasizes the importance of soft skills and even touches on the role of AI in the future of DevOps, making this an invaluable guide for my journey.

💪 "My Self-Taught Coding Story"

This personal and inspiring story of a career change from a non-technical background (hospital worker) to a software engineer, and eventually a Developer Relations Engineer, was a huge motivation. It's a powerful reminder that with dedication and a willingness to learn, a career transformation like this is entirely possible.

Why these videos matter
These resources didn't just give me technical knowledge—they gave me the confidence that with a structured plan and consistent effort, this career transition is absolutely achievable. Each video addressed different aspects of my journey: the technical roadmap, the process understanding, and the human element of career transformation.

The 2-Year Goal: A Four-Phase Roadmap

My journey is broken down into four distinct phases. This roadmap will be my guide and my promise to myself.

📚 Phase 1: Building the Foundation

Timeline: First 6 Months

Goal & Strategy
Goal: Master the fundamentals of Linux, Networking, and AWS. Make output on GitHub a daily, natural habit.

Strategy: Start with the basics that every DevOps engineer needs. Focus on understanding rather than memorization.

Key Deliverables:

✅ Earn the LPIC-1 certification
✅ Earn the CCNA certification
✅ Earn the AWS Certified Cloud Practitioner certification
✅ Maintain a well-organized GitHub profile with study scripts and notes
✅ Document learning journey through weekly blog posts

☁️ Phase 2: Cloud, IaC, and Containers in Practice

Timeline: Months 7-12

Goal & Strategy
Goal: Move beyond theory to practical application. Leave manual infrastructure setup behind and learn to containerize applications.

Strategy: Every piece of infrastructure should be code. Every application should be containerized. No exceptions.

Key Deliverables:

✅ Earn the AWS Certified Solutions Architect - Associate (SAA) certification
✅ Manage AWS infrastructure entirely with Terraform code
✅ Write custom Dockerfiles for at least 5 different application types
✅ Create a multi-container application with Docker Compose
✅ Implement GitOps practices in personal projects

🔄 Phase 3: Building an Automated Pipeline

Timeline: Months 13-18

Goal & Strategy
Goal: Understand Kubernetes and build a complete CI/CD pipeline that automates everything from code commit to deployment.

Strategy: Build real pipelines for real projects. Learn by breaking things and fixing them.

Key Deliverables:

✅ A fully functional CI/CD pipeline (GitHub Actions/Jenkins)
✅ Deploy applications to a Kubernetes cluster
✅ Implement blue-green and canary deployments
✅ Create a detailed guide on setting up a Kubernetes cluster at home
✅ Contribute to open-source DevOps projects

📊 Phase 4: SRE Practices and Job Hunting

Timeline: Months 19-24

Goal & Strategy
Goal: Learn to monitor and improve the reliability of the systems I've built. Polish my portfolio and begin the job search.

Strategy: Think like an SRE. Measure everything. Automate everything. Document everything.

Key Deliverables:

✅ Build a monitoring dashboard using Prometheus and Grafana
✅ Implement alerting with PagerDuty/Opsgenie
✅ Create chaos engineering experiments
✅ Develop SLIs, SLOs, and error budgets for personal projects
✅ Polish GitHub portfolio with 10+ production-ready projects
✅ Start applying for DevOps positions

🎯 The First 90 Days: Breaking It Down

Week 1-2: Environment Setup

Set up home lab with virtualization → done
Configure Git and GitHub → done
Start daily commits habit

Week 3-4: Linux Deep Dive

Master basic commands and file system → already learned at work but recap
Learn shell scripting basics
Understand permissions and processes
Practice with systemd and services

Week 5-8: Networking Fundamentals

OSI model and TCP/IP stack → already learned at work but recap
Subnetting and VLANs → already learned at work but recap
DNS, DHCP, and routing → already learned at work but recap
Hands-on with virtual networks

Week 9-12: AWS Foundations

Core services (EC2, S3, VPC)
IAM and security best practices
Cost optimization strategies
Prepare for Cloud Practitioner exam

📊 Success Metrics

I'm setting clear, measurable goals to track my progress:

\text{Success Rate} = \frac{\text{Completed Milestones}}{\text{Total Planned Milestones}} \times 100\%

Monthly Targets:

📝 4 technical blog posts
💻 20+ GitHub commits
📚 40 hours of structured learning
🔨 2 practical projects completed

🚀 Tools & Resources I'll Be Using

My Tech Stack
Learning Platforms:

Homelab
Udemy
YouTube
Official documentation

Hands-On Labs:

AWS Free Tier
Home Lab (Proxmox/Kubernates)

Core Tools to Master:

Version Control: Git, GitHub
IaC: Terraform, Ansible
Containers: Docker, Kubernetes
CI/CD: GitHub Actions
Monitoring: Prometheus, Grafana, ELK Stack
Cloud: AWS (primary), Azure, GCP (basics)

💡 What Makes This Different?

This isn't just another "learn DevOps" post. Here's what I'm committed to:

Public Accountability: Weekly progress updates right here
Real Projects: Everything I learn gets applied to actual projects
Open Source: All my learning materials and projects will be open source
Community First: I'll help others who are starting their journey

Follow my journey on GitHub - All resources will be open source!

🤝 Join Me on This Journey

Want to follow along?

Star my GitHub repository for updates
Follow me here on DEV and on X for weekly progress posts
Connect on LinkedIn for professional updates

I'm sharing this roadmap to hold myself accountable and to connect with others who might be on a similar path. If you have advice, encouragement, or just want to follow along, I'd love to hear from you.

Let the journey begin! 🚀

Drop a comment below with your thoughts or advice! Are you on a similar journey? What resources have helped you the most?

Next Post Preview: "Week 1: Setting Up My DevOps Home Lab - A Complete Guide"

Follow me to get notified when it drops! 🔔

Forem: Chikara Inohara

Deep Dive: How Proxmox Actually Keeps Your Cluster in Sync (Corosync & pmxcfs Internals)

🏗️ Architecture Overview: Two Key Components

Corosync (Totem Protocol)

pmxcfs (Proxmox Cluster File System)

🔄 Corosync / Totem Protocol: The Details

Token Passing

ARU (All Received Up to)

What Actually Happens When a Token Arrives (totemsrp.c)

💾 How pmxcfs Syncs Data: The Journey of a Write

Step 1: Write Request from Application

Step 2: CPG Broadcast via Corosync

Step 3: Receive and Immediately Persist ← This is the critical part

Step 4: Transaction Committed

⚡ Why Your System Disk I/O Matters More Than You Think

The Domino Effect of Slow I/O

⚠️ Homelab Warning: Watch Your System Disk!

What to Actually Use

📋 Summary

The Practical Takeaway

📚 References

🎯 The Heart of a Proxmox Cluster: Understanding Corosync for a Stable Homelab

📝 Introduction

🤝 What Exactly is Corosync?

Corosync's Main Jobs:

⚖️ Understanding "Quorum" - The Cluster's Democracy

🧠 The Dreaded "Split-Brain" Problem

How Quorum Saves the Day

💥 When Things Get Aggressive

🔢 Why Odd Numbers are Your Friend

🏢 The "Enterprise-Grade" Setup (aka Overkill for Most of Us)

🏡 The Realistic Homelab Approach

⚠️ Watch Out For These Gotchas:

🤔 "But I Only Have 2 Nodes!"

The Emergency Recovery Trick

💭 Final Thoughts

My 2-Year Journey to Becoming a DevOps Engineer - The Roadmap

Why DevOps?

The Inspiration Behind This Journey

🎯 "How to become a DevOps Engineer in 2025"

💪 "My Self-Taught Coding Story"

The 2-Year Goal: A Four-Phase Roadmap

📚 Phase 1: Building the Foundation

☁️ Phase 2: Cloud, IaC, and Containers in Practice

🔄 Phase 3: Building an Automated Pipeline

📊 Phase 4: SRE Practices and Job Hunting

🎯 The First 90 Days: Breaking It Down

📊 Success Metrics

🚀 Tools & Resources I'll Be Using

💡 What Makes This Different?

🤝 Join Me on This Journey

What Actually Happens When a Token Arrives (`totemsrp.c`)