A common thing I do at work is upgrading MongoDB Replica Sets.
The more do it, the more I think of the time I spend doing it.
So I built a lab test that tries to answer:
How long does it take to upgrade a MongoDB Replica Set using Apache Airflow?
Please keep in mind we have the following software requirements for this test:
PREREQ #1: 3x Debian 13 or Rocky Linux 9 Virtual Machines.
Ideally we should set up a test environment as close as production as possible, I suggest you use VMs as much as you can.
In all of my tests, I use my own repo for setting up a libvirt/KVM based deployment of 3 VMs:
https://github.com/ad132p/labs
In case you know libvirt/KVM, go take a look!
You need terraform (or opentofu) for deploying the nodes.
tofu plan
tofu apply
PREREQ #2: A P-S-S MongoDB v7.0 replica set
We need a standard MongoDB replica with 3 nodes. One primary, and 2 secondaries.
There are many ways to deploy a MongoDB replica set, but this deployment
uses PSMDB from Percona. It aims to be a drop-in replacement of MongoDB community version,
plus it has many add-on features.
Check it out!
https://docs.percona.com/percona-server-for-mongodb/8.0/install/index.html
I could set up the replica set manually, but for better troubleshooting,
I opted to use ansible https://github.com/percona/mongo_terraform_ansible/tree/main/ansible since
it deploys Replica Sets and Sharded environments on RHEL and Debian compatible machines.
Check Ivan's great project:
https://github.com/percona/mongo_terraform_ansible
And see what it can do for you.
Also, you need Ansible and SSH connectivity to the VMs for deploying the Replica Set.
Make sure to use MongoDB/PSMDB 7
mongo_release: psmdb-70
rs0 [primary] test> db.version()
7.0.26-14
rs0 [primary] test>
Warning: From now on, I assume you have access to a mongosh shell similar to this:
[apollo@grow ~]$ mongosh 'mongodb://root:password@db-1:27017,db-2:27017,db-3:27017/?replicaSet=rs0'
Current Mongosh Log ID: 693ca63329e00f9aa9722621
Connecting to: mongodb://<credentials>@db-1:27017,db-2:27017,db-3:27017/?replicaSet=rs0&appName=mongosh+2.5.9
Using MongoDB: 7.0.26-14
Using Mongosh: 2.5.9
mongosh 2.5.10 is available for download: https://www.mongodb.com/try/download/shell
For mongosh info see: https://www.mongodb.com/docs/mongodb-shell/
PREREQ #3: Apache Airflow
For now just install Apache Airflow
Let's go with default pip for this PoC:
Officially supported installation method is with pip:
export AIRFLOW_HOME=~/airflow
### Rocky Linux 9
You have to install uv as instructed by apache airflow:
https://docs.astral.sh/uv/getting-started/installation/
Once you have uv, create your python 3.10 env:
uv python install 3.10.7
uv venv
source .venv/bin/activate
AIRFLOW_VERSION=3.1.3
### Extract the version of Python you have installed. If you're currently using a Python version that is not supported by Airflow, you may want to set this manually.
### See above for supported versions.
PYTHON_VERSION="$(python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')"
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
### For example this would install 3.0.0 with python 3.10: https://raw.githubusercontent.com/apache/airflow/constraints-3.1.4/constraints-3.10.txt
uv pip install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
Finally start airflow:
airflow standalone
Access airflow by accessing http://localhost:8080/
Username is admin and password should be here:
cat ~/airflow/simple_auth_manager_passwords.json.generated
Now open another terminal session using the same directory you installed your venv environment, and do the following:
Activate venv:
source .venv/bin/activate
Install SSH provider:
uv pip install apache-airflow-providers-ssh --python /home/apollo/.venv/bin/python
Run this test command:
airflow tasks test example_bash_operator runme_0 2015-01-01
It should work.
For this point on, I assume you have access to an Apache Airflow web interface:
Why Airflow?
Airflow’s DAG (Directed Acyclic Graph) structure allows you to define a strict sequence of steps (nodes), and dependencies (vertices).
Nearly all MongoDB maintenance tasks require a rolling upgrade — you must upgrade one node at a time to maintain high-availability.
In this lab, all steps I take in the implementation plan are defined as nodes, and dependencies
are enforced by directed edges (arrows).
Please check example from the documentation:
https://www.mongodb.com/docs/manual/release-notes/8.0-upgrade-replica-set/#upgrade-procedure
Whenever a Major upgrade occurs, these few key steps are definitely taken:
0. Pre-checks:
We need to make sure that the Replica Set is healthy to begin with.
There are many ways to do this, but for how we can check if there is any delay in any secondary node.
rs0 [primary] test> rs.printSecondaryReplicationInfo()
source: db-2:27017
{
syncedTo: 'Sat Dec 20 2025 00:10:34 GMT+0000 (Coordinated Universal Time)',
replLag: '0 secs (0 hrs) behind the primary '
}
---
source: db-3:27017
{
syncedTo: 'Sat Dec 20 2025 00:10:34 GMT+0000 (Coordinated Universal Time)',
replLag: '0 secs (0 hrs) behind the primary '
}
We can also use commands such as rs.status() to check individual node information.
Under this step we may have extra checks, such as checking if a backup process is happening, or an index is being created. For simplicity these will be omitted here. The key is to know what are the pre-checks necessary for you to give your Implementation Plan a GO or NOGO.
After all pre-checks are completed, we are ready to proceed with the Implementation Plan.
Once we are safe to proceed with the main plan, a typical MongoDB replica set gets
upgraded this way:
1. Shut down a secondary replica set member.
2. Upgrade secondary member of the replica set.
3. Repeat steps 1 and 2 on remaining Secondaries
4. Step down the replica set primary.
5. Upgrade the former primary.
6. setFeatureCompatibilityVersion: "8.0" or latest (it's 2026, folks!)
That's it. And this repetitive process never really changes. In fact this rolling upgrade
pipeline is followed in nearly all production changes in a Replica Set/ Sharded environment.
Let's see how airflow is able to implement each of these steps:
The 5-Step Workflow
Discovery (Pre-checks, step 0).
The DAG connects to your MongoDB cluster.
It identifies which node is currently MASTER (Primary) and which are SLAVES (Secondaries).
Upgrade Secondaries Sequentially
It iterates through the list of Secondary nodes one by one.
For each secondary, it performs these actions via SSH:
Stops the
mongod
service.
Updates the package repositories (enabling Percona 8.0).
Installs the new binaries (upgrading from v7 to v8).
Starts the
mongod
service.
Health Check: Crucially, it waits for the node to rejoin the replica set and report a healthy SECONDARY state before moving on to the next node. This ensures the cluster always maintains a majority.
Step Down Primary (
step_down_primary
)
Once all secondaries are upgraded and healthy, the DAG connects to the current Primary.
It executes rs.stepDown().
Result: The Primary forces an election and becomes a Secondary. One of the already-upgraded nodes becomes the new Primary.
Upgrade Former Primary (upgrade_former_primary)
The DAG now targets the node that was originally the Primary (now a Secondary).
It performs the same upgrade procedure (Stop → Install → Start).
Finalize (
set_feature_compatibility_version
)
Finally, it connects to the new Primary.
It sets the featureCompatibilityVersion (FCV) to 8.0.
This marks the upgrade as complete and enables v8-specific features (preventing older nodes from joining).
Key Features
Zero Downtime: By upgrading secondaries first and stepping down the primary, the cluster remains available for writes (except for the few seconds during election).
Sequential Safety: Nodes are never upgraded in parallel. If one node fails to come back up, the DAG fails, leaving the rest of the cluster untouched and healthy.
Author: epaminondas
Tags: #databases
Accessed 52 times.
Loading interactive comments...