Skip to content
lfm.sys SysAdmin & Backend Developer Initiate contact
← back to case files
MICRO case_file

Scripted Cassandra Node Decommission

Database Cloud Production

Context

After a workload consolidation, a set of Cassandra nodes were no longer needed. The cluster needed a clean shrink, not a hard removal.

Problem

Removing nodes from a live ring is reversible only up to a point. The shape of the change had to respect Cassandra's data ownership semantics, and the operation had to be repeatable across multiple nodes.

My role

Operator: scripted the decommission steps and validated cluster state between them.

Technical actions

  1. [01] Drained and decommissioned nodes following Cassandra's ownership-aware procedure.
  2. [02] Validated cluster topology and token ranges between steps.
  3. [03] Captured the steps as a repeatable script so the next decommission would not start from zero.

Operational impact

Cluster shrunk safely after the consolidation. The procedure is now a reusable artifact instead of tribal knowledge.

What this demonstrates

  • Treating cleanup as a first-class operation, not an afterthought.
  • Scripting recurring infrastructure work for repeatability.