Cassandra on AWS - Part 3 - Backup
Cassandra Data Backup Strategy
In case of a node failure in a cluster, Cassandra can automatically repair and bring the node up to speed through replication. It will sync with other nodes in the cluster and use the data present on the other nodes to fix itself. It is advised to enabled incremental backup and then collect snapshots at a regular intervals( say 24 hrs). The snapshot backups and incremental backups can be collected from the nodes and then pushed into S3 for use during restore.- Snapshot backup should be taken on a daily basis at 12:00 A.M midnight or any other convenient non-peak hours
- Snapshots & incremental backup files older than 7 days can be deleted from S3.
- S3 backup location should be named and organized based on Cluster and node names. In case of AWS these will be Region and AZs.
Cassandra Node Failure Recovery Strategy
If a node in a cluster fails or if the node restarts and is out of the cluster for a certain time then it is mandatory to run the node repair tool on the node. Node repair makes deletes old data(Tombstoned) and ensures that these deleted data don't resurrect on other nodes as non-deleted data. Execute the following command on the failed node soon after startup:
nodetool repair -dc -h localhost
Example:
nodetool repair -dc us-west-2 -h localhost
Reference
[Taking a Snapshot][Incremental Backup]
[Restoring from a snapshot]