May 26, 2015

Cassandra on AWS - Part 3 - Backup

This third part will focus on the data backup strategy for Cassandra on AWS.

Cassandra Data Backup Strategy

In case of a node failure in a cluster, Cassandra can automatically repair and bring the node up to speed through replication. It will sync with other nodes in the cluster and use the data present on the other nodes to fix itself. It is advised to enabled incremental backup and then collect snapshots at a regular intervals( say 24 hrs). The snapshot backups and incremental backups can be collected from the nodes and then pushed into S3 for use during restore.

  • Snapshot backup should be taken on a daily basis at 12:00 A.M midnight or any other convenient non-peak hours
  • Snapshots & incremental backup files older than 7 days can be deleted from S3.
  • S3 backup location should be named and organized based on Cluster and node names. In case of AWS these will be Region and AZs.

Cassandra Node Failure Recovery Strategy

If a node in a cluster fails or if the node restarts and is out of the cluster for a certain time then it is mandatory to run the node repair tool on the node. Node repair makes deletes old data(Tombstoned) and ensures that these deleted data don't resurrect on other nodes as non-deleted data. Execute the following command on the failed node soon after startup:
	
nodetool repair -dc  -h localhost

Example:
nodetool repair -dc us-west-2 -h localhost
	


Reference
[Taking a Snapshot]
[Incremental Backup]
[Restoring from a snapshot]