In this post we will setup a highly available mysql galera cluster on docker swarm.
About
The service is based of docker-mariadb-cluster repository and it’s designed not to have any persistent data attached to the service, but rely on the “nodes” to replicate the data.
Note, that however this proof of concept works, I always recommend to use a remote mysql database outside your cluster, such as RDS etc.
Since we don’t persist any data on the mysql cluster, I have associated a dbclient service that will run continious backups, which we will persist the path where the backups reside to disk.
The dbclient is configured to be in the same network as the cluster so it can reach the mysql service. The default behavior is that it will make a backup every hour (3600 seconds) to the /data/{date}/ path.
Deploy the stack:
12345
$ docker stack deploy -c docker-compose.yml galeraCreating network dbnetCreating service galera_dbclusterCreating service galera_dblbCreating service galera_dbclient
Have a look to see if all the services is running:
At the moment we only have 1 replica for our mysql cluster, let’s go ahead and scale the cluster to 3 replicas:
1234567
$ docker service scale galera_dbcluster=3galera_dbcluster scaled to 3overall progress:3 out of 3 tasks1/3:running [==================================================>]2/3:running [==================================================>]3/3:running [==================================================>]verify:Service converged
$ docker exec -it $(docker ps -f name=galera_dbclient -q) mysql -uroot -ppassword -h dblb -e'select * from mydb.foo;'+------+| name |+------+| ruan |+------+
Simulate a Node Failure:
Simulate a node failure by killing one of the mysql containers:
1
$ docker kill 9e336032ab52
Verify that one container is missing from our service:
123
$ docker service lsID NAME MODE REPLICAS IMAGE PORTSp8kcr5y7szte galera_dbcluster replicated 2/3 toughiq/mariadb-cluster:latest
While the container is provisioning, as we have 2 out of 3 running containers, read the data 3 times so test that the round robin queries dont hit the affected container (the dblb wont route traffic to the affected container):
1234567891011121314151617181920
$ docker exec -it $(docker ps -f name=galera_dbclient -q) mysql -uroot -ppassword -h dblb -e'select * from mydb.foo;'+------+| name |+------+| ruan |+------+$ docker exec -it $(docker ps -f name=galera_dbclient -q) mysql -uroot -ppassword -h dblb -e'select * from mydb.foo;'+------+| name |+------+| ruan |+------+$ docker exec -it $(docker ps -f name=galera_dbclient -q) mysql -uroot -ppassword -h dblb -e'select * from mydb.foo;'+------+| name |+------+| ruan |+------+
Verify that the 3rd container has checked in:
123
$ docker service lsID NAME MODE REPLICAS IMAGE PORTSp8kcr5y7szte galera_dbcluster replicated 3/3 toughiq/mariadb-cluster:latest
How to Restore?
I’m deleting the database to simulate the scenario where we need to restore: