Using GTID to attach an asynchronous Slave sounds promising. Lets have a look at the two existing GTID implementations and their integration with Galera.

GTID@MariaDB

There is one GTID used by the cluster and every node increments the common seqno by itself. This works well as long all transactions are replicated by Galera (simplified InnoDB). Because Galera takes care of the Commit Order of the transactions on all nodes. So having identical GTID/seqno from the start there are no problems.

  node1> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

  node2> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

  node3> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

But think about having a DML not replicated by Galera. Lets assume we write into a MyISAM/MEMORY table on node1. Then only the seqno of node1 is increased:

  node1> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-505 |
  +-----------------+---------+

  node2> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

  node3> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

Galera does not care about the different seqno on the hosts. The next transaction replicated by Galera increases the seqno of all nodes by 1:

  node1> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-506 |
  +-----------------+---------+

  node2> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-505 |
  +-----------------+---------+

  node3> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-505 |
  +-----------------+---------+

So we have different mapping between GTID@MariaDB and GTID@Galera on the different hosts. This is far from optimal.

Having that situation think about a slave switching the Master. You will do one of two outcomes:

  • Loosing transactions
  • Reapplying transactions

If you are lucky replication fails. As the data is inconsistent.

So it is up to you to make sure you have only DML’s on the cluster, being replicated by Galera only.

GTID@MySQL/Percona

MySQL-Galera integration looks different:

GTID@MySQL uses server_uuid combined with the seqno to build its GTID. Galera makes a little trick in using a separate servre_uuid for transactions written/replicated by Galera. All other transactions use the original server_uuid of the server.

Let’s have a look on one node:

 node2> show global variables like 'gtid_executed';
 +---------------+-------------------------------------------------+
 | Variable_name | Value                                           |
 +---------------+-------------------------------------------------+
 | gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-10,
 933c5612-12c8-11e4-82d2-00163e014ea9:1-6 |
 +---------------+-------------------------------------------------+

6d75ac01-ed37-ee1b-6048-592af289b902 ist die server_uuid für Galera. 933c5612-12c8-11e4-82d2-00163e014ea9 ist die server_uuid für alle anderen Transaktionen.

So lets write into an InnoDB table:

node2> node2> show global variables like 'gtid_executed';
+---------------+--------------------------------------------------+
| Variable_name | Value                                            |
+---------------+--------------------------------------------------+
| gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-11,
933c5612-12c8-11e4-82d2-00163e014ea9:1-6 |
+---------------+--------------------------------------------------+

node1>  show global variables like 'gtid_executed';
+---------------+--------------------------------------------------+
| Variable_name | Value                                            |
+---------------+--------------------------------------------------+
| gtid_executed | 6c7225e2-12cc-11e4-8497-00163e5e2a58:1-2,
6d75ac01-ed37-ee1b-6048-592af289b902:1-11 |
+---------------+--------------------------------------------------+

And into a MyISAM table:

node2> show global variables like 'gtid_executed';
+---------------+------------------------------------------------------+
| Variable_name | Value                                                |
+---------------+------------------------------------------------------+
| gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-11,
933c5612-12c8-11e4-82d2-00163e014ea9:1-7 |
+---------------+------------------------------------------------------+

So because of the distinction between the writesets. All Galera data has the same GTID@MySQL to GTID@Galera mapping.

While there is still the possibility replication will break because the non-Galera replication ;)

Resumé

As long as all your data is handled by Galera Replication both GTID integrations should work fine. By no later than you do an Rolling Upgrade (mysql_upgrade) you are lost using GTID@MariaDB.

For GTID@MariaDB imho it would be a good idea to have a separate domainid for galera only and the ability to filter on the slave on that domainid.

Viel Spaß

Erkan :)

Galera the synchronous Master-Master replication is quite popular. It is used by Percona XtraDB Cluster, MariaDB Galera Cluster and even patched MySQL binaries exist. Quite often you want to add a slave to a Galera Cluster. This is going to work quite well. All you need is at least configure log_bin, log_slave_updates and server_id on the designated Masters and attach your Slave.

GTID@MariaDB


Even you can use traditional (non GTID) replication. Using non GTID replication is a hassle. As you need to search for the right offset on the new Master to attach your Slave on.

Using GTID promises to be easier. As you simply switch to the new Master and the replication framework finds the new position based on the GTiD automatically.

As a fact we have two GTID implementations

  • GTID@MySQL/Percona
  • GTID@MariaDB

There are already blogpost about attaching a Slave to a Galera Cluster not using GTID.

And even using GTID@MySQL

We are going to provide a post using GITD@MariaDB :)

We assume there is already a running Galera Cluster. Building one is already explained:

Both are quite similar :D

In opposite to the blog please use wsrep_sst_method=xtrabackup-v2. The current MariaDB release 10.0.12-MariaDB-1~trusty-wsrep-log has a bug preventing you to use wsrep_sst_method=rsync

Additional Configuration on the Galera nodes


[mysqld]
log_bin
log_slave_updates
server_id        = 1

log_bin activates the binlog, while log_slave_updates make sure to write all transactions replicated via Galera into that binlog. On didactic purpose we set the server_id on the same value on all Galera nodes.

Configuring the Slave


[mysqld]
binlog_format      = ROW
log_bin
log_slave_updates
server_id          = 2

GTID@Mariadb still lacks good operational integration. So building a slave is done not using GTID.

There is a fix for mysqldump and a patch for xtrabackup exists.

Attaching a Slave


MariaDB replicates - in opposite to MySQL - always the GTID. So we first attach the slave using mysqldump (master-data) and are going to have a running replication:

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.3.93
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysqld-bin.000002
          Read_Master_Log_Pos: 537
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 536
        Relay_Master_Log_File: mysqld-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 537
              Relay_Log_Space: 834
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
             Master_Server_Id: 1
                   Using_Gtid: No
                  Gtid_IO_Pos:

Switching to replication using GTID is quite simple:

slave> stop slave;
slave> change master to master_use_gtid=slave_pos;
slave> start slave;
slave> show slave status\G

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.3.93
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysqld-bin.000002
          Read_Master_Log_Pos: 873
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 694
        Relay_Master_Log_File: mysqld-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 873
              Relay_Log_Space: 992
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
        Seconds_Behind_Master: 0
             Master_Server_Id: 1
                   Using_Gtid: Slave_Pos
                  Gtid_IO_Pos: 0-1-3

Check out the last two lines:)

Failover


On failover we attach the slave to another Master.

 slave> STOP SLAVE;
 slave> CHANGE MASTER TO MASTER_HOST="$NEW_HOST";
 slave> START SLAVE;

Check Master_Host and be exited we don't need to care about Master_Log_File and Master_Log_Pos.

 slave> show slave status\G
 *************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 10.0.3.189
                   Master_User: replication
                   Master_Port: 3306
                 Connect_Retry: 10
               Master_Log_File: mysqld-bin.000007
           Read_Master_Log_Pos: 77357
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 46594
         Relay_Master_Log_File: mysqld-bin.000007
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
           Exec_Master_Log_Pos: 77357
               Relay_Log_Space: 46892
               Until_Condition: None
         Seconds_Behind_Master: 0
              Master_Server_Id: 1
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-504

GTID promises things to be much easier. As a fact I don't like the way GTID@MariaDB works with Galera replication.

Im gonna tell why in my next blog post :)

Viel Spaß

Erkan :)

Im giving some talks this year:

MySQL Hochverfügbar mit Galera

Location: FrOSCon

About: Learn about Galera and deploy it using LXC and Ansible

LBaaS-Loadbalancer as a Service

Place: GUUG Frühjahrsgespräche

Topic: It is a workshop ( together with Jan Walzer and Jörg Jungermann). We are going to show how to use LXC to provide slim loadbalancers.

Medley der Containertechniken

Place: GUUG Frühjahrsgespräche

Topic: Learn about all the basic techniques vanilla based Container technology uses/shares (Namespaces, Cgroups und Chroot). Have a look at some of them (LXC, Libvrit, systemd-nspawn and Docker)

MySQL Replikation: Von den Anfängen in die Zukunft

Place: DOAG 2014

Topic: Learn about the past and the future of MySQL (and MariaDB) replication.


[UPDATE]

Hands on Docker

Place: CommitterConf.de 

Topic: Let's get started using Docker (workshop)

# Docker - I want to break free!

Using Docker you are forced to think about mapping your ports. But most of the time you would like to assign a static IP. Even every DockerContainer has an IP you don’t want to use it:

  • The IP is volatil (So it is likely you get another one the next start)
  • The Ip is not routet. So the Containers unreachable - via that IP - from another host.

Dockercontainer run in there own Namespace. We are going to provide an IP via *iproute2”.

Taking this Container:

> docker ps
CONTAINER ID        IMAGE               COMMAND              NAMES
8fe4e6c72b90        erkan/nginx:v01     /bin/sh -c nginx     angry_mccarthy

But ip netns is not seeing the Network Namespace (interface) of the Container.

> ip netns
>

This is because ip nents manages (his) Network Namespace using /var/run/netns. For that we:

Get the IP of the running Dockercontainer:

> docker inspect --format='{{ .State.Pid }}' angry_mccarthy
26420

Link into /var/run/netns/$nameIlike

> ln -s /proc/26420/ns/net /var/run/netns/freeme
> ip netns
freeme

Now we create a veth wire and putting on end into the Network Namespace of the Container and attach the other and to the (given) bridge.

> ip link add veth-host type veth peer name veth-freeme
> ip link set veth-freeme netns freeme  
> ip link set veth-host master br0
> ip netns exec freeme ip addr add 192.168.178.123/24 dev veth-freeme
> ip netns exec freeme ip link  set veth-freeme up
> ip link set veth-host up

From now we can access the Container via his Ip from another host.

ImNotHere:~$ curl  192.168.178.123
<html>
<head>
<title>Welcome to nginx!</title>
</head>
<body bgcolor="white" text="black">
<center><h1>Welcome to nginx!</h1></center>
</body>
</html>

feddisch \o/

Viel Spaß Erkan

P.S: http://www.opencloudblog.com a great blog of a former collogue.

Ahoi,

Im giving a OckerHaterHipster talk at the First Docker Meetup Frankfurt
It is about hating of course. But to be frankly it is more about understanding Docker is not just a replacement. It is more a rethinking of infrastructure. 

Not that you've got to love it of course  :)

Have Fun
Erkan


Ahoi,

There had been already a couple of blogposts about Docker and Galera in the MySQL community. I've got to confess I love both. But on the other hand I don't think this is a good combination at all. Having a look at the blogpost doing galera with docker Im still not confessed. Im going to tell some points why I think so.

I assume Galera is already well know in the MySQL community :)

Docker is not just another technique to virtualize


Docker is more than just being another way to virtualize. And this may be one of the biggest points I miss with the other blogposts.

What is the purpose of Docker?


With Docker you build applicationcontainers. So you have a container just running one application. The overhead of containers compared to hypervisor technologies like KVM, VMWare etc. is much slower. But instead running a full OS in a container, you just run on application. This was first seen with LXC btw. But Docker is much more.

Some points (and yes we are missing a lot of important points)

  • Applicationscontainer

  • Build your Images via Dockerfile

  • Don't configure your running container

  • If you want to upgrade a container. Build a new image.

  • There is a lot more. But this is sufficient for a basic 'rant'

This post is some way of doing the holy grail for docker :)

Think about an applicationcontainer as a binary you run on your system. You don't do ssh to your mysql binary and so you don't do to your MySQL applicationcontaine.

Even applicationcontainer get there own IP. You shall not use them. They are not accessible from outside the host and they ere likely to change if you restart a container.

Docker advises you to do port mapping. Thats where you map the port of your application(container) to one port on the host. This will work on every host instead of relying on an IP.

Just think about Managing Ports. Remember you need to manage at least 4 ports per node (3x) to do it right. Using IPs just

Configuring Docker


Configuring and deploying docker reminds of the old Golden Image era. You upgrade an Container by starting from an upgraded Images and not by upgrading the container. Images are build via Dockerfiles, some kind of makefiles for images. Besides having fully configured images you can build images reading env or options when starting.

In the end you start your container and after that you don't connect and configure them. Thats the point. Thats what happened in the blogposts. Also Dockerfiles had been used. But they had been used in a half hearted way. Afaik there was some basic installation but still they need to access the container (attach, ssh, ansible) and configure the container. This is not the holy grail of Docker.

So at least they got a running Galera Cluster


Yes indeed and lets talk about Galera configuration. Let's only talk about Docker.

Let's summarize some fails.

  1. The Galera Cluster is build while the containers are running.
  2. The Galera Cluster depends on local IPs. IPs you gonna loose if one container restarts. For sure this setup is for playing testing only.
  3. As the Installation depends on local IPs. They all run on one host. Most unlikely you want to run you Galera Cluster on one node in production.
  4. Remember your datadir is on Aufs. Fine for tests indeed.

...

Is it that bad?


As long as you do it for testing and playing around it is fine. But when you want to go into production,, you got to rethink your setup.

LXC


I wonder why the blogger didn't used LXC in first place. In the end they used Docker like (virtual) stand alone nodes. For that LXC would be the right fit. And working for Galera. (Even not hipster at all)

With LXC you have.

  • Fixed IPs
  • Able to communicate via network using that IP. (Multi host deployment)
  • Run datadir on your prefered filesystem

So is there a Happy End for Docker and Galera


Nope :)

Of course you can do it. But lets have a look at the ports. There are two way to manage the ports:

  • Do a 1:1 mapping of the container ports to the host. 3306 -> 3306 etc.

But this would not scale at all.

  • Do your portmanagement on your own

So you got to take care of 12 Ports for a 3-node Galera Cluster. If you want to do it you should also be sure to know following Galera options:

wsrep_sst_receive_address
gmcast.listen_addr
ist.recv_addr

You also got to think about Volumemanagement.

On the other hand thats the benefit of Galera. Upgraded Nodes will do an SST. The old data is gone with the old container and using Docker for small/mid sized Installations this could even work.

Epilogue


In the end everyone can use Docker the way he wants to. But then you are not hipster:)

Btw: Maybe service discovery for the rescue?! Let's see if Ive got some time to investigate it and present the holy grail of Docker and Galera \o/

Enjoy

Erkan

There are uprades for our Galera Cluster in the repository:

> yum info galera
Installed Packages
Name        : galera
Arch        : x86_64
Version     : 25.3.2
Release     : 1.rhel6
Size        : 29 M
Repo        : installed
From repo   : mariadb

Available Packages
Name        : galera
Arch        : x86_64
Version     : 25.3.5
Release     : 1.rhel6
Size        : 7.6 M
Repo        : mariadb

> yum info MariaDB-Galera-server
Installed Packages
Name        : MariaDB-Galera-server
Arch        : x86_64
Version     : 5.5.36
Release     : 1.el6
Size        : 102 M
Repo        : installed
From repo   : mariadb

Available Packages
Name        : MariaDB-Galera-server
Arch        : x86_64
Version     : 5.5.37
Release     : 1.el6
Size        : 25 M
Repo        : mariadb

Rolling Upgrade

Let’s do a Rolling Upgrade. This nice feature of Galera allows us to upgrade to the new version without taking the cluster offline.

This is done - like with MySQL NDB Cluster -

Ansible what the f*?!

I just use it for less interactive connectint to the nodes. It not needed at all. The following command connects to all nodes and ask for the MySQL version, size and state of the cluster.

$ ansible -i cluster.ini galera -a 'mysql -u root -e 
  "select version();SELECT * from INFORMATION_SCHEMA.GLOBAL_STATUS 
  WHERE VARIABLE_NAME  IN (\"wsrep_cluster_size\",\"wsrep_local_state_comment\")"'
galera01 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera02 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep-log
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera03 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

As we see we run on a healthy 5.5.36-MariaDB-wsrep cluster.

What we do next is to upgrade one node after another. So upgrading one node:

$ ansible -i cluster.ini galera -l galera01 -a 'yum update -y  MariaDB-Galera-server galera'
galera01 | success | rc=0 >>
[snip]   
Installed:
  galera.x86_64 0:25.3.5-1.rhel6                                                

Updated:
  MariaDB-Galera-server.x86_64 0:5.5.37-1.el6                                   

Replaced:
  galera.x86_64 0:25.3.2-1.rhel6                                                

Complete!

And checking if the cluster is still fine:

$ ansible -i cluster.ini galera -a 'mysql -u root -e 
  "select version();SELECT * from INFORMATION_SCHEMA.GLOBAL_STATUS 
  WHERE VARIABLE_NAME IN (\"wsrep_cluster_size\",\"wsrep_local_state_comment\")"'
galera01 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera03 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera02 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep-log
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

Great we got already one node upgraded. The procedure for the other two nodes is the same. (Always upgrade one node after another and check each node has upgraded fine.) So it is skipped and we make the final test:

$ ansible -i cluster.ini galera -a 'mysql -u root -e 
  "select version();SELECT * from INFORMATION_SCHEMA.GLOBAL_STATUS 
  WHERE VARIABLE_NAME IN (\"wsrep_cluster_size\",\"wsrep_local_state_comment\")"'
galera01 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera03 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera02 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep-log
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

Thats a damn easy Rolling Upgrade \o/

Epilog

I generally recommend to get the node you upgrade out of the proxy.

echo "disable server galera_server/galera03" | nc -U /var/run/haproxy.sock
Upgrade Steps galera03
echo "enable server galera_server/galera03" | nc -U /var/run/haproxy.sock

Epilog2

Works with PXC too ;)

Update: sed /enable/disable/

No news in telling Galera is the synchronous multi master solution for MySQL.
But Galera is just a provider.

What do you mean by provider?

Remember configuring “Galera”?
There are two options
wsrep_provider and wsrep_provider_options
These define the provider (Galera) and the options for the provider.
The rest of the wsrep_ options are for wsrep.

https://launchpad.net/wsrep

wsrep API defines a set of application callbacks and replication library calls necessary to implement synchronous writeset replication of transactional databases and similar applications. It aims to abstract and isolate replication implementation from application details. Although the main target of this interface is a certification-based multi-master replication, it is equally suitable for both asynchronous and synchronous master/slave replication.

So yes, other providers than galera are possible \o/

I.e. an asynchronous replication provider could already benefit from the parallel applying provided by wsrep :)

Have fun

Erkan

Hi,
I would like to give you an overview regarding all my talks till june.

  • MySQL@Ceph (Ceph Day Frankfurt ./. 27.02.2014)
    Yes you missed that already :)
  • MySQL: PerformanceSchema (DOAG SIG - MySQL ./. 27.03.2014)
  • Galera Cluster für MySQL (DOAG SIG - MySQL ./. 27.03.2014)
  • Docker++ ./. Containervirtualisierung von Applikationen mit Merhwert (Linuxtag ./. 08.05.2014)
  • Docker: Not even a Hypervisor (Containers for OpenStack) (Linuxtag ./. 09.05.2014
  • Docker: LXC Applikationskontainer für jedermann (SLAC ./. 13.05.2014)

There is also a training about LXC and “newer” Linuxfeatures (Systemd, Upstart, Namespaces..) im giving at the Linuxhotal early may.

Viel Spaß :)

Erkan

What is it about?

I used to do some benchmarkstuff and blogged about it on my blog written in German. Im going to do testings and benchmarkings again:)

We are going to have a look into ‘benchmarking’ a 3-node Galera Cluster. The application (sysbench) is on a separate node accessing one node of the cluster. This would be the case in a i.e. VIP setup.

Setup

3 Galera Nodes

  • Virtual machines (OpenStack) provided by teuto.net
  • VCPU: 4
  • RAM: 4GB
  • OS: Centos 6.4-x86_64
  • MySQL-server-5.6.14wsrep25.1-1.rhel6.x86_64
  • galera-25.3.2-1.rhel6.x86_64

Separate sysbench node

  • Same specs as the Galera nodes
  • sysbench 0.5
  • oltp test on 5 tables 1000000 rows each (ca. 1.2GB)
  • A run took 60 seconds

MySQL Config

[mysqld]
user                          = mysql
binlog_format                 = ROW
default-storage-engine        = innodb

innodb_autoinc_lock_mode      = 2
innodb_flush_log_at_trx_commit= 0
innodb_buffer_pool_size       = 2048M
innodb_log_buffer_size        = 128M
innodb_file_per_table         = 1

query_cache_size              = 0
query_cache_type              = 0
bind-address                  = 0.0.0.0

init_file                     = /etc/mysql/init
max_connections               = 2000

# Galera

wsrep_provider                = "/usr/lib64/galera/libgalera_smm.so"
wsrep_cluster_name            = deadcandance
wsrep_cluster_address         = "gcomm://$useyourown"/
wsrep_slave_threads           = 
wsrep_certify_nonPK           = 1
wsrep_max_ws_rows             = 131072
wsrep_max_ws_size             = 1073741824
wsrep_sst_method              = rsync

Tests

Reminder

We are running in a hypervisor (and OpenStack) setup. Testing is in a way not reliable. Not only because of the hypervisor. We don’t know how the host, storage and network resources are consumed by other users also. So small variances are statistically irrelevant.

1. test: We use different settings for wsrep_slave_threads

for i in 1 4 8 16 24 32 48 64; do set wsrep_slave_threads=$i and run; done

galera compared

This surprised me as in another test I had different results. Im not sure if it is the oltp test or the “hardware” making a change of wsrep_slave_threads some kind of useless.

2. Test: Setting gcs.fc_limit to 512 (instead the default 16)

We could tune the replication part. See Flow Control.

galera flow_control

Ahh ok this helped. And in our setup it is fine to play with that settings. (Yes there are more. read the link :) But it is true? How does our Flow Control behaved lets hava a look at the WSREP_FLOW_CONTROL_PAUSED status variable:

galera flow_control

Ok there you see the cluster wasn’t paused that often anymore. But the values are still to high. We are going to have a look at this setting in future tests. Right now it is quite likely the machines couldn’t catch up. ‘

3. Test

Now we take one of the Galera runs and compare them with:

  • A stand alone MySQL having the same configuration.
  • A stand alone MySQL with sync_binlog und innodb_flush_log_at_trx_commit=1 set.
  • A stand alone MySQL with sync_binlog und innodb_flush_log_at_trx_commit=1 set with a Semisynchronous Replication running.

Semisynchronous Replication is often used for HA setups. The argument is to make sure the data is on (one) slave at least. As a fact this is wrong. But this is the use case.

galera flow_control

  • We see the Galera Replication ‘overhead’
  • We see the performance drop (overhead) to get some local storage consistency. But still we see Group Commit doing a good job in scaling.
  • We see the Semisynchronous Replication ‘overhead’

Lets see another graph comparing two different Galera runs with the Semisynchronous Replication run.

galera flow_control

Make up your own mind.

So

  • Galera is faster.
  • Galera is virtual synchron.
  • Galera easy Fail Over implementations because of the Mulit-Master technique.

Fake Semisync

Even it looks like Semisynchronous Replication is good for setups with a higher concurrency. Lets have a look at the RPL_SEMI_SYNC_MASTER_NO_TX status variable I monitored while doing the test.

galera flow_control

So it was no Semisynchronous Replication all the time. It switched back to asynchronous Replication. So Semisynchronous Replication could’t catch the workload either. Dropping back into Asynchronous Replication broke the consistency of the Data in the cluster. Thats where Galera reduce the performance (still higher than Semisynchronous Replication) to provide this consistency :)

Ok thats my friend the end

  • We had a simple setup
  • Different setups, distributions and ‘Hardware’ is going to be used.
  • If you had some ideas, feel free to ping/mail me.
  • As Im missing real(tm) hardware. Feel free to make me happy providing me access to that real hardware:)

Viel Spaß

Erkan :)