Im giving some talks this year:

MySQL Hochverfügbar mit Galera

Location: FrOSCon

About: Learn about Galera and deploy it using LXC and Ansible

LBaaS-Loadbalancer as a Service

Place: GUUG Frühjahrsgespräche

Topic: It is a workshop ( together with Jan Walzer and Jörg Jungermann). We are going to show how to use LXC to provide slim loadbalancers.

Medley der Containertechniken

Place: GUUG Frühjahrsgespräche

Topic: Learn about all the basic techniques vanilla based Container technology uses/shares (Namespaces, Cgroups und Chroot). Have a look at some of them (LXC, Libvrit, systemd-nspawn and Docker)

MySQL Replikation: Von den Anfängen in die Zukunft

Place: DOAG 2014

Topic: Learn about the past and the future of MySQL (and MariaDB) replication.


[UPDATE]

Hands on Docker

Place: CommitterConf.de 

Topic: Let's get started using Docker (workshop)

# Docker - I want to break free!

Using Docker you are forced to think about mapping your ports. But most of the time you would like to assign a static IP. Even every DockerContainer has an IP you don’t want to use it:

  • The IP is volatil (So it is likely you get another one the next start)
  • The Ip is not routet. So the Containers unreachable - via that IP - from another host.

Dockercontainer run in there own Namespace. We are going to provide an IP via *iproute2”.

Taking this Container:

> docker ps
CONTAINER ID        IMAGE               COMMAND              NAMES
8fe4e6c72b90        erkan/nginx:v01     /bin/sh -c nginx     angry_mccarthy

But ip netns is not seeing the Network Namespace (interface) of the Container.

> ip netns
>

This is because ip nents manages (his) Network Namespace using /var/run/netns. For that we:

Get the IP of the running Dockercontainer:

> docker inspect --format='{{ .State.Pid }}' angry_mccarthy
26420

Link into /var/run/netns/$nameIlike

> ln -s /proc/26420/ns/net /var/run/netns/freeme
> ip netns
freeme

Now we create a veth wire and putting on end into the Network Namespace of the Container and attach the other and to the (given) bridge.

> ip link add veth-host type veth peer name veth-freeme
> ip link set veth-freeme netns freeme  
> ip link set veth-host master br0
> ip netns exec freeme ip addr add 192.168.178.123/24 dev veth-freeme
> ip netns exec freeme ip link  set veth-freeme up
> ip link set veth-host up

From now we can access the Container via his Ip from another host.

ImNotHere:~$ curl  192.168.178.123
<html>
<head>
<title>Welcome to nginx!</title>
</head>
<body bgcolor="white" text="black">
<center><h1>Welcome to nginx!</h1></center>
</body>
</html>

feddisch \o/

Viel Spaß Erkan

P.S: http://www.opencloudblog.com a great blog of a former collogue.

Ahoi,

Im giving a OckerHaterHipster talk at the First Docker Meetup Frankfurt
It is about hating of course. But to be frankly it is more about understanding Docker is not just a replacement. It is more a rethinking of infrastructure. 

Not that you've got to love it of course  :)

Have Fun
Erkan


Ahoi,

There had been already a couple of blogposts about Docker and Galera in the MySQL community. I've got to confess I love both. But on the other hand I don't think this is a good combination at all. Having a look at the blogpost doing galera with docker Im still not confessed. Im going to tell some points why I think so.

I assume Galera is already well know in the MySQL community :)

Docker is not just another technique to virtualize


Docker is more than just being another way to virtualize. And this may be one of the biggest points I miss with the other blogposts.

What is the purpose of Docker?


With Docker you build applicationcontainers. So you have a container just running one application. The overhead of containers compared to hypervisor technologies like KVM, VMWare etc. is much slower. But instead running a full OS in a container, you just run on application. This was first seen with LXC btw. But Docker is much more.

Some points (and yes we are missing a lot of important points)

  • Applicationscontainer

  • Build your Images via Dockerfile

  • Don't configure your running container

  • If you want to upgrade a container. Build a new image.

  • There is a lot more. But this is sufficient for a basic 'rant'

This post is some way of doing the holy grail for docker :)

Think about an applicationcontainer as a binary you run on your system. You don't do ssh to your mysql binary and so you don't do to your MySQL applicationcontaine.

Even applicationcontainer get there own IP. You shall not use them. They are not accessible from outside the host and they ere likely to change if you restart a container.

Docker advises you to do port mapping. Thats where you map the port of your application(container) to one port on the host. This will work on every host instead of relying on an IP.

Just think about Managing Ports. Remember you need to manage at least 4 ports per node (3x) to do it right. Using IPs just

Configuring Docker


Configuring and deploying docker reminds of the old Golden Image era. You upgrade an Container by starting from an upgraded Images and not by upgrading the container. Images are build via Dockerfiles, some kind of makefiles for images. Besides having fully configured images you can build images reading env or options when starting.

In the end you start your container and after that you don't connect and configure them. Thats the point. Thats what happened in the blogposts. Also Dockerfiles had been used. But they had been used in a half hearted way. Afaik there was some basic installation but still they need to access the container (attach, ssh, ansible) and configure the container. This is not the holy grail of Docker.

So at least they got a running Galera Cluster


Yes indeed and lets talk about Galera configuration. Let's only talk about Docker.

Let's summarize some fails.

  1. The Galera Cluster is build while the containers are running.
  2. The Galera Cluster depends on local IPs. IPs you gonna loose if one container restarts. For sure this setup is for playing testing only.
  3. As the Installation depends on local IPs. They all run on one host. Most unlikely you want to run you Galera Cluster on one node in production.
  4. Remember your datadir is on Aufs. Fine for tests indeed.

...

Is it that bad?


As long as you do it for testing and playing around it is fine. But when you want to go into production,, you got to rethink your setup.

LXC


I wonder why the blogger didn't used LXC in first place. In the end they used Docker like (virtual) stand alone nodes. For that LXC would be the right fit. And working for Galera. (Even not hipster at all)

With LXC you have.

  • Fixed IPs
  • Able to communicate via network using that IP. (Multi host deployment)
  • Run datadir on your prefered filesystem

So is there a Happy End for Docker and Galera


Nope :)

Of course you can do it. But lets have a look at the ports. There are two way to manage the ports:

  • Do a 1:1 mapping of the container ports to the host. 3306 -> 3306 etc.

But this would not scale at all.

  • Do your portmanagement on your own

So you got to take care of 12 Ports for a 3-node Galera Cluster. If you want to do it you should also be sure to know following Galera options:

wsrep_sst_receive_address
gmcast.listen_addr
ist.recv_addr

You also got to think about Volumemanagement.

On the other hand thats the benefit of Galera. Upgraded Nodes will do an SST. The old data is gone with the old container and using Docker for small/mid sized Installations this could even work.

Epilogue


In the end everyone can use Docker the way he wants to. But then you are not hipster:)

Btw: Maybe service discovery for the rescue?! Let's see if Ive got some time to investigate it and present the holy grail of Docker and Galera \o/

Enjoy

Erkan

There are uprades for our Galera Cluster in the repository:

> yum info galera
Installed Packages
Name        : galera
Arch        : x86_64
Version     : 25.3.2
Release     : 1.rhel6
Size        : 29 M
Repo        : installed
From repo   : mariadb

Available Packages
Name        : galera
Arch        : x86_64
Version     : 25.3.5
Release     : 1.rhel6
Size        : 7.6 M
Repo        : mariadb

> yum info MariaDB-Galera-server
Installed Packages
Name        : MariaDB-Galera-server
Arch        : x86_64
Version     : 5.5.36
Release     : 1.el6
Size        : 102 M
Repo        : installed
From repo   : mariadb

Available Packages
Name        : MariaDB-Galera-server
Arch        : x86_64
Version     : 5.5.37
Release     : 1.el6
Size        : 25 M
Repo        : mariadb

Rolling Upgrade

Let’s do a Rolling Upgrade. This nice feature of Galera allows us to upgrade to the new version without taking the cluster offline.

This is done - like with MySQL NDB Cluster -

Ansible what the f*?!

I just use it for less interactive connectint to the nodes. It not needed at all. The following command connects to all nodes and ask for the MySQL version, size and state of the cluster.

$ ansible -i cluster.ini galera -a 'mysql -u root -e 
  "select version();SELECT * from INFORMATION_SCHEMA.GLOBAL_STATUS 
  WHERE VARIABLE_NAME  IN (\"wsrep_cluster_size\",\"wsrep_local_state_comment\")"'
galera01 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera02 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep-log
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera03 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

As we see we run on a healthy 5.5.36-MariaDB-wsrep cluster.

What we do next is to upgrade one node after another. So upgrading one node:

$ ansible -i cluster.ini galera -l galera01 -a 'yum update -y  MariaDB-Galera-server galera'
galera01 | success | rc=0 >>
[snip]   
Installed:
  galera.x86_64 0:25.3.5-1.rhel6                                                

Updated:
  MariaDB-Galera-server.x86_64 0:5.5.37-1.el6                                   

Replaced:
  galera.x86_64 0:25.3.2-1.rhel6                                                

Complete!

And checking if the cluster is still fine:

$ ansible -i cluster.ini galera -a 'mysql -u root -e 
  "select version();SELECT * from INFORMATION_SCHEMA.GLOBAL_STATUS 
  WHERE VARIABLE_NAME IN (\"wsrep_cluster_size\",\"wsrep_local_state_comment\")"'
galera01 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera03 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera02 | success | rc=0 >>
version()
5.5.36-MariaDB-wsrep-log
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

Great we got already one node upgraded. The procedure for the other two nodes is the same. (Always upgrade one node after another and check each node has upgraded fine.) So it is skipped and we make the final test:

$ ansible -i cluster.ini galera -a 'mysql -u root -e 
  "select version();SELECT * from INFORMATION_SCHEMA.GLOBAL_STATUS 
  WHERE VARIABLE_NAME IN (\"wsrep_cluster_size\",\"wsrep_local_state_comment\")"'
galera01 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera03 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

galera02 | success | rc=0 >>
version()
5.5.37-MariaDB-wsrep-log
VARIABLE_NAME VARIABLE_VALUE
WSREP_LOCAL_STATE_COMMENT Synced
WSREP_CLUSTER_SIZE  3

Thats a damn easy Rolling Upgrade \o/

Epilog

I generally recommend to get the node you upgrade out of the proxy.

echo "disable server galera_server/galera03" | nc -U /var/run/haproxy.sock
Upgrade Steps galera03
echo "enable server galera_server/galera03" | nc -U /var/run/haproxy.sock

Epilog2

Works with PXC too ;)

Update: sed /enable/disable/

No news in telling Galera is the synchronous multi master solution for MySQL.
But Galera is just a provider.

What do you mean by provider?

Remember configuring “Galera”?
There are two options
wsrep_provider and wsrep_provider_options
These define the provider (Galera) and the options for the provider.
The rest of the wsrep_ options are for wsrep.

https://launchpad.net/wsrep

wsrep API defines a set of application callbacks and replication library calls necessary to implement synchronous writeset replication of transactional databases and similar applications. It aims to abstract and isolate replication implementation from application details. Although the main target of this interface is a certification-based multi-master replication, it is equally suitable for both asynchronous and synchronous master/slave replication.

So yes, other providers than galera are possible \o/

I.e. an asynchronous replication provider could already benefit from the parallel applying provided by wsrep :)

Have fun

Erkan

Hi,
I would like to give you an overview regarding all my talks till june.

  • MySQL@Ceph (Ceph Day Frankfurt ./. 27.02.2014)
    Yes you missed that already :)
  • MySQL: PerformanceSchema (DOAG SIG - MySQL ./. 27.03.2014)
  • Galera Cluster für MySQL (DOAG SIG - MySQL ./. 27.03.2014)
  • Docker++ ./. Containervirtualisierung von Applikationen mit Merhwert (Linuxtag ./. 08.05.2014)
  • Docker: Not even a Hypervisor (Containers for OpenStack) (Linuxtag ./. 09.05.2014
  • Docker: LXC Applikationskontainer für jedermann (SLAC ./. 13.05.2014)

There is also a training about LXC and “newer” Linuxfeatures (Systemd, Upstart, Namespaces..) im giving at the Linuxhotal early may.

Viel Spaß :)

Erkan

What is it about?

I used to do some benchmarkstuff and blogged about it on my blog written in German. Im going to do testings and benchmarkings again:)

We are going to have a look into ‘benchmarking’ a 3-node Galera Cluster. The application (sysbench) is on a separate node accessing one node of the cluster. This would be the case in a i.e. VIP setup.

Setup

3 Galera Nodes

  • Virtual machines (OpenStack) provided by teuto.net
  • VCPU: 4
  • RAM: 4GB
  • OS: Centos 6.4-x86_64
  • MySQL-server-5.6.14wsrep25.1-1.rhel6.x86_64
  • galera-25.3.2-1.rhel6.x86_64

Separate sysbench node

  • Same specs as the Galera nodes
  • sysbench 0.5
  • oltp test on 5 tables 1000000 rows each (ca. 1.2GB)
  • A run took 60 seconds

MySQL Config

[mysqld]
user                          = mysql
binlog_format                 = ROW
default-storage-engine        = innodb

innodb_autoinc_lock_mode      = 2
innodb_flush_log_at_trx_commit= 0
innodb_buffer_pool_size       = 2048M
innodb_log_buffer_size        = 128M
innodb_file_per_table         = 1

query_cache_size              = 0
query_cache_type              = 0
bind-address                  = 0.0.0.0

init_file                     = /etc/mysql/init
max_connections               = 2000

# Galera

wsrep_provider                = "/usr/lib64/galera/libgalera_smm.so"
wsrep_cluster_name            = deadcandance
wsrep_cluster_address         = "gcomm://$useyourown"/
wsrep_slave_threads           = 
wsrep_certify_nonPK           = 1
wsrep_max_ws_rows             = 131072
wsrep_max_ws_size             = 1073741824
wsrep_sst_method              = rsync

Tests

Reminder

We are running in a hypervisor (and OpenStack) setup. Testing is in a way not reliable. Not only because of the hypervisor. We don’t know how the host, storage and network resources are consumed by other users also. So small variances are statistically irrelevant.

1. test: We use different settings for wsrep_slave_threads

for i in 1 4 8 16 24 32 48 64; do set wsrep_slave_threads=$i and run; done

galera compared

This surprised me as in another test I had different results. Im not sure if it is the oltp test or the “hardware” making a change of wsrep_slave_threads some kind of useless.

2. Test: Setting gcs.fc_limit to 512 (instead the default 16)

We could tune the replication part. See Flow Control.

galera flow_control

Ahh ok this helped. And in our setup it is fine to play with that settings. (Yes there are more. read the link :) But it is true? How does our Flow Control behaved lets hava a look at the WSREP_FLOW_CONTROL_PAUSED status variable:

galera flow_control

Ok there you see the cluster wasn’t paused that often anymore. But the values are still to high. We are going to have a look at this setting in future tests. Right now it is quite likely the machines couldn’t catch up. ‘

3. Test

Now we take one of the Galera runs and compare them with:

  • A stand alone MySQL having the same configuration.
  • A stand alone MySQL with sync_binlog und innodb_flush_log_at_trx_commit=1 set.
  • A stand alone MySQL with sync_binlog und innodb_flush_log_at_trx_commit=1 set with a Semisynchronous Replication running.

Semisynchronous Replication is often used for HA setups. The argument is to make sure the data is on (one) slave at least. As a fact this is wrong. But this is the use case.

galera flow_control

  • We see the Galera Replication ‘overhead’
  • We see the performance drop (overhead) to get some local storage consistency. But still we see Group Commit doing a good job in scaling.
  • We see the Semisynchronous Replication ‘overhead’

Lets see another graph comparing two different Galera runs with the Semisynchronous Replication run.

galera flow_control

Make up your own mind.

So

  • Galera is faster.
  • Galera is virtual synchron.
  • Galera easy Fail Over implementations because of the Mulit-Master technique.

Fake Semisync

Even it looks like Semisynchronous Replication is good for setups with a higher concurrency. Lets have a look at the RPL_SEMI_SYNC_MASTER_NO_TX status variable I monitored while doing the test.

galera flow_control

So it was no Semisynchronous Replication all the time. It switched back to asynchronous Replication. So Semisynchronous Replication could’t catch the workload either. Dropping back into Asynchronous Replication broke the consistency of the Data in the cluster. Thats where Galera reduce the performance (still higher than Semisynchronous Replication) to provide this consistency :)

Ok thats my friend the end

  • We had a simple setup
  • Different setups, distributions and ‘Hardware’ is going to be used.
  • If you had some ideas, feel free to ping/mail me.
  • As Im missing real(tm) hardware. Feel free to make me happy providing me access to that real hardware:)

Viel Spaß

Erkan :)

Galera Phrases

| Keine Kommentare | Keine TrackBacks

I confess I am a Galera fanboy. This post is going to present two slogans about Galera reminding you about Galera “limitations”. This is for the sake of user experience:)

Galera Phrases:

  • One Cluster
  • Replication

One Cluster

Think about Galera as One Cluster. As every transaction is committed virtually synchronously, the slowest node determines the (DML) speed of the cluster. This is true regarding the network too.

Replication

  • Galera is still some kind of replication. So keep in mind to provide PK as for (ROW based) Replication. There is a Featurerequest of mine to have a PK option to enforce creating of Tables with a PK. Please vote for it:)

  • With traditional MySQL Replication the slave might lag. This is not hurting the performance of the master. With Galera there is no lag. So if applying takes time it stalls the cluster (flow control). DDL statements run in Total Order Isolation (TOI) as a default. This will not only block the table. The whole cluster stalls till the DDL finishes.

Of course there is much more to tell. This two slogans are just a result of my experience with customers etc.

Viel Spaß

Erkan

Regarding virtualization I am a LXC guy. Nevertheless Docker has won a lot of attention and I would like to show how to use MySQL with Docker.

What is Docker?

In fact Docker is a wrapper around LXC. It is fun to use. Docker has the philosophy to virtualize single applications using LXC. So in our example we are going to start a mysqld in a chroot environment encapsulated in his own Namespaces. (You can even set Cgroups resources.) One of the main points regarding Docker is the usage of a union filesystem (aufs). So when you start a Docker Container it gets his aufs mount and only changed data is written down.

Aufs is great for a lot of applications and sufficient for Database testing. I just want to share a simple - more educational, than effective - Dockerfile. Dockerfiles are the buildscripts for the Docker images.

Lets have a look at the Dockerfile:

FROM ubuntu
MAINTAINER erkan yanar <erkan.yanar@linsenraum.de>

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get install -y  python-software-properties
RUN apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xcbcb082a1bb943db
RUN add-apt-repository 'deb http://mirror2.hs-esslingen.de/mariadb/repo/10.0/ubuntu precise main'
RUN apt-get update
RUN apt-get install -y mariadb-server
RUN echo "[mysqld]"                       >/etc/mysql/conf.d/docker.cnf
RUN echo "bind-address   = 0.0.0.0"      >>/etc/mysql/conf.d/docker.cnf
RUN echo "innodb_flush_method = O_DSYNC" >>/etc/mysql/conf.d/docker.cnf
RUN echo "skip-name-resolve"             >>/etc/mysql/conf.d/docker.cnf
RUN echo "init_file = /etc/mysql/init"   >>/etc/mysql/conf.d/docker.cnf
RUN echo "GRANT ALL ON *.* TO supa@'%' IDENTIFIED BY 'supa';" >/etc/mysql/init

EXPOSE 3306
USER mysql
ENTRYPOINT mysqld

You should change it the way you like. If you understand it, go on and optimize it. I.e. reduce the run stages:)

Lets quick build our image (named mysql)

> cat $DOCKERFILENAME | docker build -t mysql -

Great! Let’s for fun start 51 Containers:

> time for i in $(seq 10 60 ) ; do docker  run -d -p 50$i:3306   mysql ; done                                                              
..      
real    0m27.446s
user    0m0.264s
sys     0m0.211s

All on my laptop. Think about the performance using KVM :)

>  docker ps | grep mysqld |wc -l 
51
> docker ps | head -2
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS              PORTS                    NAMES
6d3a5181cd56        mysql:latest        /bin/sh -c mysqld   About a minute ago   Up About a minute   0.0.0.0:5060->3306/tcp   lonely_pare

Have fun \o/
Erkan