Docker is kinda awesome, as it releases a lot of creativity and let
us rethink infrastructure. Think about upgrading an application(docker container). We just stop the old one and start the new container. Rollback is easy as stopping the new container and starting from the old image.

Let’s have a look at nginx. Within the Docker ecosystem, in a world where the backends come and go. You profit from writing the nginx configuration in a dynamic way. Most likely using confd or consul-template.

After that you stop the container and start it new from the image.

Kinda silly!

Why?

Sending nginx a SIGHUP would have told it to simply reread the configuration without stopping it by spawning a new process.

Nginx even has a nice trick to upgrade. Sending a SIGUSR2 nginx spawns a new process with the new binary.

In a standard Docker workflow you don’t use this features.

Regards yours

DockerHipster \o/

Galera for Mesos

| Keine Kommentare | Keine TrackBacks
This time it is nothing like a linkt to another Blog:  Galera on Mesos


Ok as a fact I was kinda involved. Even the work is done by Stefan all alone.
We meat for a day in a coworking space and discussed about Galera and Mesos.

In the end Stefan produced this incredible blogpost and pushed Mesos forward.

Whats the fun about this post?

We already now Galera is already the standard in a lot of architectures. For example OpenStack.
Doing consultant work for Docker also I encourage to use Galera for all this infrastructures Docker runs on. 

Mesos is about to run easy on 1000 nodes. It has a nice abstraction of nodes and framework. Companies like Airbnb, Paypal, eBay, Groupon use Mesos. Having a Galera poc for Mesos is going to make it likely to have MySQL (etc.) being a native part in Mesos installations.

There had been another customer I was allowed to help to deploy Galera on CoreOS \o/.

At least I plan to help deploy Galera on other cluster or multi-node solutions for Docker also :)

Stay tuned as there is a plan to have a little series about Galera@Docker on the coderships website too.

Have Fun
Erkan Yanar

Using master-master for MySQL? To be frankly we need to get rid of that architecture. We are skipping the active-active setup and show why master-master even for failover reasons is the wrong decision.

So why does a DBA thinks master-master is good for in a failover scenario?

  • The recovered node does get his data automatically.
  • You need not to use a backup for recovery.

Please remember: MySQL Replication is async

Again: MySQL Replication is async. Even the so called semi-sync Replication!

So following is quite likely.

See a nice master-master setup:

 activ                 standby
+------+     c        +------+
|      |------------->|      |
|abcd  |              |ab    |
|      |              |      |
|      |<-------------|      |
+------+              +------+

Oh my god the node went down:

 RIP                   activ
+------+              +------+
|      |-----||------>|      |
|abcd  |              |abc   |
|      |              |      |
|      |<----||-------|      |
+------+              +------+

Np, we’ve got master-master. After the takeover the recovering node fetches up. (As a fact it has one transaction more:( )

recovered              activ
+------+              +------+
|      |------------->|      |
|abcd  |              |abce  |
|      |      e       |      |
|      |<-------------|      |
+------+              +------+

Great we got no sync data anymore!

recovered             activ
+------+              +------+
|      |------------->|      |
|abcde |              |abce  |
|      |              |      |
|      |<-------------|      |
+------+              +------+

As a fact there is no need for master-master anyway. We’ve got GTID nowadays. Use a simple replication. In a failover you can use GTID to check if you got extra transactions on the recovering node.

If not then simply create a replication and you get all the missing data.

But if there are extra transactions on the recovering node you got to rebuild the node anyway.

FYI: This works with GTID@MariaDB and GTID@MySQL.

Welcome to the GTID era! \o/

Viel Spaß

Erkan :)

As the systemd integration keeps on getting forward. It is hardly to be ignored by us MySQL folks :)

Lets have a look into a simple problem, you are not going to solve like you used to solve it. (At least on Centos7 installing the MariaDB package)

Increasing table_open_cache was only a configuration issue. As mysqld was started as root and then switching to the unix user mysql. On Centos7 this was not working anymore. As MariaDB/MySQL ist startet with a service file starting the process as user mysql:

[Service]
Type=simple
User=mysql
Group=mysql
..

As not root (having the right capability) it will be not able to change the open files limit. In the error log you are going to find something like:

150303 11:57:02 [Warning] Changed limits: max_open_files: 1024  
 max_connections: 214  table_cache: 400

Reading the service file, you get already the hint to use LimitNOFILE to configure you open files settings. Quite nice, but who is usually reading a service file? :)

And please do as recommended and create a file like /etc/systemd/system/mariadb.service.d/limits.conf. Don’t change the service file in the /usr/lib/systemd/system directory, as all files in this directory belong to the package maintainer and changes are likely to be lost on the next upgrade. systemd offers you to make your personal changes in /etc/systemd/system/.

Viel Spaß

Erkan

PS: The concrete service file execs /usr/bin/mysqld_safe. Imho this is kinda wrong. I haven’t checked the service files from Percona and MySQL indeed :)

PPS: Thx to Dirk Deimeke for having a look at the problem :)

There will be a MySQL & Friends Devroom at the FOSDEM 2015 this year.
And surprisingly my talk - not sure you can call 15 min a talk  - had been accepted.
(There are other accepted talks of course.) 

It will be a dense talk about Docker/MySQL/Galera :)

We are (at least me) are going to have fun \o/
Erkan

I hope there will be an official announcement too.
 

Using GTID to attach an asynchronous Slave sounds promising. Lets have a look at the two existing GTID implementations and their integration with Galera.

GTID@MariaDB

There is one GTID used by the cluster and every node increments the common seqno by itself. This works well as long all transactions are replicated by Galera (simplified InnoDB). Because Galera takes care of the Commit Order of the transactions on all nodes. So having identical GTID/seqno from the start there are no problems.

  node1> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

  node2> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

  node3> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

But think about having a DML not replicated by Galera. Lets assume we write into a MyISAM/MEMORY table on node1. Then only the seqno of node1 is increased:

  node1> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-505 |
  +-----------------+---------+

  node2> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

  node3> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-504 |
  +-----------------+---------+

Galera does not care about the different seqno on the hosts. The next transaction replicated by Galera increases the seqno of all nodes by 1:

  node1> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-506 |
  +-----------------+---------+

  node2> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-505 |
  +-----------------+---------+

  node3> show global variables like 'gtid_binlog_pos';
  +-----------------+---------+
  | Variable_name   | Value   |
  +-----------------+---------+
  | gtid_binlog_pos | 0-1-505 |
  +-----------------+---------+

So we have different mapping between GTID@MariaDB and GTID@Galera on the different hosts. This is far from optimal.

Having that situation think about a slave switching the Master. You will do one of two outcomes:

  • Loosing transactions
  • Reapplying transactions

If you are lucky replication fails. As the data is inconsistent.

So it is up to you to make sure you have only DML’s on the cluster, being replicated by Galera only.

GTID@MySQL/Percona

MySQL-Galera integration looks different:

GTID@MySQL uses server_uuid combined with the seqno to build its GTID. Galera makes a little trick in using a separate servre_uuid for transactions written/replicated by Galera. All other transactions use the original server_uuid of the server.

Let’s have a look on one node:

 node2> show global variables like 'gtid_executed';
 +---------------+-------------------------------------------------+
 | Variable_name | Value                                           |
 +---------------+-------------------------------------------------+
 | gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-10,
 933c5612-12c8-11e4-82d2-00163e014ea9:1-6 |
 +---------------+-------------------------------------------------+

6d75ac01-ed37-ee1b-6048-592af289b902 ist die server_uuid für Galera. 933c5612-12c8-11e4-82d2-00163e014ea9 ist die server_uuid für alle anderen Transaktionen.

So lets write into an InnoDB table:

node2> node2> show global variables like 'gtid_executed';
+---------------+--------------------------------------------------+
| Variable_name | Value                                            |
+---------------+--------------------------------------------------+
| gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-11,
933c5612-12c8-11e4-82d2-00163e014ea9:1-6 |
+---------------+--------------------------------------------------+

node1>  show global variables like 'gtid_executed';
+---------------+--------------------------------------------------+
| Variable_name | Value                                            |
+---------------+--------------------------------------------------+
| gtid_executed | 6c7225e2-12cc-11e4-8497-00163e5e2a58:1-2,
6d75ac01-ed37-ee1b-6048-592af289b902:1-11 |
+---------------+--------------------------------------------------+

And into a MyISAM table:

node2> show global variables like 'gtid_executed';
+---------------+------------------------------------------------------+
| Variable_name | Value                                                |
+---------------+------------------------------------------------------+
| gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-11,
933c5612-12c8-11e4-82d2-00163e014ea9:1-7 |
+---------------+------------------------------------------------------+

So because of the distinction between the writesets. All Galera data has the same GTID@MySQL to GTID@Galera mapping.

While there is still the possibility replication will break because the non-Galera replication ;)

Resumé

As long as all your data is handled by Galera Replication both GTID integrations should work fine. By no later than you do an Rolling Upgrade (mysql_upgrade) you are lost using GTID@MariaDB.

For GTID@MariaDB imho it would be a good idea to have a separate domainid for galera only and the ability to filter on the slave on that domainid.

Viel Spaß

Erkan :)

Galera the synchronous Master-Master replication is quite popular. It is used by Percona XtraDB Cluster, MariaDB Galera Cluster and even patched MySQL binaries exist. Quite often you want to add a slave to a Galera Cluster. This is going to work quite well. All you need is at least configure log_bin, log_slave_updates and server_id on the designated Masters and attach your Slave.

GTID@MariaDB


Even you can use traditional (non GTID) replication. Using non GTID replication is a hassle. As you need to search for the right offset on the new Master to attach your Slave on.

Using GTID promises to be easier. As you simply switch to the new Master and the replication framework finds the new position based on the GTiD automatically.

As a fact we have two GTID implementations

  • GTID@MySQL/Percona
  • GTID@MariaDB

There are already blogpost about attaching a Slave to a Galera Cluster not using GTID.

And even using GTID@MySQL

We are going to provide a post using GITD@MariaDB :)

We assume there is already a running Galera Cluster. Building one is already explained:

Both are quite similar :D

In opposite to the blog please use wsrep_sst_method=xtrabackup-v2. The current MariaDB release 10.0.12-MariaDB-1~trusty-wsrep-log has a bug preventing you to use wsrep_sst_method=rsync

Additional Configuration on the Galera nodes


[mysqld]
log_bin
log_slave_updates
server_id        = 1

log_bin activates the binlog, while log_slave_updates make sure to write all transactions replicated via Galera into that binlog. On didactic purpose we set the server_id on the same value on all Galera nodes.

Configuring the Slave


[mysqld]
binlog_format      = ROW
log_bin
log_slave_updates
server_id          = 2

GTID@Mariadb still lacks good operational integration. So building a slave is done not using GTID.

There is a fix for mysqldump and a patch for xtrabackup exists.

Attaching a Slave


MariaDB replicates - in opposite to MySQL - always the GTID. So we first attach the slave using mysqldump (master-data) and are going to have a running replication:

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.3.93
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysqld-bin.000002
          Read_Master_Log_Pos: 537
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 536
        Relay_Master_Log_File: mysqld-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 537
              Relay_Log_Space: 834
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
             Master_Server_Id: 1
                   Using_Gtid: No
                  Gtid_IO_Pos:

Switching to replication using GTID is quite simple:

slave> stop slave;
slave> change master to master_use_gtid=slave_pos;
slave> start slave;
slave> show slave status\G

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.3.93
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysqld-bin.000002
          Read_Master_Log_Pos: 873
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 694
        Relay_Master_Log_File: mysqld-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 873
              Relay_Log_Space: 992
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
        Seconds_Behind_Master: 0
             Master_Server_Id: 1
                   Using_Gtid: Slave_Pos
                  Gtid_IO_Pos: 0-1-3

Check out the last two lines:)

Failover


On failover we attach the slave to another Master.

 slave> STOP SLAVE;
 slave> CHANGE MASTER TO MASTER_HOST="$NEW_HOST";
 slave> START SLAVE;

Check Master_Host and be exited we don't need to care about Master_Log_File and Master_Log_Pos.

 slave> show slave status\G
 *************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 10.0.3.189
                   Master_User: replication
                   Master_Port: 3306
                 Connect_Retry: 10
               Master_Log_File: mysqld-bin.000007
           Read_Master_Log_Pos: 77357
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 46594
         Relay_Master_Log_File: mysqld-bin.000007
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
           Exec_Master_Log_Pos: 77357
               Relay_Log_Space: 46892
               Until_Condition: None
         Seconds_Behind_Master: 0
              Master_Server_Id: 1
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-504

GTID promises things to be much easier. As a fact I don't like the way GTID@MariaDB works with Galera replication.

Im gonna tell why in my next blog post :)

Viel Spaß

Erkan :)

Im giving some talks this year:

MySQL Hochverfügbar mit Galera

Location: FrOSCon

About: Learn about Galera and deploy it using LXC and Ansible

LBaaS-Loadbalancer as a Service

Place: GUUG Frühjahrsgespräche

Topic: It is a workshop ( together with Jan Walzer and Jörg Jungermann). We are going to show how to use LXC to provide slim loadbalancers.

Medley der Containertechniken

Place: GUUG Frühjahrsgespräche

Topic: Learn about all the basic techniques vanilla based Container technology uses/shares (Namespaces, Cgroups und Chroot). Have a look at some of them (LXC, Libvrit, systemd-nspawn and Docker)

MySQL Replikation: Von den Anfängen in die Zukunft

Place: DOAG 2014

Topic: Learn about the past and the future of MySQL (and MariaDB) replication.


[UPDATE]

Hands on Docker

Place: CommitterConf.de 

Topic: Let's get started using Docker (workshop)

# Docker - I want to break free!

Using Docker you are forced to think about mapping your ports. But most of the time you would like to assign a static IP. Even every DockerContainer has an IP you don’t want to use it:

  • The IP is volatil (So it is likely you get another one the next start)
  • The Ip is not routet. So the Containers unreachable - via that IP - from another host.

Dockercontainer run in there own Namespace. We are going to provide an IP via *iproute2”.

Taking this Container:

> docker ps
CONTAINER ID        IMAGE               COMMAND              NAMES
8fe4e6c72b90        erkan/nginx:v01     /bin/sh -c nginx     angry_mccarthy

But ip netns is not seeing the Network Namespace (interface) of the Container.

> ip netns
>

This is because ip nents manages (his) Network Namespace using /var/run/netns. For that we:

Get the IP of the running Dockercontainer:

> docker inspect --format='{{ .State.Pid }}' angry_mccarthy
26420

Link into /var/run/netns/$nameIlike

> ln -s /proc/26420/ns/net /var/run/netns/freeme
> ip netns
freeme

Now we create a veth wire and putting on end into the Network Namespace of the Container and attach the other and to the (given) bridge.

> ip link add veth-host type veth peer name veth-freeme
> ip link set veth-freeme netns freeme  
> ip link set veth-host master br0
> ip netns exec freeme ip addr add 192.168.178.123/24 dev veth-freeme
> ip netns exec freeme ip link  set veth-freeme up
> ip link set veth-host up

From now we can access the Container via his Ip from another host.

ImNotHere:~$ curl  192.168.178.123
<html>
<head>
<title>Welcome to nginx!</title>
</head>
<body bgcolor="white" text="black">
<center><h1>Welcome to nginx!</h1></center>
</body>
</html>

feddisch \o/

Viel Spaß Erkan

P.S: http://www.opencloudblog.com a great blog of a former collogue.

Ahoi,

Im giving a OckerHaterHipster talk at the First Docker Meetup Frankfurt
It is about hating of course. But to be frankly it is more about understanding Docker is not just a replacement. It is more a rethinking of infrastructure. 

Not that you've got to love it of course  :)

Have Fun
Erkan