# Percona XtraDB Cluster

Developer(s) -- Percona {{#property:P1324}} x86-64 GPLv2 https://www.percona.com

Percona XtraDB Cluster - is a database clustering solution for MySQL. It ensures high availability, prevents downtime and data loss, and provides linear scalability for a growing environment. It integrates Percona Server and Percona XtraBackup with the Codership Galera library of MySQL high availability solutions in a single package that enables you to create a cost-effective MySQL high availability cluster. .

## Features of Percona XtraDB Cluster

• Synchronous replication: Data is written to all nodes simultaneously, or not written at all if it fails even on a single node.
• Multi-master replication: Any node can trigger a data update.
• True parallel replication: Multiple threads on slave performing replication on row level.
• Automatic node provisioning: You simply add a node and it automatically syncs.
• Data consistency: No more unsynchronized nodes.

Percona XtraDB Cluster is fully compatible with MySQL Server Community Edition, Percona Server, and MariaDB in the following sense:

• Data compatibility: You can use data created by any MySQL variant.
• Application compatibility: There is no or minimal application changes required.

A cluster consists of nodes, where each node contains the same set of data synchronized accross nodes. The recommended configuration is to have at least 3 nodes, but you can have 2 nodes as well. Each node is a regular MySQL Server instance (for example, Percona Server). You can convert an existing MySQL Server instance to a node and run the cluster using this node as a base. You can also detach any node from the cluster and use it as a regular MySQL Server instance.

Benefits:

• When you execute a query, it is executed locally on the node. All data is available locally, no need for remote access.
• No central management. You can loose any node at any point of time, and the cluster will continue to function without any data loss.
• Good solution for scaling a read workload. You can put read queries to any of the nodes.

Drawbacks:

• Overhead of provisioning new node. When you add a new node, it has to copy the full data set from one of existing nodes. If it is 100GB, it copies 100GB.
• This can’t be used as an effective write scaling solution. There might be some improvements in write throughput when you run write traffic to 2 nodes versus all traffic to 1 node, but you can’t expect a lot. All writes still have to go on all nodes.
• You have several duplicates of the data, for 3 nodes you have 3 duplicates.

## Restrictions

• Currently only works with the replication engine InnoDB. Entries in the table, working on another engine, will not be replicated. However, the DLL-expression is replicated at the query level, and changes in MySQL tables. * Still to be replicated. So you can safely write "CREATE USER ...", "but INSERT INTO mysql.user ..." it will not be replicated.
• Unsupported requests: lock / unlock (essentially impossible in a multi-master scheme) and their functions are similar.
• Log requests can not be put into the database. If you want logging queries to the database, the log should be sent to a file.
• The maximum size of the transaction and determine the value of wsrep_max_ws_rows wsrep_max_ws_size. LOAD DATA INFILE will commit to make every 10 000 lines. So that large transactions are divided into a series of smaller ones.
• Due to the control of multithreading at the cluster level, transaction, committing a COMMIT, can still be interrupted at this point. There may be two transaction, writing to the same number in different nodah and only one of them will be completed successfully. Another will be interrupted at the cluster level (Error 1213 the SQLSTATE: 40001 (ER_LOCK_DEADLOCK)). This once again confirms that scalability is supported mainly for reading a large volume, but not recording.
• XA transactions are not supported due to a possible rollback, and commit to the stage.
• Ability for recording cluster limited capabilities of the weakest node. If one node is dragged down with him to slow down and the entire cluster. If you need a stable performance, it must be supported by your hardware.
• The minimum recommended cluster size - 3 nodes.
• Enforce_storage_engine = InnoDB is not compatible with wsrep_replicate_myisam = OFF.
• Variable binlog_rows_query_log_events not supported.

## Install

Installing at Ubuntu:

gpg --keyserver  hkp://keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A
gpg -a --export CD2EFD2A | sudo apt-key add -
sudo apt-get update
sudo apt-get install percona-xtradb-cluster-server-5.5


Sometimes swears unsolvable dependence. First, then it suffices to establish dependencies, and then the server itself.

First, without changing anything in the settings it is necessary to go to the mysql console (just on the node that will be running the first), and add the user to backup:

grant RELOAD, LOCK TABLES, REPLICATION CLIENT, FILE on *.* to backup@localhost identified by 'password';


The fact is that in Ubuntu with sh symlink leads to the dash, and the starting cluster script imprisoned under bash. If there are no contraindications, you can sort it out system-wide:

dpkg-reconfigure dash

Answer «NO».


Since the base after the launch of the cluster will be completely identical, and that the system user password on all nodah will be as in the first. Therefore, it is necessary to copy /etc/mysql/debian.cnf file from the server, which will be launched first in the other.

example of configs:

[mysqld_safe]
wsrep_urls=gcomm://192.168.1.1:3400,gcomm://192.168.1.2:3400,gcomm://

[mysqld]
port=3306
socket=/var/run/mysqld/mysqld.sock
basedir=/usr
user=mysql
log_error=error.log
binlog_format=ROW
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_cluster_name=cluster0
wsrep_sst_method=xtrabackup
wsrep_node_name=node0
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
innodb_buffer_pool_size=5000M
innodb_log_file_size=256M
innodb_log_buffer_size=4M


[client]

port=3306
socket=/var/run/mysqld/mysqld.sock


Check the location libgalera_smm.so. The value wsrep_slave_threads recommended to put the number of cores * 4. In wsrep_sst_auth specified username and password for backups. innodb_buffer_pool_size, innodb_log_file_size, innodb_log_buffer_size - affect performance and are chosen experimentally. In my case, each node is 8 gigs of RAM.

To add a node need to add them to wsrep_urls line (at the end of the line should be blank record). All IP-address, occurring in the file (except wsrep_urls lines) indicate the address of the current node. They should be changed and the propagation of this file on other nodes. It is also necessary to change the name of the node in wsrep_node_name.

In this setup port 3400 is used for synchronization, port 3500 - to fill dump, port 3306 (standard) - for the client connection.

You can run multiple nodes on the same machine, by giving them different ports. If you decide to do so, it is necessary to nasozdavali several configuration files in / etc / MySQL server and run this command for example:

/ The USR / bin / mysqld_safe --defaults-file = / etc / MySQL / my0.cnf


Note that XtraBackup can only be connected via a standard socket /var/run/mysqld/mysqld.sock (parameters from config ignores). So in this case it is necessary not to use: wsrep_sst_method = Rsync

And finally, restart the daemon:

Restart the MySQL service Sudo

If something went wrong, we look at /var/lib/mysql/error.log.
Usually it is necessary to erase / Var / Library / MySQL / ib_logfile * due to a change in the configuration log sizes.


Sometimes it is easier to erase all the entire / Var / Library / MySQL / (if there is anything of value bases) and recreate the default database:

mysql_install_db


### Installing on Ubuntu

If you previously had MySQL installed on the server, there might be an AppArmor profile which will prevent Percona XtraDB Cluster nodes from communicating with each other. The best solution is to remove the apparmor package entirely:

sudo apt-get remove apparmor

Fetch the package for configuring Percona software repository:

wget https://repo.percona.com/apt/percona-release_0.1-4.$(lsb_release -sc)_all.deb Install the downloaded package with dpkg: sudo dpkg -i percona-release_0.1-4.$(lsb_release -sc)_all.deb

Once you install this package, the Percona repository should be added. You can check the repository configuration in the /etc/apt/sources.list.d/percona-release.list file.

vi /etc/apt/sources.list.d/percona-release.list

Update the local cache:

sudo apt-get update

Install the Percona XtraDB Cluster server package:

sudo apt-get install percona-xtradb-cluster-57

Stop the mysql service:

sudo service mysql stop

Add the following configuration variables to /etc/mysql/my.cnf on the first node:

wsrep_provider=/usr/lib/libgalera_smm.so

wsrep_cluster_name=pxc-cluster

wsrep_node_name=pxc1

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:passw0rd

pxc_strict_mode=ENFORCING

binlog_format=ROW
defaul_storage_engine=InnoDB
innodb_autoinc_lock_mode=2

For the second node:

wsrep_node_name=pxc2
wsrep_node_address=192.168.70.62

For the third node:

wsrep_node_name=pxc3


Start the first node using the following command:

sudo /etc/init.d/mysql bootstrap-pxc

To make sure that the cluster has been initialized, run the following:

mysql -uroot -p
show status like 'wsrep%';

Before adding other nodes to your new cluster, create a user for SST and provide necessary privileges for it.

mysql -uroot -p
CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 'passw0rd';
GRANT RELOAD, LOCK TABLES, PROCESS, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
FLUSH PRIVILEGES;

Start the second and third node using the following command:

/etc/init.d/mysql start

## Features of multi-master replication

Multi-master replication - the ability to write to any node in the cluster and not to worry, that disturbed the synchronization, as it does during normal replication, MySQL, if you happen to spend a record is not on the server. This feature the last two years has been requested, if not more.

You can write C Percona XtraDB Cluster at any node, and the cluster ensures consistency of recording. That is, the recording or nodah occur at all, or in any. For simplicity, this digramme shows the operation for two nodes, but it works in a similar way any number of nodes:

All requests are executed locally on the node is processed in a special way only COMMIT. Once given the COMMIT command, the transaction must pass the verification on all nodes. If it fails, it will return an error. After this transaction applied to a local node.

Response time to COMMIT consists of the following components:

• The time of passage over the network
• Verification Time
• Application Lock

Из этой архитектуры вытекают два важных следствия:

Because of this architecture has two important consequences:

Firstly, multiple threads may operate in parallel. This is the true parallel replication.

Secondly, there is a small period of time when the slave is not synchronized with the master. This occurs because the master can use an event faster than the slave. If the reading occurs from slave, it is possible to read data that has not changed. This is evident from the diagram. However, this behavior can be changed by using the option variable wsrep_causal_reads = ON. In this case, the reading on the slave will wait until the event will not apply (this, however, increases the read response time). This phenomenon - the reason why the replication is called "nearly simultaneous" and not "true synchronous".

Such a behavior of the COMMIT command has another important consequence. If recording is carried out on two different node, the cluster will use an optimistic locking model. That is, the transaction will not check the possible conflicts of locks during individual requests, but only at the stage of the commit, and you can get an error when doing a commit. This is important because it is one of the possible incompatibilities with a conventional engine InnoDB. In InnoDB Error type DEADLOCK and LOCK TIMEOUT occur in response to individual requests, but not to commit. It makes sense to check for errors after the COMMIT request, although there are applications that do not.

If you plan to take advantage of multi-master and write on several nodes, Take care to handle errors on comm.

## Load Balancing with HaProxy

Here is an example configuration file HaProxy:

# this config needs haproxy-1.4.20

global
log 127.0.0.1   local0
log 127.0.0.1   local1 notice
maxconn 4096
uid 99
gid 99
daemon
#debug
#quiet

defaults
log     global
mode    http
option  tcplog
option  dontlognull
retries 3
redispatch
maxconn 2000
contimeout      5000
clitimeout      50000
srvtimeout      50000

listen mysql-cluster 0.0.0.0:3306
mode tcp
balance roundrobin
option mysql-check user root

   server db01 10.4.29.100:3306 check
server db02 10.4.29.99:3306 check
server db03 10.4.29.98:3306 check


With such a configuration HaProxy will perform load balancing among the three nodes. In this case, it only checks whether hanging on port 3306 service mysqld, but does not take into account the state of the node. It is possible that he will send requests to the node that is in the disconnected state.