6 Replication
Enhancing Scalability and Availability A Tutorial
Table of Contents
1 2 INTRODUCTION ....................................................................................................................3 CONFIGURING MYSQL REPLICATION ...............................................................................3
Configure the master & slave cnf files ................................................................................................... 3 Create Replication User ........................................................................................................................... 5 Lock the Master, Note Binlog Position and Backup Master Database ................................................ 5 Load the Dump File on the Slave ............................................................................................................ 6 Initialize Replication ................................................................................................................................. 6 Basic Checks............................................................................................................................................. 7
Checking Replication Status ..................................................................................................................................... 9 Suspending Replication ........................................................................................................................................... 10 Viewing Binary Logs ................................................................................................................................................. 11
6 7
Page 2 of 18
1 Introduction
MySQL Replication enables users to cost-effectively deliver application performance, scalability and high availability. Many of the world's most trafficked web properties like eBay, Facebook, Tumblr, Twitter and YouTube rely on MySQL Replication to elastically scale-out beyond the capacity constraints of a single instance, enabling them to serve hundreds of millions of users and handle exponential growth. By mirroring data between instances, MySQL replication is also the most common approach to delivering High Availability (HA) for MySQL databases. In addition, the MySQL replication utilities can automatically detect and recover from failures, allowing users to maintain service in the event of outages or planned maintenance. With the release of MySQL 5.6, a number of enhancements have been made to MySQL Replication, delivering higher levels of data integrity, performance, automation and application flexibility. This paper provides a simple step-by-step guide on how to install and configure a master/slave topology, as well as handle failover events. It will demonstrate how easy it is to provision and scale new services using the latest MySQL Replication Global Transaction IDs (GTIDs) - introduced in MySQL 5.6. Before working through this tutorial, you may first want to introduce yourself to MySQL Replication and the new features in MySQL 5.6 by reading the companion white paper MySQL 5.6 Replication 1 An Introduction .
binlog-format: row-based replication is selected in order to test all of the MySQL 5.6 optimisations log-slave-updates, gtid-mode, enforce-gtid-consistency, report-port and report-host: used to enable Global Transaction IDs and meet the associated prerequisites master-info-repository and relay-log-info-repository: are turned on to enable the crash-safe binlog/slave functionality (storing the information in transactional tables rather than flat files) sync-master-info: set to 1 to ensure that no information is lost slave-parallel-workers: sets the number of parallel threads to be used for applying received replication events when this server acts as a slave. A value of 0 would turn off the
http://www.mysql.com/why-mysql/white-papers/mysql-replication-introduction
Page 3 of 18
multithreaded slave functionality; if the machine has a lot of cores and you are using many databases within the server then you may want to increase this value in order to better exploit multi-threaded replication binlog-checksum, master-verify-checksum and slave-sql-verify-checksum: used to enable all of the replication checksum checks binlog-rows-query-log-events: enables informational log events (specifically, the original SQL query) in the binary log when using row-based replication this makes troubleshooting simpler log-bin: The server cannot act as a replication master unless binary logging is enabled. If you wish to enable a slave to assume the role of master at some point in the future (i.e. in the event of a failover or switchover), you also need to configure binary logging. Binary logging must also be enabled on the slave(s) when using Global Transaction IDs. server-id: The server_id variable must be unique amongst all servers in the replication 32 topology and is represented by a positive integer value from 1 to 2
Following are the .cnf files used for the servers within our test rig in addition to the hosts used to run the utilities. black.cnf:
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on enforce-gtid-consistency=true master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 slave-parallel-workers=2 binlog-checksum=CRC32 master-verify-checksum=1 slave-sql-verify-checksum=1 binlog-rows-query-log_events=1 server-id=1 report-port=3306 port=3306 log-bin=black-bin.log datadir=/home/billy/mysql/data socket=/home/billy/mysql/sock report-host=black
blue.cnf:
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on enforce-gtid-consistency=true master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 slave-parallel-workers=2 binlog-checksum=CRC32 master-verify-checksum=1 slave-sql-verify-checksum=1 binlog-rows-query-log_events=1 server-id=2 report-port=3306 port=3306 log-bin=blue-bin.log datadir=/home/billy/mysql/data socket=/home/billy/mysql/sock report-host=blue
Page 4 of 18
Note: For the greatest possible durability and consistency in a replication setup using InnoDB with transactions, you should also specify the innodb_flush_log_at_trx_commit=1, sync_binlog=1 options. MySQL 5.6 includes new group commit functionality for the binary log that significantly reduces the overhead to the master when configuring for maximum data safety. We will assume that you will be using the InnoDB storage engine (the use of GTIDs is restricted otherwise). To cover the more complex use-case, the example assumes that the MySQL Server which is to be used as a master is already in use and contains data that needs to be replicated to the new slave. If starting with an empty database then Step 3: and Step 4:can be skipped. Compatibility If you are attempting to set up replication between two MySQL servers that have already been installed, ensure that the versions of MySQL installed on the master and slave are compatible. For a current list of compatible versions see: http://dev.mysql.com/doc/refman/5.6/en/replicationcompatibility.html
Figure 1 Simple master-slave configuration Figure 1 illustrates the master-slave replication architecture that will be used in this section.
Step 3: Lock the Master, Note Binlog Position and Backup Master Database
This (along with Step 4:) is optional and is only needed if the new master has been receiving updates prior to enabling the binary logs or for so long that some of the binary logs have now been deleted. On the master flush all the tables and block write statements by executing a FLUSH TABLES WITH READ LOCK statement:
[billy@black ~]$ mysql -h 127.0.0.1 -P3306 -u root --prompt='master> '
Page 5 of 18
While the read lock placed by FLUSH TABLES WITH READ LOCK is in effect, read the value of the current binary log name and offset on the master with the following statement:
master> SHOW MASTER STATUS;
+-------------------+----------+--------------+------------------+------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-------------------+----------+--------------+------------------+------------------------------------------+ | black-bin.000006 | 1174 | | | A0F7E82D-3554-11E2-9949-080027685B56:1-5 | +-------------------+----------+--------------+------------------+------------------------------------------+
The File column shows the name of the log and the Position column shows the offset within the file. In this example, the binary log file is black-bin.000006 and the Position is 1174. Youll want to make a note of these values as you will need them later on when you are setting up the slave (unless using GTIDs, in which case this is no longer needed). They represent the replication coordinates at which the slave should begin processing new updates from the master. Note: If the master has been running previously without binary logging enabled, the log name and position values displayed by SHOW MASTER STATUS will be empty. If this is the case, the values that you need to use later when specifying the slave's log file and position are the empty string ('') and 4 (or if using GTIDs they do not need specifying at all). Next, leaving the MySQL client window open you used to execute the FLUSH TABLES WITH READ LOCK, youll need to dump out the contents of the databases on the master you will want to replicate on the slave. In our example we will dump the contents of the clusterdb database. Note: You may not want to replicate the mysql system database if the slave server has a different set of user accounts from those that exist on the master. In this case, you should exclude it from dump process. You can execute mysqldump either from the command line or in a graphically driven manner with MySQL Workbench.
[billy@black ~]$ mysqldump -h 127.0.0.1 -P3306 -u root B clusterdb > /home/billy/mysql/clusterdb.sql
You can re-enable write activity on the master with the following statement:
master> UNLOCK TABLES;
Note that if youre using MySQL Enterprise Edition then you can use MySQL Enterprise Backup to backup InnoDB tables without locking (and MySQL Cluster has its own built-in on-line backup command).
Page 6 of 18
Next, youll need to issue a CHANGE MASTER statement; if you are using GTIDs then the binary log positioning information is optional as using the MASTER_AUTO_POSITION option will ensure that the correct replication events are sent from the master to the slave:
[billy@blue ~]$ mysql -h 127.0.0.1 -P3306 -u root --prompt='slave> ' slave> CHANGE MASTER TO MASTER_HOST='black', MASTER_USER='repl_user', MASTER_PASSWORD='billy', MASTER_AUTO_POSITION=1;
Where: MASTER_HOST: the IP or hostname of the master server, in this example black or 192.168.0.31 MASTER_USER: this is the user we granted the REPLICATION SLAVE privilege to in Step 2:, in this example, repl_user MASTER_PASSWORD: this is the password we assigned to repl_user in Step 2: If not using GTIDs then you must provide positioning information for the masters binary log:
slave> -> -> -> -> CHANGE MASTER TO MASTER_HOST='192.168.0.31', MASTER_USER='repl_user', MASTER_PASSWORD='billy', MASTER_LOG_FILE='black-bin.000006', MASTER_LOG_POS=1174;
Where: MASTER_LOG_FILE: is the file name we determined in Step 3: MASTER_LOG_POS: is the position we determined in Step 3: Finally, start replication on the slave:
slave> START SLAVE;
Page 7 of 18
If replication was not already running then there would be no need to stop and then restart the slave IO thread, instead you would just issue START SLAVE. Check that semisynchronous replication is running from the masters perspective and that at least one slave is connected in semisynchronous mode:
master> SHOW STATUS LIKE 'Rpl_semi_sync_master_status'; +-----------------------------+-------+ | Variable_name | Value | +-----------------------------+-------+ | Rpl_semi_sync_master_status | ON | +-----------------------------+-------+ master> SHOW STATUS LIKE 'Rpl_semi_sync_master_clients'; +------------------------------+-------+ | Variable_name | Value | +------------------------------+-------+ | Rpl_semi_sync_master_clients | 1 | +------------------------------+-------+
Page 8 of 18
| Variable_name | Value | +-----------------------------+-------+ | Rpl_semi_sync_master_yes_tx | 1 | +-----------------------------+-------+ slave> SELECT * FROM clusterdb.simples; +-----+ | id | +-----+ | 1 | | 2 | | 3 | | 100 | | 999 | +-----+
0 No 0
Page 9 of 18
Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 5c9a887f-3983-11e2-b48f-080027685b56 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 5C9A887F-3983-11E2-B48F-080027685B56:1-6 Executed_Gtid_Set: 5C9A887F-3983-11E2-B48F-080027685B56:1-6, 5FB9D4B7-3983-11E2-B48F-0800274BDCE7:1-2
Below is some guidance on how to interpret the results of the output: Slave_IO_State - indicates the current status of the slave Slave_IO_Running - shows whether the IO thread for reading the master's binary log is running Slave_SQL_Running - shows whether the SQL thread for executing events in the relay log is running Last_Error - shows the last error registered when processing the relay log. Ideally this should be blank, indicating no errors Seconds_Behind_Master - shows the number of seconds that the slave SQL thread is behind processing the master binary log. A high number (or an increasing one) can indicate that the slave is unable to cope with the large number of statements from the master. A value of 0 for Seconds_Behind_Master can usually be interpreted as meaning that the slave has caught up with the master, but there are some cases where this is not strictly true. For example, this can occur if the network connection between master and slave is broken but the slave I/O thread has not yet detected this that is, slave_net_timeout has not yet elapsed. If replication is stopped (as in the example above) then this will show the value NULL
Note that when using GTIDs, Retrieved_Gtid_Set and Executed_Gtid_Set provide the details of which GTIDs have been received and executed (respectively) by the slave. On the master, you can check the status of slaves by examining the list of running processes on the server.
master> SHOW PROCESSLIST \G
Because it is the slave that drives the core of the replication process, very little information is available in this report.
Suspending Replication
You can stop and start the replication of statements on the slave using the STOP SLAVE and START SLAVE statements. To stop execution of the binary log from the slave, use STOP SLAVE:
slave> STOP SLAVE;
Page 10 of 18
When execution is stopped, the slave does not read the binary log from the master via the IO_THREAD and stops processing events from the relay log that have not yet been executed via the SQL_THREAD. You can pause either the IO or SQL threads individually by specifying the thread type. For example:
slave> STOP SLAVE IO_THREAD;
Stopping the SQL thread can be useful if you want to perform an offline backup or other task on a slave that only processes events from the master. The IO thread will continue to be read from the master, but the changes will not be applied yet, which will make it easier for the slave to catch up when you start slave operations again - this may be important if you need to failover and make this slave the new master. Stopping the IO thread will allow the statements in the relay log to be executed up until the point where the relay log ceased to receive new events. Using this option can be useful when you want to allow the slave to catch up with events from the master, when you want to perform administration on the slave but also ensure you have the latest updates to a specific point. To start execution again, use the START SLAVE statement:
slave> START SLAVE;
If necessary, you can start either the IO_THREAD or SQL_THREAD threads individually.
Step 1: Prerequisites
In Section 2 we configured replication for two MySQL Servers in a master-slave configuration. In this section, we extend that in 2 ways: 1. Allow relationships to change a server that was acting as a slave can become the master (e.g. when we need to shut down the original master for maintenance) and the original master can subsequently become a slave 2. We start with a more complex configuration as shown in Figure 2 a single master (black) and 2 slaves (blue and green)
Page 11 of 18
Figure 2 Master with dual slaves The configuration file for the new MySQL Server is very similar to that of the existing servers: green.cnf:
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on enforce-gtid-consistency=true master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 slave-parallel-workers=2 binlog-checksum=CRC32 master-verify-checksum=1 slave-sql-verify-checksum=1 binlog-rows-query-log_events=1 server-id=3 report-port=3306 port=3306 log-bin=green-bin.log datadir=/home/billy/mysql/data socket=/home/billy/mysql/sock report-host=green
As every server can act as a master with the other two replication users must be created on each server, each user with permissions for connecting from one of the other two servers.:
black> GRANT REPLICATION SLAVE ON *.* TO repl_user@192.168.0.34 IDENTIFIED BY 'billy'; black> GRANT REPLICATION SLAVE ON *.* TO repl_user@192.168.0.32 IDENTIFIED BY 'billy'; blue> GRANT REPLICATION SLAVE ON *.* TO repl_user@192.168.0.31 IDENTIFIED BY 'billy'; blue> GRANT REPLICATION SLAVE ON *.* TO repl_user@192.168.0.32 IDENTIFIED BY 'billy'; green> GRANT REPLICATION SLAVE ON *.* TO repl_user@192.168.0.31 IDENTIFIED BY 'billy'; green> GRANT REPLICATION SLAVE ON *.* TO repl_user@192.168.0.34 IDENTIFIED BY 'billy';
CHANGE MASTER and START SLAVE should be run on both blue and green in order to start up the replication architecture shown in Figure 2 as described in Step 5: of Section 2.
Page 12 of 18
Finally, if semisynchronous replication is being used then both the master and slave plugins should be installed on all 3:
black> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so'; black> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so'; black> SET GLOBAL rpl_semi_sync_master_enabled = on; blue> blue> blue> blue> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so'; INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so'; SET GLOBAL rpl_semi_sync_slave_enabled = on; STOP SLAVE IO_THREAD; START SLAVE; INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so'; INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so'; SET GLOBAL rpl_semi_sync_slave_enabled = on; STOP SLAVE IO_THREAD; START SLAVE;
0 No 0 0
Page 13 of 18
Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 5c9a887f-3983-11e2-b48f-080027685b56 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 5C9A887F-3983-11E2-B48F-080027685B56:1-6 Executed_Gtid_Set: 5C9A887F-3983-11E2-B48F-080027685B56:1-6, 5FB9D4B7-3983-11E2-B48F-0800274BDCE7:1-2
If Seconds_Behind_Master increases and the level of updates being sent to the master has not changed significantly then that could be a signal that the master has problems (particularly if the same is observed by multiple slaves). MySQL Enterprise Monitor can be used to monitor the status of replication and so that can be used to raise the alarm that a failover is required. Alarms can be configured that will automatically alert an administrator when a masters status variables exceed pre-defined thresholds. This would enable remedial action to be taken before service is impacted.
After this has been executed, the slaves SQL thread will continue to run and apply any remaining updates from the relay log. Once there has been enough time for all of the changes from the relay logs to be applied (wait until Exec_Master_Log_Pos = Read_Master_Log_Pos in the output from SHOW SLAVE STATUS \G) replication can be stopped altogether on the server which will become the new master (alternatively, if using GTIDs, wait until Retrieved_Gtid_Set = Executed_Gtid_Set):
blue> STOP SLAVE;
Page 14 of 18
If using semisynchronous replication then at this point, activate the master-side plugin on the new master:
blue> SET GLOBAL rpl_semi_sync_master_enabled = on;
Before starting replication from the new master (blue) to the remaining slave (green) you need to determine the current position in the new masters binary log (if not using GTIDs):
blue> SHOW MASTER STATUS \G
+-------------------+----------+--------------+------------------+------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-------------------+----------+--------------+------------------+------------------------------------------+ | blue-bin.000006 | 807 | | | A0F7E82D-3554-11E2-9949-080027685B56:1-5 | +-------------------+----------+--------------+------------------+------------------------------------------+
If using semisynchronous replication then make sure that the remaining slave is registered for asynchronous replication by querying the new master (result should be 1):
blue> SHOW STATUS LIKE 'Rpl_semi_sync_master_clients'; +------------------------------+-------+ | Variable_name | Value | +------------------------------+-------+ | Rpl_semi_sync_master_clients | 1 | +------------------------------+-------+
Page 15 of 18
If using semisynchronous replication then activate that plugin on the original master (new slave):
black> SET GLOBAL rpl_semi_sync_slave_enabled = on;
Page 16 of 18
or if not using GTIDs include the binlog file location identified using SHOW MASTER STATUS on the current master (blue):
black> -> -> -> -> black> CHANGE MASTER TO MASTER_HOST='192.168.0.34', MASTER_USER='repl_user', MASTER_PASSWORD='billy', MASTER_LOG_FILE='blue-bin.000001', MASTER_LOG_POS=807; START SLAVE;
At this point, we are back to running with 1 master and 2 slaves as shown in Figure 4.
Figure 4 Back to 1 Master and 2 Slaves An optional final step would be to run through this procedure again to make the original master (black) master again.
Conclusion
MySQL Replication has been proven as an effective solution for the extreme scaling of databasedriven applications in some of the most demanding environments on the web and in the enterprise. This whitepaper has shown how MySQL 5.6 makes the configuration, monitoring and on-going maintenance of replication far simpler and robust through the introduction of Global Transaction IDs. It also stepped through some of the more complex operations for when for some reason those new features are not available to you.
Resources
MySQL 5.6 Replication An Introduction http://www.mysql.com/why-mysql/white-papers/mysql-replication-introduction MySQL 5.6 Download: http://dev.mysql.com/downloads/mysql/
Page 17 of 18
MySQL Replication user guide: http://dev.mysql.com/doc/refman/5.6/en/replication.html MySQL Cluster Replication Documentation: http://dev.mysql.com/doc/mysql-clusterexcerpt/5.5/en/mysql-cluster-replication.html MySQL at Ticketmaster (heavy user of MySQL Replication): http://www.mysql.com/customers/view/?id=684
Copyright 2013, Oracle and/or its affiliates. MySQL is a registered trademark of Oracle in the U.S. and in other countries. Other products mentioned may be trademarks of their companies.
Page 18 of 18