A few weeks ago I’ve migrated some services to a new server and this new server was running Debian 9. One of the changes from version 8 to 9 was that the default
mysql-server package installed MariaDB instead of MySQL. This should be OK as MariaDB is supposed to be compatible with MySQL.
This service needed to be a replication slave with another instance which is not directly accessible to the internet, and is running an old version of MySQL. The setup as straightforward: setup the ssh tunnel; imported the current data with the master settings; configured table name translation, issued the
set master to ... and
start slave; commands and voilà. All is well.
In the next few weeks I kept receiving alerts that the replication had stopped. I was always blaming the ssh tunnel that kept going down. But the tunnel was being automatically brought back up and MariaDB was not reconnecting back automatically, needing me to manually issue
stop slave; start slave; in order to bring replication back.
I’ve tried to change the ssh scripts to use autossh (very nice, by the way), but the only thing it changed is that my checked never needed to bring the tunnel back up.
After a bit of search, I’ve found this article in MariaDB Knowledge Base: https://mariadb.com/kb/en/library/replication-slave-loses-connection-and-does-not-recover/. Changing the default character set from
utf8 in the slave (the master does not support the new
utf8mb4 charset), basically search and replace all
utf8. The replication issue stopped happening.
utf8m4 is the “new” UTF-8 character set that was added to MySQL in version 5.5.3 to address the issue that not all UTF-8 codepoints could be stored in 3-bytes, as explained in this article from Thomas Shay.
I just wished that MariaDB had given me nicer error messages explaining why the connection was broken and why it was not trying to reconnect back. All the search for maximum-retries, master_retry_count and global variables were no help at all.