添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
相关文章推荐
没人理的蚂蚁  ·  中国新闻网·  1 月前    · 
狂野的丝瓜  ·  武汉市统计局·  4 月前    · 

We have mysql 8.0.33-25.1 percona wsrep cluster with 6 nodes. The percona cluster is stable and works as expected.

One of the nodes is also a binlog enabled replication primary.
On the other side is 1 replica.
Those are AWS instances of comparable size, the replica has more dedicated IOPS than the primary, due to the nature of its workload.

This same setup was used on mysql5.7 which we migrated away from a few months ago.

Since the migration the replica is behaving incredibly strangely. I understand that there are probably some under-the-hood changes on mysql replication, but what we are experiencing is not stable at all.
The workload on that replica is mostly reads. On the replica there are databases that do not get replicated that do have writes, but as far as replication is concerned on that replica it is only reads.

Here are 2 example workflows that are running on the replica:

  • DB1 is replicated from the primary and the replica reads data from there, creates “snapshots” of the data and writes to DB2 on the replica (which is not replicated in any server).
  • Tableau reads data sequentially once per day that is then used for reports throughout the day from the replica on a schedule.
  • There are other workloads of smaller scale but same type that happens on that replica.

    The problem that we have is during any significant read query the replication pauses and waits for the read query to finish before continuing. Resulting in often timeouts, as those read queries may run for (sometimes) many hours. This results in the following error:

    2024-03-21T09:49:22.030113Z 2386083 [Warning] [MY-010584] [Repl] Replica SQL for channel '': Worker 1 failed executing transaction 'ANONYMOUS' at source log mysql-bin.000639, end_log_pos 853032011; Could not execute Update_rows event on table DB1.table1; Lock wait timeout exceeded; try restarting transaction, Error_code: 1205; handler error HA_ERR_LOCK_WAIT_TIMEOUT; the event's source log mysql-bin.000639, end_log_pos 853032011, Error_code: MY-001205
    

    We can increase the timeout, but thats not what we are after. Prior to the migration, the 5.7 version of the replica was keeping up to date with replication throughout the day. During these past 2 months, we have tried tweaking various variables in an effort to get back to the same state as in 5.7, but with no success. These are the final variables that we are using at the moment different from the previous defaults that we had:

    replica_preserve_commit_order = "OFF";
    replica_parallel_type = "DATABASE";
    replica_parallel_workers=1;
    innodb_adaptive_hash_index="OFF";
    

    And this is our default cnf (some values are edited out):

    # Template my.cnf for PXC
    # Edit to your requirements.
    [client]
    socket=XXX
    default-character-set=utf8
    [mysqld]
    open-files-limit                = 100000
    table_open_cache                = 20000
    innodb_open_files               = 20000
    character-set-server            = utf8
    collation-server                = utf8_unicode_ci
    datadir                         = XXX
    long_query_time                 = 5
    slow_query_log                  = 1
    slow_query_log_file             = XXX
    log_bin                         = mysql-bin
    log-error                       = XXX
    pid-file                        = XXX
    default_storage_engine          = InnoDB
    pxc_strict_mode                 = ENFORCING
    sql_mode                        = ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
    early-plugin-load               = keyring_file.so
    keyring-file-data               = XXX
    pxc-encrypt-cluster-traffic     = ON
    socket                          = XXX
    server-id                       = 9
    binlog-ignore-db                = information_schema
    binlog-ignore-db                = mysql
    binlog-ignore-db                = performance_schema
    binlog-ignore-db                = XXX
    binlog_format                   = ROW
    log_slave_updates               = ON
    binlog_expire_logs_seconds      = 604800
    ft_min_word_len                 = 1
    ft_stopword_file                = ''
    join_buffer_size                = 256K
    key_buffer_size                 = 16M
    lc-messages-dir                 = /usr/share/mysql
    lower_case_table_names          = 0
    max_allowed_packet              = 512M
    max_connections                 = 400
    max_heap_table_size             = 16M
    memlock                         = 0
    read_buffer_size                = 256K
    read_rnd_buffer_size            = 512K
    replica_preserve_commit_order   = OFF
    replica_parallel_type           = DATABASE
    replica_parallel_workers        = 1
    skip-external-locking
    skip_name_resolve
    sort_buffer_size                = 256K
    sysdate_is_now                  = 1
    table_definition_cache          = 20000
    thread_cache_size               = 20
    thread_stack                    = 256K
    tmp_table_size                  = 16M
    tmpdir                          = XXX
    net_read_timeout                = 60
    # 5.6 backwards compatibility
    explicit_defaults_for_timestamp = 1
    # INNODB OPTIONS - You may want to tune the below depending on number of cores and disk sub
    innodb_autoinc_lock_mode        = 1
    innodb_adaptive_hash_index      = OFF
    innodb_buffer_pool_instances    = 8
    innodb_buffer_pool_size         = 12G
    innodb_doublewrite              = 1
    innodb_file_per_table           = 1
    innodb_flush_log_at_trx_commit  = 3
    innodb_flush_method             = O_DIRECT
    innodb_io_capacity              = 200
    innodb_log_buffer_size          = 8M
    innodb_log_files_in_group       = 2
    innodb_log_file_size            = 128M
    innodb_read_io_threads          = 4
    innodb_thread_concurrency       = 0
    innodb_write_io_threads         = 4
    # avoid statistics update when doing e.g show tables
    innodb_ft_min_token_size        = 0
    innodb_stats_on_metadata        = 0
    [mysqld_safe]
    pid-file                        = XXX
    #syslog
    malloc-lib                      = /usr/lib64/libtcmalloc.so.4.4.5
    

    I am assuming that we are missing a key piece here, but so far unable to find it. As a last resort we are considering changing the default TXN isolation from REPEATABLE READ to READ COMMITTED or UNCOMMITTED, which may help with the locking, but that will require us to do significantly more research into implications in our use cases, so would like to avoid if at all possible.

    Any suggestions are welcomed?

    This is an invalid value. I’m surprised MySQL even starts. Change this to 2.

    ROW-based replication should not impose any locks on replicas, unless FOREIGN KEYs are involved. Do you have any of these?

    It would be helpful to understand the locks. Find a SELECT query that runs on the replica that reproduces the issue. Run the query in session #1. While it is locked, in session #2, select from sys. innodb_lock_waits and see what query is blocking. This might help determine what is happening.

    These were all in effort to get back to mysql 5.7 type of replication and performance. Because with LOGICAL_CLOCK and replica_preserve_commit_order we were getting even worse performance and locking.

    innodb_flush_log_at_trx_commit reverts to a value of 2 at runtime, but yes, i was surprised it works too.

    There are no foreign keys in any database, we removed them years ago due to other performance issues.

    As far as the locks, we know which table is getting locked, due to replication needing to update some of its records. Here is an example:

    wait_started	wait_age	wait_age_secs	locked_table	locked_table_schema	locked_table_name	locked_table_partition	locked_table_subpartition	locked_index	locked_type	waiting_trx_id	waiting_trx_started	waiting_trx_age	waiting_trx_rows_locked	waiting_trx_rows_modified	waiting_pid	waiting_query	waiting_lock_id	waiting_lock_mode	blocking_trx_id	blocking_pid	blocking_query	blocking_lock_id	blocking_lock_mode	blocking_trx_started	blocking_trx_age	blocking_trx_rows_locked	blocking_trx_rows_modified	sql_kill_blocking_query	sql_kill_blocking_connection
    2024-04-09 08:30:33	00:00:09	9	`DB1`.`table1`	DB1	table1	NULL	NULL	idx_contact	RECORD	117850381337	2024-04-09 08:30:33	00:00:09	1	1	1697121	NULL	139791340341016:665006:958203:570:139774144774368	X,GAP,INSERT_INTENTION	421266317044760	2287080	INSERT INTO `DB2`.`table2`.`Value`	139791340334104:665006:958203:570:139784041354192	S,GAP	2024-04-09 08:29:52	00:00:50	16580219	0	KILL QUERY 2287080	KILL 2287080
    

    The blocking thread in question 2287080:

    ID	USER	HOST	DB	COMMAND	TIME	STATE	INFO	TIME_MS	ROWS_SENT	ROWS_EXAMINED
    2287080	XXX	XXX	DB1	Query	58	executing	INSERT INTO `DB2`.`table2`........ SELECT                ...... COUNT(DISTINCT `table1`.`ContactId`) ............ FROM `table3` ........ ON DUPLICATE KEY↵                UPDATE `XXXX` = VALUES(`XXXX`)	57595	0	0
    

    The blocking query is a INSERT SELECT query, it selects from DB1 which is replicated from the primary to the replica and inserts into DB2 which is not replicated anywhere and is only on the replica, used for aggregation.

    While DB1.table1 is locked replication start lagging behind with:

    ID	USER	HOST	DB	COMMAND	TIME	STATE	INFO	TIME_MS	ROWS_SENT	ROWS_EXAMINED
    1697121	system user		DB1	Query	24	Applying batch of row changes (write)	NULL	23867	0	0
    1697120	system user		NULL	Query	21	Reading event from the relay log	NULL	20856	0	0
    

    Some of those queries can have a runtime of several thousand seconds, which then leads to the replication to stop with a timeout error i shared in the first post.

    hristomitev:

    The blocking query is a INSERT … SELECT query, it selects from DB1 which is replicated from the primary

    This, naturally, imposes (S)hared locks on the DB1.table1 table as it is being SELECTed. This prevents table1 from being modified while the “SELECT *” is working. Without this lock, table1 could be modified and the INSERT would no longer be (C)onsistent, nor (I)solated (eg: ACID compliant). This lock also blocks replication.

    Replication always operates in READ-COMMITTED mode and you can’t change that. I would suggest changing your I … S to use READ-COMMITTED as I can see the query is also GAP-locking, which means it’s not only locking the actual existing rows, it’s also locking the “gaps” of missing rows and the “gaps” at the end of the table. R-C does not have gap locks.