加速无索引表引起的主从延迟数据回放
加速无索引表引起的主从延迟数据回放
一、场景
由于某些原因,客户现场存在一张 8千万 的大表,而且该表上无任何索引(也无主键),平时该表上 UPDATE
或 DELETE
只操作几条数据。忽然有一天业务进行了某种操作,DELETE
2万 条数据,悲剧发生了,当在主库上执行了之后,传到从库上之后一直回放,当时评估了下可能会回放10天,后来在经过业务同意之后,对表进行操作,用于加速回放日志,处理该问题。
二、处理思路
停掉复制线程,关闭 Binlog 日志记录,添加索引,重启复制线程。
三、复现步骤
1、准备数据:
创建库:
greatsql> CREATE DATABASE qj;
Query OK, 1 row affected (0.01 sec)
准备数据,准备一张8千万数据的表:
$ sysbench --db-driver=mysql --mysql-host=192.168.139.230 --mysql-port=3307 --mysql-user=greatsql --mysql-password=GreatSQL@2024 --mysql-db=qj --table_size=80000000 --tables=1 oltp_write_only prepare
sysbench 1.1.0-df89d34 (using bundled LuaJIT 2.1.0-beta3)
Creating table 'sbtest1'...
Inserting 80000000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...
主节点修改表结构,将表的索引去掉
greatsql> USE qj
Database changed
greatsqll> ALTER TABLE sbtest1 MODIFY id int not null;
Query OK, 80000000 rows affected (36 min 29.13 sec)
Records: 80000000 Duplicates: 0 Warnings: 0
greatsql> ALTER TABLE sbtest1 DROP key `k_1`;
Query OK, 0 rows affected (0.96 sec)
Records: 0 Duplicates: 0 Warnings: 0
greatsql> ALTER TABLE sbtest1 DROP PRIMARY KEY;
Query OK, 80000000 rows affected (14 min 29.31 sec)
Records: 80000000 Duplicates: 0 Warnings: 0
2、模拟延迟:
01、主节点:
模拟删除数据
greatsql> DELETE FROM qj.sbtest1 WHERE k<20000;
Query OK, 19894 rows affected (4 min 20.40 sec)
02、从节点:
等待时间超过DELETE语句的时间,观察延迟情况,确认从节点可读可写
greatsql> SELECT sleep(300);
+------------+
| sleep(300) |
+------------+
| 0 |
+------------+
1 row in set (5 min 0.01 sec)
greatsql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 172.17.139.230
Master_User: greatsql
Master_Port: 3307
Connect_Retry: 60
Master_Log_File: binlog.000078
Read_Master_Log_Pos: 953750186
Relay_Log_File: relaylog.000231
Relay_Log_Pos: 949954084
Relay_Master_Log_File: binlog.000078
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 949953874
Relay_Log_Space: 953750683
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 1345
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1000403307
Master_UUID: 1d0963e9-85d9-11ef-80e1-00163e28e06a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-163235
Executed_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-163234
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.00 sec)
greatsql> SHOW GLOBAL VARIABLES LIKE 'read_only';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only | OFF |
+---------------+-------+
1 row in set (0.01 sec)
03、通过操作从节点加速日志回放
找到回放线程,KILL掉回放线程,关闭当前会话binlog日志记录,添加索引,重新启动复制回放线程
greatsql> SHOW PROCESSLIST;
+--------+-----------------+----------------------+------+---------+---------+----------------------------------------------------------+------------------+------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Time_ms | Rows_sent | Rows_examined |
+--------+-----------------+----------------------+------+---------+---------+----------------------------------------------------------+------------------+------------+-----------+---------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 2369454 | Waiting on empty queue | NULL | 2369454039 | 0 | 0 |
| 17 | system user | connecting host | NULL | Connect | 2369447 | Waiting for source to send event | NULL | 2369446555 | 0 | 0 |
| 543448 | greatsql | 172.17.136.93:48298 | NULL | Sleep | 110 | | NULL | 109832 | 0 | 0 |
| 543500 | greatsql | 172.17.139.230:35198 | NULL | Sleep | 586 | | NULL | 585885 | 0 | 0 |
| 543588 | greatsql | 172.17.136.93:57948 | NULL | Sleep | 50 | | NULL | 49878 | 0 | 0 |
| 543663 | greatsql | 172.17.139.230:35726 | NULL | Query | 0 | init | show processlist | 0 | 0 | 0 |
| 543666 | greatsql | 172.17.136.93:58908 | NULL | Sleep | 290 | | NULL | 289861 | 0 | 0 |
| 543708 | system user | | NULL | Query | 477 | Replica has read all relay log; waiting for more updates | NULL | 476520 | 0 | 0 |
| 543709 | system user | | qj | Query | 1383 | Applying batch of row changes (delete) | NULL | 26102 | 0 | 0 |
| 543710 | system user | | NULL | Connect | 477 | Waiting for an event from Coordinator | NULL | 476570 | 0 | 0 |
| 543711 | system user | | NULL | Connect | 477 | Waiting for an event from Coordinator | NULL | 476569 | 0 | 0 |
| 543712 | system user | | NULL | Connect | 477 | Waiting for an event from Coordinator | NULL | 476568 | 0 | 0 |
| 543714 | greatsql | 172.17.136.93:33582 | NULL | Sleep | 470 | | NULL | 469798 | 0 | 0 |
+--------+-----------------+----------------------+------+---------+---------+----------------------------------------------------------+------------------+------------+-----------+---------------+
13 rows in set (0.00 sec)
greatsql> KILL 543709;
Query OK, 0 rows affected (0.00 sec)
greatsql> SHOW PROCESSLIST;
+--------+-----------------+----------------------+------+---------+---------+----------------------------------+------------------+------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Time_ms | Rows_sent | Rows_examined |
+--------+-----------------+----------------------+------+---------+---------+----------------------------------+------------------+------------+-----------+---------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 2369473 | Waiting on empty queue | NULL | 2369473241 | 0 | 0 |
| 17 | system user | connecting host | NULL | Connect | 2369466 | Waiting for source to send event | NULL | 2369465757 | 0 | 0 |
| 543448 | greatsql | 172.17.136.93:48298 | NULL | Sleep | 129 | | NULL | 129034 | 0 | 0 |
| 543500 | greatsql | 172.17.139.230:35198 | NULL | Sleep | 605 | | NULL | 605087 | 0 | 0 |
| 543588 | greatsql | 172.17.136.93:57948 | NULL | Sleep | 9 | | NULL | 9132 | 0 | 0 |
| 543663 | greatsql | 172.17.139.230:35726 | NULL | Query | 0 | init | show processlist | 0 | 0 | 0 |
| 543666 | greatsql | 172.17.136.93:58908 | NULL | Sleep | 309 | | NULL | 309064 | 0 | 0 |
| 543714 | greatsql | 172.17.136.93:33582 | NULL | Sleep | 489 | | NULL | 489000 | 0 | 0 |
+--------+-----------------+----------------------+------+---------+---------+----------------------------------+------------------+------------+-----------+---------------+
8 rows in set (0.00 sec)
greatsql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 172.17.139.230
Master_User: greatsql
Master_Port: 3307
Connect_Retry: 60
Master_Log_File: binlog.000078
Read_Master_Log_Pos: 953750186
Relay_Log_File: relaylog.000231
Relay_Log_Pos: 949954084
Relay_Master_Log_File: binlog.000078
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1317
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '1d0963e9-85d9-11ef-80e1-00163e28e06a:163235' at master log binlog.000078, end_log_pos 950241265. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 949953874
Relay_Log_Space: 953750683
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1317
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '1d0963e9-85d9-11ef-80e1-00163e28e06a:163235' at master log binlog.000078, end_log_pos 950241265. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1000403307
Master_UUID: 1d0963e9-85d9-11ef-80e1-00163e28e06a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 241105 19:06:59
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-163235
Executed_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-163234
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.00 sec)
greatsql> SET sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)
greatsql> ALTER TABLE qj.sbtest1 ADD index tmp_k(k);
Query OK, 0 rows affected (4 min 51.92 sec)
Records: 0 Duplicates: 0 Warnings: 0
greatsql> START SLAVE;
Query OK, 0 rows affected, 1 warning (0.02 sec)
greatsql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 172.17.139.230
Master_User: greatsql
Master_Port: 3307
Connect_Retry: 60
Master_Log_File: binlog.000078
Read_Master_Log_Pos: 953750186
Relay_Log_File: relaylog.000231
Relay_Log_Pos: 953750396
Relay_Master_Log_File: binlog.000078
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 953750186
Relay_Log_Space: 953750683
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1000403307
Master_UUID: 1d0963e9-85d9-11ef-80e1-00163e28e06a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-163235
Executed_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-163235
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.00 sec)
四、思考
01、如何保持主从一致性
02、正在回放数据时,STOP SLAVE
是否会有问题
03、如果主从结构作为数据节点,上层还有计算节点该如何保持元数据的一致性即计算节点记录的表的结构和数据节点的表结构要一致
关于以上三个问题我的处理方案如下:
1、如何保持主从一致性
#为保持主从表结构一致,主节点添加索引
greatsql> SET sql_log_bin=0;
greatsql> ALTER TABLE qj.sbtest1 ADD index tmp_k(k);
2、正在回放数据时,STOP SLAVE
是否会有问题
greatsql> STOP SLAVE --当该GTID回放很长时间时,stop可能会很长时间,我们可以采用 kill 线程方式结束回放
greatsql> SHOW PROCESSLIST; --找到会话ID
greatsql> KILL 会话号
greatsql> SHOW SLAVE STATUS \G
-- 添加索引即可
3、如果主从结构作为数据节点,上层还有计算节点该如何保持元数据的一致性即计算节点记录的表的结构和数据节点的表结构要一致
-- 当从节点添加上索引之后,在计算节点即代理层给表添加索引,当然该索引名不要与临时添加索引名重复
-- 添加上索引之后,在将从节点上临时添加的索引删除即可
greatsql> STOP SLAVE;
greatsql> SHOW SLAVE STATUS \G
greatsql> SET sql_log_bin=0;
greatsql> ALTER table qj.sbtest1 DROP index tmp_k;
greatsql> START SLAVE;
greatsql> SHOW SLAVE STATUS \G
五、总结
加速无索引表的回放主要是针对该表上进行 DELETE
或 UPDATE
操作时有效,而且我们操作的时候要注意是否能记录binlog日志,保证主从数据一致性,当从节点作为数据节点即存储节点的角色时,要关注计算节点即代理层元数据与存储节点元数据的一致性或者是符合计算节点关于该表的定义规则。
Enjoy GreatSQL :)
关于 GreatSQL
GreatSQL是适用于金融级应用的国内自主开源数据库,具备高性能、高可靠、高易用性、高安全等多个核心特性,可以作为MySQL或Percona Server的可选替换,用于线上生产环境,且完全免费并兼容MySQL或Percona Server。
相关链接: GreatSQL社区 Gitee GitHub Bilibili
GreatSQL社区:
社区有奖建议反馈: https://greatsql.cn/thread-54-1-1.html
社区博客有奖征稿详情: https://greatsql.cn/thread-100-1-1.html
(对文章有疑问或者有独到见解都可以去社区官网提出或分享哦~)
技术交流群:
微信&QQ群:
QQ群:533341697
微信群:添加GreatSQL社区助手(微信号:wanlidbc
)好友,待社区助手拉您进群。

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。
持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。
转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。
-
上一篇
图解系列|DeepSeek-R1的出众推理能力从何而来?
编者按: DeepSeek-R1到底有什么特别之处?它为什么能在推理任务上取得如此出色的表现?这背后的训练方法又蕴含着怎样的创新? 当我们需要模型处理数学题、编程任务,或是进行逻辑分析时,高质量的推理能力显得尤为重要。然而,传统的训练方法往往需要耗费大量人力物力,这对许多研究团队和企业来说都是不小的负担。 今天这篇深度解析 DeepSeek-R1 训练方法的文章,将展示一个令人耳目一新的解决方案:如何通过创新的强化学习方法,在少量高质量人工标注数据的情况下,打造出一个推理能力出众的 AI 模型。文章详细介绍了 DeepSeek 团队如何通过"自动验证机制"来训练模型,这种方法不仅大大降低了对人工标注数据的依赖,还能持续提升模型的推理质量。 作者 | Jay Alammar 编译 | 岳扬 DeepSeek-R1 代表了人工智能发展的又一重要里程碑。对于机器学习领域的研究人员与开发者群体而言,这次发布之所以备受关注,主要有以下两点: 首先,这是一款开源权重的模型,并且提供了更小的、经过蒸馏的版本; 其次,它公布并深入探讨了训练方法,该方法能够复现类似于 OpenAI O1 的推理模型。 ...
-
下一篇
Skyeye 云 VUE 版本 v3.15.6 发布
Skyeye 云智能制造,采用 Springboot + winUI 的低代码平台、移动端采用 UNI-APP。包含 30 多个应用模块、50 多种电子流程,CRM、PM、ERP、MES、ADM、EHR、笔记、知识库、项目、门店、商城、财务、多班次考勤、薪资、招聘、云售后、论坛、公告、问卷、报表设计、工作流、日程、云盘等全面管理,实现智能制造行业一体化管理。实现管理流程 “客户关系 -> 线上 / 线下报价 -> 销售报价 -> 销售合同 -> 生产计划 -> 商品设计 -> 采购 -> 加工制造 -> 入库 -> 发货 -> 售后服务” 的高效运作,同时实现企业员工的管理以及内部运作的流程操作,完善了员工从 “入职 -> 培训 -> 转正 -> 办公 -> 离职” 等多项功能。 常见问题 开发文档 Skyeye 云【源代码】针对 {星球用户} 开源。拿到源码后可进行学习、毕设、企业等使用。 Skyeye 云智能制造 v3.15.6 发布 ,发布内容如下: Skyeye 云已加入 Dromara 社区...
相关文章
文章评论
共有0条评论来说两句吧...
文章二维码
点击排行
推荐阅读
最新文章
- 设置Eclipse缩进为4个空格,增强代码规范
- CentOS关闭SELinux安全模块
- Eclipse初始化配置,告别卡顿、闪退、编译时间过长
- Jdk安装(Linux,MacOS,Windows),包含三大操作系统的最全安装
- CentOS8编译安装MySQL8.0.19
- Springboot2将连接池hikari替换为druid,体验最强大的数据库连接池
- SpringBoot2整合MyBatis,连接MySql数据库做增删改查操作
- SpringBoot2整合Thymeleaf,官方推荐html解决方案
- MySQL数据库在高并发下的优化方案
- SpringBoot2更换Tomcat为Jetty,小型站点的福音