事务处理对持久统计信息自动收集的影响

2025-04-02 118

事务处理对持久统计信息自动收集的影响

持久化统计信息

持久统计信息将统计信息存储到磁盘，使它们在服务器重启后保持不变，优化器更有可能在查询时做出一致的选择，从而提高执行计划的稳定性。

在innodb_stats_persistent=ON（默认值）或表定义使用 stats_persistent=1时，优化器统计信息会持久化保存。

持久统计信息存储在 mysql.innodb_table_stats 和 mysql.innodb_index_stats 表中，last_update 列可以看到上次更新统计信息的时间。

系统变量innodb_stats_auto_recalc（默认ON）控制表行更改超过10%时，是否自动计算统计信息。也可以通过创建或更改表时指定stats_auto_recalc子句为单个表配置自动统计重新计算。

由于自动统计信息收集是一个后台线程，其处理过程与DML操作是异步的，在DML操作超过 10% 的表后，可能不会立即重新计算统计信息。在某些情况下，统计数据重新计算可能会延迟几秒钟。如果立即需要最新的统计信息，执行 ANALYZE TABLE 以启动统计信息的同步计算。

事务的 commit 和 rollback 会影响统计信息的自动收集么？通过下面测试，可以回答这问题。

测试commit和rollback对持久统计信息收集的影响

测试环境的系统变量值：

greatsql> SHOW GLOBAL VARIABLES LIKE 'innodb_stats%';
+--------------------------------------+-------------+
| Variable_name                        | Value       |
+--------------------------------------+-------------+
| innodb_stats_auto_recalc             | ON          |
| innodb_stats_include_delete_marked   | OFF         |
| innodb_stats_method                  | nulls_equal |
| innodb_stats_on_metadata             | OFF         |
| innodb_stats_persistent              | ON          |
| innodb_stats_persistent_sample_pages | 20          |
| innodb_stats_transient_sample_pages  | 8           |
+--------------------------------------+-------------+

2.1 INSERT 操作

greatsql> TRUNCATE TABLE test1;
Query OK, 0 rows affected (0.05 sec)

-- 开启事务，在空表test1中插入10万行数据
greatsql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

greatsql> SELECT now();INSERT INTO test1 SELECT * FROM LIMIT 100000;SELECT now();
+---------------------+
| now()               |
+---------------------+
| 2025-01-07 09:59:19 |
+---------------------+
1 row in set (0.00 sec)

Query OK, 100000 rows affected (2.73 sec)
Records: 100000  Duplicates: 0  Warnings: 0

+---------------------+
| now()               |
+---------------------+
| 2025-01-07 09:59:21 |
+---------------------+
1 row in set (0.00 sec)


-- 事务没有提交，但统计信息已收集
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:19 | n_diff_pfx01 |         11 |           1 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:19 | n_leaf_pages |          1 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:19 | size         |          1 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:19 | n_diff_pfx01 |         11 |           1 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:19 | n_diff_pfx02 |         11 |           1 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:19 | n_leaf_pages |          1 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:19 | size         |          1 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)

greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:29 | n_diff_pfx01 |      98712 |          20 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:29 | n_leaf_pages |       1371 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:29 | size         |       1379 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | n_diff_pfx01 |      30169 |          20 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | n_diff_pfx02 |     100268 |          20 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | n_leaf_pages |        128 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | size         |        161 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)


-- 回滚事务
greatsql> ROLLBACK;
Query OK, 0 rows affected (2.64 sec)

-- 没有重新收集统计信息，统计信息与表数据不匹配
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:29 | n_diff_pfx01 |      98712 |          20 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:29 | n_leaf_pages |       1371 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:59:29 | size         |       1379 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | n_diff_pfx01 |      30169 |          20 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | n_diff_pfx02 |     100268 |          20 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | n_leaf_pages |        128 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 09:59:29 | size         |        161 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)

greatsql> SELECT COUNT(*) FROM test1;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)


-- analyze重新收集统计信息，统计信息才和表数据一致
greatsql> ANALYZE TABLE test1;
+---------------+---------+----------+----------+
| Table         | Op      | Msg_type | Msg_text |
+---------------+---------+----------+----------+
| testdb1.test1 | analyze | status   | OK       |
+---------------+---------+----------+----------+
1 row in set (0.06 sec)

greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 10:01:58 | n_diff_pfx01 |          0 |           1 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 10:01:58 | n_leaf_pages |          1 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 10:01:58 | size         |          1 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 10:01:58 | n_diff_pfx01 |          0 |           1 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 10:01:58 | n_diff_pfx02 |          0 |           1 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 10:01:58 | n_leaf_pages |          1 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 10:01:58 | size         |          1 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)

2.2 DELETE 操作

greatsql> SELECT COUNT(*) FROM test1;
+----------+
| count(*) |
+----------+
|   100000 |
+----------+
1 row in set (0.15 sec)

-- 开启事务，执行delete操作
greatsql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

greatsql> SELECT now();DELETE FROM test1;SELECT now();
+---------------------+
| now()               |
+---------------------+
| 2025-01-07 09:41:36 |
+---------------------+
1 row in set (0.00 sec)

Query OK, 100000 rows affected (1.87 sec)

+---------------------+
| now()               |
+---------------------+
| 2025-01-07 09:41:38 |
+---------------------+
1 row in set (0.00 sec)

-- 在delete开始时，进行了一次统计信息收集
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:36 | n_diff_pfx01 |      98712 |          20 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:36 | n_leaf_pages |       1371 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:36 | size         |       1379 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:36 | n_diff_pfx01 |      32313 |          20 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:36 | n_diff_pfx02 |      99244 |          20 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:36 | n_leaf_pages |        128 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:36 | size         |        161 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)

-- delete完成一段时间后（约10秒）进行了第二次统计信息收集
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:46 | n_diff_pfx01 |          0 |          20 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:46 | n_leaf_pages |       1371 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:46 | size         |       1379 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | n_diff_pfx01 |          0 |          20 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | n_diff_pfx02 |          0 |          20 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | n_leaf_pages |        128 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | size         |        161 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)

-- 回滚事务
greatsql> ROLLBACK;
Query OK, 0 rows affected (1.95 sec)

-- 事务回滚后，统计信息与表数据不匹配
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:46 | n_diff_pfx01 |          0 |          20 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:46 | n_leaf_pages |       1371 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:41:46 | size         |       1379 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | n_diff_pfx01 |          0 |          20 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | n_diff_pfx02 |          0 |          20 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | n_leaf_pages |        128 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 09:41:46 | size         |        161 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)

greatsql> SELECT COUNT(*) FROM test1;
+----------+
| count(*) |
+----------+
|   100000 |
+----------+
1 row in set (0.15 sec)

-- analyze重新收集统计信息，统计信息才和表数据一致
greatsql> ANALYZE TABLE test1;
+---------------+---------+----------+----------+
| Table         | Op      | Msg_type | Msg_text |
+---------------+---------+----------+----------+
| testdb1.test1 | analyze | status   | OK       |
+---------------+---------+----------+----------+
1 row in set (0.08 sec)

greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:47:29 | n_diff_pfx01 |      98712 |          20 | id                                |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:47:29 | n_leaf_pages |       1371 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | PRIMARY    | 2025-01-07 09:47:29 | size         |       1379 |        NULL | Number of pages in the index      |
| testdb1       | test1      | k_1        | 2025-01-07 09:47:29 | n_diff_pfx01 |      32332 |          20 | k                                 |
| testdb1       | test1      | k_1        | 2025-01-07 09:47:29 | n_diff_pfx02 |     100051 |          20 | k,id                              |
| testdb1       | test1      | k_1        | 2025-01-07 09:47:29 | n_leaf_pages |        128 |        NULL | Number of leaf pages in the index |
| testdb1       | test1      | k_1        | 2025-01-07 09:47:29 | size         |        161 |        NULL | Number of pages in the index      |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.01 sec)

总结

数据量变化大（超过10%）的DML操作会导致2次统计信息收集，一次是DML开始时，一次是DML完成约10秒后。
DML操作是否COMMIT提交，不影响统计信息收集。
DML操作的rollback回滚，可能造成统计信息与表数据不一致。当大数据DML操作回滚后，可以执行ANALYZE TABLE重新收集表的统计信息。

Enjoy GreatSQL :)

关于 GreatSQL

GreatSQL是适用于金融级应用的国内自主开源数据库，具备高性能、高可靠、高易用性、高安全等多个核心特性，可以作为MySQL或Percona Server的可选替换，用于线上生产环境，且完全免费并兼容MySQL或Percona Server。

相关链接： GreatSQL社区 Gitee GitHub Bilibili

GreatSQL社区：

社区有奖建议反馈： https://greatsql.cn/thread-54-1-1.html

社区博客有奖征稿详情： https://greatsql.cn/thread-100-1-1.html

（对文章有疑问或者有独到见解都可以去社区官网提出或分享哦~）

技术交流群：

微信&QQ群：

QQ群：533341697

微信群：添加GreatSQL社区助手（微信号：wanlidbc ）好友，待社区助手拉您进群。

微信关注我们

原文链接：https://my.oschina.net/GreatSQL/blog/18057671

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

RAG 调优指南：Spring AI Alibaba 模块化 RAG 原理与使用

作者：夏冬，Spring AI Alibaba Contributor RAG 简介什么是 RAG（检索增强生成） RAG（Retrieval Augmented Generation，检索增强生成）是一种结合信息检索和文本生成的技术范式。 🌟 核心设计理念 RAG 技术就像给 AI 装上了「实时百科大脑」，通过先查资料后回答的机制，让 AI 摆脱传统模型的“知识遗忘”困境。 🛠️ 四大核心步骤 1. 文档切割 → 建立智能档案库核心任务: 将海量文档转化为易检索的知识碎片实现方式: 就像把厚重词典拆解成单词卡片采用智能分块算法保持语义连贯性给每个知识碎片打标签（如“技术规格”、“操作指南”） > 📌 关键价值：优质的知识切割如同图书馆分类系统，决定了后续检索效率 2. 向量编码 → 构建语义地图核心转换: 用 AI 模型将文字转化为数学向量使语义相近的内容产生相似数学特征数据存储: 所有向量存入专用数据库建立快速检索索引（类似图书馆书目检索系统） 🎯 示例效果：“续航时间”和“电池容量”会被编码为相似向量 3. 相似检索 → 智能资料猎人应答触发流程...

2025-04-02

255

编者按：在混合专家模型（MoE）的实践中，负载不均衡俨然已成为制约模型性能提升的关键瓶颈之一。传统的均衡策略往往需要引入复杂的辅助损失函数，不仅增加了训练的复杂度，还可能干扰模型的核心学习目标。工程师们在提升模型效率的道路上，一直苦苦追寻着一个优雅而高效的平衡解决方案。 DeepSeek 团队的这项研究，为这一长期困扰业界的技术难题提供了令人耳目一新的解决思路：通过在门控分数中直接添加专家层面的偏置项，在绝大部分不引入额外损失函数的情况下，实现了模型训练过程中的自适应负载均衡。更令人惊叹的是，这一方法不仅保持了模型的因果关系，还显著提升了训练的稳定性和最终性能。作者 | Shirley Li 编译 | 岳扬这是 DeepSeek-V3 系列文章的第三篇，我们将探讨 DeepSeek 模型[1, 2, 3]中与混合专家模型（MoE）相关的另一项关键架构突破：无辅助损失函数的负载均衡（Auxiliary-Loss-Free Load Balancing）[5]。在本文，我们将深入解析 DeepSeek 如何解决 MoE 的隐藏瓶颈------负载均衡，同时还通过消除梯度干扰和严格遵...

2025-04-02

236

资源下载

更多资源

优质分享App

近一个月的开发和优化，本站点的第一个app全新上线。该app采用极致压缩，本体才4.36MB。系统里面做了大量数据访问、缓存优化。方便用户在手机上查看文章。后续会推出HarmonyOS的适配版本。

腾讯云软件源

为解决软件依赖安装时官方源访问速度慢的问题，腾讯云为一些软件搭建了缓存服务。您可以通过使用腾讯云软件源站来提升依赖包的安装速度。为了方便用户自由搭建服务架构，目前腾讯云软件源站支持公网访问和内网访问。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。

Sublime Text

Sublime Text具有漂亮的用户界面和强大的功能，例如代码缩略图，Python的插件，代码段等。还可自定义键绑定，菜单和工具栏。Sublime Text 的主要功能包括：拼写检查，书签，完整的 Python API ， Goto 功能，即时项目切换，多选择，多窗口等等。Sublime Text 是一个跨平台的编辑器，同时支持Windows、Linux、Mac OS X等操作系统。