您现在的位置是：首页 > 文章详情

MySQL 隐式转换必知必会

日期：2024-05-21点击：212收藏

在生产环境中经常会有一些隐式类型转换导致SQL索引失效，性能极差，进而影响影响集群负载和业务的情况。本文总结了隐式转换常见的场景，在生产中要尽量避免 SQL 隐式转换的出现。

作者：张洛丹，热衷于数据库技术，不断探索，期望未来能够撰写更有深度的文章，输出更有价值的内容！

爱可生开源社区出品，原创内容未经授权不得随意使用，转载请联系小编并注明来源。

本文约 3000 字，预计阅读需要 10 分钟。

常见的 SQL 产生隐式转换的场景有：

数据类型的隐式转换
字符集的隐式转换

其中，特别是在表连接场景和存储过程中的字符集转换很容易被忽略。

说明：字符集是针对字符类型数据的编码规则，对于数值类型则不需要进行转换字符集。

数据类型的隐式转换

测试表结构

t1 表字段 a 为 VARCHAR 类型，t2 表字段 a 为 INT 类型。

mysql> show create database test1\G *************************** 1. row *************************** Database: test1 Create Database: CREATE DATABASE `test1` /*!40100 DEFAULT CHARACTER SET utf8 */ 1 row in set (0.00 sec) mysql> show create table t1\G *************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` int(11) NOT NULL, `a` varchar(20) DEFAULT NULL, `b` varchar(20) DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql> show create table t2\G *************************** 1. row *************************** Table: t2 Create Table: CREATE TABLE `t2` ( `id` int(11) NOT NULL, `a` int(11) DEFAULT NULL, `b` varchar(20) DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)

单表示例

这里需要说明的是，有以下两种类型的转换：

当字段类型为字符串类型，参数为整型时，会导致索引失效
而字段类型为整型，传入的参数为字符串类型时，不会导致索引失效

这是因为在字符串与数字进行比较时，MySQL 会将字符串类型转换为数字进行比较，因此当字段类型为字符串时，会在字段上加函数，而导致索引失效。

官方文档说明：Strings are automatically converted to numbers and numbers to strings as necessary.

-- 字段类型为varchar，传参为整数，无法走到索引 mysql> explain select * from t1 where a=1000; +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 498892 | 10.00 | Using where | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+ 1 row in set, 3 warnings (0.00 sec) mysql> show warnings; +---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------+ | Warning | 1739 | Cannot use ref access on index 'a' due to type or collation conversion on field 'a' | | Warning | 1739 | Cannot use range access on index 'a' due to type or collation conversion on field 'a' | | Note | 1003 | /* select#1 */ select `test1`.`t1`.`id` AS `id`,`test1`.`t1`.`a` AS `a`,`test1`.`t1`.`b` AS `b` from `test1`.`t1` where (`test1`.`t1`.`a` = 1000) | +---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------+ 3 rows in set (0.00 sec) -- 字段类型为int，传参为字符串，可以走到索引 mysql> explain select * from t2 where a='1000'; +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ | 1 | SIMPLE | t2 | NULL | ref | a | a | 5 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)

至于为什么不能将数字转换为字符串进行比较呢?

下面的比较结果：

字符串的比较是逐个比较字符串的大小，直到找到不同的字符，这样的比较结果和数字的比较结果是不同的。

mysql> select '2000' <'250'; +---------------+ | '2000' <'250' | +---------------+ | 1 | +---------------+ 1 row in set (0.00 sec)

表连接中的数据类型转换

当两个表的连接字段类型不一致时会导致隐式转换（MySQL 内部增加 cast() 函数），无法走到连接字段索引，进而可能无法使用最优的表连接顺序。

原本作为被驱动表的表由于无法使用到索引，而可能作为驱动表。

示例：

如下，正常情况下会选择 t2 表作为驱动表，但由于数据类型不同，实际上执行的 SQL 是：select * from t1 join t2 on cast(t1.a as unsigned)=t2.a where t2.id<1000
如果 t1 作为被驱动表，则没有办法走到 t1.a 的索引，因此选择 t1 表作为驱动表

mysql> explain select * from t1 join t2 on t1.a=t2.a where t2.id<1000; +----+-------------+-------+------------+------+---------------+------+---------+------------+--------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------------+--------+----------+-----------------------+ | 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 498892 | 100.00 | Using where | | 1 | SIMPLE | t2 | NULL | ref | PRIMARY,a | a | 5 | test1.t1.a | 1 | 5.00 | Using index condition | +----+-------------+-------+------------+------+---------------+------+---------+------------+--------+----------+-----------------------+ 2 rows in set, 2 warnings (0.00 sec) mysql> show warnings; +---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Warning | 1739 | Cannot use ref access on index 'a' due to type or collation conversion on field 'a' | | Note | 1003 | /* select#1 */ select `test1`.`t1`.`id` AS `id`,`test1`.`t1`.`a` AS `a`,`test1`.`t1`.`b` AS `b`,`test1`.`t2`.`id` AS `id`,`test1`.`t2`.`a` AS `a`,`test1`.`t2`.`b` AS `b` from `test1`.`t1` join `test1`.`t2` where ((`test1`.`t2`.`id` < 1000) and (`test1`.`t1`.`a` = `test1`.`t2`.`a`)) | +---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2 rows in set (0.01 sec)

字符集的隐式转换

当参数字符集和字段字符集不同时，无法直接进行比较，而需要进行字符集转换，则可能需要在转换字段上加 convert() 函数来转换字符集，导致索引失效。

测试表结构

数据库字符集是 UTF8MB4
t1 表字符集是 UTF8
t2 表字符集是 UTF8MB4

mysql> show create database test\G *************************** 1. row *************************** Database: test Create Database: CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET utf8mb4 */ mysql> show create table t1\G *************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` int(11) NOT NULL, `a` varchar(20) DEFAULT NULL, `b` varchar(20) DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql> show create table t2\G *************************** 1. row *************************** Table: t2 Create Table: CREATE TABLE `t2` ( `id` int(11) NOT NULL, `a` varchar(20) DEFAULT NULL, `b` varchar(20) DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 1 row in set (0.01 sec)

单表示例

-- 正常执行时，匹配字段的字符集（没有单独指定时继承表的字符集） mysql> explain select * from t1 where a='1000'; +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ | 1 | SIMPLE | t1 | NULL | ref | a | a | 63 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec) -- 将参数转换不同的字符集，无法走到索引，而是全表扫描 mysql> explain select * from t1 where a=convert('1000' using utf8mb4); +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 2000 | 100.00 | Using where | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec) -- show warnings可以看到优化器进行了转换，在t1.a上加了convert函数，从而无法走到索引 mysql> show warnings; +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b` from `test`.`t1` where (convert(`test`.`t1`.`a` using utf8mb4) = <cache>(convert('1000' using utf8mb4))) | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

另外，需要注意的是：

MySQL 内部会优先将低级的字符集转换为更高级的字符集，例如将 UTF8 转换为 UTF8MB4。

在前面的示例中，convert() 函数加在 t1.a 上，而下面这个示例，convert() 函数加在参数上，而非 t2.a 字段上，这种情况则没有导致性能变差：

mysql> show create table t2\G *************************** 1. row *************************** Table: t2 Create Table: CREATE TABLE `t2` ( `id` int(11) NOT NULL, `a` varchar(20) DEFAULT NULL, `b` varchar(20) DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 1 row in set (0.00 sec) mysql> explain select * from t2 where a=convert('1000' using utf8); +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ | 1 | SIMPLE | t2 | NULL | ref | a | a | 83 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec) mysql> show warnings; +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `test`.`t2`.`id` AS `id`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t2` where (`test`.`t2`.`a` = convert(convert('1000' using utf8) using utf8mb4)) | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

综上：

在当表字段字符集为更低级的字符集（如 UTF8），而传入的值为更高级的字符集（如 UTF8MB4），则此时会转换表字段的字符集，相当于字段上使用了函数，索引失效。
当表字段为更高级的字符集（如 UTF8MB4），而传入的值为更低级的字符集（如 UTF8），则此时会将传入的值进行字符集转换，并不会导致索引失效。

但我们通常不会去手工使用 convert() 函数转换参数的字符集，在后文两种场景中可能会出现比较容易忽略的隐式类型转换，引发生产问题。

表连接中的字符集转换

当两个表的连接字段字符集不一致时会导致隐式转换（MySQL 内部增加 convert() 函数），无法走到连接字段索引，进而可能无法使用最优的表连接顺序。

原本作为被驱动表的表由于无法使用到索引，而可能作为驱动表。

示例：

正常情况下，MySQL 会优先小结果集的表作为驱动表，在本例中即为 t2 为驱动表，t1 为被驱动表。
但是由于字符集不同，实际上执行的 SQL 为 show warnings 看到的，对 t1.a 字段加了 convert() 函数进行转换字符集，则无法走到 t1.a 字段的索引而不得不改变连接顺序。

mysql> explain select * from t1 left join t2 on t1.a=t2.a where t2.id<1000; +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-----------------------+ | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 498649 | 100.00 | NULL | | 1 | SIMPLE | t2 | NULL | ref | PRIMARY,a | a | 83 | func | 1 | 4.79 | Using index condition | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-----------------------+ 2 rows in set, 1 warning (0.00 sec) mysql> show warnings; +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`id` AS `id`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` join `test`.`t2` where ((`test`.`t2`.`id` < 1000) and (convert(`test`.`t1`.`a` using utf8mb4) = `test`.`t2`.`a`)) | +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) -- 在下面示例中，虽然也发生了类型转换，但是效率并没有变差，因为原本最优的连接顺序就是t1作为驱动表 mysql> explain select * from t1 left join t2 on t1.a=t2.a where t1.id<1000; +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+ | 1 | SIMPLE | t1 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 999 | 100.00 | Using where | | 1 | SIMPLE | t2 | NULL | ref | a | a | 83 | func | 1 | 100.00 | Using where | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec) mysql> show warnings; +-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`id` AS `id`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` left join `test`.`t2` on((convert(`test`.`t1`.`a` using utf8mb4) = `test`.`t2`.`a`)) where (`test`.`t1`.`id` < 1000) | +-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

存储过程中的字符集转换

这也是比较容易忽略的一种场景，问题的发现是在生产环境存储过程中根据主键更新，但却需要执行 10s+。

存储过程中变量的字符集默认继承自 database 的字符集（也可以在创建时指定），当表字段字符集和 database 的字符集不一样时，就会出现类似前面的隐式字符集类型转换。

示例：

database 的字符集是 UTF8MB4
character_set_client 和 collation_connection 是创建存储过程时会话的 character_set_client 和 collation_connection 的值
经测试存储过程中的变量的字符集是和数据库级别的字符集一致

-- 存储过程信息： Database Collation: utf8mb4_general_ci mysql> show create procedure update_data\G *************************** 1. row *************************** Procedure: update_data sql_mode: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION Create Procedure: CREATE DEFINER=`root`@`%` PROCEDURE `update_data`() begin declare j int; declare n varchar(100); select charset(n); set j=1; while(j<=2000)do set n = cast(j as char); select 1,now(); update t1 set b=concat(b,'1') where a=n; select 2,now(); select sleep(1); set j=j+1; end while; end character_set_client: utf8mb4 collation_connection: utf8mb4_general_ci Database Collation: utf8mb4_general_ci 1 row in set (0.00 sec) 如下，在执行存储过程后，看到打印的变量n的字符集是utf8mb4 mysql> call update_data(); +------------+ | charset(n) | +------------+ | utf8mb4 | +------------+ 1 row in set (0.00 sec)

根据索引字段 a 更新的语句实际上是变成了下面这样，走的是全表扫描（type:index，key：primary）。

mysql> explain update t1 set b=concat(b,'1') where a=convert('1000' using utf8mb4); +----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ | 1 | UPDATE | t1 | NULL | index | NULL | PRIMARY | 4 | NULL | 498649 | 100.00 | Using where | +----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ 1 row in set (0.00 sec) -- 而正常情况下，执行计划为： mysql> explain update t1 set b=concat(b,'1') where a='1000'; +----+-------------+-------+------------+-------+---------------+------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+------+---------+-------+------+----------+-------------+ | 1 | UPDATE | t1 | NULL | range | a | a | 63 | const | 1 | 100.00 | Using where | +----+-------------+-------+------------+-------+---------------+------+---------+-------+------+----------+-------------+ 1 row in set (0.00 sec)

更新时间也由 0.00sec 变为 0.60sec，在表数据量很大的情况下，全表扫描将会对生产产生较大影响。

mysql> update t1 set b=concat(b,'1') where a='1000'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update t1 set b=concat(b,'1') where a=convert('1000' using utf8mb4); Query OK, 1 row affected (0.60 sec) Rows matched: 1 Changed: 1 Warnings: 0

如何避免隐式转换

对于数据类型的隐式转换：

规范数据类型的选择
SQL 传参与字段数据类型匹配

对于字符集的隐式转换：客户端字符集、服务器端字符集、数据库字符集、表字符集、字段字符集保持一致。

更多技术文章，请访问：https://opensource.actionsky.com/

关于 SQLE

SQLE 是一款全方位的 SQL 质量管理平台，覆盖开发至生产环境的 SQL 审核和管理。支持主流的开源、商业、国产数据库，为开发和运维提供流程自动化能力，提升上线效率，提高数据质量。

SQLE 获取

类型	地址
版本库	https://github.com/actiontech/sqle
文档	https://actiontech.github.io/sqle-docs/
发布信息	https://github.com/actiontech/sqle/releases
数据审核插件开发文档	https://actiontech.github.io/sqle-docs/docs/dev-manual/plugins/howtouse

原文链接：https://my.oschina.net/actiontechoss/blog/11183928

关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有，本站原创内容转载请注明来源。

Java分享

MySQL 隐式转换必知必会

数据类型的隐式转换

测试表结构

单表示例

表连接中的数据类型转换

字符集的隐式转换

测试表结构

单表示例

表连接中的字符集转换

存储过程中的字符集转换

如何避免隐式转换

关于 SQLE

SQLE 获取

案例分享｜Alluxio在自动驾驶模型训练中的应用与部署

【Java orm 框架比较】十二大表测试新增四十字段测试，优化 jpa 及部分框架使用

相关文章

文章评论

文章二维码

点击排行

推荐阅读

最新文章

Java分享

MySQL 隐式转换必知必会

数据类型的隐式转换

测试表结构

单表示例

表连接中的数据类型转换

字符集的隐式转换

测试表结构

单表示例

表连接中的字符集转换

存储过程中的字符集转换

如何避免隐式转换

关于 SQLE

SQLE 获取

案例分享｜Alluxio在自动驾驶模型训练中的应用与部署

【Java orm 框架比较】十二 大表测试新增四十字段测试，优化 jpa 及部分框架使用

相关文章

文章评论

文章二维码

点击排行

推荐阅读

最新文章

【Java orm 框架比较】十二大表测试新增四十字段测试，优化 jpa 及部分框架使用