rand()函数埋的一个坑,大家注意了
一、背景
在开发博客网站随机文章的时候,刚开始没有注意到这个问题,随便加了一个rand()函数,数据量小的时候性能还可以,一旦数据达到几十万的时候,就会产生性能问题。
二、问题分析
刚开始的代码如下:
-
if(!StringUtils.isEmpty(postParam.getSortType())){ -
if(postParam.getSortType().equals(PostConstant.SORTTYPE_COMMMENT)){ -
example.setOrderByClause(" comment_count desc "); -
}else if(postParam.getSortType().equals(PostConstant.SORTTYPE_DATE)){ -
example.setOrderByClause(" post_date desc "); -
}else if(postParam.getSortType().equals(PostConstant.SORTTYPE_RANDOM)){ -
example.setOrderByClause(" RAND() "); -
log.info("开始加载随机文章列表。。。。"); -
}else if(postParam.getSortType().equals(PostConstant.SORTTYPE_VIEW)){ -
example.setOrderByClause(" post_date desc "); -
} -
}else{ -
example.setOrderByClause(" post_date desc "); -
} -
Page<WpPosts> page =(Page<WpPosts>) wpPostsMapper.selectByExample(example);
启动程序,查看随机文章,后台日志报错
-
INFO | 2018-11-19 18:43:53,040 | JWordpres-v2.0 | [http-nio-80-exec-17-47] (o.s.b.f.x.XmlBeanDefinitionReader:317) | Loading XML bean definitions from class path resource [org/springframework/jdbc/support/sql-error-codes.xml] -
INFO | 2018-11-19 18:43:53,202 | JWordpres-v2.0 | [http-nio-80-exec-17-47] (o.s.j.s.SQLErrorCodesFactory:126) | SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase, Hana] -
ERROR | 2018-11-19 18:43:53,231 | JWordpres-v2.0 | [http-nio-80-exec-17-47] (o.a.c.c.C.[.[.[.[dispatcherServlet]:181) | Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.jdbc.UncategorizedSQLException: -
### Error querying database. Cause: java.sql.SQLException: Incorrect key file for table '/tmp/#sql_7c6d_0.MYI'; try to repair it -
### The error may exist in cn/liuhaihua/web/mapper/WpPostsMapper.java (best guess) -
### The error may involve cn.liuhaihua.web.mapper.WpPostsMapper.selectByExample-Inline -
### The error occurred while setting parameters -
### SQL: SELECT id,post_author,post_date,post_date_gmt,post_content,post_title,post_excerpt,post_status,comment_status,ping_status,post_password,post_name,to_ping,pinged,post_modified,post_modified_gmt,post_content_filtered,post_parent,guid,menu_order,post_type,post_mime_type,comment_count FROM wp_posts WHERE ( post_type = ? and post_status = ? ) order by RAND() LIMIT 10 -
### Cause: java.sql.SQLException: Incorrect key file for table '/tmp/#sql_7c6d_0.MYI'; try to repair it -
; uncategorized SQLException for SQL []; SQL state [HY000]; error code [126]; Incorrect key file for table '/tmp/#sql_7c6d_0.MYI'; try to repair it; nested exception is java.sql.SQLException: Incorrect key file for table '/tmp/#sql_7c6d_0.MYI'; try to repair it] with root cause -
java.sql.SQLException: Incorrect key file for table '/tmp/#sql_7c6d_0.MYI'; try to repair it
查看sql 发现如下代码:
-
SELECT -
id, -
post_author, -
post_date, -
post_date_gmt, -
post_content, -
post_title, -
post_excerpt, -
post_status, -
comment_status, -
ping_status, -
post_password, -
post_name, -
to_ping, -
pinged, -
post_modified, -
post_modified_gmt, -
post_content_filtered, -
post_parent, -
guid, -
menu_order, -
post_type, -
post_mime_type, -
comment_count -
FROM -
wp_posts -
WHERE -
( -
post_type = "post" -
AND post_status = "publish" -
) -
ORDER BY -
RAND() -
LIMIT 10
这个sql会造成严重的性能问题,rand()造成在系统文件上来回排序。非常损耗性能
-
[Err] 126 - Incorrect key file for table '/tmp/#sql_7c6d_0.MYI'; try to repair it
三、优化方案
知道问题所在,优化方案其实也蛮简单的。原理如下
-
1首先 select count(*) from test where $where; (计算所需要的数据的总条数) -
2然后 $id=rand($a[0],$a[1]); 产生一个随机数; -
3最后 SELECT * FROM tablename WHERE id>='$id' LIMIT 1 将上面产生的随机数带入查询;
修改代码如下:
-
if(!StringUtils.isEmpty(postParam.getSortType())){ -
if(postParam.getSortType().equals(PostConstant.SORTTYPE_COMMMENT)){ -
example.setOrderByClause(" comment_count desc "); -
}else if(postParam.getSortType().equals(PostConstant.SORTTYPE_DATE)){ -
example.setOrderByClause(" post_date desc "); -
}else if(postParam.getSortType().equals(PostConstant.SORTTYPE_RANDOM)){ -
//example.setOrderByClause(" RAND() "); -
/*** -
* 首先 select count(*) from test where $where; (计算所需要的数据的总条数) -
*然后 $id=rand($a[0],$a[1]); 产生一个随机数; -
*最后 SELECT * FROM tablename WHERE id>='$id' LIMIT 1 将上面产生的随机数带入查询; -
*/ -
log.info("开始加载随机文章列表。。。。"); -
Random random = new Random(); -
int randId =random.nextInt(count); -
criteria.andGreaterThan("id", randId); -
}else if(postParam.getSortType().equals(PostConstant.SORTTYPE_VIEW)){ -
example.setOrderByClause(" post_date desc "); -
} -
}else{ -
example.setOrderByClause(" post_date desc "); -
} -
Page<WpPosts> page =(Page<WpPosts>) wpPostsMapper.selectByExample(example);
这样修改后之后,系统完美运行,sql查询时间缩短到0.058秒
四、总结
其实mysql官网也说明这种情况了,意思是说当记录超过30万,rand这种方法就不可用,需要更换方案。
-
works for small tables, but once the tables grow larger than 300,000 records or -
so this will be very slow because MySQL will have to process ALL the entries from the table, -
order them randomly and then return the first row of the ordered result, -
and this sorting takes long time. -
Instead you can do it like this (atleast if you have an auto_increment PK):
官方建议修改成这样
-
SELECT MIN(id), MAX(id) FROM tablename; -
Fetch the result into $a -
$id=rand($a[0],$a[1]); -
SELECT * FROM tablename WHERE id>='$id' LIMIT 1
原文发布时间为:2018-11-20
本文作者:HARRIES