想提高查询性能,用GaussDB(DWS) in表达式还是or表达式?
摘要:在本文中,我将重点分析在各种通用场景下,IN 运算符和 OR 运算符查询的性能差异,并探索这些性能差异背后的原因,目的是为了帮助DWS用户最大化的提升其查询性能。
前言
实验数据准备
-- 建表, DWS列存,hstore_opt表,并声明id列为primary key。 CREATE TABLE item( id INTEGER NOT NULL, name VARCHAR(30), price INTEGER, quantity INTEGER, primary key (id) ) WITH (orientation=COLUMN, enable_hstore_opt=TRUE); -- 随机插入两百万行数据。 INSERT INTO item SELECT id, SUBSTR(MD5(RANDOM()::text), 0, 20) AS name, (RANDOM() * 10000)::int AS price, (RANDOM() * 10000)::int AS quantity FROM generate_series(1, 2000000) AS t(id); -- Merge all data from hstore delta table info CU. select hstore_full_merge('item');
单一属性过滤
单个索引属性
-- IN expression SELECT * FROM item WHERE id IN (...); -- OR expression SELECT * FROM item WHERE id = ? OR id = ? OR ... ;
postgres=# explain SELECT * FROM item where id in (1559267,311557,234010,1863199,876092,580136,1116400,575622,380796,1518233); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+------------------------------------------+--------+----------+---------+--------- 1 | -> Row Adapter | 10 | | 43 | 63.79 2 | -> Vector Streaming (type: GATHER) | 10 | | 43 | 63.79 3 | -> CStore Index Heap Scan on item | 10 | 16MB | 43 | 57.79 4 | -> CStore Index Ctid Scan | 10 | 1MB | 0 | 38.12 Predicate Information (identified by plan id) --------------------------------------------------------------------------------------------------------------------------- 3 --CStore Index Heap Scan on item Recheck Cond: (id = ANY ('{1559267,311557,234010,1863199,876092,580136,1116400,575622,380796,1518233}'::integer[])) 4 --CStore Index Ctid Scan Index Cond: (id = ANY ('{1559267,311557,234010,1863199,876092,580136,1116400,575622,380796,1518233}'::integer[]))
postgres=# explain SELECT * FROM item WHERE id = 1559267 OR id = 311557 OR id = 234010 OR id = 1863199 OR id = 876092 OR id = 580136 OR id = 1116400 OR id = 575622 OR id = 380796 OR id = 1518233; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+-----------------------------------------------------------------+--------+----------+---------+--------- 1 | -> Row Adapter | 10 | | 43 | 61.99 2 | -> Vector Streaming (type: GATHER) | 10 | | 43 | 61.99 3 | -> CStore Index Heap Scan on item | 10 | 16MB | 43 | 55.99 4 | -> CStore Index Or(5, 6, 7, 8, 9, 10, 11, 12, 13, 14) | 10 | 1MB | 0 | 36.25 5 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 6 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 7 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 8 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 9 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 10 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 11 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 12 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 13 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 14 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 3 --CStore Index Heap Scan on item Recheck Cond: ((id = 1559267) OR (id = 311557) OR (id = 234010) OR (id = 1863199) OR (id = 876092) OR (id = 580136) OR (id = 1116400) OR (id = 575622) OR (id = 380796) OR (id = 1518233)) 5 --CStore Index Ctid Scan Index Cond: (id = 1559267) 6 --CStore Index Ctid Scan Index Cond: (id = 311557) 7 --CStore Index Ctid Scan Index Cond: (id = 234010) 8 --CStore Index Ctid Scan Index Cond: (id = 1863199) 9 --CStore Index Ctid Scan Index Cond: (id = 876092) 10 --CStore Index Ctid Scan Index Cond: (id = 580136) 11 --CStore Index Ctid Scan Index Cond: (id = 1116400) 12 --CStore Index Ctid Scan Index Cond: (id = 575622) 13 --CStore Index Ctid Scan Index Cond: (id = 380796) 14 --CStore Index Ctid Scan Index Cond: (id = 1518233)
单个未索引属性
-- IN expression SELECT * FROM item WHERE price IN (...); -- OR expression SELECT * FROM item WHERE price = ? OR price = ? OR ... ;
postgres=# explain SELECT * FROM item where price in (1988,5547,6631,4931,5752,2119,9647,3724,5146,873); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+----------------------------------------+--------+----------+---------+---------- 1 | -> Row Adapter | 1921 | | 43 | 19105.85 2 | -> Vector Streaming (type: GATHER) | 1921 | | 43 | 19105.85 3 | -> CStore Scan on item | 1921 | 1MB | 43 | 19015.85 Predicate Information (identified by plan id) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------------------- 3 --CStore Scan on item CU Predicate Filter: (price = ANY ('{1988,5547,6631,4931,5752,2119,9647,3724,5146,873}'::integer[])) Pushdown Predicate Filter: (((price >= 873) AND (price <= 9647)) AND ((price = 1988) OR (price = 5547) OR (price = 6631) OR (price = 4931) OR (price = 5752) OR (price = 2119) OR (price = 9647) OR (price = 3724) OR (price = 5146) OR (price = 873)))
postgres=# explain SELECT * FROM item where price = 1988 OR price = 5547 OR price = 6631 OR price = 4931 OR price = 5752 OR price = 2119 OR price = 9647 OR price = 3724 OR price = 5146 OR price = 873; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+----------------------------------------+--------+----------+---------+---------- 1 | -> Row Adapter | 1921 | | 43 | 31356.73 2 | -> Vector Streaming (type: GATHER) | 1921 | | 43 | 31356.73 3 | -> CStore Scan on item | 1921 | 1MB | 43 | 31266.73 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 3 --CStore Scan on item Filter: ((price = 1988) OR (price = 5547) OR (price = 6631) OR (price = 4931) OR (price = 5752) OR (price = 2119) OR (price = 9647) OR (price = 3724) OR (price = 5146) OR (price = 873)) Pushdown Predicate Filter: ((price = 1988) OR (price = 5547) OR (price = 6631) OR (price = 4931) OR (price = 5752) OR (price = 2119) OR (price = 9647) OR (price = 3724) OR (price = 5146) OR (price = 873))
postgres=# set enable_cu_predicate_pushdown = off; SET
postgres=# explain SELECT * FROM item where price in (1988,5547,6631,4931,5752,2119,9647,3724,5146,873); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+----------------------------------------+--------+----------+---------+---------- 1 | -> Row Adapter | 1921 | | 43 | 19105.85 2 | -> Vector Streaming (type: GATHER) | 1921 | | 43 | 19105.85 3 | -> CStore Scan on item | 1921 | 1MB | 43 | 19015.85 Predicate Information (identified by plan id) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------------------- 3 --CStore Scan on item Filter: (price = ANY ('{1988,5547,6631,4931,5752,2119,9647,3724,5146,873}'::integer[])) Pushdown Predicate Filter: (((price >= 873) AND (price <= 9647)) AND ((price = 1988) OR (price = 5547) OR (price = 6631) OR (price = 4931) OR (price = 5752) OR (price = 2119) OR (price = 9647) OR (price = 3724) OR (price = 5146) OR (price = 873)))
多属性过滤
两个索引属性
postgres=# create index idx_item_name_quantity on item (name, quantity); CREATE INDEX
SELECT * FROM item WHERE (name, quantity) IN (('a', 1), ('b', 2)); SELECT * FROM item WHERE (name='a' AND quantity=1) OR (name='b' AND quantity=2)
postgres=# explain SELECT * FROM item WHERE (name, quantity) IN (('a', 1), ('b', 2)); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+------------------------------------------+--------+----------+---------+--------- 1 | -> Row Adapter | 2 | | 43 | 17.24 2 | -> Vector Streaming (type: GATHER) | 2 | | 43 | 17.24 3 | -> CStore Index Heap Scan on item | 2 | 16MB | 43 | 11.24 4 | -> CStore Index Or(5, 6) | 2 | 1MB | 0 | 7.22 5 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.61 6 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.61 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------- 3 --CStore Index Heap Scan on item Recheck Cond: ((((name)::text = 'a'::text) AND (quantity = 1)) OR (((name)::text = 'b'::text) AND (quantity = 2))) 5 --CStore Index Ctid Scan Index Cond: (((name)::text = 'a'::text) AND (quantity = 1)) 6 --CStore Index Ctid Scan Index Cond: (((name)::text = 'b'::text) AND (quantity = 2))
postgres=# drop index idx_item_name_quantity; DROP INDEX
SELECT * FROM item WHERE (name, quantity) IN (('a', 1), ('b', 2)); SELECT * FROM item WHERE (name='a' AND quantity=1) OR (name='b' AND quantity=2)
postgres=# explain SELECT * FROM item WHERE (name, quantity) IN (('a', 1), ('b', 2)); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+----------------------------------------+--------+----------+---------+---------- 1 | -> Row Adapter | 2 | | 43 | 16564.78 2 | -> Vector Streaming (type: GATHER) | 2 | | 43 | 16564.78 3 | -> CStore Scan on item | 2 | 1MB | 43 | 16558.78 Predicate Information (identified by plan id) --------------------------------------------------------------------------------------------------------------------------------------- 3 --CStore Scan on item Filter: ((((name)::text = 'a'::text) AND (quantity = 1)) OR (((name)::text = 'b'::text) AND (quantity = 2))) Pushdown Predicate Filter: ((((name)::text = 'a'::text) AND (quantity = 1)) OR (((name)::text = 'b'::text) AND (quantity = 2)))
postgres=# explain SELECT * FROM item WHERE (name='a' AND quantity=1) OR (name='b' AND quantity=2) QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+----------------------------------------+--------+----------+---------+---------- 1 | -> Row Adapter | 2 | | 43 | 16564.78 2 | -> Vector Streaming (type: GATHER) | 2 | | 43 | 16564.78 3 | -> CStore Scan on item | 2 | 1MB | 43 | 16558.78 Predicate Information (identified by plan id) --------------------------------------------------------------------------------------------------------------------------------------- 3 --CStore Scan on item Filter: ((((name)::text = 'a'::text) AND (quantity = 1)) OR (((name)::text = 'b'::text) AND (quantity = 2))) Pushdown Predicate Filter: ((((name)::text = 'a'::text) AND (quantity = 1)) OR (((name)::text = 'b'::text) AND (quantity = 2)))
一个属性有索引 + 一个属性无索引
SELECT * FROM item WHERE (id, price) IN ((1, 1), (2, 2)); SELECT * FROM item WHERE (id=1 AND price=1) OR (id=2 AND price=2)
postgres=# explain SELECT * FROM item WHERE (id, price) IN ((1, 1), (2, 2)); QUERY PLAN ----------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+------------------------------------------+--------+----------+---------+--------- 1 | -> Row Adapter | 1 | | 43 | 17.27 2 | -> Vector Streaming (type: GATHER) | 1 | | 43 | 17.27 3 | -> CStore Index Heap Scan on item | 1 | 16MB | 43 | 11.27 4 | -> CStore Index Or(5, 6) | 2 | 1MB | 0 | 7.25 5 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 6 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 Predicate Information (identified by plan id) -------------------------------------------------------------------------- 3 --CStore Index Heap Scan on item Recheck Cond: ((id = 1) OR (id = 2)) Filter: (((id = 1) AND (price = 1)) OR ((id = 2) AND (price = 2))) 5 --CStore Index Ctid Scan Index Cond: (id = 1) 6 --CStore Index Ctid Scan Index Cond: (id = 2)
postgres=# explain SELECT * FROM item WHERE (id=1 AND price=1) OR (id=2 AND price=2); QUERY PLAN ----------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+------------------------------------------+--------+----------+---------+--------- 1 | -> Row Adapter | 1 | | 43 | 17.27 2 | -> Vector Streaming (type: GATHER) | 1 | | 43 | 17.27 3 | -> CStore Index Heap Scan on item | 1 | 16MB | 43 | 11.27 4 | -> CStore Index Or(5, 6) | 2 | 1MB | 0 | 7.25 5 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 6 | -> CStore Index Ctid Scan | 1 | 1MB | 0 | 3.62 Predicate Information (identified by plan id) -------------------------------------------------------------------------- 3 --CStore Index Heap Scan on item Recheck Cond: ((id = 1) OR (id = 2)) Filter: (((id = 1) AND (price = 1)) OR ((id = 2) AND (price = 2))) 5 --CStore Index Ctid Scan Index Cond: (id = 1) 6 --CStore Index Ctid Scan Index Cond: (id = 2)
总结

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。
持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。
转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。
- 上一篇
从 CephFS 到 JuiceFS:同程旅游亿级文件存储平台构建之路
随着公司业务的快速发展,同程旅行的非结构化的数据突破 10 亿,在 2022 年,同程首先完成了对象存储服务的建设。当时,分布式文件系统方面,同程使用的是 CephFS,随着数据量的持续增长,CephFS 的高复杂性和运维难度逐渐成为瓶颈。考虑到可观测性、稳定性和管理效率等维度,同程最终决定转向 JuiceFS。 目前,同程已在 JuiceFS 上构建了一个企业级存储平台,平台规模涵盖了超过 20 个文件系统和 2000 多个客户端挂载点,能够高效管理亿级文件和百 TiB 级别的数据量。值得一提的是,整个存储平台的日常运维工作仅需一人。该平台主要应用于多个场景,包括 AI 应用、容器云环境以及应用级共享存储需求。 01 文件系统选型:从 CephFS 到 JuiceFS 在使用 JuiceFS 之前,同程内部使用的是 Ceph 来提供对象存储和分布式文件存储服务。然而,Ceph 的技术栈复杂度较高,掌握难度大,对使用经验和运维经验都有较高要求。同时,在可运维性和生态建设方面也存在一定不足,对日常稳定性保障构成了较大挑战。 相比之下,JuiceFS 具有诸多优势。JuiceFS 设计上实...
- 下一篇
Java 内存管理 “通关秘籍”:原理吃透,优化无忧,实战称王
一、引言 在Java编程中,内存管理是一个至关重要的方面,它直接影响着程序的性能、稳定性和可扩展性。Java的内存管理机制由Java虚拟机(JVM)负责,包括内存分配和回收等关键任务。理解Java内存管理的工作原理对于编写高效、可靠的Java程序至关重要。 本文将深入探讨Java内存管理的各个方面,包括内存结构、对象的内存分配、内存回收机制以及相关的优化技巧和最佳实践。通过对这些内容的详细阐述,读者将能够更好地理解Java程序在内存中的运行机制,从而优化程序性能,避免常见的内存问题。 二、Java内存结构 (一)程序计数器(PC Register) 作用与特点 程序计数器是一块较小的内存区域,用于记录当前线程所执行的字节码的行号。它就像是线程执行的“导航仪”,指引着字节码解释器按顺序选取下一条字节码指令执行。例如,在执行循环、分支、方法调用等操作时,程序计数器的值会相应改变,以确保线程在正确的位置继续执行。 每个线程都有独立的程序计数器,它们之间互不影响,这使得多线程能够在同一时刻各自执行不同的字节码指令,实现线程的并发执行。 示例代码 public class PCRegisterE...
相关文章
文章评论
共有0条评论来说两句吧...
文章二维码
点击排行
推荐阅读
最新文章
- Docker使用Oracle官方镜像安装(12C,18C,19C)
- Docker安装Oracle12C,快速搭建Oracle学习环境
- CentOS7安装Docker,走上虚拟化容器引擎之路
- Linux系统CentOS6、CentOS7手动修改IP地址
- Docker快速安装Oracle11G,搭建oracle11g学习环境
- CentOS7编译安装Cmake3.16.3,解决mysql等软件编译问题
- CentOS关闭SELinux安全模块
- CentOS7编译安装Gcc9.2.0,解决mysql等软件编译问题
- Hadoop3单机部署,实现最简伪集群
- CentOS6,7,8上安装Nginx,支持https2.0的开启