2021-06-19

递归查询两种写法的性能差异

对于递归查询,KINGBASE用户可以选择使用connect by ,或者使用 with recursive 。下面,我们以例子来看下二者的差别。

一、构造数据

create table test_recursive(id integer,pid integer,name varchar,description text);insert into test_recursive(id,name,description) select generate_series(1,100000),'a'||generate_series(1,100000),repeat('desc',500);update test_recursive set pid=1 where id between 2 and 10;update test_recursive set pid=mod(id,9)+2 where id between 11 and 100;update test_recursive set pid=mod(id,90)+11 where id between 101 and 1000;update test_recursive set pid=mod(id,900)+101 where id between 1001 and 10000;update test_recursive set pid=mod(id,9000)+1001 where id between 10001 and 100000;create table test_recursive_random(id integer,pid integer,name varchar,description text);insert into test_recursive_random select * from test_recursive order by random;create index ind_test_recursive_random_id on test_recursive_random(id);create index ind_test_recursive_random_pid on test_recursive_random(pid);vacuum full test_recursive_random;analyze test_recursive_random;create index ind_test_recursive_id on test_recursive(id);create index ind_test_recursive_pid on test_recursive(pid);vacuum full test_recursive;analyze test_recursive;

本例子构造了5层的数据,有排序与非排序两种数据。

二、使用connect by

connect by的查询性能:用时 746ms

test=# explain analyze select id,pid,name from test_recursive start with id=1 connect by prior id = pid ;                  QUERY PLAN                   ----------------------------------------------------------------------------------------------------------------------------------------------------------- Recursive Union (cost=0.29..422.37 rows=101 width=14) (actual time=0.038..728.281 rows=100000 loops=1) -> Index Scan using ind_test_recursive_id on test_recursive (cost=0.29..8.31 rows=1 width=14) (actual time=0.015..0.017 rows=1 loops=1)   Index Cond: (id = 1) -> Nested Loop (cost=0.42..41.30 rows=10 width=14) (actual time=0.002..0.003 rows=1 loops=100000)   -> WorkTable Scan on "connect" (cost=0.00..0.02 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=100000)   -> Index Scan using ind_test_recursive_pid on test_recursive (cost=0.42..41.18 rows=10 width=14) (actual time=0.002..0.002 rows=1 loops=100000)    Index Cond: (pid = (PRIOR test_recursive.id)) Planning Time: 0.185 ms Execution Time: 746.102 ms(9 rows)

  

三、Kingbase with recursive 查询

1、排序数据:用时302ms

explain analyze with recursive tmp1 as (select id,pid,name from test_recursive where id=1union allselect a.id,a.pid,a.name from test_recursive a inner join tmp1 b on a.pid=b.id )select * from tmp1;                    QUERY PLAN                    --------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on tmp1 (cost=4013.94..4033.96 rows=1001 width=40) (actual time=0.020..297.856 rows=100000 loops=1) CTE tmp1  -> Recursive Union (cost=0.29..4013.94 rows=1001 width=14) (actual time=0.018..257.298 rows=100000 loops=1)   -> Index Scan using ind_test_recursive_id on test_recursive (cost=0.29..8.31 rows=1 width=14) (actual time=0.016..0.018 rows=1 loops=1)     Index Cond: (id = 1)   -> Nested Loop (cost=0.42..398.56 rows=100 width=14) (actual time=20.529..38.777 rows=16666 loops=6)     -> WorkTable Scan on tmp1 b (cost=0.00..0.20 rows=10 width=4) (actual time=0.003..2.150 rows=16667 loops=6)     -> Index Scan using ind_test_recursive_pid on test_recursive a (cost=0.42..39.74 rows=10 width=14) (actual time=0.001..0.002 rows=1 loops=100000)      Index Cond: (pid = b.id) Planning Time: 0.207 ms Execution Time: 302.244 ms(11 rows)

2、非排序数据:440ms

test=# explain analyze with recursive tmp1 as (test(# select id,pid,name from test_recursive_random where id=1test(# union alltest(# select a.id,a.pid,a.name from test_recursive_random a inner join tmp1 b on a.pid=b.id )test-# select * from tmp1;                   QUERY PLAN                    ------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on tmp1 (cost=4206.87..4226.89 rows=1001 width=40) (actual time=0.020..434.721 rows=100000 loops=1) CTE tmp1  -> Recursive Union (cost=0.29..4206.87 rows=1001 width=14) (actual time=0.018..397.456 rows=100000 loops=1)   -> Index Scan using ind_test_recursive_random_id on test_recursive_random (cost=0.29..8.31 rows=1 width=14) (actual time=0.017..0.018 rows=1 loops=1)     Index Cond: (id = 1)   -> Nested Loop (cost=4.50..417.85 rows=100 width=14) (actual time=33.080..62.311 rows=16666 loops=6)     -> WorkTable Scan on tmp1 b (cost=0.00..0.20 rows=10 width=4) (actual time=0.007..2.412 rows=16667 loops=6)     -> Bitmap Heap Scan on test_recursive_random a (cost=4.50..41.67 rows=10 width=14) (actual time=0.002..0.003 rows=1 loops=100000)      Recheck Cond: (pid = b.id)      Heap Blocks: exact=99557      -> Bitmap Index Scan on ind_test_recursive_random_pid (cost=0.00..4.49 rows=10 width=0) (actual time=0.001..0.001 rows=1 loops=100000)        Index Cond: (pid = b.id) Planning Time: 0.304 ms Execution Time: 439.563 ms(14 rows)

3、使用hash join:260ms

test=# set enable_nestloop=off;SETtest=# explain analyze with recursive tmp1 as (test(# select id,pid,name from test_recursive where id=1test(# union alltest(# select a.id,a.pid,a.name from test_recursive a inner join tmp1 b on a.pid=b.id )test-# select * from tmp1;                  QUERY PLAN                  ----------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on tmp1 (cost=24101.58..24121.60 rows=1001 width=40) (actual time=0.018..255.766 rows=100000 loops=1) CTE tmp1  -> Recursive Union (cost=0.29..24101.58 rows=1001 width=14) (actual time=0.016..218.427 rows=100000 loops=1)   -> Index Scan using ind_test_recursive_id on test_recursive (cost=0.29..8.31 rows=1 width=14) (actual time=0.015..0.017 rows=1 loops=1)     Index Cond: (id = 1)   -> Hash Join (cost=0.33..2407.32 rows=100 width=14) (actual time=13.828..32.571 rows=16666 loops=6)     Hash Cond: (a.pid = b.id)     -> Seq Scan on test_recursive a (cost=0.00..2031.00 rows=100000 width=14) (actual time=0.005..8.240 rows=100000 loops=6)     -> Hash (cost=0.20..0.20 rows=10 width=4) (actual time=5.114..5.114 rows=16667 loops=6)      Buckets: 131072 (originally 1024) Batches: 2 (originally 1) Memory Usage: 3073kB      -> WorkTable Scan on tmp1 b (cost=0.00..0.20 rows=10 width=4) (actual time=0.004..2.068 rows=16667 loops=6) Planning Time: 0.196 ms Execution Time: 260.360 ms(13 rows)

四、执行计划差异分析

  • connect by 查询执行逻辑:查询是通过 pid = prior id ,也就是将前条记录的 id 作为值,传给 pid 进行索引扫描。逻辑上可以看做是逐个分支查询,上个分支查询结束,再进行下个分支扫描。loop = 100000,就是表示针对每条记录,都要访问一次索引。
  • with recursive 查询逻辑:是按层次查询,上层结果都返回后,再执行下层查询。每层可以根据所有ctid进行排序,也就是 Bitmap Index Scan,将所有ctid都返回,排序,再访问表,效率提高。另外,由于是每层数据返回后,再去关联查找下层数据,可以使用hash join,提升访问效率。 rows=16666 loop = 6,表示需要访问6个批次,每次平均 16666 条记录。 

五、Oracle connect by 查询性能

以下是同样数据量的情况下,Oracle connect by 查询的性能:

SQL> select id,pid,name from test_recursive start with id=1 connect by prior id = pid ;100000 rows selected.Elapsed: 00:00:00.98Execution Plan----------------------------------------------------------Plan hash value: 2099392185----------------------------------------------------------------------------------------------------------------| Id | Operation        | Name     | Rows | Bytes | Cost (%CPU)| Time  |----------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT      |      | 12 | 384 | 18 (12)| 00:00:01 ||* 1 | CONNECT BY WITH FILTERING   |      |  |  |   |   || 2 | TABLE ACCESS BY INDEX ROWID BATCHED | TEST_RECURSIVE   |  1 | 32 |  2 (0)| 00:00:01 ||* 3 | INDEX RANGE SCAN     | IND_TEST_RECURSIVE_ID |  1 |  |  1 (0)| 00:00:01 || 4 | NESTED LOOPS      |      | 11 | 495 | 14 (0)| 00:00:01 || 5 | CONNECT BY PUMP     |      |  |  |   |   || 6 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST_RECURSIVE   | 11 | 352 | 12 (0)| 00:00:01 ||* 7 |  INDEX RANGE SCAN     | IND_TEST_RECURSIVE_PID | 11 |  |  1 (0)| 00:00:01 |----------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - access("PID"=PRIOR "ID") 3 - access("ID"=1) 7 - access("connect$_by$_pump$_002"."prior id "="PID")Note----- - dynamic statistics used: dynamic sampling (level=2) - this is an adaptive planStatistics----------------------------------------------------------   0 recursive calls   0 db block gets  101983 consistent gets   0 physical reads   0 redo size 2337649 bytes sent via SQL*Net to client  73769 bytes received via SQL*Net from client  6668 SQL*Net roundtrips to/from client   8 sorts (memory)   0 sorts (disk)  100000 rows processed

 









原文转载:http://www.shaoqun.com/a/815631.html

跨境电商:https://www.ikjzd.com/

舌头伸进去里面吃小豆豆 使劲添啊添的流水了:http://lady.shaoqun.com/m/a/248217.html

小坏蛋今晚可以不戴套 求求你别在里面会怀孕:http://lady.shaoqun.com/m/a/248299.html

他扒开我的下面舌头伸进去 吃她两腿中间的小豆豆:http://lady.shaoqun.com/a/247454.html

net a porter:https://www.ikjzd.com/w/2132


对于递归查询,KINGBASE用户可以选择使用connectby,或者使用withrecursive。下面,我们以例子来看下二者的差别。一、构造数据createtabletest_recursive(idinteger,pidinteger,namevarchar,descriptiontext);insertintotest_recursive(id,name,description)selec
甘肃文旅携手马蜂窝和东风标致,打造"文旅+车企"跨界合作新典范:http://www.30bags.com/a/219848.html
甘肃文旅走进江苏,带你一起感受美丽甘肃 :http://www.30bags.com/a/436316.html
这6种女人最容易给男人戴绿帽(4/4):http://lady.shaoqun.com/a/85024.html
卡特尔16种人格测试:http://lady.shaoqun.com/a/106161.html
口述最舒服的出轨经历 和已婚女同事精神恋爱全过程:http://lady.shaoqun.com/a/274945.html
甘肃有个神奇的地方,湿地与沙漠共存,被誉为嘉峪关的"肺"!:http://www.30bags.com/a/225212.html
bsci 认证:https://www.ikjzd.com/w/2339
pat:https://www.ikjzd.com/w/1079
kili:https://www.ikjzd.com/w/238
这位46岁的女士说:我丈夫只出轨过一次,但这让我余生都不稳定:http://www.30bags.com/a/443886.html
这位42岁的出轨女性说:人性经不起考验:http://www.30bags.com/a/443887.html
母子关系(情感故事):http://www.30bags.com/a/443888.html

No comments:

Post a Comment