CASE WHEN vs UNION SELECT - SQL++ - Couchbase Forums

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

相关文章推荐

有胆有识的豆腐 · 现代汽车氢燃料电池系统工厂在广州黄埔竣工 ...· 6 月前 ·

奋斗的针织衫 · 不下赌桌就会有机会，贾伟的疯狂设计人生_中华网· 7 月前 ·

跑龙套的台灯 · qt启动linux命令行 • Worktile社区· 9 月前 ·

逃跑的灭火器 · 问题记录——unity发布android报错 ...· 1 年前 ·

安静的热水瓶 · 优化性能：数据绑定 - WPF .NET ...· 1 年前 ·

let’s say the following N1QL with CASE WHEN

SELECT CASE WHEN condition1(fieldA,fieldB,fieldC) THEN field1
            WHEN condition2(fieldA,fieldD,fieldE) THEN field2
            WHEN condition3(fieldA,fieldF,fieldG) THEN field3
            WHEN condition4(fieldA,fieldF,fieldH) THEN field4
            ELSE MISSING
       END AS field
  FROM default
 WHERE condition(fieldX,fieldY)
AS evaluation process of CASE WHEN  shows this N1QL can rewrite as following
 SELECT field1 AS field
   FROM default
  WHERE condition(fieldX,fieldY)
    AND condition1(fieldA,fieldB,fieldC)
UNION
SELECT field2 AS field
  FROM default
 WHERE condition(fieldX,fieldY)
   AND NOT (condition1(fieldA,fieldB,fieldC))
   AND condition2(fieldA,fieldD,fieldE)
UNION
SELECT field3 AS field
  FROM default
 WHERE condition(fieldX,fieldY)
   AND NOT (condition1(fieldA,fieldB,fieldC))
   AND NOT (condition2(fieldA,fieldD,fieldE))
   AND condition3(fieldA,fieldF,fieldG)
UNION
   SELECT field4 AS field
     FROM default
    WHERE condition(fieldX,fieldY)
      AND NOT (condition1(fieldA,fieldB,fieldC))
      AND NOT (condition2(fieldA,fieldD,fieldE))
      AND NOT (condition3(fieldA,fieldF,fieldG))
      AND condition4(fieldA,fieldF,fieldH)
am I right?
if so,let’s talk about which is better for N1QL query efficiency.
If I can use correct index I think the second N1QL should be more efficiency, because it will use index to get more documents.
the test step (on CB 5.0.0-2194 Enterprise Edition (build-2194)) as following:

I have create 2 index as following
create index idx_case_when_union_test_1 on default(fieldX,fieldY,fieldA,fieldB,fieldC,fieldD,fieldE,fieldF,fieldG,fieldH,field1,field2,field3,field4);
create index idx_case_when_union_test_2 on default(fieldA,fieldB,fieldC,fieldD,fieldE,fieldF,fieldG,fieldH,field1,field2,field3,field4) WHERE (fieldX == "X" AND fieldY == "Y");
and for the first N1QL
SELECT CASE WHEN fieldA == "A" AND fieldB == "B" AND fieldC == "C" THEN field1
            WHEN fieldA == "A" AND fieldD == "D" AND fieldE == "E" THEN field2
            WHEN fieldA == "A" AND fieldF == "F" AND fieldG == "G" THEN field3
            WHEN fieldA == "A" AND fieldF == "F" AND fieldH == "H" THEN field4
            ELSE MISSING
       END AS field
  FROM default
 WHERE fieldX == "X" AND fieldY == "Y"
the explain shows

      gist.github.com
    
https://gist.github.com/atom992/9bd6d34b8f33393fa3176756457628d2
CASE_WHEN_EXPLAIN
  "#operator": "Sequence",
  "#stats": {
    "#phaseSwitches": 2,
    "execTime": "964ns",
    "kernTime": "2.489323ms"
  "~children": [
      "#operator": "Authorize",
This file has been truncated. show original
  FROM default
 WHERE fieldX == "X" AND fieldY == "Y"
   AND fieldA == "A" AND fieldB == "B" AND fieldC == "C"
UNION
SELECT field2 AS field
 FROM default
WHERE fieldX == "X" AND fieldY == "Y"
  AND NOT (fieldA == "A" AND fieldB == "B" AND fieldC == "C")
  AND fieldA == "A" AND fieldD == "D" AND fieldE == "E"
UNION
SELECT field3 AS field
 FROM default
WHERE fieldX == "X" AND fieldY == "Y"
  AND NOT (fieldA == "A" AND fieldB == "B" AND fieldC == "C")
  AND NOT (fieldA == "A" AND fieldD == "D" AND fieldE == "E")
  AND fieldA == "A" AND fieldF == "F" AND fieldG == "G"
UNION
  SELECT field4 AS field
    FROM default
   WHERE fieldX == "X" AND fieldY == "Y"
     AND NOT (fieldA == "A" AND fieldB == "B" AND fieldC == "C")
     AND NOT (fieldA == "A" AND fieldD == "D" AND fieldE == "E")
     AND NOT (fieldA == "A" AND fieldF == "F" AND fieldG == "G")
     AND fieldA == "A" AND fieldF == "F" AND fieldH == "H"
the explain shows

      gist.github.com
    
https://gist.github.com/atom992/68c83a9f455d47e11cfc6ddc642e5c72
UNION_SELECT_EXPLAIN
  "#operator": "Sequence",
  "#stats": {
    "#phaseSwitches": 2,
    "execTime": "805ns",
    "kernTime": "2.835992ms"
  "~children": [
      "#operator": "Authorize",
This file has been truncated. show original
it looks like  both N1QL use cover index with index idx_case_when_union_test_1.

which one is better?
PS: I use USE INDEX (`idx_case_when_union_test_2`) for both N1QL,only the second N1QL works.
 atom_yang:
PS: I use USE INDEX (idx_case_when_union_test_2) for both N1QL,only the second N1QL works.
Query1 cannot use index idx_case_when_union_test_2 because leading index key is not part of the predicate.
Which one is better depends on the data and how many CASE statements etc.
In my opinion CASE WHEN will be better. Pros and Cons
Predicates are pushed exactly and single IndexScan will be used (less load on indexer)
CASE evaluation might be expensive depends on which WHEN condition it matches.
For single query latency might be higher than UNION but loaded system can scale better.
UNION Pros and Cons
As each UNION arm projects different attributes, there will not be any duplicates across UNION arms you can use UNION ALL.
Each UNION query will have different precise IndexScans and each IndexScan will have precise Spans. This result in more IndexScans and there might be load on indexer Also for each Spans all predicates may not pushed to indexer. This may lead to false positives form indexer.

Take example : fieldX == “X” AND fieldY == “Y”

AND fieldA == “A” AND fieldD == “D” AND fieldE == “E”

Given index only filedX,filedY,fieldA can be pushed to indexer not fieldD, fieldE
Query has NOT predicate which needs to be DNF transformed.  NOT (fieldA == “A” AND fieldB == “B” AND fieldC == “C”)  ===>  (fieldA != “A” OR fieldB != “B” OR fieldC != “C”).   This may require more indexes, IndexScans and  face same problem as described above.