添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

let’s say the following N1QL with CASE WHEN

SELECT CASE WHEN condition1(fieldA,fieldB,fieldC) THEN field1
            WHEN condition2(fieldA,fieldD,fieldE) THEN field2
            WHEN condition3(fieldA,fieldF,fieldG) THEN field3
            WHEN condition4(fieldA,fieldF,fieldH) THEN field4
            ELSE MISSING
       END AS field
  FROM default
 WHERE condition(fieldX,fieldY)

AS evaluation process of CASE WHEN shows this N1QL can rewrite as following

 SELECT field1 AS field
   FROM default
  WHERE condition(fieldX,fieldY)
    AND condition1(fieldA,fieldB,fieldC)
UNION
SELECT field2 AS field
  FROM default
 WHERE condition(fieldX,fieldY)
   AND NOT (condition1(fieldA,fieldB,fieldC))
   AND condition2(fieldA,fieldD,fieldE)
UNION
SELECT field3 AS field
  FROM default
 WHERE condition(fieldX,fieldY)
   AND NOT (condition1(fieldA,fieldB,fieldC))
   AND NOT (condition2(fieldA,fieldD,fieldE))
   AND condition3(fieldA,fieldF,fieldG)
UNION
   SELECT field4 AS field
     FROM default
    WHERE condition(fieldX,fieldY)
      AND NOT (condition1(fieldA,fieldB,fieldC))
      AND NOT (condition2(fieldA,fieldD,fieldE))
      AND NOT (condition3(fieldA,fieldF,fieldG))
      AND condition4(fieldA,fieldF,fieldH)

am I right?

if so,let’s talk about which is better for N1QL query efficiency.

If I can use correct index I think the second N1QL should be more efficiency, because it will use index to get more documents.

the test step (on CB 5.0.0-2194 Enterprise Edition (build-2194)) as following:
I have create 2 index as following

create index idx_case_when_union_test_1 on default(fieldX,fieldY,fieldA,fieldB,fieldC,fieldD,fieldE,fieldF,fieldG,fieldH,field1,field2,field3,field4);
create index idx_case_when_union_test_2 on default(fieldA,fieldB,fieldC,fieldD,fieldE,fieldF,fieldG,fieldH,field1,field2,field3,field4) WHERE (fieldX == "X" AND fieldY == "Y");

and for the first N1QL

SELECT CASE WHEN fieldA == "A" AND fieldB == "B" AND fieldC == "C" THEN field1
            WHEN fieldA == "A" AND fieldD == "D" AND fieldE == "E" THEN field2
            WHEN fieldA == "A" AND fieldF == "F" AND fieldG == "G" THEN field3
            WHEN fieldA == "A" AND fieldF == "F" AND fieldH == "H" THEN field4
            ELSE MISSING
       END AS field
  FROM default
 WHERE fieldX == "X" AND fieldY == "Y"

the explain shows
gist.github.com

https://gist.github.com/atom992/9bd6d34b8f33393fa3176756457628d2

CASE_WHEN_EXPLAIN
"#operator": "Sequence", "#stats": { "#phaseSwitches": 2, "execTime": "964ns", "kernTime": "2.489323ms" "~children": [ "#operator": "Authorize",
This file has been truncated. show original FROM default WHERE fieldX == "X" AND fieldY == "Y" AND fieldA == "A" AND fieldB == "B" AND fieldC == "C" UNION SELECT field2 AS field FROM default WHERE fieldX == "X" AND fieldY == "Y" AND NOT (fieldA == "A" AND fieldB == "B" AND fieldC == "C") AND fieldA == "A" AND fieldD == "D" AND fieldE == "E" UNION SELECT field3 AS field FROM default WHERE fieldX == "X" AND fieldY == "Y" AND NOT (fieldA == "A" AND fieldB == "B" AND fieldC == "C") AND NOT (fieldA == "A" AND fieldD == "D" AND fieldE == "E") AND fieldA == "A" AND fieldF == "F" AND fieldG == "G" UNION SELECT field4 AS field FROM default WHERE fieldX == "X" AND fieldY == "Y" AND NOT (fieldA == "A" AND fieldB == "B" AND fieldC == "C") AND NOT (fieldA == "A" AND fieldD == "D" AND fieldE == "E") AND NOT (fieldA == "A" AND fieldF == "F" AND fieldG == "G") AND fieldA == "A" AND fieldF == "F" AND fieldH == "H"

the explain shows
gist.github.com

https://gist.github.com/atom992/68c83a9f455d47e11cfc6ddc642e5c72

UNION_SELECT_EXPLAIN
"#operator": "Sequence", "#stats": { "#phaseSwitches": 2, "execTime": "805ns", "kernTime": "2.835992ms" "~children": [ "#operator": "Authorize",
This file has been truncated. show original

it looks like both N1QL use cover index with index idx_case_when_union_test_1.
which one is better?

PS: I use USE INDEX (`idx_case_when_union_test_2`) for both N1QL,only the second N1QL works.

atom_yang:

PS: I use USE INDEX (idx_case_when_union_test_2) for both N1QL,only the second N1QL works.

Query1 cannot use index idx_case_when_union_test_2 because leading index key is not part of the predicate.

Which one is better depends on the data and how many CASE statements etc.

In my opinion CASE WHEN will be better. Pros and Cons

  • Predicates are pushed exactly and single IndexScan will be used (less load on indexer)
  • CASE evaluation might be expensive depends on which WHEN condition it matches.
  • For single query latency might be higher than UNION but loaded system can scale better.
  • UNION Pros and Cons

  • As each UNION arm projects different attributes, there will not be any duplicates across UNION arms you can use UNION ALL.
  • Each UNION query will have different precise IndexScans and each IndexScan will have precise Spans. This result in more IndexScans and there might be load on indexer Also for each Spans all predicates may not pushed to indexer. This may lead to false positives form indexer.
    Take example : fieldX == “X” AND fieldY == “Y”
    AND fieldA == “A” AND fieldD == “D” AND fieldE == “E”
    Given index only filedX,filedY,fieldA can be pushed to indexer not fieldD, fieldE
  • Query has NOT predicate which needs to be DNF transformed. NOT (fieldA == “A” AND fieldB == “B” AND fieldC == “C”) ===> (fieldA != “A” OR fieldB != “B” OR fieldC != “C”). This may require more indexes, IndexScans and face same problem as described above.
  •