添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account Code: 8. DB::Exception: Cannot find column in source stream, there are only columns: (THERE_IS_NO_COLUMN) (version 22.6.2.12 (official build)) #38605 Code: 8. DB::Exception: Cannot find column in source stream, there are only columns: (THERE_IS_NO_COLUMN) (version 22.6.2.12 (official build)) #38605 lucasenabit opened this issue Jun 29, 2022 · 7 comments

I have a problem with the distributed table. When I run the following query;

Select
sum(case when statusTransaction in('DESFEITA_NOK','DESFEITA_OK') then 1 else 0 end)/count()*100 as perc_desfeita
FROM dataflow.transactions
where dateTransaction >'2022-06-20'

I get the error below:

Code: 8. DB::Exception: Cannot find column countIf(in(statusTransaction, ('DESFEITA_NOK', 'DESFEITA_OK'))) in source stream, there are only columns: [sum(if(in(statusTransaction, ('DESFEITA_NOK', 'DESFEITA_OK')), 1, 0)), count()]. (THERE_IS_NO_COLUMN) (version 22.6.2.12 (official build))

Note: when I use the source table, just replicated the query works normally.

Select
sum(case when statusTransaction in('DESFEITA_NOK','DESFEITA_OK') then 1 else 0 end)/count()*100 as perc_desfeita
FROM dataflow.transactions_ori2 --> change to Replicated table, its works
where dateTransaction >'2022-06-20'

I have two other environments that work with distributed tables and I never had this problem. The actual versions of the other environments are 22.6.2.12.

Specifically in the environment I'm having this error, I added a password for the default user. But even removing the error still remains.

All my environments are running on Kubernetes, below the configuration YAML:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "clickhouse"
labels:
enviroment: production
app: clickhouse
spec:
#volumes:
#- name: clickhouse-database

persistentVolumeClaim:

claimNAme: ch-storage

defaults:
templates:
volumeClaimTemplate: clickhouse-volume-data
logVolumeClaimTemplate: clickhouse-template-log
podTemplate: clickhouse-aiops
#clusterServiceTemplate: clickhouse-service-template
serviceTemplate: clickhouse-service-template
templates:
volumeClaimTemplates:
- name: clickhouse-volume-data
reclaimPolicy: Retain

persistentVolumeClaim:

metadata:

name: clickhouse-database

  spec:

storageClassName: default

storageClass: standard

    accessModes: 
      - ReadWriteOnce
    # Dev
    resources:
      requests:
        storage: 60Gi
    # Conectcar
    # resources:
    #  requests:
    #    storage: 200Gi
 #inicial 
podTemplates:
  - name: database
    # zone:
    #   key: "clickhouse-aiops"
    #   values: 
    #     - "allow"
    # podDistribution:
    #   - type: ShardAntiAffinity
    #   - type: MaxNumberPerNode
    #     number: 1
    #     # Apply podDistribution on per-host basis
    #     topologyKey: "kubernetes.io/hostname"
    # distribution: "OnePerHost"
    metadata:
      labels:
        custom.label: "alma-clickhouse"
      annotations:
        custom.annotation: "Alma Plataform"
            # type PodSpec struct {} from k8s.io/core/v1
    spec:
      containers:
        - name: alma-clickhouse
          image: yandex/clickhouse-server:22.1.3.7
          volumeMounts:
            - name: clickhouse-volume-data
              mountPath: /var/lib/clickhouse
          # Dev
          resources:
            requests:
              memory: "1Gi"
              cpu: "250m"
          # BIG (Conectcar)
          #resources:
          #  requests:
          #    memory: "2Gi"
          #    cpu: "1000m"
          #  limits:
          #    memory: "10Gi"
          #    cpu: "2000m"

service template

serviceTemplates: 
  - name: clickhouse-service-template
    generateName: clickhouse-{chi}
    metadata:
    ## DESCOMENTAR ABAIXO PARA ACESSO SOMENTE INTERNO 
      labels:
        custom.label: "custom.value"
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0 # --> acesso interno do loadbalancer aws, gerenciado pelo operator
    spec:
      ports:
        - name: http
          port: 8123
        - name: tcp
          port: 9000
      #   - name: interserver
      #     port: 9009
      ## ALTERAR  de LoadBalancer para clusterIP quando acesso externo for desnecessário
      type: ClusterIP
      # type: LoadBalancer #--> Para acesso externo usando LoadBalancer descomentar esse trecho e ativar o de cima.
      #externalTrafficPolicy: external

configuration:
settings:
max_concurrent_queries: 2000
compression/case/method: zstd
disable_internal_dns_cache: 1
max_server_memory_usage: 20000000000
#merge_tree/max_suspicious_broken_parts: 20000 #remover depois , responsável para liberação do serviço quanto há divergência de parts
# ou files entre nós.
merge_tree/old_parts_lifetime: 30 # tempo total das partes em segundos que foram mergeadas e não deletadas.
merge_tree/parts_to_delay_insert: 150
merge_tree/parts_to_throw_insert: 900 # total de segmentos durante o insert, padrao 300.
#merge_tree/max_delay_to_insert: 2 unicos em uma particao - padrao 300. deve ser aumentando os batchs
merge_tree/max_delay_to_insert: 5 #tempo em milisegundos, calculado entre part_throw e delay
#merge_tree/replicated_max_ratio_of_wrong_parts: 0.5 #permite um valor limite para instancia subir novamente quando há divergência ou erros
# entre a qtde de registros no zookeeper e no arquivo do banco
#parts_to_delay_insert: 400
#skip_check_for_incorrect_settings: 1
# configs adicionais para acessos em diferentes portas nas VM`s
# http_port: 8124
# tcp_port: 9001
# interserver_http_port: 9010

  remote_servers/alma/secret: alma # senha do secret para comunicação intra cluster. É necessário quando requer usuário e senha para autenti
  #cação para ambientes distribuídos e replicados. Não é possível criptografar essa senha de comunicação interna.
users:
  default/networks/ip: "::/0"
  default/password: alma
  default/quota: default
  #default/networks/host_regexp: '(chi-clickhouse-alma-\d+-\d+|clickhouse-alma).default.svc.cluster.local$'
  readonly/networks/ip: "::/0"
  almauser1/networks/ip: "::/0"
  almauser2/networks/ip: "::/0"
  #config adicional para usuários de leitura
  readonly/profile: readonly
  readonly/password: dataread
  readonly/quota: default
  #config para usuários com secret de acesso
  almauser1/k8s_secret_password: alma-v2/clickhouse-credentials/almauser1
  almauser2/k8s_secret_password_sha256_hex: alma-v2/clickhouse-credentials/almauser2
profiles:
  default/max_memory_usage: 12000000000 # test 12GB para consultas
  almauser1/max_memory_usage: 12000000000
  almauser2/max_memory_usage: 12000000000
  default/background_pool_size: 32
zookeeper:
    nodes:
    - host: alma-zookeeper-clickhouse
      port: 2181
clusters:
  - name: "alma"
    layout:
      shardsCount: 2
      replicasCount: 2
Select sum(case when statusTransaction in('DESFEITA_NOK','DESFEITA_OK') then 1 else 0 end)/count()*100 as perc_desfeita
FROM dataflow.transactions
where dateTransaction >'2022-06-20'
settings optimize_move_to_prewhere=0;

I have two other environments that work with distributed tables and I never had this problem.

try at that env.

Select sum(case when statusTransaction in('DESFEITA_NOK','DESFEITA_OK') then 1 else 0 end)/count()*100 as perc_desfeita
FROM dataflow.transactions
prewhere dateTransaction >'2022-06-20'

Also try

Select sum(case when statusTransaction in('DESFEITA_NOK','DESFEITA_OK') then 1 else 0 end)/count()*100 as perc_desfeita
FROM dataflow.transactions
where dateTransaction >'2022-06-20'
settings optimize_fuse_sum_count_avg=0, optimize_rewrite_sum_if_to_count_if=0

den-crane,

its works fine ! When using set optimize_rewrite_sum_if_to_count_if=0.

Thanks a lot for your response

Its possible include that on YAML file, to default ? I Percive that problem just occour in this version. I Have earlier version and never seen it

Select sum(case when statusTransaction in('DESFEITA_NOK','DESFEITA_OK') then 1 else 0 end)/count()*100 as perc_desfeita
FROM dataflow.transactions
where dateTransaction >'2022-06-20'
settings optimize_move_to_prewhere=0;

I have two other environments that work with distributed tables and I never had this problem.

try at that env.

Select sum(case when statusTransaction in('DESFEITA_NOK','DESFEITA_OK') then 1 else 0 end)/count()*100 as perc_desfeita
FROM dataflow.transactions
prewhere dateTransaction >'2022-06-20'

I ran prewhere in the other environment and it worked, I didn't have any errors.

Its possible include that on YAML file, to default ? I Percive that problem just occour in this version. I Have earlier version and never seen it

Yes, in

profile:
  default/optimize_rewrite_sum_if_to_count_if=0
          

repro:

create table t( s String ) Engine=Memory;
insert into t values('a')('b')('c');
SELECT round((sum(multiIf(s IN ('a', 'b'), 1, 0)) / count()) * 100) AS r
FROM cluster('test_cluster_two_shards', currentDatabase(), t)
SETTINGS optimize_rewrite_sum_if_to_count_if = 0;
┌──r─┐
│ 67 │
└────┘
SELECT round((sum(multiIf(s IN ('a', 'b'), 1, 0)) / count()) * 100) AS r
FROM cluster('test_cluster_two_shards', currentDatabase(), t)
SETTINGS optimize_rewrite_sum_if_to_count_if = 1;
Received exception from server (version 22.6.1):
Code: 8. DB::Exception: Received from localhost:9000. 
DB::Exception: Cannot find column `countIf(in(s, ('a', 'b')))` in source stream, there are only 
columns: [sum(if(in(s, ('a', 'b')), 1, 0)), count()]. (THERE_IS_NO_COLUMN)