I am trying to setup k8ssandra on my laptop to evaluate and start developing.
I followed so far the steps listed in the documentation and also saw the hints on resource requirements. My laptop is a standard lenovo thinkpad with 16GB RAM and a I7 cpu.
For k8ssandra installation, I am starting a Vagrant VirtualBox with 16GB RAM and 8 CPUs.
Except for stargate and reaper pod, everything is running as expected. Just these two just don’t want to. What puzzles me, is that I cannot find an error in the stargate’s log.
The reaper says, “no datasource available” which I think is related to the crashed stargate pod. The only point that comes to my mind is the warning about rlimi memlock. However, from what I saw, kubernetes currently provides no way to pass a ulimit option.
Do you have a tip for me?
Please find below some dumps. I apologize for the plain text pasting here. If there is a preferred way to pass large logs, please let me know.
Using environment for config
Running java -server -XX:+CrashOnOutOfMemoryError -Xms256M -Xmx256M -Dstargate.libdir=./stargate-lib -Djava.awt.headless=true -jar ./stargate-lib/stargate-starter-1.0.18.jar --cluster-name k8ssandra --cluster-version 3.11 --cluster-seed k8ssandra-seed-service.default.svc.cluster.local --listen 10.96.223.164 --dc dc1 --rack default --enable-auth
JAR DIR: ./stargate-lib
Loading persistence backend persistence-cassandra-3.11-1.0.18.jar
Installing bundle persistence-cassandra-3.11-1.0.18.jar
Installing bundle animal-sniffer-annotations-1.9.jar
Installing bundle asm-7.1.jar
Installing bundle asm-analysis-7.1.jar
Installing bundle asm-tree-7.1.jar
Installing bundle auth-api-1.0.18.jar
Installing bundle auth-jwt-service-1.0.18.jar
Installing bundle auth-table-based-service-1.0.18.jar
Installing bundle authnz-1.0.18.jar
Installing bundle commons-beanutils-1.9.4.jar
Installing bundle commons-collections-3.2.2.jar
Installing bundle commons-digester-2.1.jar
Installing bundle commons-logging-1.2.jar
Installing bundle commons-validator-1.7.jar
Installing bundle config-store-api-1.0.18.jar
Installing bundle config-store-yaml-1.0.18.jar
Installing bundle core-1.0.18.jar
Installing bundle cql-1.0.18.jar
Installing bundle graphqlapi-1.0.18.jar
Installing bundle health-checker-1.0.18.jar
Installing bundle org.apache.felix.scr-2.1.20.jar
Installing bundle org.apache.felix.scr.ds-annotations-1.2.10.jar
Installing bundle org.apache.felix.scr.generator-1.18.4.jar
Installing bundle org.osgi.compendium-4.2.0.jar
Installing bundle org.osgi.util.function-1.1.0.jar
Installing bundle org.osgi.util.promise-1.1.1.jar
Installing bundle persistence-api-1.0.18.jar
Installing bundle rate-limiting-global-1.0.18.jar
Installing bundle restapi-1.0.18.jar
Starting bundle io.stargate.db.cassandra_3_11
INFO [main] 2021-06-09 08:56:07,639 BaseActivator.java:92 - Starting persistence-cassandra-3.11 ...
Starting bundle null
Starting bundle org.objectweb.asm
Starting bundle org.objectweb.asm.tree.analysis
Starting bundle org.objectweb.asm.tree
Starting bundle io.stargate.auth.api
INFO [main] 2021-06-09 08:56:09,938 BaseActivator.java:92 - Starting authApiServer ...
Starting bundle io.stargate.auth.jwt
Starting bundle io.stargate.auth.table
INFO [main] 2021-06-09 08:56:12,413 BaseActivator.java:92 - Starting authnTableBasedService and authzTableBasedServie ...
Starting bundle io.stargate.auth
Starting bundle org.apache.commons.commons-beanutils
Starting bundle org.apache.commons.collections
Starting bundle org.apache.commons.digester
Starting bundle org.apache.commons.logging
Starting bundle org.apache.commons.commons-validator
Starting bundle io.stargate.config.store.api
Starting bundle io.stargate.config.store.yaml
INFO [main] 2021-06-09 08:56:14,666 BaseActivator.java:92 - Starting Config Store YAML ...
Starting bundle io.stargate.core
INFO [main] 2021-06-09 08:56:14,674 BaseActivator.java:92 - Starting core services ...
INFO [main] 2021-06-09 08:56:14,725 BaseActivator.java:173 - Registering core services as io.stargate.core.metrics.api.Metrics
INFO [main] 2021-06-09 08:56:18,118 AbstractCassandraPersistence.java:100 - Initializing Apache Cassandra
INFO [main] 2021-06-09 08:56:18,346 DatabaseDescriptor.java:381 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO [main] 2021-06-09 08:56:18,349 DatabaseDescriptor.java:439 - Global memtable on-heap threshold is enabled at 61MB
INFO [main] 2021-06-09 08:56:18,351 DatabaseDescriptor.java:443 - Global memtable off-heap threshold is enabled at 61MB
WARN [main] 2021-06-09 08:56:19,797 DatabaseDescriptor.java:579 - Only 31.111GiB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots
INFO [main] 2021-06-09 08:56:19,953 RateBasedBackPressure.java:123 - Initialized back-pressure with high ratio: 0.9, factor: 5, flow: FAST, window size: 2000.
INFO [main] 2021-06-09 08:56:19,955 DatabaseDescriptor.java:773 - Back-pressure is disabled with strategy null.
INFO [main] 2021-06-09 08:56:20,138 GossipingPropertyFileSnitch.java:68 - Unable to load cassandra-topology.properties; compatibility mode disabled
INFO [main] 2021-06-09 08:56:21,510 JMXServerUtils.java:246 - Configured JMX server at: service:jmx:rmi://0.0.0.0/jndi/rmi://0.0.0.0:7199/jmxrmi
INFO [main] 2021-06-09 08:56:21,545 CassandraDaemon.java:489 - Hostname: k8ssandra-dc1-stargate-76c576f7fc-spbp4
INFO [main] 2021-06-09 08:56:21,547 CassandraDaemon.java:496 - JVM vendor/version: OpenJDK 64-Bit Server VM/1.8.0_252
INFO [main] 2021-06-09 08:56:21,612 CassandraDaemon.java:497 - Heap size: 247.500MiB/247.500MiB
INFO [main] 2021-06-09 08:56:21,615 CassandraDaemon.java:502 - Code Cache Non-heap memory: init = 2555904(2496K) used = 5547584(5417K) committed = 5570560(5440K) max = 251658240(245760K)
INFO [main] 2021-06-09 08:56:21,617 CassandraDaemon.java:502 - Metaspace Non-heap memory: init = 0(0K) used = 24982216(24396K) committed = 26214400(25600K) max = -1(-1K)
INFO [main] 2021-06-09 08:56:21,619 CassandraDaemon.java:502 - Compressed Class Space Non-heap memory: init = 0(0K) used = 3254264(3177K) committed = 3670016(3584K) max = 1073741824(1048576K)
INFO [main] 2021-06-09 08:56:21,622 CassandraDaemon.java:502 - Eden Space Heap memory: init = 71630848(69952K) used = 71630848(69952K) committed = 71630848(69952K) max = 71630848(69952K)
INFO [main] 2021-06-09 08:56:21,630 CassandraDaemon.java:502 - Survivor Space Heap memory: init = 8912896(8704K) used = 7103928(6937K) committed = 8912896(8704K) max = 8912896(8704K)
INFO [main] 2021-06-09 08:56:21,637 CassandraDaemon.java:502 - Tenured Gen Heap memory: init = 178978816(174784K) used = 15222048(14865K) committed = 178978816(174784K) max = 178978816(174784K)
INFO [main] 2021-06-09 08:56:21,640 CassandraDaemon.java:504 - Classpath: ./stargate-lib/stargate-starter-1.0.18.jar
INFO [main] 2021-06-09 08:56:21,642 CassandraDaemon.java:506 - JVM Arguments: [-XX:+CrashOnOutOfMemoryError, -Xms256M, -Xmx256M, -Dstargate.libdir=./stargate-lib, -Djava.awt.headless=true]
WARN [main] 2021-06-09 08:56:22,234 NativeLibrary.java:189 - Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root.
WARN [main] 2021-06-09 08:56:22,236 StartupChecks.java:136 - jemalloc shared library could not be preloaded to speed up memory allocations
INFO [main] 2021-06-09 08:56:22,237 StartupChecks.java:176 - JMX is enabled to receive remote connections on port: 7199
INFO [main] 2021-06-09 08:56:22,244 SigarLibrary.java:44 - Initializing SIGAR library
INFO [main] 2021-06-09 08:56:22,420 SigarLibrary.java:57 - Could not initialize SIGAR library org.hyperic.sigar.Sigar.getFileSystemListNative()[Lorg/hyperic/sigar/FileSystem;
INFO [main] 2021-06-09 08:56:22,422 SigarLibrary.java:185 - Sigar could not be initialized, test for checking degraded mode omitted.
INFO [main] 2021-06-09 08:56:23,225 QueryProcessor.java:116 - Initialized prepared statement caches with 10 MB (native) and 10 MB (Thrift)
INFO [main] 2021-06-09 08:56:27,813 ColumnFamilyStore.java:427 - Initializing system.IndexInfo
INFO [main] 2021-06-09 08:56:36,754 ColumnFamilyStore.java:427 - Initializing system.batches
INFO [main] 2021-06-09 08:56:37,046 ColumnFamilyStore.java:427 - Initializing system.paxos
INFO [main] 2021-06-09 08:56:37,349 ColumnFamilyStore.java:427 - Initializing system.local
INFO [main] 2021-06-09 08:56:37,717 ColumnFamilyStore.java:427 - Initializing system.peers
INFO [main] 2021-06-09 08:56:38,050 ColumnFamilyStore.java:427 - Initializing system.peer_events
INFO [main] 2021-06-09 08:56:38,519 ColumnFamilyStore.java:427 - Initializing system.range_xfers
INFO [main] 2021-06-09 08:56:38,915 ColumnFamilyStore.java:427 - Initializing system.compaction_history
INFO [main] 2021-06-09 08:56:39,617 ColumnFamilyStore.java:427 - Initializing system.sstable_activity
INFO [main] 2021-06-09 08:56:40,140 ColumnFamilyStore.java:427 - Initializing system.size_estimates
INFO [main] 2021-06-09 08:56:40,712 ColumnFamilyStore.java:427 - Initializing system.available_ranges
INFO [main] 2021-06-09 08:56:41,158 ColumnFamilyStore.java:427 - Initializing system.transferred_ranges
INFO [main] 2021-06-09 08:56:41,744 ColumnFamilyStore.java:427 - Initializing system.views_builds_in_progress
INFO [main] 2021-06-09 08:56:42,413 ColumnFamilyStore.java:427 - Initializing system.built_views
INFO [main] 2021-06-09 08:56:42,750 ColumnFamilyStore.java:427 - Initializing system.hints
INFO [main] 2021-06-09 08:56:43,146 ColumnFamilyStore.java:427 - Initializing system.batchlog
INFO [main] 2021-06-09 08:56:43,728 ColumnFamilyStore.java:427 - Initializing system.prepared_statements
INFO [main] 2021-06-09 08:56:44,521 ColumnFamilyStore.java:427 - Initializing system.schema_keyspaces
INFO [main] 2021-06-09 08:56:44,946 ColumnFamilyStore.java:427 - Initializing system.schema_columnfamilies
INFO [main] 2021-06-09 08:56:45,176 ColumnFamilyStore.java:427 - Initializing system.schema_columns
INFO [main] 2021-06-09 08:56:45,551 ColumnFamilyStore.java:427 - Initializing system.schema_triggers
INFO [main] 2021-06-09 08:56:45,846 ColumnFamilyStore.java:427 - Initializing system.schema_usertypes
INFO [main] 2021-06-09 08:56:46,434 ColumnFamilyStore.java:427 - Initializing system.schema_functions
INFO [main] 2021-06-09 08:56:46,720 ColumnFamilyStore.java:427 - Initializing system.schema_aggregates
INFO [main] 2021-06-09 08:56:46,726 ViewManager.java:137 - Not submitting build tasks for views in keyspace system as storage service is not initialized
INFO [main] 2021-06-09 08:56:46,743 ClientState.java:102 - Using io.stargate.db.cassandra.impl.StargateQueryHandler as query handler for native protocol queries (as requested with -Dcassandra.custom_query_handler_class)
INFO [main] 2021-06-09 08:56:47,112 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds
INFO [main] 2021-06-09 08:56:47,641 ColumnFamilyStore.java:427 - Initializing system_schema.keyspaces
INFO [main] 2021-06-09 08:56:48,136 ColumnFamilyStore.java:427 - Initializing system_schema.tables
INFO [main] 2021-06-09 08:56:49,094 ColumnFamilyStore.java:427 - Initializing system_schema.columns
INFO [main] 2021-06-09 08:56:49,633 ColumnFamilyStore.java:427 - Initializing system_schema.triggers
INFO [main] 2021-06-09 08:56:49,978 ColumnFamilyStore.java:427 - Initializing system_schema.dropped_columns
INFO [main] 2021-06-09 08:56:50,415 ColumnFamilyStore.java:427 - Initializing system_schema.views
INFO [main] 2021-06-09 08:56:50,753 ColumnFamilyStore.java:427 - Initializing system_schema.types
INFO [main] 2021-06-09 08:56:51,150 ColumnFamilyStore.java:427 - Initializing system_schema.functions
INFO [main] 2021-06-09 08:56:51,554 ColumnFamilyStore.java:427 - Initializing system_schema.aggregates
INFO [main] 2021-06-09 08:56:51,927 ColumnFamilyStore.java:427 - Initializing system_schema.indexes
INFO [main] 2021-06-09 08:56:51,937 ViewManager.java:137 - Not submitting build tasks for views in keyspace system_schema as storage service is not initialized
INFO [MemtableFlushWriter:1] 2021-06-09 08:56:54,946 CacheService.java:100 - Initializing key cache with capacity of 12 MBs.
INFO [MemtableFlushWriter:1] 2021-06-09 08:56:55,008 CacheService.java:122 - Initializing row cache with capacity of 0 MBs
INFO [MemtableFlushWriter:1] 2021-06-09 08:56:55,014 CacheService.java:151 - Initializing counter cache with capacity of 6 MBs
INFO [MemtableFlushWriter:1] 2021-06-09 08:56:55,017 CacheService.java:162 - Scheduling counter cache save to every 7200 seconds (going to save all keys).
INFO [CompactionExecutor:2] 2021-06-09 08:56:56,113 BufferPool.java:234 - Global buffer pool is enabled, when pool is exhausted (max is 61.000MiB) it will allocate on heap
INFO [main] 2021-06-09 08:56:56,534 StorageService.java:639 - Populating token metadata from system tables
INFO [main] 2021-06-09 08:56:57,014 StorageService.java:646 - Token metadata:
INFO [pool-9-thread-1] 2021-06-09 08:56:57,814 AutoSavingCache.java:174 - Completed loading (74 ms; 8 keys) KeyCache cache
INFO [main] 2021-06-09 08:56:58,119 CommitLog.java:142 - No commitlog files found; skipping replay
INFO [main] 2021-06-09 08:56:58,132 StorageService.java:639 - Populating token metadata from system tables
INFO [main] 2021-06-09 08:56:58,333 StorageService.java:646 - Token metadata:
INFO [main] 2021-06-09 08:57:00,218 QueryProcessor.java:163 - Preloaded 0 prepared statements
INFO [main] 2021-06-09 08:57:00,222 StorageService.java:657 - Cassandra version: 3.11.6
INFO [main] 2021-06-09 08:57:00,224 StorageService.java:658 - Thrift API version: 20.1.0
INFO [main] 2021-06-09 08:57:00,226 StorageService.java:659 - CQL supported versions: 3.4.4 (default: 3.4.4)
INFO [main] 2021-06-09 08:57:00,233 StorageService.java:661 - Native protocol supported versions: 3/v3, 4/v4, 5/v5-beta (default: 4/v4)
INFO [main] 2021-06-09 08:57:00,740 IndexSummaryManager.java:87 - Initializing index summary manager with a memory pool size of 12 MB and a resize interval of 60 minutes
INFO [main] 2021-06-09 08:57:00,844 MessagingService.java:750 - Starting Messaging Service on /10.96.223.164:7000 (eth0)
WARN [main] 2021-06-09 08:57:00,941 SystemKeyspace.java:1130 - No host ID found, created e2dc618e-8884-47d7-ad1d-513cb20bf97c (Note: This should happen exactly once per node).
INFO [main] 2021-06-09 08:57:01,218 OutboundTcpConnection.java:108 - OutboundTcpConnection using coalescing strategy DISABLED
INFO [HANDSHAKE-k8ssandra-seed-service.default.svc.cluster.local/10.96.223.171] 2021-06-09 08:57:01,321 OutboundTcpConnection.java:561 - Handshaking version with k8ssandra-seed-service.default.svc.cluster.local/10.96.223.171
INFO [main] 2021-06-09 08:57:02,280 StorageService.java:743 - Loading persisted ring state
INFO [main] 2021-06-09 08:57:02,286 StorageService.java:871 - Starting up server gossip
INFO [MigrationStage:1] 2021-06-09 08:57:05,915 ViewManager.java:137 - Not submitting build tasks for views in keyspace system_auth as storage service is not initialized
INFO [MigrationStage:1] 2021-06-09 08:57:07,160 ColumnFamilyStore.java:427 - Initializing system_auth.resource_role_permissons_index
INFO [MigrationStage:1] 2021-06-09 08:57:08,752 ColumnFamilyStore.java:427 - Initializing system_auth.role_members
INFO [MigrationStage:1] 2021-06-09 08:57:09,918 ColumnFamilyStore.java:427 - Initializing system_auth.role_permissions
INFO [MigrationStage:1] 2021-06-09 08:57:11,133 ColumnFamilyStore.java:427 - Initializing system_auth.roles
INFO [main] 2021-06-09 08:57:11,609 AuthCache.java:177 - (Re)initializing CredentialsCache (validity period/update interval/max entries) (2000/2000/1000)
INFO [main] 2021-06-09 08:57:11,619 StorageService.java:733 - Not joining ring as requested. Use JMX (StorageService->joinRing()) to initiate ring joining
INFO [StorageServiceShutdownHook] 2021-06-09 08:57:11,631 HintsService.java:209 - Paused hints dispatch
INFO [main] 2021-06-09 08:57:11,633 Gossiper.java:1780 - Waiting for gossip to settle...
WARN [StorageServiceShutdownHook] 2021-06-09 08:57:11,634 Gossiper.java:1655 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown
INFO [StorageServiceShutdownHook] 2021-06-09 08:57:11,638 MessagingService.java:985 - Waiting for messaging service to quiesce
INFO [ACCEPT-/10.96.223.164] 2021-06-09 08:57:11,712 MessagingService.java:1346 - MessagingService has terminated the accept() thread
INFO [StorageServiceShutdownHook] 2021-06-09 08:57:12,165 HintsService.java:209 - Paused hints dispatch
It’s not readily obvious to me what the problem is. Could you please post the following?
$ kubetctl get nodes
$ kubectl get pods
And a kubectl describe on the Reaper pod please?
Hi guys,
thank you so much for your quick responses. I am happy to provide required logs.
reaper description:
Name: k8ssandra-reaper-7ffb485bb8-ggvl7
Namespace: default
Priority: 0
Node: ubuntu-focal/10.0.2.15
Start Time: Tue, 08 Jun 2021 13:51:46 +0000
Labels: app.kubernetes.io/managed-by=reaper-operator
pod-template-hash=7ffb485bb8
reaper.cassandra-reaper.io/reaper=k8ssandra-reaper
Annotations: cni.projectcalico.org/podIP: 10.96.223.166/32
cni.projectcalico.org/podIPs: 10.96.223.166/32
Status: Running
IP: 10.96.223.166
IP: 10.96.223.166
Controlled By: ReplicaSet/k8ssandra-reaper-7ffb485bb8
Containers:
reaper:
Container ID: docker://534f1ee660cbc87540d0a4e2ac1c5f5e10ca9ac994144c5cb18427d9851e705d
Image: docker.io/thelastpickle/cassandra-reaper:2.2.2
Image ID: docker-pullable://thelastpickle/cassandra-reaper@sha256:752f041e7c933602b052e8c319aef8c9770cd8073a739b7d35233cd230c3eb62
Ports: 8080/TCP, 8081/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 143
Started: Wed, 09 Jun 2021 09:01:41 +0000
Finished: Wed, 09 Jun 2021 09:03:09 +0000
Ready: False
Restart Count: 40
Liveness: http-get http://:8081/healthcheck delay=45s timeout=1s period=15s #success=1 #failure=3
Readiness: http-get http://:8081/healthcheck delay=45s timeout=1s period=15s #success=1 #failure=3
Environment:
REAPER_STORAGE_TYPE: cassandra
REAPER_ENABLE_DYNAMIC_SEED_LIST: false
REAPER_CASS_CONTACT_POINTS: [k8ssandra-dc1-service]
REAPER_AUTH_ENABLED: false
REAPER_JMX_AUTH_USERNAME: <set to the key 'username' in secret 'k8ssandra-reaper-jmx'> Optional: false
REAPER_JMX_AUTH_PASSWORD: <set to the key 'password' in secret 'k8ssandra-reaper-jmx'> Optional: false
REAPER_CASS_AUTH_USERNAME: <set to the key 'username' in secret 'k8ssandra-reaper'> Optional: false
REAPER_CASS_AUTH_PASSWORD: <set to the key 'password' in secret 'k8ssandra-reaper'> Optional: false
REAPER_CASS_AUTH_ENABLED: true
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ngzbk (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-ngzbk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 5m49s (x252 over 87m) kubelet Back-off restarting failed container
Warning Unhealthy 48s (x80 over 94m) kubelet Liveness probe failed: Get "http://10.96.223.166:8081/healthcheck": dial tcp 10.96.223.166:8081: connect: connection refused
reaper logs:
INFO [2021-06-09 09:01:56,462] [main] i.d.s.DefaultServerFactory - Registering jersey handler with root path prefix: /
INFO [2021-06-09 09:01:56,476] [main] i.d.s.DefaultServerFactory - Registering admin handler with root path prefix: /
INFO [2021-06-09 09:01:56,477] [main] i.d.a.AssetsBundle - Registering AssetBundle with name: assets for path /webui/*
INFO [2021-06-09 09:01:56,793] [main] i.c.ReaperApplication - initializing runner thread pool with 15 threads and 2 repair runners
INFO [2021-06-09 09:01:56,793] [main] i.c.ReaperApplication - initializing storage of type: cassandra
INFO [2021-06-09 09:01:57,224] [main] c.d.d.core - DataStax Java driver 3.10.1 for Apache Cassandra
INFO [2021-06-09 09:01:57,257] [main] c.d.d.c.GuavaCompatibility - Detected Guava >= 19 in the classpath, using modern compatibility layer
INFO [2021-06-09 09:01:59,148] [main] c.d.d.c.ClockFactory - Using native clock to generate timestamps.
INFO [2021-06-09 09:02:00,125] [main] c.d.d.c.NettyUtil - Did not find Netty's native epoll transport in the classpath, defaulting to NIO.
INFO [2021-06-09 09:02:03,142] [main] c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name 'dc1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
INFO [2021-06-09 09:02:03,152] [main] c.d.d.c.Cluster - New Cassandra host k8ssandra-dc1-service/10.96.223.171:9042 added
INFO [2021-06-09 09:02:05,261] [main] o.c.c.m.MigrationRepository - Found 12 migration scripts
WARN [2021-06-09 09:02:05,262] [main] i.c.s.CassandraStorage - Starting db migration from 21 to 27…
INFO [2021-06-09 09:02:06,130] [main] o.c.c.m.MigrationRepository - Found 12 migration scripts
WARN [2021-06-09 09:02:08,028] [clustername-worker-0] c.d.d.c.Cluster - Re-preparing already prepared query is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once. Query='insert into schema_migration(applied_successful, version, script_name, script, executed_at) values(?, ?, ?, ?, ?)'
WARN [2021-06-09 09:02:08,040] [clustername-worker-0] c.d.d.c.Cluster - Re-preparing already prepared query is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once. Query='INSERT INTO schema_migration_leader (keyspace_name, leader, took_lead_at, leader_hostname) VALUES (?, ?, dateOf(now()), ?) IF NOT EXISTS USING TTL 300'
WARN [2021-06-09 09:02:08,114] [clustername-worker-0] c.d.d.c.Cluster - Re-preparing already prepared query is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once. Query='DELETE FROM schema_migration_leader where keyspace_name = ? IF leader = ?'
ERROR [2021-06-09 09:02:09,342] [main] i.c.ReaperApplication - Storage is not ready yet, trying again to connect shortly...
org.cognitor.cassandra.migration.MigrationException: Error during migration of script 022_cluster_states.cql while executing 'ALTER TABLE cluster ADD state text;'
at org.cognitor.cassandra.migration.Database.execute(Database.java:269)
at java.util.Collections$SingletonList.forEach(Collections.java:4824)
at org.cognitor.cassandra.migration.MigrationTask.migrate(MigrationTask.java:68)
at io.cassandrareaper.storage.CassandraStorage.migrate(CassandraStorage.java:362)
at io.cassandrareaper.storage.CassandraStorage.initializeCassandraSchema(CassandraStorage.java:293)
at io.cassandrareaper.storage.CassandraStorage.initializeAndUpgradeSchema(CassandraStorage.java:250)
at io.cassandrareaper.storage.CassandraStorage.<init>(CassandraStorage.java:238)
at io.cassandrareaper.ReaperApplication.initializeStorage(ReaperApplication.java:480)
at io.cassandrareaper.ReaperApplication.tryInitializeStorage(ReaperApplication.java:303)
at io.cassandrareaper.ReaperApplication.run(ReaperApplication.java:181)
at io.cassandrareaper.ReaperApplication.run(ReaperApplication.java:98)
at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:43)
at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:87)
at io.dropwizard.cli.Cli.run(Cli.java:78)
at io.dropwizard.Application.run(Application.java:93)
at io.cassandrareaper.ReaperApplication.main(ReaperApplication.java:117)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Invalid column name state because it conflicts with an existing column
at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:50)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:35)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:293)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:58)
at org.cognitor.cassandra.migration.Database.executeStatement(Database.java:277)
at org.cognitor.cassandra.migration.Database.execute(Database.java:261)
... 15 common frames omitted
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Invalid column name state because it conflicts with an existing column
at com.datastax.driver.core.Responses$Error.asException(Responses.java:181)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:215)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
at com.datastax.driver.core.RequestHandler.access$2600(RequestHandler.java:61)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:1011)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:814)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1287)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1205)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at com.datastax.driver.core.InboundTrafficMeter.channelRead(InboundTrafficMeter.java:38)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1304)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:921)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:135)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:646)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)```
**k8ssandra.yaml:**
NAME STATUS ROLES AGE VERSION
ubuntu-focal Ready control-plane,master 24h v1.21.1
get pods coming…
I suspect there could be some schema disagreements due to Stargate starting at the same time Reaper performs the schema migration.
Could you uninstall your Helm release and redeploy K8ssandra with 0 Stargate replica, and once everything is up and running (including Reaper), modify your Helm values to add 1 Stargate replica and upgrade your release? helm upgrade <release-name> k8ssandra/k8ssandra -f <values-file.yaml>
Hi Alex,
seems you were right regarding the stargate. I upgraded now the installation. Unforunately, stargate and reaper are still restarting.
Here is the “get pods” result:
The log of stargate looks pretty much the same. For me it seems, the process is somehow terminated. No clue…
Hey guys,
I tried to start stargate with 512M heap, but without effect.
Summarized, starting k8ssandra without stargate works like cheers. At least from what I can observe from “get pods”.
At the moment I start stargate, reaper and stargate are in a CrashBackLoop.
I did a dmesg. It is lots of stuff (I’ve cut a bit from the top). I wasn’t able to identify anything there. Maybe it helps you.
Hmmmm, doesn’t look like it was OOM killed. Have you checked the logs in the server-system-logger container of k8ssandra-dc1-default-sts-0?
By chance have you tried bringing up stargate and then reaper once everything is settled?
Another option is increasing the probe delays for stargate. Depending on the size of your schema and resource constraints stargate might not be able to settle gossip in time.