Hey folks,
I had this sort of issue in the past already. I am running a docker image with a 3-way real-life (non-docker) mongo cluster. The cluster is alive, online, replicated etc. This setup does work with RocketChat in Docker 4.8.1. Upgrading to 5.x fails with
[root@rc01:~/new-docker] # docker-compose up
WARNING: The DEPLOY_PLATFORM variable is not set. Defaulting to a blank string.
Creating network "new-docker_default" with the default driver
Creating new-docker_rocketchat_1 ... done
Attaching to new-docker_rocketchat_1
rocketchat_1 | /app/bundle/programs/server/node_modules/fibers/future.js:313
rocketchat_1 | throw(ex);
rocketchat_1 | ^
rocketchat_1 |
rocketchat_1 | MongoServerSelectionError: Server selection timed out after 30000 ms
rocketchat_1 | at Timeout._onTimeout (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/sdam/topology.js:312:38)
rocketchat_1 | at listOnTimeout (internal/timers.js:557:17)
rocketchat_1 | at processTimers (internal/timers.js:500:7) {
rocketchat_1 | reason: TopologyDescription {
rocketchat_1 | type: 'Single',
rocketchat_1 | servers: Map(3) {
rocketchat_1 | 'rc01.example.com:27017' => ServerDescription {
rocketchat_1 | _hostAddress: HostAddress {
rocketchat_1 | isIPv6: false,
rocketchat_1 | host: 'rc01.example.com',
rocketchat_1 | port: 27017
rocketchat_1 | },
rocketchat_1 | address: 'rc01.example.com:27017',
rocketchat_1 | type: 'Unknown',
rocketchat_1 | hosts: [],
rocketchat_1 | passives: [],
rocketchat_1 | arbiters: [],
rocketchat_1 | tags: {},
rocketchat_1 | minWireVersion: 0,
rocketchat_1 | maxWireVersion: 0,
rocketchat_1 | roundTripTime: -1,
rocketchat_1 | lastUpdateTime: 1166882,
rocketchat_1 | lastWriteDate: 0
rocketchat_1 | },
rocketchat_1 | 'rc02.example.com:27017' => ServerDescription {
rocketchat_1 | _hostAddress: HostAddress {
rocketchat_1 | isIPv6: false,
rocketchat_1 | host: 'rc02.example.com',
rocketchat_1 | port: 27017
rocketchat_1 | },
rocketchat_1 | address: 'rc02.example.com:27017',
rocketchat_1 | type: 'Unknown',
rocketchat_1 | hosts: [],
rocketchat_1 | passives: [],
rocketchat_1 | arbiters: [],
rocketchat_1 | tags: {},
rocketchat_1 | minWireVersion: 0,
rocketchat_1 | maxWireVersion: 0,
rocketchat_1 | roundTripTime: -1,
rocketchat_1 | lastUpdateTime: 1166881,
rocketchat_1 | lastWriteDate: 0
rocketchat_1 | },
rocketchat_1 | 'rc03.example.com:27017' => ServerDescription {
rocketchat_1 | _hostAddress: HostAddress {
rocketchat_1 | isIPv6: false,
rocketchat_1 | host: 'rc03.example.com',
rocketchat_1 | port: 27017
rocketchat_1 | },
rocketchat_1 | address: 'rc03.example.com:27017',
rocketchat_1 | type: 'Unknown',
rocketchat_1 | hosts: [],
rocketchat_1 | passives: [],
rocketchat_1 | arbiters: [],
rocketchat_1 | tags: {},
rocketchat_1 | minWireVersion: 0,
rocketchat_1 | maxWireVersion: 0,
rocketchat_1 | roundTripTime: -1,
rocketchat_1 | lastUpdateTime: 1166886,
rocketchat_1 | lastWriteDate: 0
rocketchat_1 | }
rocketchat_1 | },
rocketchat_1 | stale: false,
rocketchat_1 | compatible: true,
rocketchat_1 | heartbeatFrequencyMS: 10000,
rocketchat_1 | localThresholdMS: 15,
rocketchat_1 | setName: 'rc010',
rocketchat_1 | logicalSessionTimeoutMinutes: undefined
rocketchat_1 | }
rocketchat_1 | }
As I am using a real-life mongo cluster, this is my compose.yml:
volumes:
rocketchat:
services:
rocketchat:
image: registry.rocket.chat/rocketchat/rocket.chat:${RELEASE:-latest}
restart: unless-stopped
environment:
MONGO_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/rocketchat?authSource=admin&replicaSet=rc010&w=majority&directConnection=true
MONGO_OPLOG_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/local?authSource=admin&replicaSet=rc01&directConnection=true
ROOT_URL: https://chat.example.com
PORT: ${PORT:-3000}
DEPLOY_METHOD: docker
DEPLOY_PLATFORM: ${DEPLOY_PLATFORM}
expose:
- ${PORT:-3000}
ports:
- "${BIND_IP:-0.0.0.0}:${HOST_PORT:-3000}:${PORT:-3000}"
volumes:
- /var/rocketchat:/uploads
The mongo_url worked before; I just added &directConnection=true
to the url, to no avail. My .env file:
RELEASE=5.1.2
DOMAIN=chat.example.com
I can confirm Mongo is up an in the version:
mongodb-org-database-tools-extra-5.0.12-1.el8.x86_64
mongodb-org-tools-5.0.12-1.el8.x86_64
mongodb-org-mongos-5.0.12-1.el8.x86_64
mongodb-org-database-5.0.12-1.el8.x86_64
mongodb-mongosh-1.5.4-1.el8.x86_64
mongodb-database-tools-100.6.0-1.x86_64
mongodb-org-server-5.0.12-1.el8.x86_64
mongodb-org-shell-5.0.12-1.el8.x86_64
mongodb-org-5.0.12-1.el8.x86_64
The 3-node cluster of rc01, rc02 and rc03 are both running one instance of rocketchat and one instance of mongo. There are no firewalls between the nodes. Like it said, it does work with 4.8.1, not with 5.x. Switching docker to 4.x makes it work, upgrading to 5.x breaks it.
It must be something trivial. Any help?
-Chris.
MONGODB_ADVERTISED_HOSTNAME: rc01.example.com
to compose.yml, same result. I am fishing with what I should set that to.
It’s great that you took the time to post a 1 1/2 hour extensive video; but time would not be amiss if you post a small “Readme before Update” somewhere. I really appreciate it, tho!
And if you could walk with me just a tad further that’s also appreciated!
Yeah, the idea was to do a smaller video, but there was too much about it, and there is no one solution that fits all, specially when using multiple replicasets, so it’s better to understand the problem on it’s core so you can choose what is the best solution.
try checking what is the members config in your mongo and reconfigure it to point to the correct hostname, instead of localhost, or 127.0.0.1
I think that’s what wrong here. I am not running mongo in a docker container, but natively in the guest OS on the docker hosts. Only RocketChat is run inside a docker-container. I assume
MONGODB_ADVERTISED_HOSTNAME
is a variable for a possible mongo-docker container and not rocketchat?
That is right.
or you can change the configuration of the members of your replica set.
but note, this is more a environment/deployment issue than a Rocket.Chat one =\
Also consider that mongo 5.0 has some CPU incompatibilities. For now we are targeting mongo 4.4, but unless you face this CPU incompatibility, you should be fine.
Thanks!
replSetName: rc01
Thats the entire confirmation from mongod.conf, all the rest is standard. Did you change anything in your mongo-docker-container?
Help.
The strange part is that Rocket.Chat seems to not be able to reach your MongoDB at
rc01.example.com:27017
Are they on the same network or is there any firewall between them?
What is the content of the members of your replicaset?
Thanks for continuing to help me. So much appreciated!
The rocketchat container 1 is on host 1 and connecting to mongodb on host 1.
The rocketchat container 2 is on host 2 and connecting to mongodb on host 2.
The rocketchat container 3 is on host 2 and connecting to mongodb on host 3.
Well, in a perfect world. Only one mongo is primary, that’s why I am using
MONGO_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/rocketchat?authSource=admin&replicaSet=rc010&w=majority&directConnection=true
In the config. There are no firewalls in between. Like I said, this setup does work with 4.8.1 currently. Upgrading (only the) rocketchat docker container breaks it, going back to 4.8.1 makes it working again. I can also see access from the upgraded docker containers to mongo, so connectivity is not an issue.
It must be something inside docker container for rocketchat 5.x.
Addendum:
Here is the mongod.log lines for the upgraded docker-container.
{"t":{"$date":"2022-09-23T09:17:55.153+02:00"},"s":"I", "c":"NETWORK", "id":51800, "ctx":"conn13894","msg":"client metadata","attr":{"remote":"10.100.0.101:43520","client":"conn13894","doc":{"driver":{"name":"nodejs","version":"4.3.1"},"os":{"type":"Linux","name":"linux","architecture":"x64","version":"4.18.0-372.26.1.el8_6.x86_64"},"platform":"Node.js v14.19.3, LE (unified)|Node.js v14.19.3, LE (unified)"}}}
{"t":{"$date":"2022-09-23T09:18:14.615+02:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn13894","msg":"Connection ended","attr":{"remote":"10.100.0.101:43520","uuid":"22136257-990b-46d4-a143-7a8b502d2b95","connectionId":13894,"connectionCount":73}}
{"t":{"$date":"2022-09-23T09:18:14.615+02:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn13893","msg":"Connection ended","attr":{"remote":"10.100.0.101:60100","uuid":"fa7772e0-40b5-44ab-b7fc-49cb417de741","connectionId":13893,"connectionCount":72}}
{"t":{"$date":"2022-09-23T09:18:16.882+02:00"},"s":"I", "c":"NETWORK", "id":22943, "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.100.0.101:58386","uuid":"182e808a-d530-4e18-9032-e06a5052de1f","connectionId":13895,"connectionCount":73}}
{"t":{"$date":"2022-09-23T09:18:16.893+02:00"},"s":"I", "c":"NETWORK", "id":51800, "ctx":"conn13895","msg":"client metadata","attr":{"remote":"10.100.0.101:58386","client":"conn13895","doc":{"driver":{"name":"nodejs","version":"4.3.1"},"os":{"type":"Linux","name":"linux","architecture":"x64","version":"4.18.0-372.26.1.el8_6.x86_64"},"platform":"Node.js v14.19.3, LE (unified)|Node.js v14.19.3, LE (unified)"}}}
{"t":{"$date":"2022-09-23T09:18:27.407+02:00"},"s":"I", "c":"NETWORK", "id":22943, "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.100.0.101:33260","uuid":"becf7b18-33e7-41d0-b762-161563e1acac","connectionId":13896,"connectionCount":74}}
{"t":{"$date":"2022-09-23T09:18:27.412+02:00"},"s":"I", "c":"NETWORK", "id":51800, "ctx":"conn13896","msg":"client metadata","attr":{"remote":"10.100.0.101:33260","client":"conn13896","doc":{"driver":{"name":"nodejs","version":"4.3.1"},"os":{"type":"Linux","name":"linux","architecture":"x64","version":"4.18.0-372.26.1.el8_6.x86_64"},"platform":"Node.js v14.19.3, LE (unified)|Node.js v14.19.3, LE (unified)"}}}
{"t":{"$date":"2022-09-23T09:18:36.871+02:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn13896","msg":"Connection ended","attr":{"remote":"10.100.0.101:33260","uuid":"becf7b18-33e7-41d0-b762-161563e1acac","connectionId":13896,"connectionCount":73}}
{"t":{"$date":"2022-09-23T09:18:36.872+02:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn13895","msg":"Connection ended","attr":{"remote":"10.100.0.101:58386","uuid":"182e808a-d530-4e18-9032-e06a5052de1f","connectionId":13895,"connectionCount":72}}
So it can connect, but fails going forward.
I noticed a typo:
MONGO_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/rocketchat?authSource=admin&replicaSet=rc010&w=majority&directConnection=true
I changed “replicaSet=rc010” to “replicaSet=rc01” and it went a little further. Oddly, I copy&pasted the configuration from the old compose.yml to the new one. The typo is in the old compose file, but its working there, failing in 5.x.
With that I tried to connect to server 01, and fails because it’s not master. Which is correct, 02 is currently. The old docker container made the connect to 02 then, rocketchat 5.x container fails on the spot.
By sheer desperation I tried
MONGO_URL: mongodb://rc02.example.net:27017/rocketchat?replicaSet=rc01&w=majority&directConnection=true
MONGO_OPLOG_URL: mongodb://rc02.example.net:27017/local?replicaSet=rc01&directConnection=true
Leaving only the current master in there. It went further, but I stopped it before it got to upgrade the schemas. I can’t take a downtime during the day
Did you change supplying multiple mongo servers in some way?