Starting Elasticsearch Service fail after upgrade - Graylog Central (peer support)

link管理
链接快照平台
输入网页链接，自动生成快照
标签化管理网页链接
相关文章推荐
单身的跑步鞋 · 科技赋能，让千年壁画动起来（传承之光）--新 ...· 3 月前 ·
咆哮的勺子 · 高颜值的Python版WGCNA分析和蛋白质 ...· 5 月前 ·
安静的手链 · 记者手记：困在历史中的摩尔多瓦-中新网· 6 月前 ·
狂野的泡面 · 治理教育培训机构乱收费-中国法院网· 6 月前 ·
孤独的咖啡 · Forums· 6 月前 ·
Starting Elasticsearch Service fail after upgrade graylog from 4.2.7 to 4.2.8 version
This in the message that i see after reboot of the system:
apr 15 09:08:34 xxxxxxxx systemd[1]: Starting Elasticsearch...
apr 15 09:09:49 xxxxxxxx systemd[1]: elasticsearch.service: start operation timed out. Terminating.
apr 15 09:09:50 xxxxxxxx systemd[1]: elasticsearch.service: Failed with result 'timeout'.
apr 15 09:09:50 xxxxxxxx systemd[1]: Failed to start Elasticsearch.
xxxxxxxx@xxxxxxxx:~$ sudo systemctl status elasticsearch.service
● elasticsearch.service - Elasticsearch
     Loaded: loaded (/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled)
     Active: failed (Result: timeout) since Fri 2022-04-15 09:09:50 CEST; 18s ago
       Docs: https://www.elastic.co
    Process: 759 ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quiet (code=exited, status=143)
   Main PID: 759 (code=exited, status=143)
Fortunately it happened on the test system and not on the main server

I did the same operation on two other graylog systems identical to the one that had a bad outcome and in the other cases the problem did not manifest itself

In the case failed by restarting the service manually everything goes back to working
What could it have depended on?

I don’t have much faith in updating on the main server before figuring out how to fix the problem highlighted should it re-emerge

I would not like to have to resort to manual restart of the service
Thanks for taking the time
              Hello @alessio.dapelo
It might have been couple different reasons why it failed to start.

I need to ask a couple question.

Elasticsearch failed to start  what did the log files show before and after the failed restart?
vi /var/log/elasticsearch/elqsticsearch.log
When this issue occurred  by chance  was the  Journalctl checked? Journalctl is to view a Systemd Logs.
root#  journalctl
 alessio.dapelo:
In the case failed by restarting the service manually everything goes back to working
Correct me if I’m wrong but after you restart elasticsearch it started working again?

If so, it might be something with the system and not with Graylog, not 100% sure. I would need to see more data. If this is correct it doesn’t seam to be a configuration issue.
 alessio.dapelo:
What could it have depended on?
A complete  scan of any logs files that could possibly show an interruption of  elasticsearch service.
 alessio.dapelo:
I would not like to have to resort to manual restart of the service
Not sure how this upgrade && update was executed. More details would be needed.
Checklist:
1.Insure you have elasticsearch starting up on reboot.

systemctl enable elasticsearch
2.When upgrades are applied, it is suggested that Elasticsearch starts first, wait till the service is fully operational, then start MongoDb service. Once both of those service are running fine, next step is to start Graylog service. These steps have never failed me.
Hope that helps
EDIT:  This also may help
How to prevent systemd service start operation from timing out | sleeplessbeastie's notes
Hello @alessio.dapelo
It might have been couple different reasons why it failed to start.

I need to ask a couple question.

Elasticsearch failed to start what did the log files show before and after the failed restart?
vi /var/log/elasticsearch/elqsticsearch.log
When this issue occurred by chance was the Journalctl checked? Journalctl is to view a Systemd Logs.
root#  journalctl
 alessio.dapelo:
In the case failed by restarting the service manually everything goes back to working
Correct me if I’m wrong but after you restart elasticsearch it started working again?

If so, it might be something with the system and not with Graylog, not 100% sure. I would need to see more data. If this is correct it doesn’t seam to be a configuration issue.
 alessio.dapelo:
What could it have depended on?
A complete scan of any logs files that could possibly show an interruption of elasticsearch service.
 alessio.dapelo:
I would not like to have to resort to manual restart of the service
Not sure how this upgrade && update was executed. More details would be needed.
Checklist:
1.Insure you have elasticsearch starting up on reboot.

systemctl enable elasticsearch
2.When upgrades are applied, it is suggested that Elasticsearch starts first, wait till the service is fully operational, then start MongoDb service. Once both of those service are running fine, next step is to start Graylog service. These steps have never failed me.
Hope that helps
EDIT: This also may help
Thanks for your answer.

I’m trying to analyze the logs you indicated.

To answer some of your questions, I can tell you that by restarting the elasticsearch service everything will work again.

As for how the update was done, I can answer that having set the various repositories I simply ran the commands apt-get update upgrade and dist-upgrade from the terminal saying not to replace the server.conf file.

I performed the same procedure on two other identical virtual machines without running into this problem.
              Could it be useful to increase the waiting time before reporting a timeout?

This is my current configuration
XXXX @ XXXX: ~ $ sudo systemctl show elasticsearch | grep ^ Timeout
[sudo] password for XXXX:
TimeoutStartUSec = 1min 15s
TimeoutStopUSec = infinity
TimeoutAbortUSec = infinity
TimeoutCleanUSec = infinity
XXXX @ XXXX: ~ $
 alessio.dapelo:
Could it be useful to increase the waiting time before reporting a timeout?
Not sure that will help. We need to know exactly why that service failed. The reason there is a timeout is because systemd  tries to restart or start the service and at the end of those tries systemd  is unable to start that service, it will timeout.
So increasing the timeout may only delay the inevitable  which is the error were seeing.
What is needed is more logs to identify the issue.
Executing  systemctl status elasticsearch -l may show  more details on this issue.

Also using Journalctl   to show system logs are helpful in troubleshooting  services.
Examples:
journalctl -ef

Jump to the end of the journal (-e, and enable follow mode (-f).
journalctl -u elasticsearch

This will display all messages generated by, and about, the elasticsearch.service.
 alessio.dapelo:
As for how the update was done, I can answer that having set the various repositories I simply ran the commands apt-get update upgrade and dist-upgrade from the terminal saying not to replace the server.conf file.
I have done that also, but now when I upgrade/update  Graylog I make it a point to shutdown Graylog service run updates then start graylog back up and tail -f  logs files. Few time I caught problems/issue  that manifest  during the start-up process. If Graylog  is the only service  that requires update/s I do not stop the service, this would also depend on what version is going to be installed.
With Elasticsearch I have a tendency to check out the release notes on that version to insure a easy upgrade process. Sometime it may require a restart on ES service. This also depends what ES version is being updated to.
              I’ll do some more checks on the system logs

At the moment using the command
sudo systemctl edit --full elasticsearch.service
I increased the timeout time and restarting several times to test the virtual machine the elasticsearch service has always started regularly