Ok, I have had nothing but grief after upgrading to 24.9.5 and the problems are still continuing weeks after the upgrade. The mail server randomly stops responding with "unable to start zmconfigd" messages and when I restart the server I get this:
Host mail.xxx.ca Starting directory server...Done. Starting config service...Failed. Starting zmconfigd...Failed to start zmconfigd. Starting mailbox...Done. Starting memcached...Done. Starting proxy...Done. Starting amavis...Done. Starting antispam...Done. Starting antivirus...Done. Starting opendkim...Done. Starting mta...Done. Starting stats...Done. Starting service webapp...Done.
So I issue the following command to check the status:
zextras@mail:~$ zmcontrol status Host mail.xxx.ca amavis Running antispam Running antivirus Running directory-server Running mailbox Running memcached Running mta Running opendkim Running proxy Running service webapp Running service-discover Running stats Running config service Running
And lo and behold, the server reports that it is running normally. But it's not, because when we look at the logs we see:
Oct 25 10:33:11 mail zmconfigd[1499788]: Command not defined for service-discover Oct 25 10:33:07 mail zmconfigd[1499788]: Command not defined for directory-server Oct 25 10:32:50 mail zmconfigd[1499788]: All configs fetched in 0.11 seconds Oct 25 10:32:50 mail zmconfigd[1499788]: Fetching All configs Oct 25 10:31:49 mail zmconfigd[1499788]: All restarts completed in 0.00 sec Oct 25 10:31:49 mail zmconfigd[1499788]: All rewrite threads completed in 0.02 sec Oct 25 10:31:49 mail zmconfigd[1499788]: Watchdog: service antivirus status is OK. Oct 25 10:31:46 mail zmconfigd[1499788]: Command not defined for service-discover Oct 25 10:31:43 mail zmconfigd[1499788]: Command not defined for directory-server Oct 25 10:31:16 mail zmconfigd[1499788]: All configs fetched in 0.36 seconds Oct 25 10:31:16 mail zmconfigd[1499788]: Fetching All configs Oct 25 10:30:15 mail zmconfigd[1499788]: All restarts completed in 0.00 sec Oct 25 10:30:15 mail zmconfigd[1499788]: All rewrite threads completed in 0.13 sec Oct 25 10:30:14 mail zmconfigd[1499788]: Watchdog: service antivirus status is OK. Oct 25 10:30:10 mail zmconfigd[1499788]: Command not defined for service-discover Oct 25 10:30:04 mail zmconfigd[1499788]: Command not defined for directory-server Oct 25 10:29:48 mail zmconfigd[1499788]: All configs fetched in 0.10 seconds Oct 25 10:29:48 mail zmconfigd[1499788]: Fetching All configs Oct 25 10:28:46 mail zmconfigd[1499788]: All restarts completed in 0.00 sec Oct 25 10:28:46 mail zmconfigd[1499788]: All rewrite threads completed in 0.01 sec Oct 25 10:28:46 mail zmconfigd[1499788]: Watchdog: service antivirus status is OK. Oct 25 10:28:43 mail zmconfigd[1499788]: Command not defined for service-discover Oct 25 10:28:42 mail zmconfigd[1499788]: Command not defined for directory-server
So I issued a command to register the service, like so:
zextras@mail:~$ zmprov ms `zmhostname` -zimbraServiceEnabled zmconfigd -zimbraServiceInstalled zmconfigd
Doesn't work though, as I am still getting the same log errors as before, which can only mean that the server WILL AGAIN crash at some indeterminate point in the near future! This is really becoming a serious problem for me as I am losing faith in Carbonio as a stable platform. Any ideas how I can fix this?
Could you provide more information about your environment?. And also provide some logs about the time the crash occurs?
Just completed an uneventful upgrade to 24.9.7. Let's hope this one works out better than the last upgrade (I'm looking at you, broken ldap module!). BTW, I upgraded with all services running.
glad to hear it worked smoothly. Keeping services running (especially OpenLDAP) during the upgrade is the way to go now.
That should be explicitly mentioned in the upgrade instructions btw.Keeping services running (especially OpenLDAP) during the upgrade is the way to go now.
we removed the zmcontrol stop command from the upgrade procedure, which was responsible for stopping (among others) the OpenLDAP service, which we believe suffices. If you don't agree, we'll be happy to discuss this further.
Happy to. I used this upgrade guide which does not explicitly state that you should keep your server running before you upgrade to 24.9.7. This is at odds with previous upgrade which required server shutdown, so it might confuse people. I only kept the server running as I upgraded because I remembered my nightmare from the previous upgrade with the broken LDAP module, where the fix was to keep your server running when you re-installed the fixed LDAP.
Step one could be:
1. Verify that Carbonio is running and no critical errors with
$ zmcontrol status
# systemctl status carbonio*