It can be said that I have something of an obsession with watchdogs. There are about six posts on the subject on this site and I am working on a couple of ongoing hardware watchdog projects. I should have installed a watchdog on the Raspberry Pi that is hosting my home automation system before now, but better late than never.
There is a hardware watchdog timer built into the Raspberry Pi SOC (system on a chip). So it should be enabled to restart the Raspberry Pi if the operating system crashes. In addition, I want to monitor Domoticz to restart the system if it should stop working while the operating system remains intact. There are two pages on the Domoticz site on the subject.
The latter uses Monit to monitor Domoticz. This is a general solution which could monitor many additional services. I will look into this latter. For now, I want to implement the basics:
- Reboot if the hardware watchdog times out.
- Reboot if Domoticz hangs.
- Send an email notification when the system reboots.
There is much information on the Web but some of it is out of date (as this post will be soon enough). The lesson is to try to check things out before installing anything. A case in point, the watchdog kernel module is already installed in the latest Raspbian image (Stretch Nov. 2018) and nothing needs to be done to that effect.
However it is not operational. To check this, I ran the "forkbomb" script.
After a while, the Raspberry Pi froze, Domoticz ceased functioning and I could no longer open an ssh session. The only option was to turn the off the power and then power the Raspberry Pi back up. To recover from such a situation, install the watchdog service and edit its configuration file.
The first two lines were already in the file but they were comments, so
the leading "#" at start of those two lines have to be removed. The third
line, about the timeout, is necessary because watchdog
sets a
default timeout of 60 seconds and the Raspberry Pi watchdog timer only
supports a 15 second timeout. If the line is not put in then the following
error will be encountered.
Many thanks to Florian Harr for this fix. Next, start the watchdog daemon and verify that everything is working correctly.
Running the "forkbomb" again showed that the watchdog is effective. After a while the open session froze, but eventually, the Raspberry Pi rebooted and a new ssh session could be opened. Looking at the log it was possible to see when the watchdog bit.
Interestingly, it looks like watchdog
could send a message
about the shutdown using sendmail
which could be the
answer to my third objective. I used a different approach as will be seen later.
For now, let us look at how to monitor Domoticz.
I followed the lead in Setting up the raspberry pi watchdog but without
turning on the Domoticz log. Instead the Linux trick of "touching" an empty file to update its
last modified date on a regular basis will be the way to "feed" the
watchdog. All it takes it a simple Luas script that will be executed
every minute by Domoticz.
Check the file time on a regular basis to ensure that it is updated every minute.
The watchdog configuration file has to be updated. Two lines at the top need to be changed.
Stop and restart the watchdog service and then wait over five minutes (300 seconds) to ensure that the system is not rebooted. Then stop the Domoticz service and the Raspberry Pi should be rebooted in about five minutes.
Heed the warning about reboots. [b]e sure to remember this when stopping the domoticz service by hand!.
So what about email notification? Again, I will use the approach proposed
in the first reference. It simply sends an email each time the Raspberry Pi
is rebooted. This is simpler than having watchdog
do it. I am
sure the latter is possible and it would then be possible to send a
different email pointing out the reason for the reboot. Something to do
later.
In section 3. Mail Alert Using syslog
of a recent post
I modified a short Python script to send a set email. I decided to reuse
that script making it a bit more versatile. This avoids installing
sendmail
which does look like a rather formidable task.
Here is the Python 3 script.
Downloadable version: pymail.
The script is saved as the file pymail
in the
pythons
directory of the pi
home directory. The
last bit is to add a cron
task to send the email at reboot
time.
As can be seen there was already a task performed at reboot but it does not matter if another is added. I did impose a minute wait period before sending the message. That may be excessive, but initially there were problems in sending the email.