Morning all,
I am currently using watchdog version 5.13 and I want to control the temperature-device.
Here my watchdog.conf :
I put 60°C on purpose for testing.
I start watchdog -v
and here I join the /var/log/message.
As you can see, it can not print the correct temperature (0°C) instead of the value on direct access (cat /sys/class/thermal/thermal_zone0/temp).
Well, I can not modifying my ticket and temperature_cpu is empty.
So this is a replacement file : messages.txt.
Have a good day,
AR
Can you post the results of reading the temperature directly? E.g. this sort of thing:
cat /sys/class/thermal/thermal_zone0/temp > /tmp/result.txt
Also check this when it is running, as there is a known bug that the /sys/class stuff may be enumerated randomly if you have multiple sensor chips and the modules are loaded in parallel (when auto-detected, for speed) rather than in prescribed order from all being in /etc/modules or similar.
Regards, Paul
Just looked again at your watchdog.conf file and see it is using "temperature-device" suggesting it is not up to date compared to the current GIT code, as that is using the temperature-sensor keyword to distinguish between the old binary read and the new ASCII read of temperature.
If you can, I suggest you get the current code and rebuild it on your chosen machine and try again. There is a summary of the procedure in my answer to this question:
http://askubuntu.com/questions/272788/how-can-i-tell-how-long-for-watchdog-to-wait-to-stop-all-processes
Hope that helps. Regards, Paul
Morning Paul,
First of all, thanks you for the reply.
The result of "cat /sys/class/thermal/thermal_zone0/temp > /tmp/result.txt" is correct and always give me the current temperature.
I did not include any driver(/module) of my sensor, just access via /sys/class/..., I am asking to my seller which sensor is used in the processor.
I am currently using yocto to build my OS (http://layers.openembedded.org/layerindex/recipe/122/) and I probably miss something with "temperature-device".
We keep in touch,
Regards, AR
Last edit: AR 2014-04-18
I dl watchdog_5.13 and watchdog.conf.5 contains :
"temperature-device = <temp-dev>
Set the temperature device name. Default is to disable temperature checking."
What did I do wrong?
Is watchdog_5.13 is not the current GIT code?</temp-dev>
The problem with "version 5.13" comes from the way the GIT copy is updated and distributed. Most distros (like Ubuntu, etc) will get the snapshot that Michael generated on 1 Feb 2013 using the download from:
https://sourceforge.net/projects/watchdog/
This is internally version 5.13, however, the GIT repository has various patches/updates that have been applied since then, and one of those was for the lm-sensors style of temperature monitoring:
https://sourceforge.net/p/watchdog/code/ci/033febf84a12268fece6bfb95b2f5c4fe7e9e742/
Unfortunately, the version number in GIT is still 5.13 as Michael normally updates it when he decides that a new "stable" version is worth releasing.
If you want the latest of the officially maintained code here, then use the command:
git clone git://git.code.sf.net/p/watchdog/code watchdog-code
You may need to install the git program, for a typical Debian/Ubuntu flavour of Linux you would do (as root, or using sudo):
apt-get install git
This will create the sub-directory 'watchdog-code' and populate it with the source code, etc, so use the command in a sensible place, such as one you normally build code in, or in ~/Downloads or similar.
Once you have made the copy, you need to configure it, etc, and probably will need to install the supporting packages, as I reported in the earlier message.
Regards,
Paul
Version 5.13 was released and tagged as such on Feb 1st last year. The git archive does contain a lot of changes for the next release in HEAD. Actually I was planning to do an interim to get the current version out, but need some time to add some more patches first.
In other words, using 5.13 always means using the released version. If you're following git, please point that out. The watchdog version number, as Paul correctly pointed out, is only update when the release is done.
Morning all,
You were right, I update watchdog to sha="4bf2a299fc2dce711f2261c40edc06f4c87fa16d", and that works now.
Put temperature-device = /sys/class/thermal/thermal_zone0/temp
max-temperature = 50
on watchdog.conf (notice the "temperature-device", "temperature-sensor" is not recognized)
and now I have got the following log :
Mar 29 17:48:43 foo daemon.info watchdog[1510]: still alive after 6146 interval(s)
Mar 29 17:48:43 foo daemon.info watchdog[1510]: current temperature is 0.044 for /sys/class/thermal/thermal_zone0/temp
Mar 29 17:48:43 foo daemon.info watchdog[1510]: still alive after 6147 interval(s)
Mar 29 17:48:43 foo daemon.info watchdog[1510]: current temperature is 0.044 for /sys/class/thermal/thermal_zone0/temp
You can close the bug if you want.
Thanks you,
AR
Good to hear that you have something working, but this is not looking right:
"current temperature is 0.044 for /sys/class/thermal/thermal_zone0/temp"
I presume that should be 44 deg C but we are seeing it scaled down by 1000 times. This suggests your temperature sensor is not reporting in milli-Celcius like the lm-sensors packages should!
Can you tell us exactly what is shown by "cat /sys/class/thermal/thermal_zone0/temp" ?
This may not be a watchdog problem, of course, but we really want to check this point.
Regards,
Paul
Morning Paul,
Could you tell me what should convert the temperature?
I add all the following package to my image
lmsensors-libsensors
lmsensors-libsensors-dbg
lmsensors-libsensors-dev
lmsensors-libsensors-staticdev
lmsensors-libsensors-doc
lmsensors-sensors
lmsensors-sensors-dbg
lmsensors-sensors-doc
lmsensors-sensord
lmsensors-sensord-dbg
lmsensors-sensord-doc
lmsensors-sensorsdetect
lmsensors-sensorsdetect-doc
lmsensors-sensorsconfconvert
lmsensors-config-fancontrol
lmsensors-config-libsensors
lmsensors-config-sensord
I started sensord,
but the convertion did not works...
Do you have any idea?
Here some information :
Apr 28 08:54:30 foo daemon.info watchdog[1537]: current temperature is 0.046 for /sys/class/thermal/thermal_zone0/temp
root@foo:~# cat /sys/class/thermal/thermal_zone0/temp
46
/usr/sbin/sensors-detect end with seg fault...
Best regards,
AR
Dear AR,
Unfortunately I don't know enough about the lm-sensors software, or your particular computer's configuration, to be of much help. However, the lm-sensors forum is quite active and you may be able to get some help there.
The reading of "46" from the temperature is in keeping with the watchdog's result of 0.046, but not in keeping with the expectation of the units being milli-Celsius. For example, this documentation:
https://www.kernel.org/doc/Documentation/thermal/sysfs-api.txt
"temp
Current temperature as reported by thermal zone (sensor).
Unit: millidegree Celsius
RO, Required
"
So my conclusion is this is not a watchdog bug, but unfortunately some bug in the temperature system in your configuration.
Regards,
Paul