I've noticed that some of our Raspberries running Octoprint have issues sending web request. It often returns "getrandom() initialization failed". This happens for both Octoprint itself, and a plugin I've made that relies heavily on web requests.
Examples of the error; (from the log)
octoprint.plugins.announcements - ERROR - Could not fetch channel _important from https://octoprint.org/feeds/important.xml: HTTPSConnectionPool(host='octoprint.org', port=443): Max retries exceeded with url: /feeds/important.xml (Caused by SSLError(SSLError("bad handshake: Error([('', 'osrandom_rand_bytes', 'getrandom() initialization failed.')],)",),))
SSLError: HTTPSConnectionPool(host='plugins.octoprint.org', port=443): Max retries exceeded with url: /notices.json (Caused by SSLError(SSLError("bad handshake: Error([('', 'osrandom_rand_bytes', 'getrandom() initialization failed.')],)",),))
octoprint.plugins.SimplyPrint - INFO - [SimplyPrint] - Web request FAILED; URLError = getrandom() initialization failed. (_ssl.c:661)
Googlin' has lead me to this Stackoverflow question regarding the same issue, where the recommended fix is installing rng-tools. I am yet to test whether installing this makes a difference for my printers that has the issue, but they are all currently mid-print.
If this was only a problem for my plugin, I'd just deal with it locally, but seeing it affects Octoprint core functions I thought it better to bring it up here. Has anyone here experienced this issue?
From what I understand, there's a randomization service which works behind-the-scenes in a variety of situations. I would guess that one of those situations is when a new HTTPS session key needs to be generated if you have code which tries to reach out to another website.
At the beginning of OctoPrint's startup and especially first boot of the day, it tries to reach back to the OctoPrint newsfeed to ask for a number of things. In some cases, the server is busy and can't be reached; there are lots of OctoPrint instances out there asking the same feed-related questions.
In other cases, though, I've seen messages in perhaps the syslog which appear to be something like "whatever I'm using to generate random bits has been exhausted and I've decided to fail during this time segment". I believe that the average implementation of this same library uses mouse and keyboard events to generate some randomness so as to seed the generator. In a typical headless OctoPrint computer, though, the service might be starved of this input.
But this isn't just a problem with octoprint not getting the newsfeed announcements and such. It also happens with my plugin when requesting a server that has no issues what so ever. This is not the servers' fault, but the request service (in my case, urllib2 in Python, but it seems that other request methods have the same issues, fx. requests).
I think you're missing the point. You review the OctoPrint logs so this is your relationship with this systemic error. The error is probably also reported in your /var/log/syslog on any given day as well. In theory, it affects not only any HTTPS from any service running on there, it should also negatively affect anything that needs something random and those outside the realm of web traffic.
I might also ask if any of these are on Raspberry Pi 4B computers. The thought here is that the nifty/new ARM Cortex-A72 isn't represented in the compiled hardware list for the rng-tools, yet.
Yeah I get that, but this doesn't just seem to be a problem that I am having, but potentially everyone could experience. So in terms of trying to make sure it doesn't happen - how would I / we go about that?
The stackoverflow question I linked might be the solution. I'm trying a lot of different things right now. First step; using the requests library and setting verify=False which has helped in other cases that had similarities.
Oh, alright. I'm just seeing more incidents of this "starvation" in the Pi 4B threads out there. For what it's worth, the Cortex M-4 specs indicate hardware-based real random number generation is included there. So any inability to generate entropy suggests to me that the hardware-based seeder isn't bound to the driver.
All regular old Rpi 3's Most requests works, but in the odd cases where they don't (which often leads to entire periods of it not working for a while), it's quite fatal. I'll try some more possible solutions.
True true. In our case we have our own OctoPi-cloned RPI image, so we'll just include the fix ourselves (if it indeed is installing the mentioned package ^), but would be nice with a fix for the public image too.
In our case, we send web requests on events like PrintDone, and if our system doesn't get that message, a lot of stuff goes wrong (print is never recorded done etc etc)
This is still happening on a regular basis. Now all ours printers have experienced the issue at least once. @foosel is this something you have stumbled upon before?
We have also tied this to issues resulting in the printer not stopping correctly. Octoprint knows the print is stopped, but the printer still think it's printing. Once the printer gets any new gcode the printer gets that it's no longer printing, but this also means it often doesn't heat down (depending on your "end gcode" in the slicer)
Little extra info. Installing the pyopenssl package did not help. Nor did installing the rng-tools package, using Python requests to request without verifying the SSL (requests.get(url, allow_redirects=True, verify=False)) - which had fixed it for our local python scripts, but doesn't work in this case. Next up is to try installing haveged.
So far a mix of the python package pyopenssl, the apt packages rng-tools and haveged and using the verify=False option in the requests.get method seem to have done the trick. I'm quite sure one of the apt packages are unnecessary, as they're supposed to do the same thing. Will do some testing later on which one does the job - if not both. It'd be nice if verify=False wasn't nessecary - I will do some testing to see if it still is.
Now official Octoprint requests never fail, and nor does requests from my external plugin. I'm sceptical on whether this will return or not though, as it's been an issue for many months, and has only not been a problem for about a day.
I will note that I sometimes get failed RSS pulls from time to time on my own rig. I assumed that it was due to server contention but this might be the actual cause.