ERROR - Could not scan /dev for serial ports on the system

I am running the latest Octoprint on a RPI5/Bookworm.

I notice that I am getting the following error message in my octoprint.log file, repeated approximately ONCE A SECOND!!!!

2024-06-03 19:32:54,153 - octoprint.util.comm - ERROR - Could not scan /dev for serial ports on the system
Traceback (most recent call last):
  File "/home/myuser/octoprint/venv/lib/python3.11/site-packages/octoprint/util/comm.py", line 212, in serialList

The relevant code in comm.py is:

def serialList():
    if os.name == "nt":
        candidates = []
        try:
            key = winreg.OpenKey(
                winreg.HKEY_LOCAL_MACHINE, "HARDWARE\\DEVICEMAP\\SERIALCOMM"
            )
            i = 0
            while True:
                candidates += [winreg.EnumValue(key, i)[1]]
                i += 1
        except Exception:
            pass

    else:
        candidates = []
        try:
            with os.scandir("/dev") as it:
                for entry in it:
                    if regex_serial_devices.match(entry.name):
                        candidates.append(entry.path)
        except Exception:
            logging.getLogger(__name__).exception(
                "Could not scan /dev for serial ports on the system"
            )
...

Looking back on historical logs, this problem started occurring in mid-March.
I am not aware of any changes that occurred then.

Not sure if this is relevant, but I run and always have run octoprint in user space (non-root) without any problems.

Deleting the template does not liberate you from posting essential information:

  • What printer,
  • What OctoPrint and OctoPi version
  • Systeminfo bundle.
  • Printer: Prusa MK3S+
  • octoprint, version 1.10.1
  • I don't use Octopi

octoprint-systeminfo-20240603232316.zip (92.2 KB)

Also, I noticed the following frequent repeats in my octoprint.log file:

2024-06-03 19:45:53,613 - octoprint.server - INFO - Serial port list was updated, refreshing the port list in the frontend
2024-06-03 19:45:53,700 - octoprint - ERROR - Exception on /api/connection [GET]
  File "/home/kosowsky/octoprint/venv/lib/python3.11/site-packages/flask/app.py", line 2529, in wsgi_app
  File "/home/kosowsky/octoprint/venv/lib/python3.11/site-packages/flask/app.py", line 1825, in full_dispatch_request
  File "/home/kosowsky/octoprint/venv/lib/python3.11/site-packages/flask/app.py", line 1823, in full_dispatch_request
  File "/home/kosowsky/octoprint/venv/lib/python3.11/site-packages/flask/app.py", line 1799, in dispatch_request
  File "/home/kosowsky/octoprint/venv/lib/python3.11/site-packages/octoprint/vendor/flask_principal.py", line 196, in _decorated
  File "/home/myuser/octoprint/venv/lib/python3.11/site-packages/octoprint/server/api/connection.py", line 27, in connectionState
  File "/home/myuser/octoprint/venv/lib/python3.11/site-packages/octoprint/server/api/connection.py", line 76, in _get_options
  File "/home/myuser/octoprint/venv/lib/python3.11/site-packages/octoprint/printer/profile.py", line 357, in get_all
  File "/home/myuser/octoprint/venv/lib/python3.11/site-packages/octoprint/printer/profile.py", line 483, in _load_all
  File "/home/myuser/octoprint/venv/lib/python3.11/site-packages/octoprint/printer/profile.py", line 500, in _load_all_identifiers
OSError: [Errno 24] Too many open files: '/home/kosowsky/.octoprint/printerProfiles'

I imagine that may be related????

Have you tested it in safe mode?

2024-06-03 19:32:53,150 - octoprint.util.comm - ERROR - Could not scan /dev for serial ports on the system
Traceback (most recent call last):
  File "/home/myuser/octoprint/venv/lib/python3.11/site-packages/octoprint/util/comm.py", line 212, in serialList
OSError: [Errno 24] Too many open files: '/dev'

Yes, that is indeed related. That's an issue on operating system level. If a reboot doesn't fix it, you could try running in safe mode in case a plugin is responsible for the open file count. It could however also be something else on your system, in which case you'll have to figure that out on your own, as we have no idea what else you might be running on there.

In the meantime, you can disable the serial port scan by setting serial.autorefresh to false in config.yaml (note to self: that needs documentation), however that will only stop that specific feature from running into the open file problem, not solve it. Your system will not function properly until that is solved.

When this occurs what does the following command yield?

sudo cat /proc/$(pgrep octoprint)/limits
sudo lsof -p $(pgrep octoprint)

Well.. What do you use?

[quote="kantlivelong, post:7, topic:59366, full:true"]

When this occurs what does the following command yield?

sudo cat /proc/$(pgrep octoprint)/limits
sudo lsof -p $(pgrep octoprint)

[\quote]
Will do!

I use Raspbian Bookworm for easy and simple maintenance

  • I have multiple PI across my house using the same basic setup
  • I run multiple (related) programs on each Pi

This makes it very easy for me to add/swap/move RPI's across my house as needed since I can be sure that the setup, updates, tools, security, etc. are the same everywhere. This is efficient and minimizes the number of separate RPi's and distinct setups that I need.

The only exception is HomeAssistant since it seems that if I run HA as a standalone app then I lose some of the core functionality.

OK - well rebooting last night stopped the problem (before I saw your post) and then this AM, I found that the problem recurred:

#sudo cat /proc/$(pgrep -f "/octoprint ")/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    0                    bytes
Max resident set          unlimited            unlimited            bytes
Max processes             30850                30850                processes
Max open files            1024                 1048576              files
Max locked memory         1055604736           1055604736           bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       30850                30850                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
#sudo lsof -p $(pgrep -f "/octoprint ")

See attached file since it's very long
lsof-octoprint.zip (18.6 KB)

The key lines though appear to be 827 copies of lines of the form:

python3 308476 myuser  982u     IPv4             838164      0t0     TCP mymachine.mydomain:http-alt->mymachine.mydomain:40800 (CLOSE_WAIT)

This might also explain why the web interface becomes unresponsive -- i.e., `http://mymachine:5000' just hangs without any response.

I also noticed in my octoprint log, that just before I started getting the error message:

OSError: [Errno 24] Too many open files: 

I had a slew of hundreds of lines of form:

2024-06-04 05:36:04,568 - backoff - ERROR - Giving up capture_jpeg(...) after 3 tries (requests.exceptions.ConnectionError: HTTPConnectionPool(host='octopi', port=8080): Max retrie
s exceeded with url: /?action=snapshot (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ffef82de1d0>: Failed to establish a new connection: [Errno 22]
 Invalid argument')))

So, it seems like capturing jpeg snapshots fail which somehow results in hung file http attempts and stale http file handles while the http snapshot access waits that causes the file limit to be exceeded.

This probably occurs since there is an issue in my camera setup where the camera freezes (randomly) after a certain amount of time...

The question though is why do those file handles remain open rather than being closed...

BTW, if I recall correctly, the snapshots that are being called every few seconds are being triggered by the OctoApp plugin that uses snapshots to send to my Android Gear watch.

Of course, I'm sure I could temporarily solve the problem by either removing the camera and/or disabling OctoApp but there seems to be a bug at the Octoprint App level (or elsewhere) that doesn't recover gracefully from hung jpeg snapshot requests.

It seems like it is not an OS issue. See my full comment below.

In summary though, I believe something like this happens.

  • OctoApp uses the http '/?action=snapshot' to generate a snapshot every few seconds that it sends to my Android wearable watch as a poor-man's video
  • However, my camera setup hangs after a random amount of time (this is a separate bug I am working on)
  • This causes multiple error messages of the following form in my octoprint log file
2024-06-04 05:36:04,568 - backoff - ERROR - Giving up capture_jpeg(...) after 3 tries (requests.exceptions.ConnectionError: HTTPConnectionPool(host='octopi', port=8080): Max retries exceeded with url: /?action=snapshot (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ffef82de1d0>: Failed to establish a new connection: [Errno 22] Invalid argument')))
  • These failed/hung snapshot http requests generate a listing in lsof of the form:
python3 308476 myuser  982u     IPv4             838164      0t0     TCP mymachine.mydomain:http-alt->mymachine.mydomain:40800 (CLOSE_WAIT)
  • After some 800+ of these stale file handles are stuck waiting, I start getting messages in my octoprint log of form:
OSError: [Errno 24] Too many open files: ...

and then repeated streams of errors like:

OSError: [Errno 24] Too many open files: '/dev'
2024-06-04 06:48:39,671 - octoprint.util.comm - ERROR - Could not scan /dev for serial ports on the system
  • And about that time the http interface http://mymachine:5000 becomes unresponsive

So the question is why the failed http snapshot requests hang and accumulate rather than being closed.

Of course, I could probably stop the problem by fixing/disabling my camera or disabling Octoapp, but I would like to fix the root cause which seems to be a bug in the way failed snapshots are handled.

Note that all of the above can be reset by restarting octoprint (without a reboot) and the problem persists until the cycle repeats

Note I can also manually verify that once the camera hangs, taking a snapshot adds a `CLOSE_WAIT' socket line to the lsof listing.

You do know the new nightly builds of octopi ARE Raspbian bookworm...

Yes - but I want to include and maintain the packages I generally need - not just the ones for Octopi. That way I upgrade and maintain all my Pi's identically. Note that it really has been trivial to install Octoprint on top of Raspbian.

I personally don't want to have install, learn, manage, customize, upgrade etc. another OS flavor for each app I run -- e.g., mythos, octopi, home assistant, etc.
There is a reason OS and Apps are distinct...

Have you actually isolated it back to the OctoApp plugin? There are a couple of other plugins that will generate that backoff line in octoprint.log. For example, I see you also have Obico, and that one will also do stuff with the camera.

As I mentioned above, the problem isn't the OctoApp plugin, that's just an example of something that calls snapshots enough times to exceed the number of open sockets allowed. Indeed, anything that generates snapshots using the '/?action=snapshot' url can trigger this problem. So, I can trigger the same bug if I manually, hit the link http:http://mymachine:8080/?action=snapshot some 800+ times after the camera freezes.

The question is why doesn't the socket free up when it hangs?

Now to be fair I am using a modified version of Obico webstream_stream.py (upgraded to Picamera2) to create the still frames (for snapshots) and the streaming (for video) but not sure why that would lead to the http sockets being kept open rather than released when the jpeg snapshot fails.
Said another way, the snapshots are being called out of Octoprint itself, so if they fail to return, shouldn't Octoprint release the resource?

It's not OctoPrint that's serving the snapshot so why would it be OctoPrint's job to kill those?

Guessing but it could be an issue where requests isn't closing the connection when it throws an exception. Curious if this happens without plugins while using core timelapse functionality.

That's a good reason, but running other apps on the pi while printing can cause stuttering as interrupts from the other apps hit the SOC etc. you're asking for more issues by loading the pi up with everything else.

1 Like