After the update to 1.5.x I keep running into SerialExceptions

My failed after 5hrs with 1.4.2. too. My next step is to run it in safe and see if that works.

Communication error

There was a communication error while talking to your printer. Please consult the terminal output and octoprint.log for details. Error: Too many consecutive timeouts, printer still connected and alive?
Printer: Monoprice Maker Select Plus
Raspberry pi 4
octoprint.log (1.0 MB)

That's a different kind of error from a "SerialException" though - this one is you printer stopping responding to OctoPrint's attempts to keep a communication going, a SerialException is the whole connection breaking down on a level closer to the bare metal.

1 Like

Update 3: I have rebuilt the OctoPi instance using the latest OctoPi 0.17.0 image, upgraded the OctoPrint instance to 1.5.2, and upgraded Python to 3.7.3.

Third Party Addons installed and running:

Creality Temperature (1.2.4)
Creality-2x-temperature-reporting-fix (0.0.4)
PrintTimeGenious Plugin (2.2.7)
UI Customizer (0.1.1.1)

Happy to say after a five hour print, no serial issues. I shall continue to add plugins one by one and conduct a >3 hour print each time to ensure no issues.

Addons I plan to reinstall:
Bed Visualizer
Navbar Temperature Plugin
Octolapse

If all the above work as planned and generate no serial issues, I shall install the 'Tuya Smartplug' app.

I'll provide updates as I go.

@foosel, would you like me to try something specific or continue to provide updates?

I reloaded the micro-SD from scratch using the 0.18 Octopi and 1.4.2 of OctoPrint. Am going to hold off updating to any of the 1.5.x version for know till I see a fix or solution. Thanks for the help on the roll back, never could get it to work.

Let us know if you either continue to have issues, or if it is issue free. From our perspective, it would be nice if you could keep a backup of the current SD card (using an imaging tool) & try and update again to see if it causes the same problems... but I appreciate you may not want to. Depends, we still can't find a pattern to these problems.

1 Like

For now I still lack any kind of clue, so just keep the updates coming I guess. Needless to say I cannot reproduce this at all here.

1 Like

I don't see a serial exception there, that's a timeout. This topic was about SerialException only, to try and work out a pattern.

I'm gonna jump in here, even though I am not experiencing the problem right now. I try to run with the minimal amount of plug-ins anyway.

I think foosel is both correct (that she didn't change any code to the serial port handler), and incorrect that problem occurred when the upgraded code was added.

Here's my take, after years of writing embedded code (which is basically what Octoprint is) and dealing with multiple languages: I think we are seeing the real-time limitations of Python.

Python is a dynamic language running with two parts: A language interpreter, and a garbage collector. The garbage collector runs in the background cleaning and scavenging memory of the dynamic parts of the language, both of which are memory and CPU hogs. The language interpreter is also a hog, and that is interesting to note that even Java has a pre-compiler for performance reasons.

Serial port timeouts and serial exceptions is a sure indication that the software can't get back to the USART (the hardware that controls the USB) hardware quick enough to extract a byte out of the USART before the next one comes in. In this case, it is the Python library(or Raspbian) that is doing this based on a hardware interrupt by the USART. I don't know how the Python library is written, nor how preemptive scheduling works on Raspbian, but I suspect that the load of Python stuff is preventing the interrupt of the USART from extracting the data properly from the USART, thus giving the errors. That's probably why 1.4 works fine, but 1.5 doesn't, simply because of the added stuff in 1.5.

What can be done? Dunno. It's no surprise that Marlin (which is written in C) performs so well even on 8 bit processors). C is a static compiled language that needs no background processes running, With a real-time OS (or no OS), timing is generally not an issue unless the software is designed poorly. Software design and software architecture is a major problem when building complex systems, and I have personally "fixed" many complex systems by having the developer document their architecture and when that is done, the problems are immediately obvious. My last position (before I retired) was Head of IT Architecture in a major airlines, so I have some experience in these matters.

The higher up the abstraction you get in computer languages, the worse your performance will be.

Perhaps running faster processors will solve the problem. I.e.: Kill it with hardware.

I notice on my Octoprint 1.5 on a Pi 2B+ (with only a couple of plug-ins running, my CPU load is approaching 70%, and the processor is throttling back based on heat). 1.4 never got that type of load. I probably won't upgrade Octoprint without getting a Pi 4B+.

If I where re-writing Octoprint (which I won't be), I probably would write it in C or C++ (C++ can be made static) for performance reasons.

Just my $.02. Better put on my asbestos suit.

I'm using Octoprint 1.5.2 on a Pi3 and even with the webcam stream open in my browser, I barely see system load higher than 0.2, CPU cores stays under 12% most of time and three quarters of RAM are free, still I experienced some kind of port timeout (even if not the same as most of users here).
I'm afraid you can't kill this with hardware.

On the other hand, I installed the "Dry Run" plugin and tried to simulate some Benchy prints after taping the + pin on my USB cable, and it seems that the timeouts are not occurring anymore... I'll try some real print in the next days and see.

@foosel - Update: After printing more or less none stop since yesterday, running 1.5.2 and Python 3.7.3 after a clean SD card image reload I am no longer (touch wood) getting serial time out issues.

I have reloaded all the plugins, apart from GCodeEditior and all is well (plugin list below)

One thing of note, I would get print time out errors when using the BLough-R USB mod (physical device to block the 5V feed), now I am using a bit of electrical tape to block the 5v feedback pin.

But so far, it appears wiping the SD card, and reloading OctoPi, bypassing 1.5.0 and 1.5.1 and going right to 1.5.2, along with a Python 2 to 3 upgrade appears to have fixed my issues.

Lets see how long this lasts.

Plugins I use:
Bed Visualizer (1.0.0)
Creality Temperature (1.2.4)
Creality 2x temperature reporting fix (0.0.4)
HeaterTimeout (0.0.3)
Navbar Temperature Plugin (0.14)
Octolapse (0.4.1)
PrintTimeGenius Plugin (2.2.7)
UI Customizer (0.1.1.1)

Hi,

I notice communication timeout since few days, causinf printer to stop and i thought it was since the recent updates in 1.5.x.

But after reding this thread, i checked my usb plug that i modded to avoid 5v pin, and noticed that the tape masking the 5v was a little scratched. I just put a new tape to mask 5v and see if it works better.

My info:
browser.user_agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
connectivity.connection_check: 84.200.xx.xx.xx
connectivity.connection_ok: false
connectivity.enabled: true
connectivity.online: false
connectivity.resolution_check: octoprint.org
connectivity.resolution_ok: false
env.hardware.cores: 4
env.hardware.freq: 1500
env.hardware.ram: 1979641856
env.os.bits: 32
env.os.id: linux
env.os.platform: linux2
env.plugins.pi_support.model: Raspberry Pi 4 Model B Rev 1.1
env.plugins.pi_support.octopi_version: 0.17.0
env.plugins.pi_support.throttle_state: 0x0
env.python.pip: 19.3.1
env.python.version: 2.7.16
env.python.virtualenv: true
octoprint.safe_mode: false
octoprint.version: 1.5.2
printer.firmware: Robin

  • Printer model - FLSun Q5 used exclusively with octopi since months flawless.
  • OctoPrint, Python & (if applicable) 1.5.2 /
  • Whether a connection to the printer was active at point of update or not - Yes I think
  • Whether the issue persists in safe mode not tested
  • Whether the issue persists after switching the USB port - not tested
  • Whether the issue persists after a reboot of the system OctoPrint is installed on - Yes
  • If on a Pi:
  • Pi model - 4B rev1.1
    octoprint.log (44.8 KB)

Wrong kind of issue, we are explicitly looking for increases of SerialException here. Communication timeouts as visible in your log are usually caused by the firmware stopping to respond to OctoPrint - firmware issues or crashing controllers.

@Steverino I don't want to rule this out, but I also think that if this was a general issue with 1.5 on Pis and Python, we would be seeing WAY more of this. I'm currently looking at over 33k confirmed instances on 1.5.x with over a million hours of printing between them. Yet we only so far have a handful of reports in this thread here, and so far all but one of these (whose issues turned out to be coincidence caused by a mintemp error thanks to lower temperature) seem to run Creality printers.

Looking at the posix implementation of pyserial it also seems to be utilizing OS functions for the serial implementation, with a heavy utilization of select and ioctl. The specific location where the error we are trying to understand here gets raised is here:

which I read as "this will happen if select.select reported available data to be read from the serial port's file descriptor, yet upon trying to read it it wasn't there". My understanding here is that the kernel driver for the serial device takes care of the serial protocol and timings itself, and gets utilized via its file interface from Python - pretty much being treated as a bidirectional data pipe with some ioctl sugar sprinkled on top. Based on this I'm not sure that the issue is even within anything implemented in Python. I'd be happy to get corrected here however.

Still having an issue with the USB after the update, but it running great in safe mode. I'm thinking its a plugin problem now. The only two plugins I have are AstroPrint (1.6.3) and TP-Link Smartplug (0.9.26). Will do some more testing later today with one plugin at a time. time.octoprint.log (206.8 KB)

I'd hate to think it's my TPLink plugin, as I use it regularly and have not had SerialException errors at all. If it does turn out to be the culprit, I'd need logs and an issue opened on the plugin's github repo.

Edit: It could be that the plugin is doing what it's told to do and powering off your printer on a temperature runaway setting or something similar?

Gina,

It’s not, although is probably is. I did a bit of research on how the BCM2837 does USB. I couldn’t find the actual hardware manual (I put a request into Broadcom, but have not heard back), but I found the hardware peripheral info on the BCM2835 (used in the PI2), so it is probably similar to the one used in the PI3B. The operation of the serial ports (including the USB is similar to the 16550 UART without the DMA.

Anyway, without going into a lot of hardware talk, my suspicions seem to be correct, although I have no idea how Raspbian does it’s I/O. What happens is the serial data is assembled and written into a register. The processor has a limited amount of time to extract that data before the next byte is written into the register. There is a limited time after the serial interrupt when the data has to be retrieved from the UART before new data is written into the register. Timeouts and serial exception mean that when the interrupt occurs, data is not timely read from the register, and so the processor can’t get any the serial data. One can calculate this time, but obviously, Raspbian is not getting the data fast enough before the next byte of data comes in.

The problem mainly occurs when the serial handler is poorly written, and priority is not given to the serial interrupt. Remember, I have no idea how Raspbian does their serial I/O (USB), nor how it is programmed, so I can’t pinpoint where the problem lies. Unfortunately, Gina, I don’t think you can do anything about it, except find a real-time linux that will run on the PI.

What makes the Pi different? On big computers, USB is typically handled by DMA (direct memory access), where the data from the UART is written directly into a memory buffer, and therefore, timing is only critical to keep the buffers switched so the data keeps flowing. X86 architecture is like this, and the 16550 UART works like this. The little machines (like the PI or the Arduino) don’t have enough real estate on board to implement this, and so you have timing issues on high data throughput.

My suspicion on Raspbian is that they ported Debian without rewriting (or minimally rewriting) the I/O handling. These tiny chips require a total rewrite of the I/O handlers when used in a high-throughput or real time capacity. A real-time OS (or a real-time Linux) should have done this already.

When it comes to the snake (Python), how the implementation of the language interpreter and garbage collector interface with Linux may be part of the problem. If they are part of a Linux thread spawn (at the OS level), then all that software can affect the I/O timing because of the software load. You might not even know this from CPU utilization because CPU utilization might not even be calculating this. I typically go by CPU temperature to determine the load on the CPU, not the utilization number. On my PI 2B+, based on the temperature, the CPU utilization Is fairly high because I can see throttling taking place, and the 3D prints show it.

My only comment is to try to explain what is going on, not specifically how to fix it. The real fix is not to use Raspbian as a real-time OS, and to code certain portions of Octopi in C or C++, but I understand this probably isn’t practical.

Take a look at the source code of Marlin, and you will get a fair idea of writing embedded code. The PI is great for some things, but not great for others. Octoprint might have been better on an Arduino, not a PI. The Arduino is a simpler architecture and is easier to code without an OS.

The PI isn’t a big general purpose computer that does multiple things. It’s a tiny machine, that probably does one or two small things well. I had a similar argument in another part of this forum where I was asking questions about shutting off the main power supply of the printer, which also powers Octoprint. I got arguments as to why the PI should be left on all the time because it can do other things. I countered this argument, but I couldn’t convince some folks. Anyway, I figured out how to do what I wanted to do on my own, and now Octoprint nicely shuts down the printer power supply when it shuts down.

Steve

@Steverino as I said though, of this was a case of "the pi is at its limit with 1.5.x and python is just ill suited here" we'd see way more cases of this, across all possible printers. We don't. In fact we only see a handful. That does not indicate a general fundamental problem rooted in OS and/or architecture - that is identical to literally thousands of setups that print problem free. We are trying to identify a pattern here that differs from the setups known to work. Considering the available information, I find "Raspbian combined with Python" to be the cause here somewhat unlikely.

Depends on what is going on inside the processor.

Just giving my opinion.

@foosel

Just an update for you, printing using 1.5.1 after a few days, I've had no serial connection issues.

I've removed the 'OctoPrint-TuyaSmartplug' and 'GcodeEditor' plugins and I've not had any issues.

Now i know versions of GCodeEditor prior to the latest are included in the plugin blacklist. So that may not be playing nice with OctoPrint.

Also, the TuyaSmartplug plugin was logging errors into the octoprint.log for a while also the install.

Unfortunatly I no longer have those logs.

Intermediary summary:

  • :white_check_mark: @Doug_s_Printing:
    • unknown printer w/ Marlin 1.1.9 (ADVi3++ 4.0.6-dev)
    • OctoPrint 1.5.2, Python 2.7.16, OctoPi 0.17.0
    • communication timeout with 1.4.2
    • 1.5.2 works w/o issue in safe mode, suspects Astroprint or TP-Link Smartplug
  • :x: @Rancorbin:
    • Ender 5 plus
    • no further info
  • :no_entry: @Spooky_Wookie:
    • Ender 3 & CR10s Pro
    • false positive, "too many timeouts"
  • :no_entry: @drfish:
    • Prusa Mini
    • false positive, room too cold & mintemp
  • :white_check_mark: @Red_Gorilla:
    • unknown printer
    • switching USB port solved issue
  • :white_check_mark: @apainter2:
    • Ender 3 v2 w/ Marlin E3V2-2.0.x-14-Smith3D.M (Oct 13 2020)
    • OctoPrint 1.5.1, Python 3.7.3, OctoPi 0.17.0
    • downgrade to 1.4.2 w/o reboot still caused issue, after hard reboot printed without any failures in safe mode
    • reflashed card, switch to Python 3.7.3, upgrade to 1.5.2, minimal plugins (creality temperature, creality 2x reporting fix, printtimegenius, ui customizer) -> no issue
    • no issue since removing TuyaSmartplug and GcodeEditor from plugin collection
  • :x: @AevnsGrandpa:
    • Ender 3 w/ MKS Gen L & Marlin TH3D U1.R2.B5
      • no problem on Artillery Sidewinder X1 w/ MKS Gen L & vanilla Marlin
    • OctoPrint 1.5.1, Python 3.7.3, OctoPi 0.17.0
  • :no_entry: @fificap:
    • FLSun Q5
    • false positive, "too many timeouts"

Legend: :x: exception problem reported and unsolved, :white_check_mark: exception problem reported and solved, :no_entry: false positive

1 Like

OctoPrint 1.5.2
Ender 5 Pro
Debian Buster
Creality board 4.2.2
Creality firmware 1.3.1 w/BLTouch with Adapter
Python 3.7.3

Troubleshooting:
Disabled plugins

Changed USB cable power feed is taped off.
Applied OS updates
Changed USB port
Enabled logging

Current PLUGINS:
Bed Visualizer
Detailed Progress
Exclude Region
Navbar Temperature Plugin
PrintTimeGenius Plugin
Printer Dialogs
Printer Notifications
Sidebar Webcam
Simple Emergency Stop
Themeify
Thingiverse Plugin
Virtual Printer
Octolapse

I enabled logging and after the USB port swap and I haven't been able to duplicate since.
I had experienced the Serial Connection error three times deep into a two seperate model prints five to six hours into three 8 hour prints.
To the best of my knowledge my printer was connected and powered on with the upgrade of Octoprint. I haven't duplicated the condition as of yet since changing usb ports
After applying the updates in Debian and changing the usb port it seems to have possilbly resolved my issue. If I can duplicate the error I will publish the serial.log files.

octoprint (2).log (387.6 KB)