Octoprint stops printing, then drops network connection mid-print

Did you add the required sudoers configuration to allow the Network Health plugin to function? The Pi4 definitely has more wifi issues than older devices from what I've seen here and in Discord, and experienced personally. Have you run the sudo apt update and sudo apt upgrade commands via SSH/terminal as that could update some system level drivers for wifi that might make it more stable.

oh, and the issue with stopped printing and the time shift I suspect was a brown out possibly on the Pi, which would make sense that the print stopped. When the printer goes offline and stays powered on the print will continue.

I did an apt-get update and apt-get upgrade a couple of weeks ago and I added the required sudoers config when I installed the plugin.

I'm not sure I understand what you're saying here. The print stopped before the network connection dropped this time. OctoPrint had then already cancelled the print job and marked it as failed. Then a few minutes later the connection dropped as well while I was gathering diagnostics to figure out what was going on, and I needed to reboot the Pi.

Edit: I just ordered a new SD card to make sure that the SD card isn't the issue.

Ok, so I ordered a new SD card to rule out file system corruption. Installed Octopi, made a point of not installing any plugins besides the network health plugin that I apparently need (despite no other Pi4 in my network having any connection issues whatsoever) and tried a long print again. It failed again, three hours in, same symptoms as in my original post:

2022-02-05 17:19:20,259 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 17:34:20,261 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 17:49:20,262 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 17:52:40,291 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2022-02-05 17:52:43,301 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2022-02-05 17:52:46,304 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2022-02-05 17:52:49,308 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2022-02-05 17:52:52,309 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2022-02-05 17:52:55,312 - octoprint.util.comm - INFO - No response from printer after 6 consecutive communication timeouts, considering it dead.
2022-02-05 17:52:55,338 - octoprint.util.comm - INFO - Changing monitoring state from "Printing" to "Offline after error"
2022-02-05 17:52:55,350 - octoprint.plugins.action_command_notification - INFO - Notifications cleared
2022-02-05 17:52:55,363 - octoprint.server.util.sockjs - INFO - Client connection closed: <ip>
2022-02-05 17:52:55,517 - octoprint.server.util.flask - INFO - Passively logging in user NMe from <ip>
2022-02-05 17:52:55,517 - octoprint.access.users - INFO - Logged in user: NMe
2022-02-05 17:52:56,406 - octoprint.server.util.flask - INFO - Passively logging in user NMe from <ip>
2022-02-05 17:52:56,407 - octoprint.access.users - INFO - Logged in user: NMe
2022-02-05 17:52:56,906 - octoprint.server.util.sockjs - INFO - New connection from client: <ip>
2022-02-05 17:52:57,029 - octoprint.server.util.flask - INFO - Passively logging in user NMe from <ip>
2022-02-05 17:52:57,030 - octoprint.access.users - INFO - Logged in user: NMe
2022-02-05 17:52:58,489 - octoprint.server.util.sockjs - INFO - User NMe logged in on the socket from client <ip>
2022-02-05 17:53:13,329 - octoprint.util.comm - INFO - Changing monitoring state from "Offline" to "Detecting serial connection"
2022-02-05 17:53:13,355 - octoprint.util.comm - INFO - Serial detection: Performing autodetection with 7 port/baudrate candidates: /dev/ttyUSB0@115200, /dev/ttyUSB0@250000, /dev/ttyUSB0@230400, /dev/ttyUSB0@57600, /dev/ttyUSB0@38400, /dev/ttyUSB0@19200, /dev/ttyUSB0@9600
2022-02-05 17:53:13,355 - octoprint.util.comm - INFO - Serial detection: Trying port /dev/ttyUSB0, baudrate 115200
2022-02-05 17:53:13,355 - octoprint.util.comm - INFO - Connecting to port /dev/ttyUSB0, baudrate 115200
2022-02-05 17:53:13,371 - octoprint.util.comm - INFO - Serial detection: Handshake attempt #1 with timeout 2.0s
2022-02-05 17:53:13,373 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2022-02-05 17:53:14,331 - octoprint.util.comm - INFO - Changing monitoring state from "Detecting serial connection" to "Operational"
2022-02-05 17:53:14,372 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2022-02-05 17:53:16,497 - octoprint.util.comm - INFO - Printer reports firmware name "Marlin Ver 1.70.0 BLZ- (Creality3D)"
2022-02-05 17:53:16,504 - octoprint.util.comm - INFO - Firmware states that it supports temperature autoreporting

At this point what used to be a fun hobby is turning into an increasingly frustrating can of worms. Octoprint is supposed to help make printing easier but right now all it's doing is causing frustration. Any help anyone could offer would be greatly appreciated. Or even just a way to keep testing if things are working now without having to waste filament.

Have you tried any firmware other than Creality? It is often problematic with lots of bugs across the different versions. You would not believe what we've seen from them.

The problem is with the communication timeouts. The printer is stopping responding to OctoPrint at some point and we don't know why. In this case it is an issue with the serial connection to the printer. So a serial.log might be useful, to see what lines were sent to the printer just before - but it's not enabled by default, as it can impact performance or get quite large.

Some reasons for communication timeouts I've seen include lines that are too long (happens sometimes with high precision or with arc commands), special characters such as percent signs (sometimes injected by plugins) or just generally the firmware crashing.

I haven't tried any other firmware and honestly with all the trouble it may cause if something goes wrong I'm a bit reluctant to try. Not in the least since everything worked fine prior to me installing Octoprint on the new Pi 4, everything was just fine on the Pi 3 until that started overheating.

The problem is with the communication timeouts. The printer is stopping responding to OctoPrint at some point and we don't know why. In this case it is an issue with the serial connection to the printer. So a serial.log might be useful, to see what lines were sent to the printer just before - but it's not enabled by default, as it can impact performance or get quite large.

I'll enable the serial.log, though I'm not looking forward to wasting yet another multi-hour print just to get to the point where it will randomly crash anyway. Why isn't Octoprint at least logging unexpected responses to a file? I get that literally logging everything could cause performance issues but there is a whole lot of gray area between logging everything and logging nothing.

Some reasons for communication timeouts I've seen include lines that are too long (happens sometimes with high precision or with arc commands), special characters such as percent signs (sometimes injected by plugins) or just generally the firmware crashing.

It's been the same print that has been failing, but it has never been consistently failing on the same layer. That should, at least in my mind, exclude anything related to the gcode as a possible cause.

Edit: I ordered some heat sinks for my Pi 3, let's see if the old Pi does better at the same print when I do something to prevent overheating.

It's not the response of a file, but like Charlie said your printer stops responding. It is logged starting at 2022-02-05 17:52:40,291 in your log snippet where all of a sudden your printer is no longer communicating. I would also recommend getting off of any Creality made firmware. If you're daunted by compiling stock Marlin there are a couple of precompiled versions that are acceptable and InsanityAutomation has made very good ones optimized to work with OctoPrint.

It's not so much that I'm afraid of compiling firmware. I'm a software developer, I can probably figure that out. It's just that I'm on a CR-X Pro and the CR-X is a special piece of hardware that already isn't getting much love from Creality. If I brick it because they did something different than normal or whatever it's a lot of money down the drain. And as I said, I never had any issues with the firmware prior to upgrading to a Pi 4. That's not to say I won't try upgrading the firmware but I won't do it unless I have tried everything else that won't brick my 700 euro device.

On the bright side, I decided to leave the printer on without it actually printing in hopes of it losing its connection without me having to waste filament. And apparently it did:

2022-02-05 21:34:20,275 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 21:49:20,277 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 22:04:20,278 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 22:19:20,279 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 22:34:20,280 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2022-02-05 22:36:58,299 - octoprint.util.comm - INFO - No response from printer after 3 consecutive communication timeouts, considering it dead.
2022-02-05 22:36:58,315 - octoprint.util.comm - INFO - Changing monitoring state from "Operational" to "Offline after error"
2022-02-05 22:36:58,332 - octoprint.plugins.action_command_notification - INFO - Notifications cleared

serial.log leading up to that:

2022-02-05 22:36:08,250 - Recv:  == T:16.87 /0.00 == B:17.97 /0.00 @:0 B@:0
2022-02-05 22:36:10,249 - Recv:  == T:16.87 /0.00 == B:18.05 /0.00 @:0 B@:0
2022-02-05 22:36:12,249 - Recv:  == T:16.95 /0.00 == B:17.97 /0.00 @:0 B@:0
2022-02-05 22:36:14,249 - Recv:  == T:16.87 /0.00 == B:18.05 /0.00 @:0 B@:0
2022-02-05 22:36:16,248 - Recv:  == T:16.87 /0.00 == B:17.97 /0.00 @:0 B@:0
2022-02-05 22:36:18,248 - Recv:  == T:16.87 /0.00 == B:18.01 /0.00 @:0 B@:0
2022-02-05 22:36:20,248 - Recv:  == T:16.87 /0.00 == B:18.01 /0.00 @:0 B@:0
2022-02-05 22:36:22,247 - Recv:  == T:16.87 /0.00 == B:18.01 /0.00 @:0 B@:0
2022-02-05 22:36:24,247 - Recv:  == T:16.87 /0.00 == B:18.05 /0.00 @:0 B@:0
2022-02-05 22:36:58,300 - No response from printer after 3 consecutive communication timeouts, considering it dead. Configure long running commands or increase communication timeout if that happens regularly on specific commands or long moves.
2022-02-05 22:36:58,315 - Changing monitoring state from "Operational" to "Offline after error"
2022-02-05 22:36:58,326 - Connection closed, closing down monitor

I'll see what good I can do by swapping my Pi3 back in once I receive my heat sinks, and if that still has issues I'll replace the USB cable. If that also doesn't solve it I'll consider upgrading the firmware, assuming no one else has a less invasive idea.

I can't see a configuration example from Marlin for the CR-X at a quick glance, although there are some pre-built versions around - so maybe there's one somewhere I can't see.

The most annoying thing about this problem is that there seems to be no cause. The printer is just sitting there and then something gives up...

Yeah, that's why I changed one thing and am now trying the same thing I tried yesterday: I have Octoprint connected to the printer with the printer on and I'm waiting if it will still eventually lose its connection now that I moved the USB cable to a different USB port to rule out that the port on the Pi is somehow dodgy. In the mean time I'll try to find a mini-USB cable that only transfers data so I don't have to tape a pin (or just order a new one and tape it anyway) so that I can rule out the cable too.

It gets worse, it's not a CR-X but a CR-X Pro, which seems to have barely sold at all so it's not getting much support. It's essentially a CR-X but with all the common upgrades pre-installed: silent board, metal extruders, BLTouch, etc.

Creality printers are very susceptible to EMI and to make matters worse, are the source of substantial amounts of EMI. The USB cable used needs to be short, high quality, shielded, and have ferrite beads at both ends. Finding a cable that "only transfers data" with these qualities is going to be almost impossible, so tape on the 5v pin will be the easiest solution. There are short adapters available that don't pass the 5v which can be used if properly shielded.

Cable routing (including power) can be critical. There are multiple topics in this forum about successful re-routing of cables both inside and outside the printer.

It looks like you have two problems, "stops printing" appears to be caused by the USB communications failing and "drops network connection" is a RPi issue. Both could be caused by EMI so I think adding shielding and grounding will be the best approach to solving both.

Thanks for your pointers. The thing that weirds me out most is that Octoprint always worked fine using the same cable, same firmware and obviously the same internal cable routing. It suddenly stopped working when the only variable I changed was installing Octoprint on a new Pi 4 to replace the Pi 3 I had been using successfully for over a year until it started overheating on longer prints.

After a week of downtime due to me being ill and unable to grab the packages I had delivered at the office I have now tried a new good quality USB cable and that still has the same disconnection issues. I also got some heat sinks for my old Pi 3 so I can try running that again. If the heatsinks fix the overheating issue and the connectivity issues I've been having with its replacement don't show up I'll finally be able to properly print again. If not...I'm gonna have to risk bricking my printer.

Edit: the old Pi has been running for 5 hours now and has been keeping its connection the entire time. So far that seems pretty promising but I'll leave it on overnight. I haven't hooked up my Raspicam though, so even if it's still running in the morning I'll have to test again with that last piece of the puzzle hooked up as well. Either way, it seems like either my Pi 4 is just busted or the Pi 4 in general doesn't want to play nice with Octoprint.

Hooked everything up with the old Pi again, including the camera. I'm currently 15 hours into a print with temps sitting around 65 degrees C now that it has heat sinks. I guess that whatever was wrong, was wrong either with my specific Pi 4 or with the Pi 4 in general...

Probably not the RPi 4 in general.

I've been running on the Pi 3 for a while now and sadly things are still pretty dodgy. Last weekend I once again had a print that just stopped because of a dropped serial connection after about 8 hours of printing.

I also noticed that my CPU temps go up by like 10 degrees C if I view the webcam stream, so that got me thinking and I found this thread. This bit in particular was interesting to me:

There is another caveat to using a Raspberry Pi as an OctoPrint host with mjpg-streamer, because it shares its USB bandwidth between all the USB peripherals and the Ethernet interface. When using mjpg-streamer at high resolutions, it is possible that the USB bandwidth available for the 3D printer serial interface is not sufficient for some prints. Again, the solution is to step down the video resolution.

I dropped the resolution for my raspicam from 1280x960 to 800x600 which seems to have reduced idle load from about 0.73 to about 0.53, though if I open up the webcam stream in a browser I still end up with a high load (considering the system is otherwise idle):

top - 13:33:07 up 8 min,  1 user,  load average: 1.20, 0.78, 0.42
Tasks: 119 total,   1 running, 118 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  5.6 sy,  0.0 ni, 93.2 id,  0.0 wa,  0.0 hi,  0.9 si,  0.0 st
MiB Mem :    871.7 total,    609.1 free,     98.8 used,    163.7 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.    715.4 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   89 root      20   0       0      0      0 D  11.9   0.0   0:06.32 kworker/u8:1+brcmf_wq/mmc1:0001:1

I am starting to feel like my issue is that the webcam is killing the communication with my printer (so hopefully the resolution drop was enough to fix that) and apart from that it seems like simply viewing the cam while printing is enough to trigger temperature warnings.

Maybe I should just move away from using a Pi entirely. Octoprint seems to have an absolutely massive idle load and it feels like the Pi just can't deal with both that and the camera. Is that a fair conclusion? I see lots of people saying they're using Pis and specifically Pi 3Bs so it feels a bit weird to me that I'm seemingly the only one with these issues.

Idle OctoPi 0.18.0, OctoPrint 1.8.0rc5, RPi3

... with a USB webcam, no web browser connected:

top - 14:46:48 up 1 day, 23:04,  1 user,  load average: 0.11, 0.04, 0.01
Tasks: 112 total,   1 running, 111 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.3 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.2 si,  0.0 st
MiB Mem :    871.7 total,    480.5 free,    110.4 used,    280.8 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.    695.1 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  503 root       0 -20       0      0      0 I   1.0   0.0  21:48.44 kworker/u9:1-uvcvideo
  317 pi        20   0  238528  70324   9644 S   0.7   7.9  14:55.67 octoprint

... with a USB webcam, web browser & camera feed open:

top - 14:52:42 up 1 day, 23:10,  1 user,  load average: 0.19, 0.06, 0.01
Tasks: 112 total,   1 running, 111 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.5 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :    871.7 total,    477.9 free,    112.8 used,    280.9 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.    692.6 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  317 pi        20   0  239424  71392   9644 S   1.0   8.0  15:03.75 octoprint
  503 root       0 -20       0      0      0 I   1.0   0.0  21:51.51 kworker/u9:1-uvcvideo

... with a RPi cam, no web browser connected:

top - 14:47:45 up 1 day, 23:49,  1 user,  load average: 0.86, 0.66, 0.65
Tasks: 112 total,   1 running, 111 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.2 sy,  0.0 ni, 99.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :    871.7 total,    447.1 free,    110.3 used,    314.3 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.    693.4 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 6159 root      20   0       0      0      0 I   1.3   0.0   0:00.78 kworker/1:1-events_power_efficient
 1145 pi        20   0  221808  70912   9660 S   1.0   7.9  15:53.78 octoprint

... with an RPi cam, web browser & camera feed open:

top - 14:54:10 up 1 day, 23:56,  1 user,  load average: 0.59, 0.63, 0.63
Tasks: 114 total,   1 running, 113 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :    871.7 total,    446.4 free,    110.8 used,    314.5 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.    692.8 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 5853 root      20   0       0      0      0 I   1.0   0.0   1:34.96 kworker/1:0-events_power_efficient
 1145 pi        20   0  221640  70656   9660 S   0.7   7.9  16:01.89 octoprint
 6268 root      20   0       0      0      0 I   0.7   0.0   0:00.54 kworker/u8:0-mmal-vchiq

Admittedly all at default resolution, but I wouldn't call a load of 0.11 while idle "massive", neither would I say that about 0.7% CPU usage.

What you see producing load up there in your top output upon viewing the webcam isn't OctoPrint, it isn't your camera, it isn't the webcam server accessing the camera on OctoPi and making it available for embedding in OctoPrint's UI, it is the kernel portion of your wifi driver. This load is the very reason why a RPi 0w isn't recommended to be used with OctoPrint, since that only has one core contrary to the 3b, and that one core will get almost saturated with large data transfers over wifi. A simple nc or wget suffices for that, OctoPrint doesn't have to be in the mix.

Looking at your other stats, your Pi has plenty of resources left to drive a printer (a load of below 1 is not even fully utilizing 1 of its 4 cores).

Neither would I, but that is not what I'm seeing. My load is at 1.2 while I actually have the camera stream open. I'm at 0.5-0.6 while idling with a browser tab open that shows Octoprint on its temperature page, so without the camera in view. And this is without any non-stock plugins. Judging from your numbers either 1.8.0 improved a lot or stock resolution (640x480 I assume?) instead of 800x600 is reducing loads by 80%, which I wouldn't think is very likely.

Closing the browser tab entirely does significantly lower the load to something between 0.4 and 0.6 but not quite as low as 0.1.

Apart from the CPU usage, which I can live with, is it possible that my serial issues above could be related to the resolution thing?

One thing I have noticed from a couple of people's log files is that the USB chip can reset due to over heating or over current, or maybe some other reason.

Check dmesg when you find the printer disconnects to see if it gives any reason for the disconnect (if it is a SerialException type) would be my suggestion.

Yeah, overheating definitely can be an issue, it was for me with the Pi 3 before I installed heat sinks on it. And with the camera at 1280x960 resolution it still happened from time to time even with the heat sinks. I've had a few times where a print failed because of overheating and if I remember correctly the UI actually said that was the reason.

That being said, I was watching temperatures when my last print failed and they were fine, somewhere in the 55-60 range on both CPU and GPU. This is why I suspect I'm somehow filling up the device's serial capabilities between the camera and the printer and it's disconnecting because of it, but I'm not sure because it doesn't say anywhere explicitly that this would be possible as expected behavior.

Also, as stated somewhere above, I've had the printer disconnect when I had it idling. I just turned on the printer, connected Octoprint to it and waited for it to lose its connection, which it then did after some hours. No sign of overheating, just random disconnects. But at the time I was still running the camera at a high resolution, so I'll have to try that again now that I more than halved the pixel count.

After I wrote my previous message I turned on the printer and connected it to Octoprint and left that running for 24 hours. Before I changed the resolution on the camera that would have disconnected at some random point, but it kept its connection. After those 24 hours I intentionally started a longer print to see if the connection with the printer would stay stable and it seems like it is. Currently it has been printing for almost 72 hours and the print will finish in another couple of hours. The printer has been connected to the Pi successfully for about 100 hours now, so I guess it's safe to say it was the serial connection being overloaded with a 1280x960@10fps stream in addition to whatever amount of data Octoprint has to send/receive to/from the printer.

So while the print is not done and it's still only one print I guess this solves my issue. I'm kind of shocked that a Pi (and not even the Pi 4 apparently) is able to handle a simple webcam stream in addition to the obviously required serial communication for printing.

Would I have seen similar results if instead of the resolution I had cut the frame rate? I like making timelapses 1 FPS would be fine for that. Would that cause a 90% reduction in serial data transfered? Or is that just something that would be handled in software and still have just as much data transmitted?