Thermal Runaway aborts (Snapmaker 2)

What is the problem?

Model prints fine using Snapmaker Touch Screen with file transferred over WiFi but fails if I try to print the same file using Octoprint connected to printer over USB. Also fails if I print from the touch screen but with Octoprint connected to monitor the temperature. Appears to be due to a BED_TEMP_RUNAWAY exception received from the controller, which in turn seems to be due to a droop in bed temperature two or three layers into the print. The droop seems to be real as it is visible on the touch screen when printing that way, so the touch screen must simply be ignoring it and continuing. Is there a way I can make Octoprint also ignore it?
Ideally, I'd just ignore under temperature, not over for safety.
I realise this is a problem with the printer (probably the PSU is not strong enough) and will eventually need to be addressed at source, but for the moment it is printing OK from the Touch Screen and I'd like it to do the same from Octoprint.

What did you already try to solve it?

Checked the terminal. Here is the relevant lines with the print commands removed:
Recv: T:200.60 /200.00 B:59.61 /65.00 T0:200.60 /200.00 T1:30.90 /0.00 @:0 B@:127 @0:0 @1:0
Recv: T:200.30 /200.00 B:59.58 /65.00 T0:200.30 /200.00 T1:30.90 /0.00 @:0 B@:127 @0:0 @1:0
Recv: T:199.50 /200.00 B:59.61 /65.00 T0:199.50 /200.00 T1:30.90 /0.00 @:127 B@:127 @0:127 @1:0
Recv: T:199.50 /200.00 B:59.61 /65.00 T0:199.50 /200.00 T1:30.90 /0.00 @:127 B@:127 @0:127 @1:0
Recv: Error:Thermal Runaway, system stopped! Heater_ID: bed
Recv: Not handle exception: BED TEMP RUNAWAY
Changing monitoring state from "Printing" to "Error"
Send: M112
Send: N3226 M11220
Send: N3227 M104 T0 S0
21
Send: N3228 M104 T1 S027
Send: N3229 M140 S0
95
Changing monitoring state from "Error" to "Offline after error"

Have you tried running in safe mode?

No

Did running in safe mode solve the problem?

N/A

Systeminfo Bundle

octoprint-systeminfo-20231213122345.zip (13.0 KB)
browser.user_agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36
connectivity.connection_check: 1.1.1.1:53
connectivity.connection_ok: True
connectivity.enabled: True
connectivity.online: True
connectivity.resolution_check: octoprint.org
connectivity.resolution_ok: True
env.hardware.cores: 4
env.hardware.freq: 1800.0
env.hardware.ram: 3976409088
env.os.bits: 64
env.os.id: linux
env.os.platform: linux
env.python.pip: 23.0.1
env.python.version: 3.10.13
env.python.virtualenv: False
octoprint.last_safe_mode.date: unknown
octoprint.last_safe_mode.reason: unknown
octoprint.safe_mode: False
octoprint.version: 1.9.3
systeminfo.generated: 2023-12-13T12:23:45Z
systeminfo.generator: zipapi

Additional information about your setup

Octoprint is running in a Docker container on a Raspberry Pi 4.

Thermal runaway usually is a hardware issue with the printer.

Check for proper installation of thermistor and heater.
Has the PSU enough power for the bed heater?
Did you run a PID tune for the bed?
Is the part cooling fan cooling the bed too much?

Please attach the systeminfo bundle to your next post.

OK added the bundle file. The instructions did not make it easy to find out how to do that so I previously extracted what I thought was the important part as text.

I realise I have a problem with the printer which I have investigated quite extensively. It appears I will have to make some wiring changes to use a second PSU to power the bed. It is a bang-bang servo not a PID and as far as I can tell there is nothing to tune. I think it is just shedding load once the print actually starts to protect the PSU. The dip is actually quite short lived and it recovers a little later without dipping any lower. I think what happened is Snapmaker launched a dual extruder head and this overloaded the PSU which they bodged by ignoring the under temperature alarm in the Touch Screen app rather than widening the threshold in the firmware.

Surely there is some way to make Octoprint ignore this alarm also?

No, it's printer safety mechanism. Turning it off would be highly questionable and could lead to a fire.
And even if there was a way for OctoPrint to disable it, we wouldn't allow it for the same reason.

Thermal protection is one of the most vital safety features in Marlin, allowing the firmware to catch a bad situation and shut down heaters before it goes too far. Consider what happens when a thermistor comes loose during printing. The firmware sees a low temperature reading so it keeps the heat on. As long as the temperature reading is low, the hotend will continue to heat up indefinitely, leading to smoke, oozing, a ruined print, and possibly even fire.

Marlin offers two levels of thermal protection:

  1. Check that the temperature is actually increasing when a heater is on. If the temperature fails to rise enough within a certain time period (by default, 2 degrees in 20 seconds), the machine will shut down with a β€œHeating failed” error. This will detect a disconnected, loose, or misconfigured thermistor, or a disconnected heater.
  2. Monitor thermal stability. If the measured temperature drifts too far from the target temperature for too long, the machine will shut down with a β€œThermal runaway” error. This error may indicate poor contact between thermistor and hot end, poor PID tuning, or a cold environment.

If it's really psu power related you should switch to a better and higher rated psu.
Imo it's not wise to run the psu at 100% or more all the time. Keep some headroom for spikes and similar things.
You could test it for example with something like a kill-a-watt power consumption meter.
It of course doesn't show you how much the psu is outputting on the DC side, butt you know if you're in the ballpark of the output limit.

1 Like

I understand, but under temperature is not a hazard (nor for that matter is it 'thermal runaway'). In any case this sort of protection ought to be in the firmware, not the UI. Maybe there is a G Code I can send in the preamble to set the thermal limits so the error message never gets sent? I only asked this because the Snapmaker's own UI does not respond to this temperature dip in the way that Octoprint does.

I accept his is definitely a Snapmaker fault: The PSU is only rated at 320W, 24V and the bed has a resistance of 2.6 Ohms, so 222W on its own. Hardly seems sufficient, less than 100W for everything else. I'll have to find a way to fix that, but the PSU is a specific Snapmaker unit and it seems to have more complexity than just a simple pair of wires carrying 24V!

lets just say 230W bed, 45W hotend, 10W for mainboard and fans. There should be still some headroom and you're "just" at 90% load.

Where did you measure the resistance of the bed?
Try to measure it right at the end of the power cables if you haven't done that before.
Also check all the wiring and connectors.
Maybe there's a broken wire or a loose connection.

Nope those settings are hardcoded into the firmware. You can only change them when you build a new firmware.
The default setting for most firmwares is 2Β°CΒ±. You had about 5Β°C difference to the set temperature.

Thanks, yes I measured 2.6 Ohms directly across the high current connections on the bed heater plate. There are two more low current wires which seem to go to the temperature sensor. I do not think there is a wiring problem because the bed heats up fine and stabilises at the demanded temperature prior to printing (albeit a little slowly). The dip only starts after the motors start running. Could be a problem with the extruder head fans cooling the bed I suppose, since once the head gets a few layers further up, the temperature recovers. There are complaints on the Snapmaker forum about the PSU. I'll do some more work and take the issue up on that forum. Thanks for your help!

riiiiiight I forgot the steppers in my calculation :man_facepalming:

Yeah then it's even closer to 100%

Yeah that definitely possible.
Try a test print with just 20% max fan speed and see if it changes anything

good luck :slight_smile: :tentacle: