Failure with Hermal Runaway bed at EXACTLY the halfway point

William_Webb · May 11, 2019, 3:46pm

What is the problem?

Print stops at thermal runaway: bed at exactly the same spot in the model at exactly 50% complete

What did you already try to solve it?

Tried print twice. Stoppedt thermal runaway bed at precisely the same spot. I am suspecting Octoprint because only Octoprint knows what the 50% point on Z is.

Additional information about your setup

Octopi Version 1.3.11rc3
Anet A8 running Marlin 2.0-bugfix

Is this something Octoprint could be causing or am I barking up the wrong tree? I looked at the g-code compiled by Cura 4.0 and I did not see anything scary at the failure point. There were no temperature commands whats0ever.

OutsourcedGuru · May 11, 2019, 4:06pm

Why wouldn't you think that this isn't a firmware problem? Read some of the text from some of the issue threads on Marlin...

And honestly, wouldn't providing some of that log help us to also see what you're seeing?

William_Webb · May 11, 2019, 4:49pm

The reason that I don't think that is is a firmware problem is that the error occurred at precisely to 50% point in the print out of 45,000 or so lines. The firmware has no idea what the 50%. I am putting a lot of significance on the 50% thing.

Now about a log file..... yea, I would like to see that to. I was not aware a log file was created. Never looked. There is is. Thoughts?

octoprint.log (57.8 KB)

OutsourcedGuru · May 11, 2019, 5:20pm

2019-05-10 21:04:31,981 OctoPrint starts
2019-05-10 22:35:23,787 Thermal runaway halt from firmware

So the difference in time is about 1h 31min, roughly.

| Last lines in terminal:
| Send: N233105 G1 F1800 X123.704 Y75.196 E230.95928*10
| Recv: ok
| Send: N233106 G0 F7200 X123.704 Y74.949*74
| Recv: ok
| Send: N233107 G1 F1800 X123.452 Y74.697 E230.96125*9
| Recv: ok
| Send: N233108 G0 F7200 X122.996 Y74.984*65
| Recv: ok
| Send: N233109 G1 F1800 X123.315 Y75.302 E230.96375*12
| Recv: ok
| Send: N233110 G0 F7200 X123.067 Y75.302*75
| Recv: ok
| Send: N233111 G1 F1800 X122.996 Y75.231 E230.9643*50
| Recv:  T:217.08 /217.00 B:57.87 /60.00 @:63 B@:127      # Temperature-related
| Recv: ok
| Send: N233112 G1 F1500 E224.4643*0
| Recv: ok
| Send: N233113 G0 F7200 X122.996 Y76.665*73
| Recv: Error:Thermal Runaway, system stopped! Heater_ID: bed
| Recv: Error:Printer halted. kill() called!

So... before your firmware dumps, you see this back from it:

T:217.08 /217.00 B:57.87 /60.00 @:63 B@:127

I've marked the bed-related information in that. It seems like 60C degrees was set, the bed reported 57.87C and that 127C should be the maximum temperature. I can't find any documentation which describes the @:63 there but it's not high enough to be hotend-related so it's pertinent, one would have to assume. It's probable that those two @-related commands are just configured values from your compilation of Marlin or something fetched from EEPROM.

Monitor thermal stability. If the measured temperature drifts too far from the target temperature for too long, the machine will shut down with a “ Thermal runaway ” error. This error may indicate poor contact between thermistor and hot end, poor PID tuning, or a cold environment.

For false thermal runaways not caused by a loose temperature sensor, try increasing WATCH_TEMP_PERIOD or decreasing WATCH_TEMP_INCREASE . Heating may be slowed in a cold environment, if a fan is blowing on the thermistor, or if the heater has high resistance.

#define BED_MAXTEMP 127

At this point, I would wonder what's going on in printing the part at this magic point. Is there an abrupt movement which could make the thermistor wiring disconnect in a repeatable way?

I found this interesting, read it all the way through.

William_Webb · May 11, 2019, 6:12pm

Interesting post. Other than being at the 50% point in the print there is nothing unusual happening. Perhaps this is a red herring although that would be a remarkable coincidence. I started a new print compiled with Cura 3.4 instead of 4.0. this seems like a long shot considering the code looked just fine around the failure. This is my largest print and only the second since switching.

if I have to recompile Marlin to troubleshoot this so be it. However, I am not controlling the Bed temp with a PID loop. I am using "bang-bang", I think it is called.

OutsourcedGuru · May 11, 2019, 6:26pm

Maybe switching to PID for the bed would make it more responsive...?

William_Webb · May 11, 2019, 6:30pm

Yea, I hoped to do that on the last compile, but I do not have enough memory on this puny stock A8 controller. What I can try, which coincidentally is easier, is putting a piece of cork sheet under the bed as insulation. Unrelated to this sort of problem, people report general goodness from this cheap modification.

foosel · May 13, 2019, 8:10am

Just for the record, a thermal runaway error is entirely the firmware's responsibility. I wouldn't even know how to intentionally trigger one via the serial connection. The fact that it happens at 50% is something I'd deem coincidental here. I'd rather look at the 50% mark in the GCODE file and find out what is happening just before in the print that might be causing temperature fluctuations. Or I'd look if there's a correlation with time instead of percentage.

AFAIK those @ signify the power currently being applied to the heaters in question (in 7bits, so 127 is "full on power") and have nothing to do at all with max temperatures.

OutsourcedGuru · May 13, 2019, 3:17pm

Okay then, what it's trying to say then is that the bed's heating is full on (bang-bang mode) and that it's possibly not returning to 60C fast enough for the firmware. Again, try switching to PID for the bed.

William_Webb · May 13, 2019, 3:18pm

Thanks for the reply. After further thought I concur that OctoPI could not be the culprit. I have a theory.

n another forum,another hobbyist posted a nearly identical problem. Someone replied to his post with a possible explanation. Due to some glitch or design "feature" the slicer creates a ridiculously intricate and complex series of motions. This causes the steppers to draw a lot of power to keep up. The weak A8 power supply can not handle the demand of the bed and the steppers and the bed gets too cold or overshoots, whatever. This creates the failure. Interesting. However unlikely that sounds, it fits all the facts. Failure at the same point, check. Dependence on a particular slicer, check. Thermal failure when everything was going fine, check. Unrelated changes could make the problem go away, check. Now, how to test for it.

I think we can consider this issue closed (on this forum anyway).

OutsourcedGuru · May 13, 2019, 3:19pm

To be honest, my Robo C2 came with an underpowered 19V power adapter which I then replaced with a beefy 24V brick. The difference in heatup times are much nicer now; it's like it's a whole new printer.

And don't forget the paragraph above "for false thermal runaways"...

Spyder · May 14, 2019, 5:07am

I have 2 of those A8 beasties, and one of the first things I did was to swap out those crappy power supplies and replace them with a 450W ATX

It'll give you plenty of power, plus add a few bonuses, like a good clean 5V for the Pi, and an easy way to add a relay for the enclosure plugin for power on/off switching