Octopi Freezes on RP 3B+


#1

This is my first post so please forgive me if I haven't done this right.

What is the problem?

The network goes down, all printing stops, only a reboot brings it back to life. I cannot ping or SSH to the PI. This has happened on wifi and ethernet. It seems to happen late in the print.

What did you already try to solve it?

I checked the /var/log/messages file and there does not seem to be anything in it that indicates a lockup of the OS, it just freezes.

I thought perhaps it was a problem with the switch and it was maybe just a network interface issue but I've switched from wired to wifi (yes you read that right) and it makes no difference.

Additional information about your setup (OctoPrint version, OctoPi version, printer, firmware, octoprint.log, serial.log or output on terminal tab, ...)

Right now it is in the state that it's locked up. I do have a long hdmi cable that I can hook up to it, and a wireless mouse and keyboard. It's right near the end of a print job so I don't want to touch it in case I can save the job.

I did enable serial debugging after the last crash. I don't think it's a USB communication issue. I think this may help me locate the spot in the print that I can resume from?

It seems to be happening after my upgrade to 1.3.8 but I don't have the latest octopi OS on it.

Any suggestions will be very appreciated.


#2

Not sure, but I think the Raspberry Pi 3B+ needs OctoPi 0.15.x to be happy.

24%20PM


#3

Total newb here, but is there a way to resume my job after I reboot it?

It's printing on a CR-10s which supports pause/resume


#4

In the serial logs it's just chugging away. In the /var/log/messages I see that reboots hard, but on real indication why and the octoprint log shows a bunch of these messages:

2018-05-03 11:08:23,696 - octoprint.plugins.anywhere - WARNING - Not connected to server ws or connection lost
2018-05-03 11:08:23,698 - backoff - ERROR - Backing off forward_ws(...) for 0.0s (error: [Errno 0] Error)
2018-05-03 11:08:23,744 - octoprint.plugins.anywhere - WARNING - Not connected to server ws or connection lost
2018-05-03 11:08:23,746 - backoff - ERROR - Backing off forward_ws(...) for 0.5s (error: [Errno 0] Error)
2018-05-03 11:08:24,230 - octoprint.plugins.anywhere - WARNING - Not connected to server ws or connection lost
2018-05-03 11:08:24,233 - backoff - ERROR - Backing off forward_ws(...) for 2.8s (error: [Errno 0] Error)
2018-05-03 11:08:27,027 - octoprint.plugins.anywhere - WARNING - Not connected to server ws or connection lost
2018-05-03 11:08:27,029 - backoff - ERROR - Backing off forward_ws(...) for 0.3s (error: [Errno 0] Error)
2018-05-03 11:08:27,367 - octoprint.plugins.anywhere - WARNING - Not connected to server ws or connection lost
2018-05-03 11:08:27,368 - backoff - ERROR - Backing off forward_ws(...) for 4.3s (error: [Errno 0] Error)
2018-05-03 11:08:31,664 - octoprint.plugins.anywhere - WARNING - Not connected to server ws or connection lost
2018-05-03 11:08:31,665 - backoff - ERROR - Backing off forward_ws(...) for 9.6s (error: [Errno 0] Error)
2018-05-03 11:08:41,314 - octoprint.plugins.anywhere - WARNING - Not connected to server ws or connection lost
2018-05-03 11:08:41,316 - backoff - ERROR - Giving up forward_ws(...) after 8 tries (error: [Errno 0] Error)


#5

It is 0.15.0

Linux octopi 4.14.30-v7+ #1102 SMP Mon Mar 26 16:45:49 BST 2018 armv7l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu May 3 16:09:30 2018 from 192.168.0.35


Access OctoPrint from a web browser on your network by navigating to any of:

http://octopi.local
http://192.168.0.36

https is also available, with a self-signed certificate.

OctoPrint version : 1.3.8
OctoPi version : 0.15.0


#6

You're right about that...

From the website...
" OctoPrint version 1.3.8, 0.15 also contains the following changes:
Raspberrypi 3B+ support"


#7

That would be a cool feature but OctoPrint does not support this. There's a queueing mechanism in the RAMPS board itself and once OctoPrint has sent off several GCODE commands to run, it has no knowledge of how many have been actually done yet. At best, restarting after power loss would be guesswork.


#8

Okay, if you have 0.15.0... not sure which of the users here takes care of the OctoPi image but I'm guessing that @foosel would know.


#9

That would be @guysoft


#10

I have had some success with restarting a print after a glitch. The printer needs to remain powered on so that it does not lose its internal position information. You need to determine what layer is currently being printed either by LCD readout or measuring.

You need to edit the gcode file to remove everything before the current layer and then add gcode (G92) to get things synched up again. Resuming in the middle of a layer can be done, but it adds complexity.

Send that edited gcode file to the printer and monitor closely. There will most certainly be a flaw in the print at this point. If the print froze while doing infill, then the flaw may not be noticable. If you get the layer off by one, it usually shows but may be acceptable.

This process is not for the faint of heart. Understanding gcode is a must have skill. I only attempt this if I'm multiple hours into a long print.


#11

I used to do similar things with postscript code years ago on huge print jobs that would fail half way though. I had to move on and bypass octoprint to get these jobs out but once I get them done, I will setup octoprint again and see if a job fails. If it does, I'll give this a go.


#12

I wrote GetToDahChoppa which does this, splitting a Cura-sliced GCODE file at a certain layer. It's written in Go and open-source if it's useful. It assumes an earlier version of Cura, however. In theory, compiling it on Raspian would allow you to chop in place in the ~/.octoprint/data/uploads folder itself.

Not sure about the G92. What if the next seen extrusion command then spits out a lot of filament? What's weird is that I've printed perhaps 200 of these (see photo) using this technique and I don't remember inserting any extra G92 commands (although it could have been in my startup GCODE and I didn't know it).

When restarting a job it's crucial to NOT run a G29 or similar autoleveling routine which is often in the startup script either for the sliced job or for OctoPrint. For this reason, I wrote this to ignore the command. The autolevel will either 1) gets stupid when it sees plastic under the IR sensor or 2) blindly crash into your parts and knock them off the print bed or off your raft. In other words, if you're restarting a second job, you might need to write a little G0/G1 code to bring the hotend gracefully in and over a collection of towers that you want to complete. Otherwise, it may just make a bee-line straight through your parts since that's usually the shortest path between two points, in toolpath terms.

Scrabble


#13

I'm not familiar with the language "Go". Is it widely available on a variety of platforms? Is it available on Raspbian?

G92 would be used to set Marlin's E value to something close to the E value of the remaining fragment of gcode. This prevents the next seen E value from spitting out lots of filament. If I'm restarting a print job in the middle, I may manually extrude some filament to make sure there's fresh in the nozzle and it hasn't been "cooked". Adding a "G92 E" to the remaining gcode makes sure we start printing correctly.

My G28 and G29 commands are in the start gcode which I am assuming are not part of the remaining gcode. We are talking here about resuming printing after a failure, very much a manual process.


#14

I can answer one of my own questions... Is Go available on Raspbian? Yes, see https://golang.org/doc/install?download=go1.10.2.linux-armv6l.tar.gz

(I think that also answers "Is it widely available..." :smile: )


#15

Go is like the cool/new open-sourced C compiler, if you will. To me, it's still like magic because Go is written almost entirely in Go itself. It has a rich package landscape for adding functionality. A 200-line long program compiles to a 2MB executable, which isn't too bad these days.

So you're saying that G92 E (without a number component) at the beginning of file-two-of-two would work out. Alright, I'll try that.

I fully intent to create an automated process later on. One part of all this is that I'll need to write a watchdog plugin that will record and serialize-to-file the last line number sent to the RAMPS board. It will also create a flag file that can be used upon startup to recognize that a job didn't finish and to offer a resuming of same.


#16

Sorry, my bad. That should have been "G92 En". I originally had the "n" in angle brackets but Discourse made both the angle brackets and the n disappear and I didn't catch it.


#17

So... what should n be in this case? I would guess that it would be the last extrusion offset from the previously-executed G1 line plus whatever you extruded manually.


#18

When you split gcode with absolute E values, the first part is fine but the second part needs to know what the last E value was in the first part so the next E value will extrude the proper amount of filament.

99% of the time, you are splitting a perfectly good gcode file because something went wrong and various manual actions have taken place fix the problem and to arrive at where the split should be. The firmware is pretty good about recovering X, Y, and Z but E not so much. If you manually played with the filament (I always do) then you need to tell the firmware where its E value should be. G92 En moves the E value without moving the filament.

If your gcode has relative E values, then the split is the same, but no matter where the filament has been moved to manually, the next G0 or G1 with an E value will just extrude "that much more" filament.

By adjusting the E value with a G92, you are essentially "erasing" all of the manually executed filament moves. How much or how little was extruded (or retracted) manually doesn't matter. What matters it what the firmware is going to do with the next E value it sees on a G0 or G1 command.

Does this explain it better?


#19

The relative-E paragraph is easy enough to follow.

Regaring absolute-E and assuming that the printer hasn't been shutdown/restarted, something internally like the RAMPS board ought to have an extrusion offset/audit running and will know what it last did.

So now, let's say you like to extrude 10mm to prime the pump and to push the cooked filament, fair enough. Is your next command then near the top of file-two G92 E10 or something else?


#20

I guess we need an example... I'll work one up and post it later tonight or tomorrow.