The ongoing USB Conspiracy Theory?


Some, but not all. A better (actually engineered) protocol would probably be the actual answer, but as I said, that would/will be met with a TON of resistance. I tried it once a couple years ago and I'm not gonna repeat that experience, it was frustrating at best.


Just to give a summary of my tests. Transferring a 20MByte file to a 32bit board running Marlin with a "pure USB" (no UART) connection gives the following data rates:
Octoprint (running on a Pi 2): 5.4KBytes/second
Repetier host(running on a PC): 47KBytes/second
Direct USB file access (from Pi 2): 666KBytes/second
So as you can see the data rate achieved by Octoprint is much slower than what is possible from the raw transfer speed. I'd be interested to see results from a similar test to a board that uses a USB/UART connection and that uses just UARTS (so no USB) if anyone has the hardware and time to do it.

Oh and for anyone that is interested the protocol used by repetier host is identical to that used by Octoprint (in that the bytes on the wire are the same). The difference is that rather than using a simple send/wait approach it basically tracks the state of the printer receive buffer. This allows it to send several lines of gcode before having to wait for an ok to come back from the printer. This typically allows for better usage of things like the available USB bandwidth (a smaller number of large packets rather than more small ones) and probably more importantly greatly reduces the overall latency since the ok responses will typically overlap with the ongoing data being sent. It does however mean that error recovery becomes more complex as you may need to go back several gcode lines.


I'd like to know how it reliably achieves that tracking across several firmware variants. The thing is, there are horribly buggy firmwares out there that acknowledge in a different order than they have received commands in and also don't support any additional means to track the internal buffer state but command tracking. I tried to implement a more sensible buffer use a couple years ago before I started running into these kind of bugs which pretty much nuked the whole approach and had me start the comm layer rewrite from scratch. Since then I've stuck to ping pong since it's the only thing that reliably works across 99% of firmware variants out there that get thrown at OctoPrint, without the user having to fiddle around with settings :woman_shrugging:

With regards to performance, I'd be interested how in your setup OctoPrint would perform installed on the same PC as Repetier Host (to negate affects of the underlying hardware platform). It's cross platform, so that should be possible to test.

A while ago I did some tests of the pure comm layer (without attached server) vs a C implementation for streaming an SD file to the firmware, and back then there wasn't a huge difference. Adding the full blown running server to it though changed things a bit. OctoPrint's architecture is a bit at its limit now - this is Python, so we have a GIL in the way that prevents utilization of multiple CPU cores for proper pipelining, and it's not just a GCODE sender but it has to do a ton of other stuff as well (the more plugins you have the more overhead).

Due to this I want to experiment with moving the printer communication (completely or partially) into its own process to get around the GIL, maybe split off other parts as well, but that would cause plugins that today rely on being able to inject themselves into that communication (e.g. Octolapse or anything that modifies the sent/received commands in any way) to no longer function, or at least not without severe adjustment, since shared memory would be a bit tricky to achieve here, as would injecting of various bits and pieces.


A good question and given that they offer the option to enable ping/pong mode I suspect that they don't. But I guess that it works well enough for some(lots ?) people, given that there does not seem to be a large number of people reporting problems. Having said that I did take a look at the code and it is pretty complex, certainly much more so than that in Octoprint.

Yes I'd like to be able to eliminate the hardware differences to get a better picture of what is going on. When I have a few moments I will give it a try.

It is particularly interesting now that pure USB stacks are available that in theory can have hardware flow control and pretty much no loss of data. One of the more interesting decisions with the current Marlin USB stack is what we should do at the USB level if we get incoming data but have no place to put it (because Marlin is not processing the data fast enough). In effect the current serial implementation will just drop the data, as will the existing USB implementation (though in a slightly different way), but it is certainly possible (and I have a version that does this) to use the USB flow control to stall the data input so that no data is dropped.




Speaking theoretically of course, I wonder what it would be like to have two USB cables between the Raspberry and the controller board? The first would be an out-of-band control channel and for printer-to-host conversations and the second would be for host-to-printer data. In theory, the first would have priority.

So if you want to pause the print job, this happens on the first connection. Since it's prioritized, the controller board would promptly respond to the request.

Returning to reality, perhaps it would be good for an outbound line of gcode to be able to "jump the queue" and be processed next. NodeJS's queuing mechanism can do stuff like this. Might be handy.


well if the mapping is bidirectional so that one can convert between the two it should not be a problem .. but while mentioned .bgc files were way safer and faster to transport afaik they never got accepted by the 3d printing community and they were being pushed by the netfabb that was an important player at the time...

I see the same thing being pushed by repetier binary protocol v3, looks 90% same to what I was working on with netfabb with, possible same ppl are implementing it now :smiley: for repetier...

the major problem I see is that any firmware will have to keep the plain text gcode parser in parallel to any new protocol out there just as well as any host like octoprint need to still support this pingponging thing in parallel to anything new... how easy would it be to do protocol as plugin for octoprint I can't say but might be a solution where anyone making a fancy firmware update can also make an octoprint plugin for his specific firmware :smiley:


You don't need two USB cables. A single connection can easily provide two channels that operate at two different priorities. Arguably the existing USB-CDC protocol (that is used to provide a serial device like interface over USB), already provides this. The messages that support the USB control signals (like baud rate and the various serial style hardware flow control mechanisms - RTS, DCD etc) are already communicated via a different USB mechanism to that used for general data flow, and can be received even if the main data channel is already "full" and paused. So it would be possible to say use one of those signals for a control program to tell the printer to stop and dump all of the data in the main data channel so it could receive a more complete "emergency command". Indeed the RTS signal is already used this way, this signal is typically used to indicate that the host program has closed the connection.


That's all well and good except when the controller board's receive queue is full and it's grinding on something that's taking too much time (tiny segments on a delta). I think you can get into a standoff when that buffer's full. The two-serial approach would mean that you have two receive buffers on the controller.


Just to be clear my comments are about 32bit systems that have a native USB stack, not ones that use a USB to serial interface like most of the Arduinos. There is nothing stopping you having multiple buffers for the USB stack. The Marlin LPC1768 USB stack has lots of them, some for the USB-CDC (serial interface) and some for the USB disk interface. All of the USB traffic is typically handled at interrupt time and will usually get handled no matter how busy the printer may be (especially on a 32bit system which is really what we are talking about here as you really need to have native USB hardware). If the processor will have time to look at two serial receive queues then it will have time to look at two streams provided by a single USB connection.

But anyway the idea of having different inputs with a different priority is certainly a valid one.


Do you know anything about the Smoothieboard while we're at it? (5xC v1.1)


but those boards mostly have zero issues with this lag we are discussing. I have 5+ smoothiware based machines running here, each of them with octoprint, never seen a printer stuttering even with crazy short movements generated by buggy simplify3d (fixed some time ago)

i do i do :smiley:


First off, where did you get one? :laugh:

What kind of serial rate can you push to this board?


that's a tough one, one original from germany, one original I got as gift, 4 PRC mks sbase clones from ALIEXPRESS, 2 PRC re-arm clones (aliexpress) and one original re-arm.... then I created my own version of the board, I have 4 different ones :smiley: + I have one conversion of teartime printer to smoothieware (cpu replacement)

I run my own fork of smoothieware on these printers and all printers are cartesian (no corexy nor delta)

I never tried more then 2,000,000 but probbly you can go higher, this boards are used for laser printing too they move laser head (mirror) super duper fast printing raster so incredible number of gcodes per second, more then any 3d printer is physically capable of doing... you can also have 2 serial ports on the same usb so one you can use to shoot code and other one to monitor printer (never tried that) ..

now if you have some special test you want me to try I'm happy to do it :smiley:


there's also a very useful:

it can send data much faster then a normal host


smoothieboard works faster trough usb then from sdcard in many cases btw :smiley:
ppl get over 1000 pixels per second on laser engravers, each pixel is at least one (if not two) g-code lines

smoothieboard also has network port that's even slower then sdcard (and of course usb)

EDIT: good read:

We determined that USB serial is capable of around 10,000 gcodes/sec so it isn't the communication bandwidth (although one has to take into account that USB may not be full duplex and a small buffer containing the ok needs to be sent back, plus the overhead of USB handshaking and protocol... but I guess at worst that would be half the bandwidth).


I'd love to see one of your Smoothieware config file examples one of these days. (I'm going to buy one in a few days when they become more easily available.)


I'm just running a test pushing 1,000,000 lines of g-code (all lines are M25 that does nothing on smoothieware but it still needs to get sent, ack'ed, parsed + there's reply that sd card is ok .. so actually slower then idea as there's a "big" reply, not only ok) .. so 500KB file from orange pi one using octoprint to smoothieware

Send: N41610 M21*18
Recv: SD card ok
Recv: ok
Send: N41611 M21*19
Recv: SD card ok
Recv: ok
Send: N41612 M21*16
Recv: SD card ok
Recv: ok

using 115200 - took 06:29 so 383sec for 1M gcodes or 2610 gcodes/second
using 2000000 - took exactly the same 06:29

now same file using the

pi@orangepione:~$ time OctoPrint/venv/bin/python -q .octoprint/uploads/milion.gcode /dev/ttyACM0
Streaming .octoprint/uploads/milion.gcode to /dev/ttyACM0
Waiting for complete...
  Press <Enter> to exit

real    3m23.662s
user    1m3.907s
sys     0m19.444s

now this "press enter to exit" took few seconds so the time is even less then this 3:23 but.. even 3:23 for million lines is not bad :slight_smile:

config, that's rather simple :smiley:

root@orangepione:~# cat config.txt
# NOTE Lines must not exceed 132 characters

default_feed_rate                            4000
default_seek_rate                            4000
mm_per_arc_segment                           0.5 
alpha_steps_per_mm                           644
beta_steps_per_mm                            644
gamma_steps_per_mm                           644
planner_queue_size                           32
acceleration                                 1000
acceleration_ticks_per_second                1000
junction_deviation                           0.02
microseconds_per_step_pulse                  1
base_stepping_frequency                      100000
x_axis_max_speed                             30000
y_axis_max_speed                             30000
z_axis_max_speed                             1200
alpha_step_pin                               2.1
alpha_dir_pin                                0.11!
alpha_en_pin                                 2.4!
alpha_current                                1.5
alpha_max_rate                               30000.0
beta_step_pin                                2.0
beta_dir_pin                                 0.5
beta_en_pin                                  2.4!
beta_current                                 1.5
beta_max_rate                                30000.0
gamma_step_pin                               2.2
gamma_dir_pin                                0.20
gamma_en_pin                                 2.4!
gamma_current                                1.5
gamma_max_rate                               1200.0
second_usb_serial_enable                     false
leds_disable                                 true
kill_button_enable                           true
kill_button_pin                              2.12
currentcontrol_module_enable                 false
extruder.hotend.enable                       true
extruder.hotend.steps_per_mm                   800
extruder.hotend.filament_diameter              1.75
extruder.hotend.default_feed_rate              600
extruder.hotend.acceleration                   500
extruder.hotend.max_speed                     5000
extruder.hotend.step_pin                        2.3
extruder.hotend.dir_pin                         0.22
extruder.hotend.en_pin                          2.4!
delta_current                                1.5
laser_module_enable                          false
temperature_control.hotend.enable              true
temperature_control.hotend.sensor              pt100
temperature_control.hotend.thermistor_pin      0.23
temperature_control.hotend.ampmod1_pin         1.20
temperature_control.hotend.ampmod2_pin         1.21
temperature_control.hotend.slope               0.0257604875
temperature_control.hotend.yintercept         -18.54
temperature_control.hotend.heater_pin          2.7
temperature_control.hotend.set_m_code          104
temperature_control.hotend.set_and_wait_m_code 109
temperature_control.hotend.designator          T
temperature_control.hotend.max_temp            500
temperature_control.hotend.min_temp            0
temperature_control.hotend.p_factor            23.0
temperature_control.hotend.i_factor            1.104
temperature_control.hotend.d_factor            120
temperature_control.bed.enable                 true
temperature_control.bed.sensor                 pt100
temperature_control.bed.slope                  0.0234092253
temperature_control.bed.yintercept            -2.85
temperature_control.bed.thermistor_pin         0.24
temperature_control.bed.heater_pin             2.5
temperature_control.bed.set_m_code             140
temperature_control.bed.set_and_wait_m_code    190
temperature_control.bed.designator             B
temperature_control.bed.bang_bang              true
temperature_control.bed.hysteresis             1.0
temperature_control.bed.min_temp               0
temperature_control.bed.max_temp               150
temperature_control.bed.runaway_heating_timeout 0
temperature_control.bed.runaway_cooling_timeout 0
endstops_enable                                true
alpha_min_endstop                              1.26^
alpha_max_endstop                              1.26^
alpha_homing_direction                         home_to_min
alpha_min                                      0
alpha_max                                      140
beta_min_endstop                               1.24^
beta_max_endstop                               1.24^
beta_homing_direction                          home_to_min
beta_min                                       0
beta_max                                       145
gamma_min_endstop                              1.28^
gamma_max_endstop                              1.28^
gamma_homing_direction                         home_to_max
gamma_min                                      0
gamma_max                                      135.4
homing_order                                   ZXY
move_to_origin_after_home                      false
alpha_fast_homing_rate_mm_s                    50
beta_fast_homing_rate_mm_s                     50
gamma_fast_homing_rate_mm_s                     4
alpha_slow_homing_rate_mm_s                    25
beta_slow_homing_rate_mm_s                     25
gamma_slow_homing_rate_mm_s                     2
alpha_homing_retract_mm                         5
beta_homing_retract_mm                          5
gamma_homing_retract_mm                         1
zprobe.enable                                  true
zprobe.probe_pin                               1.30
zprobe.slow_feedrate                           5   
#zprobe.debounce_count                         100 
zprobe.fast_feedrate                           100 
zprobe.probe_height                            10  
#gamma_min_endstop                             nc  
leveling-strategy.three-point-leveling.enable  false
leveling-strategy.rectangular-grid.enable               true
leveling-strategy.rectangular-grid.x_size               140 
leveling-strategy.rectangular-grid.y_size               140 
leveling-strategy.rectangular-grid.size                 7   
leveling-strategy.rectangular-grid.probe_offsets        0,0,0                 true  
leveling-strategy.rectangular-grid.initial_height       10    
leveling-strategy.rectangular-grid.human_readable       true
mm_per_line_segment 1 # necessary for cartesians using rectangular-grid
leveling-strategy.rectangular-grid.height_limit         3
leveling-strategy.rectangular-grid.dampening_start      1
leave_heaters_on_suspend true
panel.enable                                 false
network.enable                               false                            true                  M106                 M107                        1.18                       hwpwm                           100                    true                     true                     0
switch.sw1.fail_safe_set_to                  100

the cheapest way to go smoothieware is this SKR board for 17$ (I have ordered, not tried it yet):

or get rearm clone:

now while this is the cheapest way, I do suggest getting original, it is bit more expensive but the pcb is properly made with proper power distribution and proper emi shielding and high speed traces crosstalk consideration and and and ...



Gonna wait for the top-shelf version in a few days (5xC).


btw tested now fast-streamer from a regular linux box (some old Core2Duo E6850 running fedora), same file, same smoothieboard ...

[root@gedora ~]# time echo | python -q milion.gcode /dev/ttyACM0
Streaming milion.gcode to /dev/ttyACM0
Waiting for complete...
  Press <Enter> to exit
real    1m39.769s
user    0m23.681s
sys     0m8.914s
[root@gedora ~]#

so 1:40 or 10,000 gcode lines per second ... more the twice the speed of orangepi one

yeah 5xC are avaialble for pre-order on robotseed ...
I think USA store has them on stock ( ) you might want to check

I was hoping for v2 to come out but I kinda lost interest along the way :frowning: .. and am actually holding myself to start testing klipper first chance I get to find some time :slight_smile: