15mm error while running code

I’ve noticed this too. Not sure why.

Now we just need to figure this out :grinning:

We’ve been doing a binary search to try to figure out at which version the issue started happening. We’re running a 40 minute cut over and over again on each version at Alt Space (the maker space) where we saw the issue happen before but we just made it all the way back to 0.76 without the issue showing up :grimacing:

This is FANTASTIC testing. This also matches up with my experience.

When we got really busy packing and shipping kits we had less time for doing real projects and so I was going a lot of testing in the office running dry cuts (router off) and I think that somehow in testing that way I introduced a bug which only shows up when the router is on :man_facepalming:. I think that it’s reasonable that the bug is both software based and EMF sensitive. We’re reading from the encoders 1000 times a second so we can survive a decent number of bad packets and still maintain physical precision, but it seems like something about the way that we are handling the encoder reads now is different from before and is leading to a chance for a bad packet to crash things.

We’re working through testing every version right now to figure out which change is the cause of the instability.

What if it’s not an EMF thing, what if it’s a vibration thing? The router vibrates a lot more when cutting than when not cutting

Has anyone tested a cut with the router running but dust collector not running. It occurs to me that, considering the static every other dust collector generates, if this is EMF the dust flowing through the hose would be the cause.

On another note, with just as much mystery as anything else, I swapped back to all original cat6 encoder cables and started having more frequent issues. As soon as I switch one of them back to the random one I had laying around it was much better, still comes up, but only like every 3rd cut, not every 2 minutes. Maybe worth noting that this other cable incidentally touches the power cable.

1 Like

Bar wrote:

What if it’s not an EMF thing, what if it’s a vibration thing? The router vibrates a lot more when cutting than when not cutting

get a bolt, have someone hit it with a mig welder to make it slightly
unbalanced, and chuck that up and try running it (at the lowest speed you can)

David Lang

2 Likes

Yes. I ran this 5 hour cut with the dust collection turned off and removed for the last 4ish hours of the cut.

1 Like

Just a long shot… I have done lots of microcontroller projects in the past and I remember some with nasty bugs caused by very subtle timing problems. Sometimes the problems got worse by adding or even just changing error logging/messaging and sometimes the logging itself actually was the real problem. Could it be that as a result of vibration, there are slightly more encoder read errors that need to be logged, causing timing problems (race conditions, delays, missing interrupts, too many interrupts, …)?

1 Like

Oh yeah it’s absolutely something like that :grinning:

It’s the worst type of but to track down

Just to add, that 5 hour cut I did was a detail finishing pass. There were some sections where the passes were pretty deep for the bit, but I would hazard there was a lot less resistance to the bit with this job than there is with a standard cut.

I’d also hazard that using a ball-tip endmill may produce a smoother ride as well, which is what I was using for this. When I used the same width but a straight end bit on a more-standard cut, it was the only time I ever saw the 15mm error.

With some people having mitigated EMF pretty robustly and still having the issue, I really wouldn’t be surprised if it had something to do with the extra strain on belts during a cut or with vibration causing misreads or causing compensatory code to run more often and break timings.

1 Like

I’m hopping in here to summarize where I think we appear to be

  1. Older versions of the firmware seem more stable, but there are still errors, even on the older version (@jadeqp).
  2. EMF may or may not be contributing to it.
  3. Vibration may or may not be contributing to it.
  4. It seems that it fails more often when cutting material, but there have been failures (@Josh_Monroe ) on dry runs too.
  5. Grounding/shielding/separating cords and other suggestions to reduce EMI “may help”?

I don’t see that we have really eliminated anything as a cause (software, EMI, vibration, tension from cutting to aggressively, etc) Well… Maybe I just have a case of the “Mondays” but this seems to be the case. I was pretty excited to get a successful cut but it seems like my cut was not nearly long enough to prove that the firmware was the issue.

Also it occurs to me that the 15mm issue itself may not even be the cause of some of these failures, given as this very long topic has gone through iterations where we actually removed that as a panic condition and allowed cuts to move on past it, where other failures happened.

I wonder if we’d be best off finding a way to produce a very detailed “telemetry dump” of machine data over time during cuts that can be downloaded post failure to determine what happened. The ESP32 has 2 cores, so it seems like the non-critical core could be gathering and logging to disk periodically measurements we could download and maybe see where things went wrong. (also adding a timestamp number to all messages would allow us to correlate the data)

2 Likes

Perhaps it’s an EMF issue and the 15mm error is the failure state that manifests in more recent versions of the firmware, whereas random movement and belt snap is how it manifests in 0.64? I agree that some kind of more verbose logging would be useful.

2 Likes

It’s funny that you would mention me now. I have been busy with family and work, but just got back to this tonight. I am trying a dry run on 0.77, but I have the PCB removed and a sandwich of foil, felt and electrical tape between it and the router. The router is on, but not cutting

If this works, I’ll go ahead and make myself a hat from the foil.

4 Likes

:rofl:

1 Like
  1. shortening length of DC cord may help
  2. adding power filtering capacitors to DC supply may help

FYI, I was able to do a smaller version of my laundry sign (when my sister in-law saw my wife’s sign she wanted one too :joy:) with the latest 0.78.1 firmware, using an 1/8 bit in 2 passes on 1/2 MDF, feedrate 1500, speed “2” (should be 18,200).
I did not see any 15mm messages at all, and after having a fresh coat of wax on the bottom of my sled and re-re-re done calibration starting with dlang’s manual measurement calculator, then a calibration, I saw no overcurrent errors either.

I may be getting there as far as trust in this machine for me with my current mods (grounding encoder pin, grounding shield under board, grounding dust hose).

Also as an aside, these bits are pretty awesome https://www.amazon.com/gp/product/B09Z6DJ62D/?th=1. I think I could have skipped tabs and done one pass, and I will try on my next mdf sign.

3 Likes

Following up on my previous tests (https://forums.maslowcnc.com/t/15mm-error-while-running-code/21106/129?u=willemx) I did another test run, now with the router on and cutting, but with the dust collection off and no hose connected. That resulted in a successful cut (and lots of dust all over the place…).
I was thinking maybe I could try to shield the magnetic sensors somehow and I wrapped all sensors in aluminium foil, creating some (very leaky) Faraday cages.
;


I ran the same test again (router cutting and dust collection on), but this resulted in another fail after a few minutes. It did not fail with a 15 mm error though, but with “Update function not being called enough”

I’m pretty sure that the issue is dust collection, and I’m also pretty sure that the anti-static hose did in fact fix it for us, but I want to get some more testing in to be sure before recommending that anyone buy a $80 hose since they’re expensive

1 Like

I tried the anti static dust hose that you sent a link for a couple weeks ago and didn’t have any luck with that or putting a ground wire down the dust hose

1 Like

That’s good feedback, @zack_rosier are you able to get the issue to happen without the dust collection on at all or is it that the issue happens only when the dust collection is running but the type of hose doesn’t matter?

1 Like

The only times I haven’t got the error to show up is when I am doing a dry test with the router on but not actually cutting anything.

Here is a list of the tests/things I’ve tried that were unsuccessful

Faster RPM
Clean ethernet cables
3/4” pass
1/4” passes
3/16” passes
1/8” passes
New bit
Anti static hose
Bare wire inside dust collector
Test outlets are grounded
Shielded cat 6 wire
Ground router and base
Horizontal frame
Different types of material

1 Like