Settings not pushing to machine?

Hey everyone!

I ran into a strange issue today that’s got me a little stumped so I’m curious if someone else can spot what’s going on.

The issue is that today I was seeing behavior that made me think maybe some of my settings weren’t getting pushed to the machine properly with 0.83. As part of the process of investigating what was happening I started cleaning up some of the code along the way. As part of the clean up I removed the default values for the kinematics math, and something strange happened. When I removed the default values for either D or motorOffsetY (float motorOffsetY = 463.0; vs float motorOffsetY;) the kinematics was wrong. I can (I think) confirm that the values sent from Ground Control are being sent, and the correct values being changed within kinematics, but the kinematics does not work properly without default values for either of these. I checked this by adding a print statement to line 78 in the file kinematics.cpp

In the firmware file kinematics.h:

I’m fairly certain that this speaks to a larger issue that we need to solve.

My branch where I started poking around is here:

https://github.com/MaslowCNC/Firmware/tree/explore-some-settings-not-pushing

Does anyone see what I’m missing?

A couple of things come to mind, but I am at work so I don’t have the arduino ICE with me to test my thoughts.

You say that kinematics are not correct. Can you elaborate? Does it start wrong and then correct? How is the problem manifested?

  1. All variables should have an initial value. Sometimes the compiler\linker has options to ensure this but a good coding practice is to set a variable’s value to something known, otherwise you may get whatever is currently sitting in that memory location. I would set D and motorOffsetY to something reasonable or more probably 0.0.

  2. Because the timer interrupt is setup and started in the setup() routine and the loop() starts later, it is highly possible the the interrupt is firing before the settings get read inside of the loop(). If this was the case, I would expect that its first calculation could be wrong, but then it would be correct after the B03 command was processed. Unless! If the value that is present in D or offsetMotorY is something that could cause an exeception (e.g. division by 0) it is possible that an unhandled exception is occurring and then doesn’t recover.

What you might do is ensure that the settings changes are processed at least once before the timer interrupt can fire. You could either:

  1. call the “settings” processing functions before setting up the timer interrupt (maybe you could call loop() a few times manually from within setup() before starting the timer interrupt although I haven’t looked through the code enough just now to verify this is the best solution) OR
  2. put the timer setup at the very end of the loop() function but only call it the first time that loop() is executed. (I like this option less).

However you decide to do it, I would suggest initializing both variables to “something” in their declaration, and make sure that your settings changes are processed at least once before the timer interrupt starts.

Best guesses for now… I’ll look at it more this weekend with the debugger attached if you don’t have it solved by then.

Welcome to interrupt-driven programming!

1 Like

I’m also going to gently suggest a change that will make things more predictable.

In your interrupt handler, you are doing PID calculations. This is probably not the best place for this. Interrupt service routines(ISR) are meant to be in and out as fast as possible, so that overlapping interrupts (and the problems those cause) are less likely to occur. In your ISR, you should probably just set a bit that indicates that it’s time to calculate PIDs and then if the bit is set, do that calculating within the loop() (and don’t forget to reset the bit).

Best practice when servicing interrupts… minimize processing.

1 Like

I’ll chime in to echo that gentle suggestion on the interrupt processing thing. This has been the solution to many of the ‘funnies’ I’ve dealt with in the past.
As to the settings issue, this does feel like an uninitialized variable to me as well. I was going to try a ‘firstRun’ flag in kinematics to print the values - one version uninitialized, one initialized but identifiable as such. Is there a reason not to store the values in EEPROM and accept updates from GC?

along similar lines, we are wanting to be compatible with grbl senders, so we
need to have these settings, and the PID values, stored locally so that we don’t
have to have them sent to the device every time.

Looking at the values of D and motorOffsetY just before they are updated in updateSettings, they are 0.0 if they were not configured in Kinematics.h. Haven’t looked into Kinematics to see the ramifications…

1 Like

Ramifications at a quick glance… I see division operations with D as part of the divisor, which could mean possible divide by 0 error. The use of motorOffsetY looks pretty benign from that standpoint. In any case, I would bet on a scenario where they are used before being set.

1 Like

I assume everyone knows this but just-in-case…if anyone changes the code to store the settings in flash, make sure to read them out into local variables in RAM at startup… reading from the flash at runtime should only be done once as it is very slow compared to RAM and would probably result in other issues appearing.

1 Like

I wasn’t able to replicate the issue I was seeing last night, I think I was just tired and doing something silly. It seems like all the settings are pushing right now :upside_down_face:

@seware Your point about initializing to a know value instead of just setting it to whatever is in memory is a good one. I’ve set them all to 1 to avoid divide by zero errors.

I agree that our ISR is way too big and I’m open to a better architecture if you have one in mind. The PID loops need to be called with pretty consistent timing which is why the timer is nice, but I agree that if our ISR were to wrap over itself it would be a big mess (and you can make that happen by putting a bunch of prints in the ISR)

@dlang Your point about wanting to make sure connecting from any GRBL controller works is also well taken. I think that the right solution is to store the machine’s settings into EEPROM, that way we only need to push them when the calibration changes. What do you think?

1 Like

I agree that our ISR is way too big and I’m open to a better architecture if you have one in mind. The PID loops need to be called with pretty consistent timing which is why the timer is nice, but I agree that if our ISR were to wrap over itself it would be a big mess (and you can make that happen by putting a bunch of prints in the ISR)

There are two ways to do accurate motion calculations

  1. try to run the code at predictable intervals

  2. when you run the code, fetch the time and include the delay since the last
    time it was run in the calculations.

when you have interrupt driven code running, the second approach is better as
you never really have an ability to run at exactly the right time.

so thge ISR should fetch the value of a timer that’s continually running

@dlang Your point about wanting to make sure connecting from any GRBL
controller works is also well taken. I think that the right solution is to
store the machine’s settings into EEPROM, that way we only need to push them
when the calibration changes. What do you think?

Long term, I think these should be stored in flash, not in EEPROM, since some of
our later targets may not have EEPROM, but you are thinking along thr right
lines.

@dlang, the most recent commits to the ESP32 Arduino include EEPROM :slight_smile: . The Due erases flash when the firmware is changed :frowning:

There are several feasible design patterns, but here’s what I really think:

Best (but most effort): use an RTOS. (See my other diatribes on the subject).

Next best (IMHO):

loop() runs in a endless tight loop, meaning it will be executed as frequently as the code within it allows. Within loop() is the serial polling and g-line processing code. Outside of loop() is an ISR that needs to run trigger some code consistently. So we have a case where we are using both polling and interrupts. This is not a best-practice situation if consistent timing is required. To stay with the current basic architecture I would replace the serial polling with a serial interrupt (yes it can be done with arduino). So both the timer and “serial available” are triggered by interrupt. The ISRs themselves kept simple: In the timer ISR set a bit when it’s time to compute PID; in the serial ISR, set a bit that serial is available (or alternately pull the value out of the UART buffer and put it in a queue, if the UART buffer itself isn’t large enough) This makes the the loop() function much simpler and lessens its overall work. The purpose of minimizing loop() is so that it can now focus on handling computing the PIDs (when timer bit says so) or processing lines of serial instruction (but only when it’s available). This makes it easy to tune the loop() to meet your consistent PID needs: it will only calculate on loop() iterations where the ComputePIDNow bit has been set. If you’re worried about it getting behind, it could also be combined with @dlang’s plan of taking a timestamp when the timer fires, so that adjustments could be made in the edge case that other work in the loop() exceeded the max time between PID calcs.

All this to say, if you want to ensure consistent timing, use a lightweight RTOS; if that’s not feasible, at least make it completely interrupt-driven. Interrupts drive when to do work… loop() does the work when required. Simple, right?

1 Like

I like your interrupt-driven approach, but FreeRTOS might be an interesting branch. Currently the firmware updates the motors at about 6kHz. Is this frequent an update necessary?
FreeRTOS is limited by time slices no less than 15ms. If nothing else got in the way, my very rough calculation suggests we could approach 75 inches per minute at the ~.02 inch accuracy in the 15ms time slice.

Sheesh… in thinking more about computing PIDs within the ISR (and the inherent dangers therein), it also occurred to me that (at least) these three variables in the Axis class: _pidSetpoint, _pidInput, _pidOutput; should be declared as “volatile” because they are read and written by both the main program loop and the ISR. As an example, as coded, it is highly possible that sometimes when _pidOutput is written to the motor, it is using the wrong value, because it might be taking the value from a register and not RAM. Any variable that can be modified from another location than the currently executing thread, should be declared as “volatile”

1 Like

an OS adds overhead, on a 16MHz machine, this overhead can be a real problem.

In our case, it’s not that the PID calculations need to be done at a specific
time, but rather that the calculations need to know what time it is when they
are run.

There are multiple things that we have to track at the same time

  1. encoder interrupts that need to immediatly update our ‘known position’

  2. serial interrupts that indicate the need to either push additional pending
    data, or pull data from the buffer. in both cases, it’s keeping data flowing and
    avoiding loss. When reading from the buffer, we need to look for ‘stop’ and
    similar commands, separate from normal command interpretation.

  3. the tight loop that computes where we should be, where we are, and adjusts
    the power to try and make the two match (the PID loop)

  4. interpreting the serial data so that when we finish the current movement, we
    know what to do next (eventually this should get ahead of what we are currently
    doing so that smart acceleration planning can be done)

Am I missing anything else that the software needs to do?

#1 and #2 are interrupt driven (even if you have a RTOS, they are interrupt
driven)

#3 doesn’t need to be done on a precise schedule, as long as the calculation
knows the time precisely enough to compute where we should be (and as long as
the loop is fast enough that we don’t overshoot too far when stopping). The more
frequently this loop runs, the more accurate the motion.

so the only odd duck is interpreting the serial data. That is where a flag
should be set when new data arrives so that the main loop can spend a little
time processing the serial data in the next pass.

I like your interrupt-driven approach, but FreeRTOS might be an interesting
branch. Currently the firmware updates the motors at about 6kHz. Is this
frequent an update necessary?

the more frequent the updates, the more accurate the motion.

Remember, any cpu cycles not used are wasted (and on these tiny devices, you
don’t even have power saving modes)

FreeRTOS is limited by time slices no less than 15ms. If nothing else got in
the way, my very rough calculation suggests we could approach 75 inches per
minute at the ~.02 inch accuracy in the 15ms time slice.

how well can it respond to the encoder interrupts? remember that it needs to
process one encoder pulse from the motor before the next one arrives or it
cannot figure out which direction things are moving.

8148 pulses/rev @10 rpm is 1358 pulses/sec or one pulse (from that motor) every
0.00073 sec. And there are three motors to monitor

15ms is a very long time in this sort of environment.

The PID calculations appear to be called from the interrupt-driven timer, which is configured for 10ms.

For comparison the Marlin 3D printer firmware will generate somewhere around 15K steps/second on the same hardware platform

There is an extra layer of calculation the Maslow needs to do… :wink: and monitoring the encoder as well.
Marlin is a very mature-looking project, though. They sure have squeezed a lot of juice out of the Mega!

One less axis, and no temperature control helps a little. Marlin started with grbl, although it forked off somewhere in the early 2010s. I’ve been using it since mid 2012. It fell into the open source bickering trap for a while although that might be resolved with the 1.1 effort. Repetier firmware plays games to try and double the step rates, although I’ve forgotten the details. Tinyg does it with brute force, higher clock rates.

Strongly agree with moving as much as possible out if interrupt code, no place for lengthy calculation and certainly if it includes emulated floating point. Had good luck with the flag and non-interrupt time technique. Haven’t done much more than a quick skim of the maslow firmware so can’t make any constructive comments.

1 Like