Here is a discussion of some of the memory problems that can arise in more complex applications on the Arduino. A memory leak is another possibility, though the Arduino IDE developers have done a pretty good job of preventing those.
no, a memory leak is when the computer has lost track of a chunk of memory and
has nothing useful in it, but will never release it.
fragmentation is when you use memory in a pattern that leaves free chunks of
memory betwen used chunks of memory, but nothing is large enough to use.
say you have a 12" lenth of space, and you can’t move block once they are placed
in the space (other than to remove them)
so you then allocate (place) a 3" chunk, a 4" chunk, and a 3 in chunk (10" used)
you then release the 4" chunk and put in a 2" chunk (using the first 2" of the
4" space), you now have 8" used out of 12", but if you need to place a 3" chunk
there, you don’t have any place to put it, as you have 2x 2" spaces available.
Right, no garbage collection in C++. My bad, I certainly understand the difference. The last time I did any serious programming on an 8K RAM machine it was all in assembly and nothing like an ATMega.
Just bitten by the firmware… Best to start another topic.
Kinematics.c does a lot of printing, that might be a place to look for ways to optimize.
Does printing result in stack fragmentation? I thought it was just string operations like copy or append. I’ve been thinking that this is the issue we’re probably dealing with, but I could be wrong because my attempts to fix it haven’t had any effect so far today.
The test file above has been very effective at making the problem replicate for me:
printing does a lot of string operations
did you ever do the reserved space for the strings for g-code parsing?
I did try the reserved space thing. It didn’t seem to help, but I’ve just merged those changes in because it seems like a good thing to do anyway
FWIW this page gives some info on saving sram with strings.
I remember something about const static keeping strings out of data space, but it must not work in Arduino land.
Long long ago one of our programmers left a lot of debug printf’s in an IBM host channel interface controller, cut the channel speed way down waiting for the characters to clock out the (no connected terminal) serial port at 9600bps. Do they make a noticable impact on arduinos?
So here are some things I’ve learned, I’m not sure how they all connect
Pressing the “Stop” button seems to fix the problem
This branch here will print the available RAM continuously and we’re not even using 1/3 of the available RAM at this point which makes fragmentation less likely
It seems to be related to how long the program has been running or the number of lines which have been executed, not any particular lines. Fast forwarding to a place where the trouble started before, doesn’t seem make it start again, but waiting the same amount of time (roughly) will cause the jittery behavior to start again
I’m not sure how all these fit together yet
Could we try building the output message in a character array using C++ functions and using one print statement to send that? The String class is known for profligate heap use and I’ve read somewhere that printing floats is particularly guilty.
This thread discusses a tool to inspect the heap that might be useful.
That’s a great tool! Let’s see what it tells us. Thanks for sharing, I’ve been looking for exactly that but didn’t find it
Is that the version of the MemoryFree library that allows walking the free list? If so, you don’t need to change libraries … That would give a feeling for fragmentation… Perhaps implement a temporary Bxx gcode to trigger a printout of the free list that could be slipped into a gcode file to debug?
How many prints are relevant for normal, i.e. non developer, users? Can you just ifdef them out in released code?
@blurfl I love the idea of slipping in a temp Bxx code to let us interrogate the sate of things and give us a sense of how fragmented things are.
The version of MemoryFree I was using just prints the total memory that is free, but if there is a version which would let us walk the free list that could tell us more. The tool to inspect the heap looks really promising to me.
If you make progress on this front, let me know. I’d love to be able to build off of your work in the morning. If not this sounds like exactly the place I want to start so thanks for the advice!
@mooselake Unfortunately, they’re all relevant for normal users, the ones you don’t recognize are when the machine prints it’s position and positional error. Those are masked by Ground Control and used to move the cross hairs on the screen. If we can prove that heap fragmentation is the issue, and that it’s caused by printing we can almost certainly find a way to achieve the same functionality without causing fragmentation
Unless Arduino does things very differently, the only thing in this firmware that could be causing heap fragmentation is strings. Otherwise I don’t find any other use of dynamic memory allocation. i.e. there are no uses of “new” or “malloc” that I can find. So almost all of the elements allocated are on the stack.
I don’t think this is a heap problem. You preallocate the two main strings that are dynamically changed (readycommandstring and gcodeline), so unless they are exceeding the reserved string length (could they be?) thenthe two strings of size 128 isn’t likely an issue.
Incidentally there is an unused String variable named “temps” in the ringbuffer class that can be deleted. I don’t have git loaded on the machine I’m on, so I can’t make the change just now.
From my experience, I thiink chasing the “heap” isn’t going to yield much, but I’ve been wrong before. Will keep looking.
One way to be sure, is to replace the use of the string class with good old fashioned c style strings. no dynamically allocated strings means no heap problems. See here for info if you’re not familiar.
Here is another very good tutorial for those who missed the glory days of ‘C’ programming.
I’ll give it a look tonight
@mooselake, your suggestion is interesting - we could test that by temporarily disabling the print and foregoing the cursor move to see if that affects this issue.
PR#272 in Firmware adds two Bxx codes to put hjeap info into the log:
- B12 executes Avrheap.freeListWalk
- B13 executes Avrheap.dumpHeap
I didn’t do anything with the strings.
Running the “sled abbreviated” file a few times left me wondering whether it might be an issue in the PID routines. PID winding out beyond limits? Swamp rats? Is there an easy way to switch it on and off? Do we need PID on the z-axis with the current motor/gearbox?
Bar wrote "1. Pressing the “Stop” button seems to fix the problem"
Could it be a serial communications buffer problem since pressing the “stop” button would stop and reset the serial communications ?