tests/sys/cpp11_thread: use less stack for boards with low RAM memory#22417
tests/sys/cpp11_thread: use less stack for boards with low RAM memory#22417crasbe wants to merge 2 commits into
Conversation
|
You are sending PRs by the minute today gawd damn 🤠 |
| # Boards with little RAM (~8KB) can't handle threads with the large default | ||
| # stack size. | ||
| # Beware: The reduced stack size will also suppress the thread stack usage | ||
| # printout! | ||
| LOW_MEMORY_BOARDS := \ | ||
| nucleo-c031c6 \ | ||
| nucleo-f042k6 \ | ||
| nucleo-g031k8 \ | ||
| nucleo-l031k6 \ | ||
| stm32g0316-disco \ | ||
| weact-g030f6 \ | ||
| # | ||
|
|
||
| ifneq (,$(filter $(BOARD),$(LOW_MEMORY_BOARDS))) | ||
| $(shell $(COLOR_ECHO) "$(COLOR_YELLOW)Using a reduced stack size for low" \ | ||
| "memory board \"$(BOARD)\"!$(COLOR_RESET)" 1>&2) | ||
|
|
||
| CFLAGS += -DTHREAD_STACKSIZE_MAIN=512 | ||
| DISABLE_MODULE += test_utils_print_stack_usage | ||
| endif |
There was a problem hiding this comment.
Could this somehow be checked in code by checking for the define? I'm generally fine with this but idk how other people stand on doing it like this
There was a problem hiding this comment.
Something like this?
#if RAM_LEN <= 8096
# if THREAD_STACKSIZE_MAIN > 512
# error "A stack size of more than 512 bytes might lead to a memory overflow!" \
"Consider adding the board to the LOW_MEMORY_BOARDS list in the Makefile."
# endif
#endifDisclaimer: Haven't tested it yet.
There was a problem hiding this comment.
Oh yeah that as well but I meant that in the sense of simply changing the THREAD_STACKSIZE_MAIN in the header, e.g.
#ifndef THREAD_STACKSIZE_MAIN
# define THREAD_STACKSIZE_MAIN 512
#endifso the stack size reduction is closer to the boards itself, I feel like doing this externally might make it harder to spot where the number suddenly changes
maribu
left a comment
There was a problem hiding this comment.
No, let's not do this.
The test is not really testing the board, but the CPU architecture and toolchain. If one Cortex M33 board works, all other will also. It IMO just not worth spending code end effort in making this test runnable on more boards.
There also is the issue that hard coded stack sizes don't really work. On AVR 512 B is plenty, as it is rather stack efficient. On native64, anything below 4096 will run into a stack overflow. We could try using THREAD_STACKSIZE_SMALL instead. But again, why bother if this test doesn't work on some boards, as long as we have at least one board per family supported?
I also fear that some people would read the code here that stack sizes do relate to available RAM. But stack size should only chosen based on what the code needs, regardless of what board we have.
I don't think this is a good way of looking at it. If someone ports a board and runs the compile and test script and a test fails, they'd rightfully assume that something is wrong with the port. Relying on tribal knowledge of some maintainers that certain tests just don't work on some boards for some reason is bad, as that knowledge is quickly lost.
Well the used stack according to the print_whatever module is only 136 Bytes or so. Reducing the stack size to anything less than 512 Bytes will lead to a crash though. If the proposed solution is not acceptable, I would at least want to have the boards with little RAM on the board blacklist (or what is it called now? I don't remember) and an explanation why that is. Although I do think that maybe the #if-#error thing wouldn't be so bad. That would at least point someone who's porting porting a board to the board blacklist when the RAM length is insufficient. Mind you, this took me 1-2 hours to debug and find out. |
|
What exactly is the failure mode here? Does it compile and then fail at the linking stage due to e.g. A failure at runtime is not expected here. |
It compiles successfully and fails at runtime, see the logs in the initial message. |
|
OK, but then just making more RAM available won't fix the underlying issue, it will just cause some boards to no longer be affected by it. |
|
Who knows whether or not that's the expected behavior. There's no documentation. Some tests say that they're expected to fail, so who knows 🤷 |
|
The issue here is that the constructor of /**
* @brief Holds context data for the thread.
*/
struct thread_data {
thread_data() : ref_count{2}, joining_thread{KERNEL_PID_UNDEF} {
// nop
}
/** @cond INTERNAL */
std::atomic<unsigned> ref_count;
kernel_pid_t joining_thread;
std::array<char, THREAD_STACKSIZE_MAIN> stack;
/** @endcond */
};There are a number of red flags in the code here:
IMO we can just remove the whole C++ thread API and call it a day. I think nobody uses this anyway. I'm sure nobody should use it, looking at the code. |
|
I don't have any hardware at hand. Could you test #22418 and see if the failure mode is a bit more boring with that? I still believe dropping the C++ |
Contribution description
Following #22404, I ran
dist/tools/compile_and_test_for_board/compile_and_test_for_board.pyand noticed thattests/sys/cpp11_threadwould crash. The reason is that the amount of stack used just exceeds the RAM available in the microcontroller.Instead of reducing it globally, which would disable printing the stack usage for all microcontrollers, I adapted the
LOW_MEMORY_BOARDSvariable known from other tests and examples.To get the additional boards that work now which didn't before, I set the stack to 512 bytes and ran
make generate-Makefile.ciand copied the boards over afterwards.It is possible that there are boards that would fail the test because they have enough memory for the static compilation check but not for the actual execution though, but without testing them it's kinda hard to know.
One way to find out would be to check the
RAM_LENsomewhere and compile for all boards and add the ones that have <16KB of RAM, but I didn't want to go that far...Testing procedure
Unfortunately I don't have all the hardware for this, but now it works with the
nucleo-g031k8where it previously crashed.With #22404 applied, this is the result from
master:With #22404 applied and this PR:
Issues/PRs references
Declaration of AI-Tools / LLMs usage:
AI-Tools / LLMs that were used are: