[minicoredumper] stack detection fails with kernel >= v5.18

Holger Brunck holger.brunck at hitachienergy.com
Wed Sep 27 13:59:54 CEST 2023


> > I currently try to integrate the minicoredumper into our embedded SW.
> > I am using the latest version 2.06. With kernel 5.4.x it works pretty
> > well. But with boards using kernel 6.1.x I saw problems. The stacks
> > were missing in the generated core file from the minicoredumper.
> >
> > Our application is multithreaded and minicoredumper reports when
> > processing the core file:
> >
> > Aug 15 08:18:45 unit user.err minicoredumper: unable to find thread
> > #14's (386) stack Aug 15 08:18:45 unit user.err minicoredumper: unable
> > to find thread #15's (387) stack Aug 15 08:18:45 unit user.err
> > minicoredumper: unable to find thread #16's (388) stack Aug 15
> > 08:18:45 unit user.err minicoredumper: unable to find thread #17's
> > (389) stack
> >
> > I was able to reproduce this problem in a x86 qemu environment with a
> > mainline kernel and parts of our application code. After bisecting the
> > kernel I saw that this was introduced with kernel v5.15 due to the
> > following commit:
> >
> > 7b1b610f  coredump:  Don't perform any cleanups before dumping core
> 
> Yes, that commit broke /proc/PID/stat. The commit does not properly take the
> non-crashing threads into account. It has been on my TODO list for a while to
> post a proper fix. I would also like to add kernel tests because this is not the first
> time that a developer breaks /proc/PID/stat.
> 
> > Btw when using the regular core file gdb is able to show the stacks
> > from the threads as expected.
> 
> gdb manually parses the dump information to retrieve the stack pointers. This is
> quite complicated because different architectures do things differently.
> 
> The minicoredumper project has no interest in implementing all that. (It has
> been suggested in the past [0].) Instead, /proc/PID/stat is used, which already
> provides that information. However, that information is only available for tasks
> that are no longer executing (are shutting down or have crashed). And that is
> what is currently broken in mainline.
> 

thanks for the explanation.

> I have attached a workaround-patch for the kernel, that seems to fix the issue.

yes this patch does solve the problems in my setup, thanks.

Best regards
Holger



More information about the minicoredumper mailing list