Discussion:
[Crash-utility] [PATCH] cmdline: Add a new "--machdep stacksize=<value>".
Sean Fu
2018-09-29 09:09:57 UTC
Permalink
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.

Signed-off-by: Sean Fu <***@gmail.com>
---
x86_64.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/x86_64.c b/x86_64.c
index 7d01140..1798f05 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -5716,6 +5716,15 @@ parse_cmdline_args(void)
continue;
}
}
+ } else if (STRNEQ(arglist[i], "stacksize=")) {
+ p = arglist[i] + strlen("stacksize=");
+ if (strlen(p)) {
+ value = stol(p, RETURN_ON_ERROR|QUIET, &errflag);
+ if (!errflag) {
+ machdep->stacksize = value;
+ continue;
+ }
+ }
}

error(WARNING, "ignoring --machdep option: %s\n", arglist[i]);
--
2.6.2
Dominique Martinet
2018-10-01 13:35:11 UTC
Permalink
Post by Sean Fu
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.
Instead of making that an option it could be possible to autodetect this
by looking at __start_init_task / __end_init_task symbols, the
difference should be the proper size (the symbols have been around since
91ed140d6c1e168b11bbbddac4f6066f40a0c6b5 in 4.7 so that might not be old
enough for you though, as your commit dates 3.15 ; but there might be
other methods of getting stack size I haven't thought of, I only grepped
in a recent kernel)
--
Dominique Martinet
Dave Anderson
2018-10-01 13:37:10 UTC
Permalink
----- Original Message -----
Post by Sean Fu
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.
The x86_64 default value of 8K is basically a leftover value that each of
the architectures originally used for setting machdep->stacksize. But for
quite some time now, those values should get overridden later on here
in task_init():

STRUCT_SIZE_INIT(task_union, "task_union");
STRUCT_SIZE_INIT(thread_union, "thread_union");

if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
error(WARNING, "\nnon-standard stack size: %ld\n",
len = SIZE(task_union));
machdep->stacksize = len;
} else if (VALID_SIZE(thread_union) &&
((len = SIZE(thread_union)) != STACKSIZE())) {
machdep->stacksize = len;
} else if (!VALID_SIZE(thread_union) && !VALID_SIZE(task_union)) {
if (kernel_symbol_exists("__start_init_task") &&
kernel_symbol_exists("__end_init_task")) {
len = symbol_value("__end_init_task");
len -= symbol_value("__start_init_task");
ASSIGN_SIZE(thread_union) = len;
machdep->stacksize = len;
}
}

As of Linux 4.18 at least, x86_64 still uses the thread_union declaration.
For example:

crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>

On what kernel version are you seeing the obsolete 8k stacksize being used?
What does the command above show on your system?

Thanks,
Dave
Post by Sean Fu
---
x86_64.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/x86_64.c b/x86_64.c
index 7d01140..1798f05 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -5716,6 +5716,15 @@ parse_cmdline_args(void)
continue;
}
}
+ } else if (STRNEQ(arglist[i], "stacksize=")) {
+ p = arglist[i] + strlen("stacksize=");
+ if (strlen(p)) {
+ value = stol(p, RETURN_ON_ERROR|QUIET, &errflag);
+ if (!errflag) {
+ machdep->stacksize = value;
+ continue;
+ }
+ }
}
error(WARNING, "ignoring --machdep option: %s\n", arglist[i]);
--
2.6.2
Sean Fu
2018-10-09 06:59:26 UTC
Permalink
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.
The x86_64 default value of 8K is basically a leftover value that each of
the architectures originally used for setting machdep->stacksize. But for
quite some time now, those values should get overridden later on here
STRUCT_SIZE_INIT(task_union, "task_union");
STRUCT_SIZE_INIT(thread_union, "thread_union");
if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
error(WARNING, "\nnon-standard stack size: %ld\n",
len = SIZE(task_union));
machdep->stacksize = len;
} else if (VALID_SIZE(thread_union) &&
((len = SIZE(thread_union)) != STACKSIZE())) {
machdep->stacksize = len;
} else if (!VALID_SIZE(thread_union) && !VALID_SIZE(task_union)) {
if (kernel_symbol_exists("__start_init_task") &&
kernel_symbol_exists("__end_init_task")) {
len = symbol_value("__end_init_task");
len -= symbol_value("__start_init_task");
ASSIGN_SIZE(thread_union) = len;
machdep->stacksize = len;
}
}
I compiled latest kernel and latest crash and run a qemu guest machine with the latest compliled kernel
image.
In this case, STRUCT_SIZE_INIT initialized size_table.task_union and
size_table.thread_union with -1. So machdep->stacksize did NOT get
overridden.
Post by Dave Anderson
As of Linux 4.18 at least, x86_64 still uses the thread_union declaration.
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
On what kernel version are you seeing the obsolete 8k stacksize being used?
What does the command above show on your system?
kernel version is upstream Linux 4.18 (commit#94710cac0ef4ee177a63b5227664b38c95bbf703)
(git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git).

"bt" command in crash shows "bt: invalid RSP: ffffc9000069bc08
bt->stackbase/stacktop: ffffc90000698000/ffffc9000069a000 cpu: 0".

BestRegards
Sean
Post by Dave Anderson
Thanks,
Dave
Post by Sean Fu
---
x86_64.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/x86_64.c b/x86_64.c
index 7d01140..1798f05 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -5716,6 +5716,15 @@ parse_cmdline_args(void)
continue;
}
}
+ } else if (STRNEQ(arglist[i], "stacksize=")) {
+ p = arglist[i] + strlen("stacksize=");
+ if (strlen(p)) {
+ value = stol(p, RETURN_ON_ERROR|QUIET, &errflag);
+ if (!errflag) {
+ machdep->stacksize = value;
+ continue;
+ }
+ }
}
error(WARNING, "ignoring --machdep option: %s\n", arglist[i]);
--
2.6.2
Dave Anderson
2018-10-09 13:39:10 UTC
Permalink
----- Original Message -----
Post by Sean Fu
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.
The x86_64 default value of 8K is basically a leftover value that each of
the architectures originally used for setting machdep->stacksize. But for
quite some time now, those values should get overridden later on here
STRUCT_SIZE_INIT(task_union, "task_union");
STRUCT_SIZE_INIT(thread_union, "thread_union");
if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
error(WARNING, "\nnon-standard stack size: %ld\n",
len = SIZE(task_union));
machdep->stacksize = len;
} else if (VALID_SIZE(thread_union) &&
((len = SIZE(thread_union)) != STACKSIZE())) {
machdep->stacksize = len;
} else if (!VALID_SIZE(thread_union) && !VALID_SIZE(task_union)) {
if (kernel_symbol_exists("__start_init_task") &&
kernel_symbol_exists("__end_init_task")) {
len = symbol_value("__end_init_task");
len -= symbol_value("__start_init_task");
ASSIGN_SIZE(thread_union) = len;
machdep->stacksize = len;
}
}
I compiled latest kernel and latest crash and run a qemu guest machine with
the latest compliled kernel
image.
In this case, STRUCT_SIZE_INIT initialized size_table.task_union and
size_table.thread_union with -1. So machdep->stacksize did NOT get
overridden.
Post by Dave Anderson
As of Linux 4.18 at least, x86_64 still uses the thread_union declaration.
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
On what kernel version are you seeing the obsolete 8k stacksize being used?
What does the command above show on your system?
kernel version is upstream Linux 4.18
(commit#94710cac0ef4ee177a63b5227664b38c95bbf703)
(git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git).
"bt" command in crash shows "bt: invalid RSP: ffffc9000069bc08
bt->stackbase/stacktop: ffffc90000698000/ffffc9000069a000 cpu: 0".
BestRegards
Sean
Ok, the most recent 4.18 kernel I have on hand is this one:

crash> sys | grep RELEASE
RELEASE: 4.18.0-20.el8.x86_64
crash>

and its debuginfo data contains the "thread_union" information:

crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>

but if it did not, then code should then calculate the stack
size from the difference between the "__start_init_task" and
"__end_init_task" symbols:

crash> sym __start_init_task
ffffffffa7800000 (D) __start_init_task
crash> sym __end_init_task
ffffffffa7804000 (D) __end_init_task
crash>

Does your kernel not show/contain those 2 symbols?

Dave
Post by Sean Fu
Post by Dave Anderson
Thanks,
Dave
Post by Sean Fu
---
x86_64.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/x86_64.c b/x86_64.c
index 7d01140..1798f05 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -5716,6 +5716,15 @@ parse_cmdline_args(void)
continue;
}
}
+ } else if (STRNEQ(arglist[i], "stacksize=")) {
+ p = arglist[i] + strlen("stacksize=");
+ if (strlen(p)) {
+ value = stol(p, RETURN_ON_ERROR|QUIET, &errflag);
+ if (!errflag) {
+ machdep->stacksize = value;
+ continue;
+ }
+ }
}
error(WARNING, "ignoring --machdep option: %s\n", arglist[i]);
--
2.6.2
Sean Fu
2018-10-10 04:03:22 UTC
Permalink
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.
The x86_64 default value of 8K is basically a leftover value that each of
the architectures originally used for setting machdep->stacksize. But for
quite some time now, those values should get overridden later on here
STRUCT_SIZE_INIT(task_union, "task_union");
STRUCT_SIZE_INIT(thread_union, "thread_union");
if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
error(WARNING, "\nnon-standard stack size: %ld\n",
len = SIZE(task_union));
machdep->stacksize = len;
} else if (VALID_SIZE(thread_union) &&
((len = SIZE(thread_union)) != STACKSIZE())) {
machdep->stacksize = len;
} else if (!VALID_SIZE(thread_union) && !VALID_SIZE(task_union)) {
if (kernel_symbol_exists("__start_init_task") &&
kernel_symbol_exists("__end_init_task")) {
len = symbol_value("__end_init_task");
len -= symbol_value("__start_init_task");
ASSIGN_SIZE(thread_union) = len;
machdep->stacksize = len;
}
}
I compiled latest kernel and latest crash and run a qemu guest machine with
the latest compliled kernel
image.
In this case, STRUCT_SIZE_INIT initialized size_table.task_union and
size_table.thread_union with -1. So machdep->stacksize did NOT get
overridden.
Post by Dave Anderson
As of Linux 4.18 at least, x86_64 still uses the thread_union declaration.
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
On what kernel version are you seeing the obsolete 8k stacksize being used?
What does the command above show on your system?
kernel version is upstream Linux 4.18
(commit#94710cac0ef4ee177a63b5227664b38c95bbf703)
(git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git).
"bt" command in crash shows "bt: invalid RSP: ffffc9000069bc08
bt->stackbase/stacktop: ffffc90000698000/ffffc9000069a000 cpu: 0".
BestRegards
Sean
crash> sys | grep RELEASE
RELEASE: 4.18.0-20.el8.x86_64
crash>
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
but if it did not, then code should then calculate the stack
size from the difference between the "__start_init_task" and
crash> sym __start_init_task
ffffffffa7800000 (D) __start_init_task
crash> sym __end_init_task
ffffffffa7804000 (D) __end_init_task
crash>
Does your kernel not show/contain those 2 symbols?
Sure, my test kernel contains these 2 symbols.

crash> sys | grep RELEASE
RELEASE: 4.18.0-1-default+
crash> thread_union
crash: command not found: thread_union
crash> struct thread_union
struct: invalid data structure reference: thread_union
crash> sym __start_init_task
ffffffff82000000 (D) __start_init_task
crash> sym __end_init_task
ffffffff82004000 (D) __end_init_task

Agree with you, Automatic calculation stack size from the difference between__start_init_task and __end_init_task should be better.
Calculating and assignning stack size should be add into "x86_64_init", Do you think so?

Thanks
Sean
Post by Dave Anderson
Dave
Post by Sean Fu
Post by Dave Anderson
Thanks,
Dave
Post by Sean Fu
---
x86_64.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/x86_64.c b/x86_64.c
index 7d01140..1798f05 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -5716,6 +5716,15 @@ parse_cmdline_args(void)
continue;
}
}
+ } else if (STRNEQ(arglist[i], "stacksize=")) {
+ p = arglist[i] + strlen("stacksize=");
+ if (strlen(p)) {
+ value = stol(p, RETURN_ON_ERROR|QUIET, &errflag);
+ if (!errflag) {
+ machdep->stacksize = value;
+ continue;
+ }
+ }
}
error(WARNING, "ignoring --machdep option: %s\n", arglist[i]);
--
2.6.2
Dave Anderson
2018-10-10 13:45:48 UTC
Permalink
----- Original Message -----
Post by Sean Fu
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.
The x86_64 default value of 8K is basically a leftover value that each of
the architectures originally used for setting machdep->stacksize. But for
quite some time now, those values should get overridden later on here
STRUCT_SIZE_INIT(task_union, "task_union");
STRUCT_SIZE_INIT(thread_union, "thread_union");
if (VALID_SIZE(task_union) && (SIZE(task_union) !=
STACKSIZE())) {
error(WARNING, "\nnon-standard stack size: %ld\n",
len = SIZE(task_union));
machdep->stacksize = len;
} else if (VALID_SIZE(thread_union) &&
((len = SIZE(thread_union)) != STACKSIZE())) {
machdep->stacksize = len;
} else if (!VALID_SIZE(thread_union) &&
!VALID_SIZE(task_union)) {
if (kernel_symbol_exists("__start_init_task") &&
kernel_symbol_exists("__end_init_task")) {
len = symbol_value("__end_init_task");
len -= symbol_value("__start_init_task");
ASSIGN_SIZE(thread_union) = len;
machdep->stacksize = len;
}
}
I compiled latest kernel and latest crash and run a qemu guest machine with
the latest compliled kernel
image.
In this case, STRUCT_SIZE_INIT initialized size_table.task_union and
size_table.thread_union with -1. So machdep->stacksize did NOT get
overridden.
Post by Dave Anderson
As of Linux 4.18 at least, x86_64 still uses the thread_union declaration.
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
On what kernel version are you seeing the obsolete 8k stacksize being used?
What does the command above show on your system?
kernel version is upstream Linux 4.18
(commit#94710cac0ef4ee177a63b5227664b38c95bbf703)
(git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git).
"bt" command in crash shows "bt: invalid RSP: ffffc9000069bc08
bt->stackbase/stacktop: ffffc90000698000/ffffc9000069a000 cpu: 0".
BestRegards
Sean
crash> sys | grep RELEASE
RELEASE: 4.18.0-20.el8.x86_64
crash>
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
but if it did not, then code should then calculate the stack
size from the difference between the "__start_init_task" and
crash> sym __start_init_task
ffffffffa7800000 (D) __start_init_task
crash> sym __end_init_task
ffffffffa7804000 (D) __end_init_task
crash>
Does your kernel not show/contain those 2 symbols?
Sure, my test kernel contains these 2 symbols.
crash> sys | grep RELEASE
RELEASE: 4.18.0-1-default+
crash> thread_union
crash: command not found: thread_union
crash> struct thread_union
struct: invalid data structure reference: thread_union
crash> sym __start_init_task
ffffffff82000000 (D) __start_init_task
crash> sym __end_init_task
ffffffff82004000 (D) __end_init_task
Agree with you, Automatic calculation stack size from the difference
between__start_init_task and __end_init_task should be better.
Calculating and assignning stack size should be add into "x86_64_init", Do you think so?
No, because the calculation is being done in an architecture-neutral manner
by task_init(), here in task.c:

437 if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
438 error(WARNING, "\nnon-standard stack size: %ld\n",
439 len = SIZE(task_union));
440 machdep->stacksize = len;
441 } else if (VALID_SIZE(thread_union) &&
442 ((len = SIZE(thread_union)) != STACKSIZE())) {
443 machdep->stacksize = len;
444 } else if (!VALID_SIZE(thread_union) && !VALID_SIZE(task_union)) {
445 if (kernel_symbol_exists("__start_init_task") &&
446 kernel_symbol_exists("__end_init_task")) {
447 len = symbol_value("__end_init_task");
448 len -= symbol_value("__start_init_task");
449 ASSIGN_SIZE(thread_union) = len;
450 machdep->stacksize = len;
451 }
452 }

I see that "thread_union" is not found in your debuginfo data, but I don't understand
how your kernel gets past the second "else" segment above where the __start_init_task
and __end_init_task symbol values are checked.

Dave
Sean Fu
2018-10-11 04:39:14 UTC
Permalink
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Post by Dave Anderson
----- Original Message -----
Post by Sean Fu
Implemented support for 16k stack size that was introduced by commit
6538b8ea886e472f4431db8ca1d60478f838d14b titled "x86_64: expand kernel
stack to 16K".
Without the patch, kernels has 16k stack, leading to errors in commands
such as "bt" and any command regarding 8K stack.
Add a new "--machdep stacksize=<value>" option that can be used to
override the default machdep->stacksize value which is 8k.
The x86_64 default value of 8K is basically a leftover value that each of
the architectures originally used for setting machdep->stacksize. But for
quite some time now, those values should get overridden later on here
STRUCT_SIZE_INIT(task_union, "task_union");
STRUCT_SIZE_INIT(thread_union, "thread_union");
if (VALID_SIZE(task_union) && (SIZE(task_union) !=
STACKSIZE())) {
error(WARNING, "\nnon-standard stack size: %ld\n",
len = SIZE(task_union));
machdep->stacksize = len;
} else if (VALID_SIZE(thread_union) &&
((len = SIZE(thread_union)) != STACKSIZE())) {
machdep->stacksize = len;
} else if (!VALID_SIZE(thread_union) &&
!VALID_SIZE(task_union)) {
if (kernel_symbol_exists("__start_init_task") &&
kernel_symbol_exists("__end_init_task")) {
len = symbol_value("__end_init_task");
len -= symbol_value("__start_init_task");
ASSIGN_SIZE(thread_union) = len;
machdep->stacksize = len;
}
}
I compiled latest kernel and latest crash and run a qemu guest machine with
the latest compliled kernel
image.
In this case, STRUCT_SIZE_INIT initialized size_table.task_union and
size_table.thread_union with -1. So machdep->stacksize did NOT get
overridden.
Post by Dave Anderson
As of Linux 4.18 at least, x86_64 still uses the thread_union declaration.
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
On what kernel version are you seeing the obsolete 8k stacksize being used?
What does the command above show on your system?
kernel version is upstream Linux 4.18
(commit#94710cac0ef4ee177a63b5227664b38c95bbf703)
(git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git).
"bt" command in crash shows "bt: invalid RSP: ffffc9000069bc08
bt->stackbase/stacktop: ffffc90000698000/ffffc9000069a000 cpu: 0".
BestRegards
Sean
crash> sys | grep RELEASE
RELEASE: 4.18.0-20.el8.x86_64
crash>
crash> thread_union
union thread_union {
struct task_struct task;
unsigned long stack[2048];
}
SIZE: 16384
crash>
but if it did not, then code should then calculate the stack
size from the difference between the "__start_init_task" and
crash> sym __start_init_task
ffffffffa7800000 (D) __start_init_task
crash> sym __end_init_task
ffffffffa7804000 (D) __end_init_task
crash>
Does your kernel not show/contain those 2 symbols?
Sure, my test kernel contains these 2 symbols.
crash> sys | grep RELEASE
RELEASE: 4.18.0-1-default+
crash> thread_union
crash: command not found: thread_union
crash> struct thread_union
struct: invalid data structure reference: thread_union
crash> sym __start_init_task
ffffffff82000000 (D) __start_init_task
crash> sym __end_init_task
ffffffff82004000 (D) __end_init_task
Agree with you, Automatic calculation stack size from the difference
between__start_init_task and __end_init_task should be better.
Calculating and assignning stack size should be add into "x86_64_init", Do
you think so?
No, because the calculation is being done in an architecture-neutral manner
437 if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
438 error(WARNING, "\nnon-standard stack size: %ld\n",
439 len = SIZE(task_union));
440 machdep->stacksize = len;
441 } else if (VALID_SIZE(thread_union) &&
442 ((len = SIZE(thread_union)) != STACKSIZE())) {
443 machdep->stacksize = len;
444 } else if (!VALID_SIZE(thread_union) && !VALID_SIZE(task_union)) {
445 if (kernel_symbol_exists("__start_init_task") &&
446 kernel_symbol_exists("__end_init_task")) {
447 len = symbol_value("__end_init_task");
448 len -= symbol_value("__start_init_task");
449 ASSIGN_SIZE(thread_union) = len;
450 machdep->stacksize = len;
451 }
452 }
I see that "thread_union" is not found in your debuginfo data, but I don't understand
how your kernel gets past the second "else" segment above where the __start_init_task
and __end_init_task symbol values are checked.
Your code is different from mine. The following is from my task.c:

436 if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
437 error(WARNING, "\nnon-standard stack size: %ld\n",
438 len = SIZE(task_union));
439 machdep->stacksize = len;
440 } else if (VALID_SIZE(thread_union) &&
441 ((len = SIZE(thread_union)) != STACKSIZE()))
442 machdep->stacksize = len;
443
444 MEMBER_OFFSET_INIT(pid_namespace_idr, "pid_namespace", "idr");
445 MEMBER_OFFSET_INIT(idr_idr_rt, "idr", "idr_rt");

My code repo:
***@linux-zmni:~/work/source/upstream/crash> git remote -v
origin https://github.com/crash-utility/crash.git (fetch)
origin https://github.com/crash-utility/crash.git (push)

What's your crash version?

Thanks
Sean
Post by Dave Anderson
Dave
Dominique Martinet
2018-10-11 04:49:10 UTC
Permalink
Post by Sean Fu
436 if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
437 error(WARNING, "\nnon-standard stack size: %ld\n",
438 len = SIZE(task_union));
439 machdep->stacksize = len;
440 } else if (VALID_SIZE(thread_union) &&
441 ((len = SIZE(thread_union)) != STACKSIZE()))
442 machdep->stacksize = len;
443
444 MEMBER_OFFSET_INIT(pid_namespace_idr, "pid_namespace", "idr");
445 MEMBER_OFFSET_INIT(idr_idr_rt, "idr", "idr_rt");
origin https://github.com/crash-utility/crash.git (fetch)
origin https://github.com/crash-utility/crash.git (push)
What's your crash version?
Your checkout is a bit old, the extra check was added back in april in
this commit:

commit 6088a29f7e4ad7160e757679827db63ea41553df
Author: Dave Anderson <***@redhat.com>
Date: Thu Apr 5 11:07:59 2018 -0400

Fix for the "bt" command on 4.16 and later kernels size in which the
"thread_union" data structure is not contained in the vmlinux file's
debuginfo data. Without the patch, the kernel stack size is not
calculated correctly, and defaults to 8K. As a result "bt" fails
with the message "bt: invalid RSP: <address> bt->stackbase/stacktop:
<address>/<address> cpu: <number>".
(***@gmx.de)

which is present in releases since 7.2.2 according to `git tag
--contains`, but development should probably always be done with an
updated tree
--
Domnique
Sean Fu
2018-10-25 02:50:39 UTC
Permalink
Post by Dominique Martinet
Post by Sean Fu
436 if (VALID_SIZE(task_union) && (SIZE(task_union) != STACKSIZE())) {
437 error(WARNING, "\nnon-standard stack size: %ld\n",
438 len = SIZE(task_union));
439 machdep->stacksize = len;
440 } else if (VALID_SIZE(thread_union) &&
441 ((len = SIZE(thread_union)) != STACKSIZE()))
442 machdep->stacksize = len;
443
444 MEMBER_OFFSET_INIT(pid_namespace_idr, "pid_namespace", "idr");
445 MEMBER_OFFSET_INIT(idr_idr_rt, "idr", "idr_rt");
origin https://github.com/crash-utility/crash.git (fetch)
origin https://github.com/crash-utility/crash.git (push)
What's your crash version?
Your checkout is a bit old, the extra check was added back in april in
commit 6088a29f7e4ad7160e757679827db63ea41553df
Date: Thu Apr 5 11:07:59 2018 -0400
Fix for the "bt" command on 4.16 and later kernels size in which the
"thread_union" data structure is not contained in the vmlinux file's
debuginfo data. Without the patch, the kernel stack size is not
calculated correctly, and defaults to 8K. As a result "bt" fails
<address>/<address> cpu: <number>".
which is present in releases since 7.2.2 according to `git tag
--contains`, but development should probably always be done with an
updated tree
Correct, The new crash tools with this patch works fine on my machine.
Thanks
Post by Dominique Martinet
--
Domnique
Loading...