How to access glibc heap metadata
February 18, 2021
I had that exact requirement recently. I needed to restore the glibc heap from a core dump based on a new process. After restoring all original mappings, all it took was patching main_arena
.
How can the difference between two Linux core dumps be identified and why would this even come up? This is going to be lengthy, but will hopefully give you your answer to both of those questions.
Comparing two core dumps is only meaningful if they represent the same process at different points in time. If that’s the case, they could be thought of as process snapshots. Consider an application that triggers a segmentation fault after a random uptime. If the root cause is suspected to be memory corruption and post-mortem debugging does not provide any hints, it would be helpful to go back in time to inspect the memory state before the fatal error. In the best case, all of that should be done with minimal overhead because the issue only occurs in production in our thought experiment. Also, the actual memory locations of interest are unknown, so being able to visualize relevant memory changes before the fatal error would be desirable. A set of core dumps could provide simple low-overhead time travel debugging in respect to process memory.
In most real-world debugging scenarios involving memory corruption, a memory diff would be too large to be useful. In specific cases involving mostly read-only memory and a limited set of debugging alternatives, going the diff route might just be what’s needed to identify the constellation leading to the corruption. With all that talk about comparing two core dumps, How can this diff even be generated?
A core dump is represented by an ELF file that contains metadata and a specific set of memory regions (on Linux, this can be controlled via /proc/[pid]/coredump_filter
) that were mapped into the given process at the time of dump creation.
The obvious way to compare the dumps would be to compare a hex-representation:
$ diff -u <(hexdump -C dump1) <(hexdump -C dump2)
--- /dev/fd/63 2020-05-17 10:01:40.370524170 +0000
+++ /dev/fd/62 2020-05-17 10:01:40.370524170 +0000
@@ -90,8 +90,9 @@
000005e0 00 00 00 00 00 00 00 00 00 00 00 00 80 1f 00 00 |................|
000005f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
The result is rarely useful because you’re missing the context. More specifically, there’s no straightforward way to get from the offset of a value change in the file to the offset corresponding to the process virtual memory address space.
So, more context if needed. The optimal output would be a list of VM addresses including before and after values.
Before we can get on that, we need a test scenario to validate our comparison approach. The following sample includes a use-after-free memory issue that does not lead to a segmentation fault at first (a new allocation with the same size hides the issue). The idea here is to create a core dump using gdb (generate
) during each phase based on break points triggered by the code:
The sample code:
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
int **g_state;
int main()
{
int value = 1;
g_state = malloc(sizeof(int*));
*g_state = &value;
if (g_state && *g_state) {
printf("state: %d\n", **g_state);
}
printf("no corruption\n");
raise(SIGTRAP);
free(g_state);
char **unrelated = malloc(sizeof(int*));
*unrelated = "val";
if (g_state && *g_state) {
printf("state: %d\n", **g_state);
}
printf("use-after-free hidden by new allocation (invalid value)\n");
raise(SIGTRAP);
printf("use-after-free (segfault)\n");
free(unrelated);
int *unrelated2 = malloc(sizeof(intptr_t));
*unrelated2 = 1;
if (g_state && *g_state) {
printf("state: %d\n", **g_state);
}
return 0;
}
Now, the dumps can be generated:
Starting program: test
state: 1
no corruption
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffff7a488df in raise () from /lib64/libc.so.6
(gdb) generate dump1
Saved corefile dump1
(gdb) cont
Continuing.
state: 7102838
use-after-free hidden by new allocation (invalid value)
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffff7a488df in raise () from /lib64/libc.so.6
(gdb) generate dump2
Saved corefile dump2
(gdb) cont
Continuing.
use-after-free (segfault)
Program received signal SIGSEGV, Segmentation fault.
main () at test.c:31
31 printf("state: %d\n", **g_state);
(gdb) generate dump3
Saved corefile dump3
A quick manual inspection shows the relevant differences:
# dump1
(gdb) print g_state
$1 = (int **) 0x602260
(gdb) print *g_state
$2 = (int *) 0x7fffffffe2bc
# dump2
(gdb) print g_state
$1 = (int **) 0x602260
(gdb) print *g_state
$2 = (int *) 0x4008c1
# dump3
$2 = (int **) 0x602260
(gdb) print *g_state
$3 = (int *) 0x1
Based on that output, we can clearly see that *g_state
changed but is still a valid pointer in dump2
. In dump3
, the pointer becomes invalid. Of course, we’d like to automate this comparison.
Knowing that a core dump is an ELF file, we can simply parse it and generate a diff ourselves. What we’ll do:
PROGBITS
sections of the dumpBased on elf.h
, it’s relatively easy to parse ELF files. I created a sample implementation that compares two dumps and prints a diff that is similar to comparing two hexdump
outputs using diff
. The sample makes some assumptions (x86_64, mappings either match in terms of address and size or they only exist in dump1 or dump2), omits most error handling and always chooses a simple implementation approach for the sake of brevity.
#include <elf.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/stat.h>
#define MAX_MAPPINGS 1024
struct dump
{
char *base;
Elf64_Shdr *mappings[MAX_MAPPINGS];
};
unsigned readdump(const char *path, struct dump *dump)
{
unsigned count = 0;
int fd = open(path, O_RDONLY);
if (fd != -1) {
struct stat stat;
fstat(fd, &stat);
dump->base = mmap(NULL, stat.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
Elf64_Ehdr *header = (Elf64_Ehdr *)dump->base;
Elf64_Shdr *secs = (Elf64_Shdr*)(dump->base+header->e_shoff);
for (unsigned secinx = 0; secinx < header->e_shnum; secinx++) {
if (secs[secinx].sh_type == SHT_PROGBITS) {
if (count == MAX_MAPPINGS) {
count = 0;
break;
}
dump->mappings[count] = &secs[secinx];
count++;
}
}
dump->mappings[count] = NULL;
}
return count;
}
#define DIFFWINDOW 16
void printsection(struct dump *dump, Elf64_Shdr *sec, const char mode,
unsigned offset, unsigned sizelimit)
{
unsigned char *data = (unsigned char *)(dump->base+sec->sh_offset);
uintptr_t addr = sec->sh_addr+offset;
unsigned size = sec->sh_size;
data += offset;
if (sizelimit) {
size = sizelimit;
}
unsigned start = 0;
for (unsigned i = 0; i < size; i++) {
if (i%DIFFWINDOW == 0) {
printf("%c%016x ", mode, addr+i);
start = i;
}
printf(" %02x", data[i]);
if ((i+1)%DIFFWINDOW == 0 || i + 1 == size) {
printf(" [");
for (unsigned j = start; j <= i; j++) {
putchar((data[j] >= 32 && data[j] < 127)?data[j]:'.');
}
printf("]\n");
}
addr++;
}
}
void printdiff(struct dump *dump1, Elf64_Shdr *sec1,
struct dump *dump2, Elf64_Shdr *sec2)
{
unsigned char *data1 = (unsigned char *)(dump1->base+sec1->sh_offset);
unsigned char *data2 = (unsigned char *)(dump2->base+sec2->sh_offset);
unsigned difffound = 0;
unsigned start = 0;
for (unsigned i = 0; i < sec1->sh_size; i++) {
if (i%DIFFWINDOW == 0) {
start = i;
difffound = 0;
}
if (!difffound && data1[i] != data2[i]) {
difffound = 1;
}
if ((i+1)%DIFFWINDOW == 0 || i + 1 == sec1->sh_size) {
if (difffound) {
printsection(dump1, sec1, '-', start, DIFFWINDOW);
printsection(dump2, sec2, '+', start, DIFFWINDOW);
}
}
}
}
int main(int argc, char **argv)
{
if (argc != 3) {
fprintf(stderr, "Usage: compare DUMP1 DUMP2\n");
return 1;
}
struct dump dump1;
struct dump dump2;
if (readdump(argv[1], &dump1) == 0 ||
readdump(argv[2], &dump2) == 0) {
fprintf(stderr, "Failed to read dumps\n");
return 1;
}
unsigned sinx1 = 0;
unsigned sinx2 = 0;
while (dump1.mappings[sinx1] || dump2.mappings[sinx2]) {
Elf64_Shdr *sec1 = dump1.mappings[sinx1];
Elf64_Shdr *sec2 = dump2.mappings[sinx2];
if (sec1 && sec2) {
if (sec1->sh_addr == sec2->sh_addr) {
// in both
printdiff(&dump1, sec1, &dump2, sec2);
sinx1++;
sinx2++;
}
else if (sec1->sh_addr < sec2->sh_addr) {
// in 1, not 2
printsection(&dump1, sec1, '-', 0, 0);
sinx1++;
}
else {
// in 2, not 1
printsection(&dump2, sec2, '+', 0, 0);
sinx2++;
}
}
else if (sec1) {
// in 1, not 2
printsection(&dump1, sec1, '-', 0, 0);
sinx1++;
}
else {
// in 2, not 1
printsection(&dump2, sec2, '+', 0, 0);
sinx2++;
}
}
return 0;
}
With the sample implementation, we can re-evaluate our scenario above. A excerpt from the first diff:
$ ./compare dump1 dump2
-0000000000601020 86 05 40 00 00 00 00 00 50 3e a8 f7 ff 7f 00 00 [..@.....P>......]
+0000000000601020 00 6f a9 f7 ff 7f 00 00 50 3e a8 f7 ff 7f 00 00 [.o......P>......]
-0000000000602260 bc e2 ff ff ff 7f 00 00 00 00 00 00 00 00 00 00 [................]
+0000000000602260 c1 08 40 00 00 00 00 00 00 00 00 00 00 00 00 00 [..@.............]
-0000000000602280 6e 6f 20 63 6f 72 72 75 70 74 69 6f 6e 0a 00 00 [no corruption...]
+0000000000602280 75 73 65 2d 61 66 74 65 72 2d 66 72 65 65 20 68 [use-after-free h]
-0000000000602290 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
+0000000000602290 69 64 64 65 6e 20 62 79 20 6e 65 77 20 61 6c 6c [idden by new all]
The diff shows that *gstate
(address 0x602260
) was changed from 0x7fffffffe2bc
to 0x4008c1
:
-0000000000602260 bc e2 ff ff ff 7f 00 00 00 00 00 00 00 00 00 00 [................]
+0000000000602260 c1 08 40 00 00 00 00 00 00 00 00 00 00 00 00 00 [..@.............]
The second diff with only the relevant offset:
$ ./compare dump1 dump2
-0000000000602260 c1 08 40 00 00 00 00 00 00 00 00 00 00 00 00 00 [..@.............]
+0000000000602260 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
The diff shows that *gstate
(address 0x602260
) was changed from 0x4008c1
to 0x1
.
There you have it: a core dump diff. Now, whether or not that can prove to be useful depends on various factors, one being the timeframe between the two dumps and the activity that takes place within that window. A large diff will possibly be difficult to analyze, so the aim must be to minimize its size by choosing the diff window carefully.
The more context you have, the easier the analysis will turn out to be. For example, the relevant scope of the diff could be reduced by limiting it to addresses of the .data
and .bss
sections of the executable or library to be debugged if changes in there are relevant to the debugging scenario.
Another approach to reduce the scope: excluding changes to memory that is not referenced by the debugging subject. The relationship between arbitrary heap allocations and the executable or specific libraries is not immediately apparent. Based on the the addresses of changes in your initial diff, you could search for pointers in the .data
and .bss
sections of the executable or library right in the diff implementation. This does not take every possible reference into account (most notably indirect references from other allocations, register and stack references of library-owned threads), but it’s a start.
February 18, 2021
I had that exact requirement recently. I needed to restore the glibc heap from a core dump based on a new process. After restoring all original mappings, all it took was patching main_arena
.
December 11, 2020
For tracing filesystem accesses of Java applications, native tracing facilities are always the first choice. On Windows, use Process Monitor to trace I/O. On Linux, use strace. Other platforms provide similar facilities.