breakpad/docs/stack_walking.md
Orgad Shaneh 11d7510c08 Update links
code.google.com is obsolete.

Fix all broken markdown links while at it.

Change-Id: I6a337bf4b84eacd5f5c749a4ee61331553279009
Reviewed-on: https://chromium-review.googlesource.com/411800
Reviewed-by: Mike Frysinger <vapier@chromium.org>
2016-11-18 17:24:37 +00:00

8 KiB

Introduction

This page aims to provide a detailed description of how Breakpad produces stack traces from the information contained within a minidump file.

Details

Starting the Process

Typically the stack walking process is initiated by instantiating the MinidumpProcessor class and calling the MinidumpProcessor::Process method, providing it a minidump file to process. To produce a useful stack trace, the MinidumpProcessor requires two other objects which are passed in its constructor: a SymbolSupplier and a SourceLineResolverInterface. The SymbolSupplier object is responsible for locating and providing SymbolFiles that match modules from the minidump. The SourceLineResolverInterface is responsible for loading the symbol files and using the information contained within to provide function and source information for stack frames, as well as information on how to unwind from a stack frame to its caller. More detail will be provided on these interactions later.

A number of data streams are extracted from the minidump to begin stack walking: the list of threads from the process (MinidumpThreadList), the list of modules loaded in the process (MinidumpModuleList), and information about the exception that caused the process to crash (MinidumpException).

Enumerating Threads

For each thread in the thread list (MinidumpThread), the thread memory containing the stack for the thread (MinidumpMemoryRegion) and the CPU context representing the CPU state of the thread at the time the dump was written (MinidumpContext) are extracted from the minidump. If the thread being processed is the thread that produced the exception then a CPU context is obtained from the MinidumpException object instead, which represents the CPU state of the thread at the point of the exception. A stack walker is then instantiated by calling the Stackwalker::StackwalkerForCPU method and passing it the CPU context, the thread memory, the module list, as well as the SymbolSupplier and SourceLineResolverInterface. This method selects the specific !Stackwalker subclass based on the CPU architecture of the provided CPU context and returns an instance of that subclass.

Walking a thread's stack

Once a !Stackwalker instance has been obtained, the processor calls the Stackwalker::Walk method to obtain a list of frames representing the stack of this thread. The !Stackwalker starts by calling the GetContextFrame method which returns a StackFrame representing the top of the stack, with CPU state provided by the initial CPU context. From there, the stack walker repeats the following steps for each frame in turn:

Finding the Module

The address of the instruction pointer of the current frame is used to determine which module contains the current frame by calling the module list's GetModuleForAddress method.

Locating Symbols

If a module is located, the SymbolSupplier is asked to locate symbols corresponding to the module by calling its GetCStringSymbolData method. Typically this is implemented by using the module's debug filename (the PDB filename for Windows dumps) and debug identifier (a GUID plus one extra digit) as a lookup key. The SimpleSymbolSupplier class simply uses these as parts of a file path to locate a flat file on disk.

Loading Symbols

If a symbol file is located, the SourceLineResolverInterface is then asked to load the symbol file by calling its LoadModuleUsingMemoryBuffer method. The BasicSourceLineResolver implementation parses the text-format symbol file into in-memory data structures to make lookups by address of function names, source line information, and unwind information easy.

Getting source line information

If a symbol file has been successfully loaded, the SourceLineResolverInterface's FillSourceLineInfo method is called to provide a function name and source line information for the current frame. This is done by subtracting the base address of the module containing the current frame from the instruction pointer of the current frame to obtain a relative virtual address (RVA), which is a code offset relative to the start of the module. This RVA is then used as a lookup into a table of functions (FUNC lines from the symbol file), each of which has an associated address range (function start address, function size). If a function is found whose address range contains the RVA, then its name is used. The RVA is then used as a lookup into a table of source lines (line records from the symbol file), each of which also has an associated address range. If a match is found it will provide the file name and source line associated with the current frame. If no match was found in the function table, another table of publicly exported symbols may be consulted (PUBLIC lines from the symbol file). Public symbols contain only a start address, so the lookup simply looks for the nearest symbol that is less than the provided RVA.

Finding the caller frame

To find the next frame in the stack, the !Stackwalker calls its GetCallerFrame method, passing in the current frame. Each !Stackwalker subclass implements GetCallerFrame differently, but there are common patterns.

Typically the first step is to query the SourceLineResolverInterface for the presence of detailed unwind information. This is done using its FindWindowsFrameInfo and FindCFIFrameInfo methods. These methods look for Windows unwind info extracted from a PDB file (STACK WIN lines from the symbol file), or DWARF CFI extracted from a binary (STACK CFI lines from the symbol file) respectively. The information covers address ranges, so the RVA of the current frame is used for lookup as with function and source line information.

If unwind info is found it provides a set of rules to recover the register state of the caller frame given the current register state as well as the thread's stack memory. The rules are evaluated to produce the caller frame.

If unwind info is not found then the !Stackwalker may resort to other methods. Typically on architectures which specify a frame pointer unwinding by dereferencing the frame pointer is tried next. If that is successful it is used to produce the caller frame.

If no caller frame was found by any other method most !Stackwalker implementations resort to stack scanning by looking at each word on the stack down to a fixed depth (implemented in the Stackwalker::ScanForReturnAddress method) and using a heuristic to attempt to find a reasonable return address (implemented in the Stackwalker::InstructionAddressSeemsValid method).

If no caller frame is found or the caller frame seems invalid, stack walking stops. If a caller frame was found then these steps repeat using the new frame as the current frame.