[Docs] add markdown docs (converted from Wiki)

BUG=none
R=mark
CC=google-breakpad-dev@googlegroups.com

Review URL: https://codereview.chromium.org/1357773004 .

Patch from Andy Bonventre <andybons@chromium.org>.
This commit is contained in:
Andy Bonventre 2015-09-22 17:29:52 -04:00 committed by Mark Mentovai
parent 4d06db5a1f
commit 0ff15b41ed
17 changed files with 2948 additions and 37 deletions

37
README
View file

@ -1,37 +0,0 @@
Breakpad is a set of client and server components which implement a
crash-reporting system.
-----
Getting started in 32-bit mode (from trunk)
Configure: CXXFLAGS=-m32 CFLAGS=-m32 CPPFLAGS=-m32 ./configure
Build: make
Test: make check
Install: make install
If you need to reconfigure your build be sure to run "make distclean" first.
-----
To request change review:
0. Get a copy of depot_tools repo.
http://dev.chromium.org/developers/how-tos/install-depot-tools
1. Create a new directory for checking out the source code.
mkdir breakpad && cd breakpad
2. Run the `fetch` tool from depot_tools to download all the source repos.
fetch breakpad
3. Make changes. Build and test your changes.
For core code like processor use methods above.
For linux/mac/windows, there are test targets in each project file.
4. Commit your changes to your local repo and upload them to the server.
http://dev.chromium.org/developers/contributing-code
e.g. git commit ... && git cl upload ...
You will be prompted for credential and a description.
5. At https://codereview.chromium.org/ you'll find your issue listed; click on
it, and select Publish+Mail, and enter in the code reviewer and CC
google-breakpad-dev@googlegroups.com

47
README.md Normal file
View file

@ -0,0 +1,47 @@
# Breakpad
Breakpad is a set of client and server components which implement a
crash-reporting system.
## Getting started in 32-bit mode (from trunk)
```sh
# Configure
CXXFLAGS=-m32 CFLAGS=-m32 CPPFLAGS=-m32 ./configure
# Build
make
# Test
make check
# Install
make install
```
If you need to reconfigure your build be sure to run `make distclean` first.
## To request change review:
1. Get a copy of depot_tools repo.
http://dev.chromium.org/developers/how-tos/install-depot-tools
2. Create a new directory for checking out the source code.
mkdir breakpad && cd breakpad
3. Run the `fetch` tool from depot_tools to download all the source repos.
`fetch breakpad`
4. Make changes. Build and test your changes.
For core code like processor use methods above.
For linux/mac/windows, there are test targets in each project file.
5. Commit your changes to your local repo and upload them to the server.
http://dev.chromium.org/developers/contributing-code
e.g. `git commit ... && git cl upload ...`
You will be prompted for credential and a description.
6. At https://codereview.chromium.org/ you'll find your issue listed; click on
it, and select Publish+Mail, and enter in the code reviewer and CC
google-breakpad-dev@googlegroups.com
## Documentation
Visit https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/

1
docs/OWNERS Normal file
View file

@ -0,0 +1 @@
*

BIN
docs/breakpad.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

1023
docs/breakpad.svg Normal file

File diff suppressed because it is too large Load diff

After

Width:  |  Height:  |  Size: 45 KiB

224
docs/client_design.md Normal file
View file

@ -0,0 +1,224 @@
# Breakpad Client Libraries
## Objective
The Breakpad client libraries are responsible for monitoring an application for
crashes (exceptions), handling them when they occur by generating a dump, and
providing a means to upload dumps to a crash reporting server. These tasks are
divided between the “handler” (short for “exception handler”) library linked in
to an application being monitored for crashes, and the “sender” library,
intended to be linked in to a separate external program.
## Background
As one of the chief tasks of the client handler is to generate a dump, an
understanding of [dump files](processor_design.md) will aid in understanding the
handler.
## Overview
Breakpad provides client libraries for each of its target platforms. Currently,
these exist for Windows on x86 and Mac OS X on both x86 and PowerPC. A Linux
implementation has been written and is currently under review.
Because the mechanisms for catching exceptions and the methods for obtaining the
information that a dump contains vary between operating systems, each target
operating system requires a completely different handler implementation. Where
multiple CPUs are supported for a single operating system, the handler
implementation will likely also require separate code for each processor type to
extract CPU-specific information. One of the goals of the Breakpad handler is to
provide a prepackaged cross-platform system that masks many of these
system-level differences and quirks from the application developer. Although the
underlying implementations differ, the handler library for each system follows
the same set of principles and exposes a similar interface.
Code that wishes to take advantage of Breakpad should be linked against the
handler library, and should, at an appropriate time, install a Breakpad handler.
For applications, it is generally desirable to install the handler as early in
the start-up process as possible. Developers of library code using Breakpad to
monitor itself may wish to install a Breakpad handler when the library is
loaded, or may only want to install a handler when calls are made in to the
library.
The handler can be triggered to generate a dump either by catching an exception
or at the request of the application itself. The latter case may be useful in
debugging assertions or other conditions where developers want to know how a
program got in to a specific non-crash state. After generating a dump, the
handler calls a user-specified callback function. The callback function may
collect additional data about the programs state, quit the program, launch a
crash reporter application, or perform other tasks. Allowing for this
functionality to be dictated by a callback function preserves flexibility.
The sender library is also has a separate implementation for each supported
platform, because of the varying interfaces for accessing network resources on
different operating systems. The sender transmits a dump along with other
application-defined information to a crash report server via HTTP. Because dumps
may contain sensitive data, the sender allows for the use of HTTPS.
The canonical example of the entire client system would be for a monitored
application to link against the handler library, install a Breakpad handler from
its main function, and provide a callback to launch a small crash reporter
program. The crash reporter program would be linked against the sender library,
and would send the crash dump when launched. A separate process is recommended
for this function because of the unreliability inherent in doing any significant
amount of work from a crashed process.
## Detailed Design
### Exception Handler Installation
The mechanisms for installing an exception handler vary between operating
systems. On Windows, its a relatively simple matter of making one call to
register a [top-level exception filter]
(http://msdn.microsoft.com/library/en-us/debug/base/setunhandledexceptionfilter.asp)
callback function. On most Unix-like systems such as Linux, processes are
informed of exceptions by the delivery of a signal, so an exception handler
takes the form of a signal handler. The native mechanism to catch exceptions on
Mac OS X requires a large amount of code to set up a Mach port, identify it as
the exception port, and assign a thread to listen for an exception on that port.
Just as the preparation of exception handlers differ, the manner in which they
are called differs as well. On Windows and most Unix-like systems, the handler
is called on the thread that caused the exception. On Mac OS X, the thread
listening to the exception port is notified that an exception has occurred. The
different implementations of the Breakpad handler libraries perform these tasks
in the appropriate ways on each platform, while exposing a similar interface on
each.
A Breakpad handler is embodied in an `ExceptionHandler` object. Because its a
C++ object, `ExceptionHandler`s may be created as local variables, allowing them
to be installed and removed as functions are called and return. This provides
one possible way for a developer to monitor only a portion of an application for
crashes.
### Exception Basics
Once an application encounters an exception, it is in an indeterminate and
possibly hazardous state. Consequently, any code that runs after an exception
occurs must take extreme care to avoid performing operations that might fail,
hang, or cause additional exceptions. This task is not at all straightforward,
and the Breakpad handler library seeks to do it properly, accounting for all of
the minute details while allowing other application developers, even those with
little systems programming experience, to reap the benefits. All of the Breakpad
handler code that executes after an exception occurs has been written according
to the following guidelines for safety at exception time:
* Use of the application heap is forbidden. The heap may be corrupt or
otherwise unusable, and allocators may not function.
* Resource allocation must be severely limited. The handler may create a new
file to contain the dump, and it may attempt to launch a process to continue
handling the crash.
* Execution on the thread that caused the exception is significantly limited.
The only code permitted to execute on this thread is the code necessary to
transition handling to a dedicated preallocated handler thread, and the code
to return from the exception handler.
* Handlers shouldnt handle crashes by attempting to walk stacks themselves,
as stacks may be in inconsistent states. Dump generation should be performed
by interfacing with the operating systems memory manager and code module
manager.
* Library code, including runtime library code, must be avoided unless it
provably meets the above guidelines. For example, this means that the STL
string class may not be used, because it performs operations that attempt to
allocate and use heap memory. It also means that many C runtime functions
must be avoided, particularly on Windows, because of heap operations that
they may perform.
A dedicated handler thread is used to preserve the state of the exception thread
when an exception occurs: during dump generation, it is difficult if not
impossible for a thread to accurately capture its own state. Performing all
exception-handling functions on a separate thread is also critical when handling
stack-limit-exceeded exceptions. It would be hazardous to run out of stack space
while attempting to handle an exception. Because of the rule against allocating
resources at exception time, the Breakpad handler library creates its handler
thread when it installs its exception handler. On Mac OS X, this handler thread
is created during the normal setup of the exception handler, and the handler
thread will be signaled directly in the event of an exception. On Windows and
Linux, the handler thread is signaled by a small amount of code that executes on
the exception thread. Because the code that executes on the exception thread in
this case is small and safe, this does not pose a problem. Even when an
exception is caused by exceeding stack size limits, this code is sufficiently
compact to execute entirely within the stacks guard page without causing an
exception.
The handler thread may also be triggered directly by a user call, even when no
exception occurs, to allow dumps to be generated at any point deemed
interesting.
### Filter Callback
When the handler thread begins handling an exception, it calls an optional
user-defined filter callback function, which is responsible for judging whether
Breakpads handler should continue handling the exception or not. This mechanism
is provided for the benefit of library or plug-in code, whose developers may not
be interested in reports of crashes that occur outside of their modules but
within processes hosting their code. If the filter callback indicates that it is
not interested in the exception, the Breakpad handler arranges for it to be
delivered to any previously-installed handler.
### Dump Generation
Assuming that the filter callback approves (or does not exist), the handler
writes a dump in a directory specified by the application developer when the
handler was installed, using a previously generated unique identifier to avoid
name collisions. The mechanics of dump generation also vary between platforms,
but in general, the process involves enumerating each thread of execution, and
capturing its state, including processor context and the active portion of its
stack area. The dump also includes a list of the code modules loaded in to the
application, and an indicator of which thread generated the exception or
requested the dump. In order to avoid allocating memory during this process, the
dump is written in place on disk.
### Post-Dump Behavior
Upon completion of writing the dump, a second callback function is called. This
callback may be used to launch a separate crash reporting program or to collect
additional data from the application. The callback may also be used to influence
whether Breakpad will treat the exception as handled or unhandled. Even after a
dump is successfully generated, Breakpad can be made to behave as though it
didnt actually handle an exception. This function may be useful for developers
who want to test their applications with Breakpad enabled but still retain the
ability to use traditional debugging techniques. It also allows a
Breakpad-enabled application to coexist with a platforms native crash reporting
system, such as Mac OS X [CrashReporter]
(http://developer.apple.com/technotes/tn2004/tn2123.html) and [Windows Error
Reporting](http://msdn.microsoft.com/isv/resources/wer/).
Typically, when Breakpad handles an exception fully and no debuggers are
involved, the crashed process will terminate.
Authors of both callback functions that execute within a Breakpad handler are
cautioned that their code will be run at exception time, and that as a result,
they should observe the same programming practices that the Breakpad handler
itself adheres to. Notably, if a callback is to be used to collect additional
data from an application, it should take care to read only “safe” data. This
might involve accessing only static memory locations that are updated
periodically during the course of normal program execution.
### Sender Library
The Breakpad sender library provides a single function to send a crash report to
a crash server. It accepts a crash servers URL, a map of key-value parameters
that will accompany the dump, and the path to a dump file itself. Each of the
key-value parameters and the dump file are sent as distinct parts of a multipart
HTTP POST request to the specified URL using the platforms native HTTP
facilities. On Linux, [libcurl](http://curl.haxx.se/) is used for this function,
as it is the closest thing to a standard HTTP library available on that
platform.
## Future Plans
Although weve had great success with in-process dump generation by following
our guidelines for safe code at exception time, we are exploring options for
allowing dumps to be generated in a separate process, to further enhance the
handler librarys robustness.
On Windows, we intend to offer tools to make it easier for Breakpads settings
to be managed by the native group policy management system.
We also plan to offer tools that many developers would find desirable in the
context of handling crashes, such as a mechanism to determine at launch if the
program last terminated in a crash, and a way to calculate “crashiness” in terms
of crashes over time or the number of application launches between crashes.
We are also investigating methods to capture crashes that occur early in an
applications launch sequence, including crashes that occur before a programs
main function begins executing.

View file

@ -0,0 +1,35 @@
# Introduction
Thanks for thinking of contributing to Breakpad! Unfortunately there are some
pesky legal issues to get out of the way, but they're quick and painless.
## Legal
If you're doing work individually, not as part of any employment, you'll need to
sign the <a
href='http://code.google.com/legal/individual-cla-v1.0.html'>Individual
Contributor License Agreement</a>. This agreement can be completed
electronically.
If you're contributing to Breakpad as part of your employment with another
organization, you'll need to sign a <a
href='http://code.google.com/legal/corporate-cla-v1.0.html'> Corporate
Contributor License Agreement</a>. Once completed this document will need to be
faxed.
**_IMPORTANT_**: The authors(you!) of the contributions will maintain all
copyrights; the agreements you sign will grant rights to Google to use your
work.
Thanks, and if you have any questions let me know and I'll loop in the legal guy
here to get you an answer.
## Technical
Once you have signed the agreement you can be added to our contributors list and
have write access to code. For full details on getting started see our trunk
`README`.
## List of people who have signed contributor agreements
None so far.

128
docs/exception_handling.md Normal file
View file

@ -0,0 +1,128 @@
The goal of this document is to give an overview of the exception handling
options in breakpad.
# Basics
Exception handling is a mechanism designed to handle the occurrence of
exceptions, special conditions that change the normal flow of program execution.
`SetUnhandledExceptionFilter` replaces all unhandled exceptions when Breakpad is
enabled. TODO: More on first and second change and vectored v. try/catch.
There are two main types of exceptions across all platforms: in-process and
out-of-process.
# In-Process
In process exception handling is relatively simple since the crashing process
handles crash reporting. It is generally considered unsafe to write a minidump
from a crashed process. For example, key data structures could be corrupted or
the stack on which the exception handler runs could have been overwritten. For
this reason all platforms also support some level of out-of-process exception
handling.
## Windows
In-process exception handling Breakpad creates a 'handler head' that waits
infinitely on a semaphore at start up. When this thread is woken it writes the
minidump and signals to the excepting thread that it may continue. A filter will
tell the OS to kill the process if the minidump is written successfully.
Otherwise it continues.
# Out-of-Process
Out-of-process exception handling is more complicated than in-process exception
handling because of the need to set up a separate process that can read the
state of the crashing process.
## Windows
Breakpad uses two abstractions around the exception handler to make things work:
`CrashGenerationServer` and `CrashGenerationClient`. The constructor for these
takes a named pipe name.
During server start up a named pipe and registers callbacks for client
connections are created. The named pipe is used for registration and all IO on
the pipe is done asynchronously. `OnPipeConnected` is called when a client
attempts to connect (call `CreateFile` on the pipe). `OnPipeConnected` does the
state machine transition from `Initial` to `Connecting` and on through
`Reading`, `Reading_Done`, `Writing`, `Writing_Done`, `Reading_ACK`, and
`Disconnecting`.
When registering callbacks, the client passes in two pointers to pointers: 1. A
pointer to the `EXCEPTION_INFO` pointer 1. A pointer to the `MDRawAssertionInfo`
which handles various non-exception failures like assertions
The essence of registration is adding a "`ClientInfo`" object that contains
handles used for synchronization with the crashing process to an array
maintained by the server. This is how we can keep track of all the clients on
the system that have registered for minidumps. These handles are: *
`server_died(mutex)` * `dump_requested(Event)` * `dump_generated(Event)`
The server registers asynchronous waits on these events with the `ClientInfo`
object as the callback context. When the `dump_requested` event is set by the
client, the `OnDumpRequested()` callback is called. The server uses the handles
inside `ClientInfo` to communicate with the child process. Once the child sets
the event, it waits for two objects: 1. the `dump_generated` event 1. the
`server_died` mutex
In the end handles are "duped" into the client process, and the clients use
`SetEvent` to request events, wait on the other event, or the `server_died`
mutex.
## Linux
### Current Status
As of July 2011, Linux had a minidump generator that is not entirely
out-of-process. The minidump was generated from a separate process, but one that
shared an address space, file descriptors, signal handles and much else with the
crashing process. It worked by using the `clone()` system call to duplicate the
crashing process, and then uses `ptrace()` and the `/proc` file system to
retrieve the information required to write the minidump. Since then Breakpad has
updated Linux exception handling to provide more benefits of out-of-process
report generation.
### Proposed Design
#### Overview
Breakpad would use a per-user daemon to write out a minidump that does not have,
interact with or depend on the crashing process. We don't want to start a new
separate process every time a user launches a Breakpad-enabled process. Doing
one daemon per machine is unacceptable for security concerns around one user
being able to initiate a minidump generation for another user's process.
#### Client/Server Communication
On Breakpad initialization in a process, the initializer would check if the
daemon is running and, if not, start it. The race condition between the check
and the initialization is not a problem because multiple daemons can check if
the IPC endpoint already exists and if a server is listening. Even if multiple
copies of the daemon try to `bind()` the filesystem to name the socket, all but
one will fail and can terminate.
This point is relevant for error handling conditions. Linux does not clean the
file system representation of a UNIX domain socket even if both endpoints
terminate, so checking for existence is not strong enough. However checking the
process list or sending a ping on the socket can handle this.
Breakpad uses UNIX domain sockets since they support full duplex communication
(unlike Windows, named pipes on Linux are half) and the kernal automatically
creates a private channel between the client and server once the client calls
`connect()`.
#### Minidump Generation
Breakpad could use the current system with `ptrace()` and `/proc` within the
daemon executable.
Overall the operations look like: 1. Signal from OS indicating crash 1. Signal
Handler suspends all threads except itself 1. Signal Handler sends
`CRASH_DUMP_REQUEST` message to server and waits for response 1. Server inspects
1. Minidump is asynchronously written to disk by the server 1. Server responds
indicating inspection is done
## Mac OSX
Out-of-process exception handling is fully supported on Mac.

View file

@ -0,0 +1,121 @@
# Introduction
Breakpad is a library and tool suite that allows you to distribute an
application to users with compiler-provided debugging information removed,
record crashes in compact "minidump" files, send them back to your server, and
produce C and C++ stack traces from these minidumps. Breakpad can also write
minidumps on request for programs that have not crashed.
Breakpad is currently used by Google Chrome, Firefox, Google Picasa, Camino,
Google Earth, and other projects.
![http://google-breakpad.googlecode.com/svn/wiki/breakpad.png]
(http://google-breakpad.googlecode.com/svn/wiki/breakpad.png)
Breakpad has three main components:
* The **client** is a library that you include in your application. It can
write minidump files capturing the current threads' state and the identities
of the currently loaded executable and shared libraries. You can configure
the client to write a minidump when a crash occurs, or when explicitly
requested.
* The **symbol dumper** is a program that reads the debugging information
produced by the compiler and produces a **symbol file**, in [Breakpad's own
format](symbol_files.md).
* The **processor** is a program that reads a minidump file, finds the
appropriate symbol files for the versions of the executables and shared
libraries the minidump mentions, and produces a human-readable C/C++ stack
trace.
# The minidump file format
The minidump file format is similar to core files but was developed by Microsoft
for its crash-uploading facility. A minidump file contains:
* A list of the executable and shared libraries that were loaded in the
process at the time the dump was created. This list includes both file names
and identifiers for the particular versions of those files that were loaded.
* A list of threads present in the process. For each thread, the minidump
includes the state of the processor registers, and the contents of the
threads' stack memory. These data are uninterpreted byte streams, as the
Breakpad client generally has no debugging information available to produce
function names or line numbers, or even identify stack frame boundaries.
* Other information about the system on which the dump was collected:
processor and operating system versions, the reason for the dump, and so on.
Breakpad uses Windows minidump files on all platforms, instead of the
traditional core files, for several reasons:
* Core files can be very large, making them impractical to send across a
network to the collector for processing. Minidumps are smaller, as they were
designed to be used this way.
* The core file format is poorly documented. For example, the Linux Standards
Base does not describe how registers are stored in `PT_NOTE` segments.
* It is harder to persuade a Windows machine to produce a core dump file than
it is to persuade other machines to write a minidump file.
* It simplifies the Breakpad processor to support only one file format.
# Overview/Life of a minidump
A minidump is generated via calls into the Breakpad library. By default,
initializing Breakpad installs an exception/signal handler that writes a
minidump to disk at exception time. On Windows, this is done via
`SetUnhandledExceptionFilter()`; on OS X, this is done by creating a thread that
waits on the Mach exception port; and on Linux, this is done by installing a
signal handler for various exceptions like `SIGILL, SIGSEGV` etc.
Once the minidump is generated, each platform has a slightly different way of
uploading the crash dump. On Windows & Linux, a separate library of functions is
provided that can be called into to do the upload. On OS X, a separate process
is spawned that prompts the user for permission, if configured to do so, and
sends the file.
# Terminology
**In-process vs. out-of-process exception handling** - it's generally considered
that writing the minidump from within the crashed process is unsafe - key
process data structures could be corrupted, or the stack on which the exception
handler runs could have been overwritten, etc. All 3 platforms support what's
known as "out-of-process" exception handling.
# Integration overview
## Breakpad Code Overview
All the client-side code is found by visiting the Google Project at
http://code.google.com/p/google-breakpad. The following directory structure is
present in the `src` directory:
* `processor` Contains minidump-processing code that is used on the server
side and isn't of use on the client side
* `client` Contains client minidump-generation libraries for all platforms
* `tools` Contains source code & projects for building various tools on each
platform.
(Among other directories)
* <a
href='http://code.google.com/p/google-breakpad/wiki/WindowsClientIntegration'>Windows
Integration Guide</a>
* <a
href='http://code.google.com/p/google-breakpad/wiki/MacBreakpadStarterGuide'>Mac
Integration Guide</a>
* <a href='http://code.google.com/p/google-breakpad/wiki/LinuxStarterGuide'>
Linux Integration Guide</a>
## Build process specifics(symbol generation)
This applies to all platforms. Inside `src/tools/{platform}/dump_syms` is a tool
that can read debugging information for each platform (e.g. for OS X/Linux,
DWARF and STABS, and for Windows, PDB files) and generate a Breakpad symbol
file. This tool should be run on your binary before it's stripped(in the case of
OS X/Linux) and the symbol files need to be stored somewhere that the minidump
processor can find. There is another tool, `symupload`, that can be used to
upload symbol files if you have written a server that can accept them.

View file

@ -0,0 +1,97 @@
# How To Add Breakpad To Your Linux Application
This document is an overview of using the Breakpad client libraries on Linux.
## Building the Breakpad libraries
Breakpad provides an Autotools build system that will build both the Linux
client libraries and the processor libraries. Running `./configure && make` in
the Breakpad source directory will produce
**src/client/linux/libbreakpad\_client.a**, which contains all the code
necessary to produce minidumps from an application.
## Integrating Breakpad into your Application
First, configure your build process to link **libbreakpad\_client.a** into your
binary, and set your include paths to include the **src** directory in the
**google-breakpad** source tree. Next, include the exception handler header: ```
# include "client/linux/handler/exception_handler.h"
```
Now you can instantiate an `ExceptionHandler` object. Exception handling is active for the lifetime of the `ExceptionHandler` object, so you should instantiate it as early as possible in your application's startup process, and keep it alive for as close to shutdown as possible. To do anything useful, the `ExceptionHandler` constructor requires a path where it can write minidumps, as well as a callback function to receive information about minidumps that were written:
```
static bool dumpCallback(const google_breakpad::MinidumpDescriptor& descriptor,
void* context, bool succeeded) { printf("Dump path: %s\n", descriptor.path());
return succeeded; }
void crash() { volatile int* a = (int*)(NULL); *a = 1; }
int main(int argc, char* argv[]) { google_breakpad::MinidumpDescriptor
descriptor("/tmp"); google_breakpad::ExceptionHandler eh(descriptor, NULL,
dumpCallback, NULL, true, -1); crash(); return 0; } ```
Compiling and running this example should produce a minidump file in /tmp, and
it should print the minidump filename before exiting. You can read more about
the other parameters to the `ExceptionHandler` constructor <a
href='http://code.google.com/p/google-breakpad/source/browse/trunk/src/client/linux/handler/exception_handler.h'>in
the exception_handler.h source file</a>.
**Note**: You should do as little work as possible in the callback function.
Your application is in an unsafe state. It may not be safe to allocate memory or
call functions from other shared libraries. The safest thing to do is `fork` and
`exec` a new process to do any work you need to do. If you must do some work in
the callback, the Breakpad source contains <a
href='http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/linux/linux_libc_support.h'>some
simple reimplementations of libc functions</a>, to avoid calling directly into
libc, as well as <a href='http://code.google.com/p/linux-syscall-support/'>a
header file for making Linux system calls</a> (in **src/third\_party/lss**) to
avoid calling into other shared libraries.
## Sending the minidump file
In a real application, you would want to handle the minidump in some way, likely
by sending it to a server for analysis. The Breakpad source tree contains <a
href='http://code.google.com/p/google-breakpad/source/browse/#svn/trunk/src/common/linux'>some
HTTP upload source</a> that you might find useful, as well as <a
href='http://code.google.com/p/google-breakpad/source/browse/#svn/trunk/src/tools/linux/symupload'>a
minidump upload tool</a>.
## Producing symbols for your application
To produce useful stack traces, Breakpad requires you to convert the debugging
symbols in your binaries to <a
href='http://code.google.com/p/google-breakpad/wiki/SymbolFiles'>text-format
symbol files</a>. First, ensure that you've compiled your binaries with `-g` to
include debugging symbols. Next, compile the `dump_syms` tool by running
`configure && make` in the Breakpad source directory. Next, run `dump_syms` on
your binaries to produce the text-format symbols. For example, if your main
binary was named `test`: `$ google-breakpad/src/tools/linux/dump_syms/dump_syms
./test > test.sym
`
In order to use these symbols with the `minidump_stackwalk` tool, you will need
to place them in a specific directory structure. The first line of the symbol
file contains the information you need to produce this directory structure, for
example (your output will vary): `$ head -n1 test.sym MODULE Linux x86_64
6EDC6ACDB282125843FD59DA9C81BD830 test $ mkdir -p
./symbols/test/6EDC6ACDB282125843FD59DA9C81BD830 $ mv test.sym
./symbols/test/6EDC6ACDB282125843FD59DA9C81BD830
`
You may also find the <a
href='http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/tools/symbolstore.py'>symbolstore.py</a>
script in the Mozilla repository useful, as it encapsulates these steps.
## Processing the minidump to produce a stack trace
Breakpad includes a tool called `minidump_stackwalk` which can take a minidump
plus its corresponding text-format symbols and produce a symbolized stacktrace.
It should be in the **google-breakpad/src/processor** directory if you compiled
the Breakpad source using the directions above. Simply pass it the minidump and
the symbol path as commandline parameters:
`google-breakpad/src/processor/minidump_stackwalk minidump.dmp ./symbols
` It produces verbose output on stderr, and the stacktrace on stdout, so you may
want to redirect stderr.

View file

@ -0,0 +1,47 @@
# Introduction
Linux implements its userland-to-kernel transition using a special library
called linux-gate.so that is mapped by the kernel into every process. For more
information, see
http://www.trilithium.com/johan/2005/08/linux-gate/
In a nutshell, the problem is that the system call gate function,
kernel\_vsyscall does not use EBP to point to the frame pointer.
However, the Breakpad processor supports special frames like this via STACK
lines in the symbol file. If you look in src/client/linux/data you will see
symbol files for linux-gate.so for both Intel & AMD(the implementation of
kernel\_vsyscall changes depending on the CPU manufacturer). When processing
minidumps from Linux 2.6, having these symbol files is necessary for walking the
stack for crashes that happen while a thread is in a system call.
If you're just interested in processing minidumps, those two symbol files should
be all you need!
# Details
The particular details of understanding the linux-gate.so symbol files can be
found by reading about STACK lines inside
src/common/windows/pdb\_source\_line\_writer.cc, and the above link. To
summarize briefly, we just have to inform the processor how to get to the
previous frame when the EIP is inside kernel\_vsyscall, and we do that by
telling the processor how many bytes kernel\_vsyscall has pushed onto the stack
in it's prologue. For example, one of the symbol files looks somewhat like the
following:
MODULE Linux x86 random\_debug\_id linux-gate.so PUBLIC 400 0 kernel\_vsyscall
STACK WIN 4 100 1 1 0 0 0 0 0 1
The PUBLIC line indicates that kernel\_vsyscall is at offset 400 (in bytes) from
the beginning of linux-gate.so. The STACK line indicates the size of the
function(100), how many bytes it pushes(1), and how many bytes it pops(1). The
last 1 indicates that EBP is pushed onto the stack before being used by the
function.
# Warnings
These functions might change significantly depending on kernel version. In my
opinion, the actual function stack information is unlikely to change frequently,
but the Linux kernel might change the address of kernel\_vsyscall w.r.t the
beginning of linux-gate.so, which would cause these symbol files to be invalid.

View file

@ -0,0 +1,184 @@
# How To Add Breakpad To Your Mac Client Application
This document is a step-by-step recipe to get your Mac client app to build with
Breakpad.
## Preparing a binary build of Breakpad for use in your tree
You can either check in a binary build of the Breakpad framework & tools or
build it as a dependency of your project. The former is recommended, and
detailed here, since building dependencies through other projects is
problematic(matching up configuration names), and the Breakpad code doesn't
change nearly often enough as your application's will.
## Building the requisite targets
All directories are relative to the `src` directory of the Breakpad checkout.
* Build the 'All' target of `client/mac/Breakpad.xcodeproj` in Release mode.
* Execute `cp -R client/mac/build/Release/Breakpad.framework <location in your
source tree>`
* Inside `tools/mac/dump_syms` directory, build dump\_syms.xcodeproj, and copy
tools/mac/dump\_syms/build/Release/dump\_syms to a safe location where it
can be run during the build process.
## Adding Breakpad.framework
Inside your application's framework, add the Breakpad.Framework to your
project's framework settings. When you select it from the file chooser, it will
let you pick a target to add it to; go ahead and check the one that's relevant
to your application.
## Copy Breakpad into your Application Package
Copy Breakpad into your Application Package, so it will be around at run time.
Go to the Targets section of your Xcode Project window. Hit the disclosure
triangle to reveal the build phases of your application. Add a new Copy Files
phase using the Contextual menu (Control Click). On the General panel of the new
'Get Info' of this new phase, set the destination to 'Frameworks' Close the
'Info' panel. Use the Contextual Menu to Rename your new phase 'Copy Frameworks'
Now drag Breakpad again into this Copy Frameworks phase. Drag it from whereever
it appears in the project file tree.
## Add a New Run Script build phase
Near the end of the build phases, add a new Run Script build phase. This will be
run before Xcode calls /usr/bin/strip on your project. This is where you'll be
calling dump\_sym to output the symbols for each architecture of your build. In
my case, the relevant lines read:
```
#!/bin/sh
$TOOL_DIR=<location of dump_syms from step 3 above>
"$TOOL_DIR/dump_syms" -a ppc "$PROD" > "$TARGET_NAME ppc.breakpad"
"$TOOL_DIR/dump_syms" -a i386 "$PROD" > "$TARGET_NAME i386.breakpad"
```
## Adjust the Project Settings
* Turn on Separate Strip,
* Set the Strip Style to Non-Global Symbols.
## Write Code!
You'll need to have an object that acts as the delegate for NSApplication.
Inside this object's header, you'll need to add
1. add an ivar for Breakpad and
2. a declaration for the applicationShouldTerminate:(NSApplication`*` sender)
message.
```
#import <Breakpad/Breakpad.h>
@interface BreakpadTest : NSObject {
.
.
.
BreakpadRef breakpad;
.
.
.
}
.
.
- (NSApplicationTerminateReply)applicationShouldTerminate:(NSApplication *)sender;
.
.
@end
```
Inside your object's implementation file,
1. add the following method InitBreakpad
2. modify your awakeFromNib method to look like the one below,
3. modify/add your application's delegate method to look like the one below
```
static BreakpadRef InitBreakpad(void) {
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
BreakpadRef breakpad = 0;
NSDictionary *plist = [[NSBundle mainBundle] infoDictionary];
if (plist) {
// Note: version 1.0.0.4 of the framework changed the type of the argument
// from CFDictionaryRef to NSDictionary * on the next line:
breakpad = BreakpadCreate(plist);
}
[pool release];
return breakpad;
}
- (void)awakeFromNib {
breakpad = InitBreakpad();
}
- (NSApplicationTerminateReply)applicationShouldTerminate:(NSApplication *)sender {
BreakpadRelease(breakpad);
return NSTerminateNow;
}
```
## Configure Breakpad
Configure Breakpad for your application.
1. Take a look inside the Breakpad.framework at the Breakpad.h file for the
keys, default values, and descriptions to be passed to BreakpadCreate().
2. Add/Edit the Breakpad specific entries in the dictionary passed to
BreakpadCreate() -- typically your application's info plist.
Example from the Notifier Info.plist:
`<key>BreakpadProduct</key><string>Google_Notifier_Mac</string>
<key>BreakpadProductDisplay</key><string>${PRODUCT_NAME}</string>
`
## Build Your Application
Almost done!
## Verify
Double-check:
Your app should have in its package contents:
myApp.app/Contents/Frameworks/Breakpad.framework.
The symbol files have reasonable contents (you can look at them with a text
editor.)
Look again at the Copy Frameworks phase of your project. Are you leaking .h
files? Select them and delete them. (If you drag a bunch of files into your
project, Xcode often wants to copy your .h files into the build, revealing
Google secrets. Be vigilant!)
## Upload the symbol file
You'll need to configure your build process to store symbols in a location that
is accessible by the minidump processor. There is a tool in tools/mac/symupload
that can be used to send the symbol file via HTTP post.
1. Test
Configure breakpad to send reports to a URL by adding to your app's Info.plist:
```
<key>BreakpadURL</key>
<string>upload URL</string>
<key>BreakpadReportInterval</key>
<string>30</string>
```
## Final Notes
Breakpad checks whether it is being run under a debugger, and if so, normally
does nothing. But, you can force Breakpad to function under a debugger by
setting the Unix shell variable BREAKPAD\_IGNORE\_DEBUGGER to a non-zero value.
You can bracket the source code in the above Write The Code step with #if DEBUG
to completely eliminate it from Debug builds. See
//depot/googlemac/GoogleNotifier/main.m for an example. FYI, when your process
forks(), exception handlers are reset to the default for child processes. So
they must reinitialize Breakpad, otherwise exceptions will be handled by Apple's
Crash Reporter.

View file

@ -0,0 +1,84 @@
# Breakpad Crash Reporting for Mozilla
* January 24, 2007
* Links updated February 14, 2007
* Mozilla HQ
* Mark Mentovai
* Brian Ryner
## What is a crash reporter?
* Enables developers to analyze crashes that occur in the wild
* Produces stack backtraces that help identify how a program failed
* Offers higher-level data aggregation (topcrashes, MTBF statistics)
## Motivation
* Talkback is proprietary and unmaintained
* Smaller open-source projects have few options
* Larger projects need flexibility and scalability
## Design Options
* Stackwalking done on client
* Apple CrashReporter
* GNOME BugBuddy
* Client sends memory dump
* Talkback
* Windows Error Reporting
* Breakpad
## Goals
* Provide libraries around which systems can be based
* Open-source
* Cross-platform
* Mac OS X x86, PowerPC
* Linux x86
* Windows x86
* No requirement to distribute symbols
## Client Libraries
* Exception handler installed at application startup
* Spawns a separate thread
* Minidump file written at crash time
* Format used by Windows debuggers
* Separate application invoked to send
* HTTP[S](S.md) POST, can include additional parameters
## Symbols
* Cross-platform symbol file format
* Contents
* Function names
* Source file names and line numbers
* Windows: Frame pointer omission data
* Future: parameters and local variables
* Symbol conversion methods
## Processor
* Examines minidump file and invokes stackwalker
* Symbol files requested from a SymbolSupplier
* Produces stack trace
* Output may be placed where convenient
## Intergation
* Breakpad client present in Gran Paradiso Alpha 1 for Windows
* Disabled by default
* Enable with `MOZ_AIRBAG`
* Proof-of-concept collector
* http://mavra.perilith.com/~luser/airbag-collector/list.pl
* Other platforms coming soon
## More Information
* Project home: http://code.google.com/p/google-breakpad/
* Mailing lists
* [google-breakpad-dev@googlegroups.com]
(http://groups.google.com/group/google-breakpad-dev/)
* [google-breakpad-discuss@googlegroups.com]
(http://groups.google.com/group/google-breakpad-discuss/)
* Ask me (irc.mozilla.org: mento)

230
docs/processor_design.md Normal file
View file

@ -0,0 +1,230 @@
# Breakpad Processor Library
## Objective
The Breakpad processor library is an open-source framework to access the the
information contained within crash dumps for multiple platforms, and to use that
information to produce stack traces showing the call chain of each thread in a
process. After processing, this data is made available to users of the library.
## Background
The Breakpad processor is intended to sit at the core of a comprehensive
crash-reporting system that does not require debugging information to be
provided to those running applications being monitored. Some existing
crash-reporting systems, such as [GNOME](http://www.gnome.org/)s Bug-Buddy and
[Apple](http://www.apple.com/)s [CrashReporter]
(http://developer.apple.com/technotes/tn2004/tn2123.html), require symbolic
information to be present on the end users computer; in the case of
CrashReporter, the reports are transmitted only to Apple, not to third-party
developers. Other systems, such as [Microsoft](http://www.microsoft.com/)s
[Windows Error Reporting](http://msdn.microsoft.com/isv/resources/wer/) and
SupportSofts Talkback, transmit only a snapshot of a crashed process state,
which can later be combined with symbolic debugging information without the need
for it to be present on end users computers. Because symbolic debugging
information consumes a large amount of space and is otherwise not needed during
the normal operation of software, and because some developers are reluctant to
release debugging symbols to their customers, Breakpad follows the latter
approach.
We know of no currently-maintained crash-reporting systems that meet our
requirements, which are to: * allow for symbols to be separate from the
application, * handle crash reports from multiple platforms, * allow developers
to operate their own crash-reporting platform, and to * be open-source. Windows
Error Reporting only functions for Microsoft products, and requires the
involvement of Microsofts servers. Talkback, while cross-platform, has not been
maintained and at this point does not support Mac OS X on x86, which we consider
to be a significant platform. Talkback is also closed-source commercial
software, and has very specific requirements for its server platform.
We are aware of Windows-only crash-reporting systems that leverage Microsofts
debugging interfaces. Such systems, even if extended to support dumps from other
platforms, are tied to using Windows for at least a portion of the processor
platform.
## Overview
The Breakpad processor itself is written in standard C++ and will work on a
variety of platforms. The dumps it accepts may also have been created on a
variety of systems. The library is able to combine dumps with symbolic debugging
information to create stack traces that include function signatures. The
processor library includes simple command-line tools to examine dumps and
process them, producing stack traces. It also exposes several layers of APIs
enabling crash-reporting systems to be built around the Breakpad processor.
## Detailed Design
### Dump Files
In the processor, the dump data is of primary significance. Dumps typically
contain:
* CPU context (register data) as it was at the time the crash occurred, and an
indication of which thread caused the crash. General-purpose registers are
included, as are special-purpose registers such as the instruction pointer
(program counter).
* Information about each thread of execution within a crashed process,
including:
* The memory region used for each threads stack.
* CPU context for each thread, which for various reasons is not the same
as the crash context in the case of the crashed thread.
* A list of loaded code segments (or modules), including:
* The name of the file (`.so`, `.exe`, `.dll`, etc.) which provides the
code.
* The boundaries of the memory region in which the code segment is visible
to the process.
* A reference to the debugging information for the code module, when such
information is available.
Ordinarily, dumps are produced as a result of a crash, but other triggers may be
set to produce dumps at any time a developer deems appropriate. The Breakpad
processor can handle dumps in the minidump format, either generated by an
[Breakpad client “handler”](client_design.md) implementation, or by another
implementation that produces dumps in this format. The
[DbgHelp.dll!MiniDumpWriteDump]
(http://msdn2.microsoft.com/en-us/library/ms680360.aspx) function on Windows
produces dumps in this format, and is the basis for the Breakpad handler
implementation on that platform.
The [minidump format]
(http://msdn.microsoft.com/en-us/library/ms679293%28VS.85%29.aspx) is
essentially a simple container format, organized as a series of streams. Each
stream contains some type of data relevant to the crash. A typical “normal”
minidump contains streams for the thread list, the module list, the CPU context
at the time of the crash, and various bits of additional system information.
Other types of minidump can be generated, such as a full-memory minidump, which
in addition to stack memory contains snapshots of all of a process mapped
memory regions.
The minidump format was chosen as Breakpads dump format because it has an
established track record on Windows, and it can be adapted to meet the needs of
the other platforms that Breakpad supports. Most other operating systems use
“core” files as their native dump formats, but the capabilities of core files
vary across platforms, and because core files are usually presented in a
platforms native executable format, there are complications involved in
accessing the data contained therein without the benefit of the header files
that define an executable formats entire structure. Because minidumps are
leaner than a typical executable format, a redefinition of the format in a
cross-platform header file, `minidump_format.h`, was a straightforward task.
Similarly, the capabilities of the minidump format are understood, and because
it provides an extensible container, any of Breakpads needs that could not be
met directly by the standard minidump format could likely be met by extending it
as needed. Finally, using this format means that the dump file is compatible
with native debugging tools at least on Windows. A possible future avenue for
exploration is the conversion of minidumps to core files, to enable this same
benefit on other platforms.
We have already provided an extension to the minidump format that allows it to
carry dumps generated on systems with PowerPC processors. The format already
allows for variable CPUs, so our work in this area was limited to defining a
context structure sufficient to represent the execution state of a PowerPC. We
have also defined an extension that allows minidumps to indicate which thread of
execution requested a dump be produced for non-crash dumps.
Often, the information contained within a dump alone is sufficient to produce a
full stack backtrace for each thread. Certain optimizations that compilers
employ in producing code frustrate this process. Specifically, the “frame
pointer omission” optimization of x86 compilers can make it impossible to
produce useful stack traces given only a stack snapshot and CPU context. In
these cases, however, compiler-emitted debugging information can aid in
producing useful stack traces. The Breakpad processor is able to take advantage
of this debugging information as supplied by Microsofts C/C++ compiler, the
only compiler to apply such optimizations by default. As a result, the Breakpad
processor can produce useful stack traces even from code with frame pointer
omission optimizations as produced by this compiler.
### Symbol Files
The [symbol files](symbol_files.md) that the Breakpad processor accepts allow
for frame pointer omission data, but this is only one of their capabilities.
Each symbol file also includes information about the functions, source files,
and source code line numbers for a single module of code. A module is an
individually-loadble chunk of code: these can be executables containing a main
program (`exe` files on Windows) or shared libraries (`.so` files on Linux,
`.dylib` files, frameworks, and bundles on Mac OS X, and `.dll` files on
Windows). Dumps contain information about which of these modules were loaded at
the time the dump was produced, and given this information, the Breakpad
processor attempts to locate debugging symbols for the module through a
user-supplied function embodied in a “symbol supplier.” Breakpad includes a
sample symbol supplier, called `SimpleSymbolSupplier`, that is used by its
command-line tools; this supplier locates symbol files by pathname.
`SimpleSymbolSupplier` is also available to other users of the Breakpad
processor library. This allows for the use of a simple reference implementation,
but preserves flexibility for users who may have more demanding symbol file
storage needs.
Breakpads symbol file format is text-based, and was defined to be fairly
human-readable and to encompass the needs of multiple platforms. The Breakpad
processor itself does not operate directly with native symbol formats ([DWARF]
(http://dwarf.freestandards.org/) and [STABS]
(http://sourceware.org/gdb/current/onlinedocs/stabs.html) on most Unix-like
systems, [.pdb files]
(http://msdn2.microsoft.com/en-us/library/yd4f8bd1(VS.80).aspx) on Windows),
because of the complications in accessing potentially complex symbol formats
with slight variations between platforms, stored within different types of
binary formats. In the case of `.pdb` files, the debugging format is not even
documented. Instead, Breakpads symbol files are produced on each platform,
using specific debugging APIs where available, to convert native symbols to
Breakpads cross-platform format.
### Processing
Most commonly, a developer will enable an application to use Breakpad by
building it with a platform-specific [client “handler”](client_design.md)
library. After building the application, the developer will create symbol files
for Breakpads use using the included `dump_syms` or `symupload` tools, or
another suitable tool, and place the symbol files where the processors symbol
supplier will be able to locate them.
When a dump file is given to the processors `MinidumpProcessor` class, it will
read it using its included minidump reader, contained in the `Minidump` family
of classes. It will collect information about the operating system and CPU that
produced the dump, and determine whether the dump was produced as a result of a
crash or at the direct request of the application itself. It then loops over all
of the threads in a process, attempting to walk the stack associated with each
thread. This process is achieved by the processors `Stackwalker` components, of
which there are a slightly different implementations for each CPU type that the
processor is able to handle dumps from. Beginning with a threads context, and
possibly using debugging data, the stackwalker produces a list of stack frames,
containing each instruction executed in the chain. These instructions are
matched up with the modules that contributed them to a process, and the
`SymbolSupplier` is invoked to locate a symbol file. The symbol file is given to
a `SourceLineResolver`, which matches the instruction up with a specific
function name, source file, and line number, resulting in a representation of a
stack frame that can easily be used to identify which code was executing.
The results of processing are made available in a `ProcessState` object, which
contains a vector of threads, each containing a vector of stack frames.
For small-scale use of the Breakpad processor, and for testing and debugging,
the `minidump_stackwalk` tool is provided. It invokes the processor and displays
the full results of processing, optionally allowing symbols to be provided to
the processor by a pathname-based symbol supplier, `SimpleSymbolSupplier`.
For lower-level testing and debugging, the processor library also includes a
`minidump_dump` tool, which walks through an entire minidump file and displays
its contents in somewhat readable form.
### Platform Support
The Breakpad processor library is able to process dumps produced on Mac OS X
systems running on x86, x86-64, and PowerPC processors, on Windows and Linux
systems running on x86 or x86-64 processors, and on Android systems running ARM
or x86 processors. The processor library itself is written in standard C++, and
should function properly in most Unix-like environments. It has been tested on
Linux and Mac OS X.
## Future Plans
There are currently no firm plans or timetables to implement any of these
features, although they are possible avenues for future exploration.
The symbol file format can be extended to carry information about the locations
of parameters and local variables as stored in stack frames and registers, and
the processor can use this information to provide enhanced stack traces showing
function arguments and variable values.
On Mac OS X and Linux, we can provide tools to convert files from the minidump
format into the native core format. This will enable developers to open dump
files in a native debugger, just as they are presently able to do with minidumps
on Windows.

160
docs/stack_walking.md Normal file
View file

@ -0,0 +1,160 @@
# Introduction
This page aims to provide a detailed description of how Breakpad produces stack
traces from the information contained within a minidump file.
# Details
## Starting the Process
Typically the stack walking process is initiated by instantiating the
[MinidumpProcessor]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/minidump_processor.cc)
class and calling the [MinidumpProcessor::Process]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/minidump_processor.cc#61)
method, providing it a minidump file to process. To produce a useful stack
trace, the MinidumpProcessor requires two other objects which are passed in its
constructor: a [SymbolSupplier]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/symbol_supplier.h)
and a [SourceLineResolverInterface]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h).
The SymbolSupplier object is responsible for locating and providing SymbolFiles
that match modules from the minidump. The SourceLineResolverInterface is
responsible for loading the symbol files and using the information contained
within to provide function and source information for stack frames, as well as
information on how to unwind from a stack frame to its caller. More detail will
be provided on these interactions later.
A number of data streams are extracted from the minidump to begin stack walking:
the list of threads from the process ([MinidumpThreadList]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#335)),
the list of modules loaded in the process ([MinidumpModuleList]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#501)),
and information about the exception that caused the process to crash
([MinidumpException]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#615)).
## Enumerating Threads
For each thread in the thread list ([MinidumpThread]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#299)),
the thread memory containing the stack for the thread ([MinidumpMemoryRegion]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#236))
and the CPU context representing the CPU state of the thread at the time the
dump was written ([MinidumpContext]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#171))
are extracted from the minidump. If the thread being processed is the thread
that produced the exception then a CPU context is obtained from the
MinidumpException object instead, which represents the CPU state of the thread
at the point of the exception. A stack walker is then instantiated by calling
the [Stackwalker::StackwalkerForCPU]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#77)
method and passing it the CPU context, the thread memory, the module list, as
well as the SymbolSupplier and SourceLineResolverInterface. This method selects
the specific !Stackwalker subclass based on the CPU architecture of the provided
CPU context and returns an instance of that subclass.
## Walking a thread's stack
Once a !Stackwalker instance has been obtained, the processor calls the
[Stackwalker::Walk]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h)
method to obtain a list of frames representing the stack of this thread. The
!Stackwalker starts by calling the GetContextFrame method which returns a
StackFrame representing the top of the stack, with CPU state provided by the
initial CPU context. From there, the stack walker repeats the following steps
for each frame in turn:
### Finding the Module
The address of the instruction pointer of the current frame is used to determine
which module contains the current frame by calling the module list's
[GetModuleForAddress]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/code_modules.h#56)
method.
### Locating Symbols
If a module is located, the SymbolSupplier is asked to locate symbols
corresponding to the module by calling its [GetCStringSymbolData]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/symbol_supplier.h#87)
method. Typically this is implemented by using the module's debug filename (the
PDB filename for Windows dumps) and debug identifier (a GUID plus one extra
digit) as a lookup key. The [SimpleSymbolSupplier]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/simple_symbol_supplier.cc)
class simply uses these as parts of a file path to locate a flat file on disk.
### Loading Symbols
If a symbol file is located, the SourceLineResolverInterface is then asked to
load the symbol file by calling its [LoadModuleUsingMemoryBuffer]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#71)
method. The [BasicSourceLineResolver]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/basic_source_line_resolver.cc)
implementation parses the text-format [symbol file](symbol_files.md) into
in-memory data structures to make lookups by address of function names, source
line information, and unwind information easy.
### Getting source line information
If a symbol file has been successfully loaded, the SourceLineResolverInterface's
[FillSourceLineInfo]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#89)
method is called to provide a function name and source line information for the
current frame. This is done by subtracting the base address of the module
containing the current frame from the instruction pointer of the current frame
to obtain a relative virtual address (RVA), which is a code offset relative to
the start of the module. This RVA is then used as a lookup into a table of
functions ([FUNC lines](SymbolFiles#FUNC_records.md) from the symbol file), each
of which has an associated address range (function start address, function
size). If a function is found whose address range contains the RVA, then its
name is used. The RVA is then used as a lookup into a table of source lines
([line records](SymbolFiles#Line_records.md) from the symbol file), each of
which also has an associated address range. If a match is found it will provide
the file name and source line associated with the current frame. If no match was
found in the function table, another table of publicly exported symbols may be
consulted ([PUBLIC lines](SymbolFiles#PUBLIC_records.md) from the symbol file).
Public symbols contain only a start address, so the lookup simply looks for the
nearest symbol that is less than the provided RVA.
### Finding the caller frame
To find the next frame in the stack, the !Stackwalker calls its [GetCallerFrame]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#186)
method, passing in the current frame. Each !Stackwalker subclass implements
GetCallerFrame differently, but there are common patterns.
Typically the first step is to query the SourceLineResolverInterface for the
presence of detailed unwind information. This is done using its
[FindWindowsFrameInfo]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#96)
and [FindCFIFrameInfo]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#102)
methods. These methods look for Windows unwind info extracted from a PDB file
([STACK WIN](SymbolFiles#STACK_WIN_records.md) lines from the symbol file), or
DWARF CFI extracted from a binary ([STACK CFI](SymbolFiles#STACK_CFI_records.md)
lines from the symbol file) respectively. The information covers address ranges,
so the RVA of the current frame is used for lookup as with function and source
line information.
If unwind info is found it provides a set of rules to recover the register state
of the caller frame given the current register state as well as the thread's
stack memory. The rules are evaluated to produce the caller frame.
If unwind info is not found then the !Stackwalker may resort to other methods.
Typically on architectures which specify a frame pointer unwinding by
dereferencing the frame pointer is tried next. If that is successful it is used
to produce the caller frame.
If no caller frame was found by any other method most !Stackwalker
implementations resort to stack scanning by looking at each word on the stack
down to a fixed depth (implemented in the [Stackwalker::ScanForReturnAddress]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#131)
method) and using a heuristic to attempt to find a reasonable return address
(implemented in the [Stackwalker::InstructionAddressSeemsValid]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#111)
method).
If no caller frame is found or the caller frame seems invalid, stack walking
stops. If a caller frame was found then these steps repeat using the new frame
as the current frame.

497
docs/symbol_files.md Normal file
View file

@ -0,0 +1,497 @@
# Introduction
Given a minidump file, the Breakpad processor produces stack traces that include
function names and source locations. However, minidump files contain only the
byte-by-byte contents of threads' registers and stacks, without function names
or machine-code-to-source mapping data. The processor consults Breakpad symbol
files for the information it needs to produce human-readable stack traces from
the binary-only minidump file.
The platform-specific symbol dumping tools parse the debugging information the
compiler provides (whether as DWARF or STABS sections in an ELF file or as
stand-alone PDB files), and write that information back out in the Breakpad
symbol file format. This format is much simpler and less detailed than compiler
debugging information, and values legibility over compactness.
# Overview
Breakpad symbol files are ASCII text files, with lines delimited as appropriate
for the host platform. Each line is a _record_, divided into fields by single
spaces; in some cases, the last field of the record can contain spaces. The
first field is a string indicating what sort of record the line represents
(except for line records; these are very common, making them the default saves
space). Some fields hold decimal or hexadecimal numbers; hexadecimal numbers
have no "0x" prefix, and use lower-case letters.
Breakpad symbol files contain the following record types. With some
restrictions, these may appear in any order.
* A `MODULE` record describes the executable file or shared library from which
this data was derived, for use by symbol suppliers. A `MODULE' record should
be the first record in the file.
* A `FILE` record gives a source file name, and assigns it a number by which
other records can refer to it.
* A `FUNC` record describes a function present in the source code.
* A line record indicates to which source file and line a given range of
machine code should be attributed. The line is attributed to the function
defined by the most recent `FUNC` record.
* A `PUBLIC` record gives the address of a linker symbol.
* A `STACK` record provides information necessary to produce stack traces.
# `MODULE` records
A `MODULE` record provides meta-information about the module the symbol file
describes. It has the form:
> `MODULE` _operatingsystem_ _architecture_ _id_ _name_
For example: `MODULE Linux x86 D3096ED481217FD4C16B29CD9BC208BA0 firefox-bin
` These records provide meta-information about the executable or shared library
from which this symbol file was generated. A symbol supplier might use this
information to find the correct symbol files to use to interpret a given
minidump, or to perform other sorts of validation. If present, a `MODULE` record
should be the first line in the file.
The fields are separated by spaces, and cannot contain spaces themselves, except
for _name_.
* The _operatingsystem_ field names the operating system on which the
executable or shared library was intended to run. This field should have one
of the following values: | **Value** | **Meaning** |
|:----------|:--------------------| | Linux | Linux | | mac | Macintosh OSX
| | windows | Microsoft Windows |
* The _architecture_ field indicates what processor architecture the
executable or shared library contains machine code for. This field should
have one of the following values: | **Value** | **Instruction Set
Architecture** | |:----------|:---------------------------------| | x86 |
Intel IA-32 | | x86\_64 | AMD64/Intel 64 | | ppc | 32-bit PowerPC | | ppc64
| 64-bit PowerPC | | unknown | unknown |
* The _id_ field is a sequence of hexadecimal digits that identifies the exact
executable or library whose contents the symbol file describes. The way in
which it is computed varies from platform to platform.
* The _name_ field contains the base name (the final component of the
directory path) of the executable or library. It may contain spaces, and
extends to the end of the line.
# `FILE` records
A `FILE` record holds a source file name for other records to refer to. It has
the form:
> `FILE` _number_ _name_
For example: `FILE 2 /home/jimb/mc/in/browser/app/nsBrowserApp.cpp
`
A `FILE` record provides the name of a source file, and assigns it a number
which other records (line records, in particular) can use to refer to that file
name. The _number_ field is a decimal number. The _name_ field is the name of
the file; it may contain spaces.
# `FUNC` records
A `FUNC` record describes a source-language function. It has the form:
> `FUNC` _address_ _size_ _parameter\_size_ _name_
For example: `FUNC c184 30 0 nsQueryInterfaceWithError::operator()(nsID const&,
void**) const
`
The _address_ and _size_ fields are hexadecimal numbers indicating the start
address and length in bytes of the machine code instructions the function
occupies. (Breakpad symbol files cannot accurately describe functions whose code
is not contiguous.) The start address is relative to the module's load address.
The _parameter\_size_ field is a hexadecimal number indicating the size, in
bytes, of the arguments pushed on the stack for this function. Some calling
conventions, like the Microsoft Windows `stdcall` convention, require the called
function to pop parameters passed to it on the stack from its caller before
returning. The stack walker uses this value, along with data from `STACK`
records, to step from the called function's frame to the caller's frame.
The _name_ field is the name of the function. In languages that use linker
symbol name mangling like C++, this should be the source language name (the
"unmangled" form). This field may contain spaces.
# Line records
A line record describes the source file and line number to which a given range
of machine code should be attributed. It has the form:
> _address_ _size_ _line_ _filenum_
For example: `c184 7 59 4
`
Because they are so common, line records do not begin with a string indicating
the record type. All other record types' names use upper-case letters;
hexadecimal numbers, like a line record's _address_, use lower-case letters.
The _address_ and _size_ fields are hexadecimal numbers indicating the start
address and length in bytes of the machine code. The address is relative to the
module's load address.
The _line_ field is the line number to which the machine code should be
attributed, in decimal; the first line of the source file is line number 1. The
_filenum_ field is a decimal number appearing in a prior `FILE` record; the name
given in that record is the source file name for the machine code.
The line is assumed to belong to the function described by the last preceding
`FUNC` record. Line records may not appear before the first `FUNC' record.
No two line records in a symbol file cover the same range of addresses. However,
there may be many line records with identical line and file numbers, as a given
source line may contribute many non-contiguous blocks of machine code.
# `PUBLIC` records
A `PUBLIC` record describes a publicly visible linker symbol, such as that used
to identify an assembly language entry point or region of memory. It has the
form:
> PUBLIC _address_ _parameter\_size_ _name_
For example: `PUBLIC 2160 0 Public2_1
`
The Breakpad processor essentially treats a `PUBLIC` record as defining a
function with no line number data and an indeterminate size: the code extends to
the next address mentioned. If a given address is covered by both a `PUBLIC`
record and a `FUNC` record, the processor uses the `FUNC` data.
The _address_ field is a hexadecimal number indicating the symbol's address,
relative to the module's load address.
The _parameter\_size_ field is a hexadecimal number indicating the size of the
parameters passed to the code whose entry point the symbol marks, if known. This
field has the same meaning as the _parameter\_size_ field of a `FUNC` record;
see that description for more details.
The _name_ field is the name of the symbol. In languages that use linker symbol
name mangling like C++, this should be the source language name (the "unmangled"
form). This field may contain spaces.
# `STACK WIN` records
Given a stack frame, a `STACK WIN` record indicates how to find the frame that
called it. It has the form:
> STACK WIN _type_ _rva_ _code\_size_ _prologue\_size_ _epilogue\_size_
> _parameter\_size_ _saved\_register\_size_ _local\_size_ _max\_stack\_size_
> _has\_program\_string_ _program\_string\_OR\_allocates\_base\_pointer_
For example: `STACK WIN 4 2170 14 1 0 0 0 0 0 1 $eip 4 + ^ = $esp $ebp 8 + =
$ebp $ebp ^ =
`
All fields of a `STACK WIN` record, except for the last, are hexadecimal
numbers.
The _type_ field indicates what sort of stack frame data this record holds. Its
value should be one of the values of the [StackFrameTypeEnum]
(http://msdn.microsoft.com/en-us/library/bc5207xw%28VS.100%29.aspx) type in
Microsoft's [Debug Interface Access (DIA)]
(http://msdn.microsoft.com/en-us/library/x93ctkx8%28VS.100%29.aspx) API.
Breakpad uses only records of type 4 (`FrameTypeFrameData`) and 0
(`FrameTypeFPO`); it ignores others. These types differ only in whether the last
field is an _allocates\_base\_pointer_ flag (`FrameTypeFPO`) or a program string
(`FrameTypeFrameData`). If more than one record covers a given address, Breakpad
prefers `FrameTypeFrameData` records over `FrameTypeFPO` records.
The _rva_ and _code\_size_ fields give the starting address and length in bytes
of the machine code covered by this record. The starting address is relative to
the module's load address.
The _prologue\_size_ and _epilogue\_size_ fields give the length, in bytes, of
the prologue and epilogue machine code within the record's range. Breakpad does
not use these values.
The _parameter\_size_ field gives the number of argument bytes this function
expects to have been passed. This field has the same meaning as the
_parameter\_size_ field of a `FUNC` record; see that description for more
details.
The _saved\_register\_size_ field gives the number of bytes in the stack frame
dedicated to preserving the values of any callee-saves registers used by this
function.
The _local\_size_ field gives the number of bytes in the stack frame dedicated
to holding the function's local variables and temporary values.
The _max\_stack\_size_ field gives the maximum number of bytes pushed on the
stack in the frame. Breakpad does not use this value.
If the _has\_program\_string_ field is zero, then the `STACK WIN` record's final
field is an _allocates\_base\_pointer_ flag, as a hexadecimal number; this is
expected for records whose _type_ is 0. Otherwise, the final field is a program
string.
## Interpreting a `STACK WIN` record
Given the register values for a frame F, we can find the calling frame as
follows:
* If the _has\_program\_string_ field of a `STACK WIN` record is zero, then
the final field is _allocates\_base\_pointer_, a flag indicating whether the
frame uses the frame pointer register, `%ebp`, as a general-purpose
register.
* If _allocates\_base\_pointer_ is true, then `%ebp` does not point to the
frame's base address. Instead,
* Let _next\_parameter\_size_ be the parameter size of the function
frame F called (**not** this record's _parameter\_size_ field), or
zero if F is the youngest frame on the stack. You must find this
value in F's callee's `FUNC`, `STACK WIN`, or `PUBLIC` records.
* Let _frame\_size_ be the sum of the _local\_size_ field, the
_saved\_register\_size_ field, and _next\_parameter\_size_. > > With
those definitions in place, we can recover the calling frame as
follows:
* F's return address is at `%esp +`_frame\_size_,
* the caller's value of `%ebp` is saved at `%esp
+`_next\_parameter\_size_`+`_saved\_register\_size_`- 8`, and
* the caller's value of `%esp` just before the call instruction was
`%esp +`_frame\_size_`+ 4`. > > (Why do we include
_next\_parameter\_size_ in the sum when computing _frame\_size_ and
the address of the saved `%ebp`? When a function A has called a
function B, the arguments that A pushed for B are considered part of
A's stack frame: A's value for `%esp` points at the last argument
pushed for B. Thus, we must include the size of those arguments
(given by the debugging info for B) along with the size of A's
register save area and local variable area (given by the debugging
info for A) when computing the overall size of A's frame.)
* If _allocates\_base\_pointer_ is false, then F's function doesn't use
`%ebp` at all. You may recover the calling frame as above, except that
the caller's value of `%ebp` is the same as F's value for `%ebp`, so no
steps are necessary to recover it.
* If the _has\_program\_string_ field of a `STACK WIN` record is not zero,
then the record's final field is a string containing a program to be
interpreted to recover the caller's frame. The comments in the
[postfix\_evaluator.h]
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/postfix_evaluator.h#40)
header file explain the language in which the program is written. You should
place the following variables in the dictionary before interpreting the
program:
* `$ebp` and `$esp` should be the values of the `%ebp` and `%esp`
registers in F.
* `.cbParams`, `.cbSavedRegs`, and `.cbLocals`, should be the values of
the `STACK WIN` record's _parameter\_size_, _saved\_register\_size_, and
_local\_size_ fields.
* `.raSearchStart` should be set to the address on the stack to begin
scanning for a return address, if necessary. The Breakpad processor sets
this to the value of `%esp` in F, plus the _frame\_size_ value mentioned
above.
> If the program stores values for `$eip`, `$esp`, `$ebp`, `$ebx`, `$esi`, or
> `$edi`, then those are the values of the given registers in the caller. If the
> value of `$eip` is zero, that indicates that the end of the stack has been
> reached.
The Breakpad processor checks that the value yielded by the above for the
calling frame's instruction address refers to known code; if the address seems
to be bogus, then it uses a heuristic search to find F's return address and
stack base.
# `STACK CFI` records
`STACK CFI` ("Call Frame Information") records describe how to walk the stack
when execution is at a given machine instruction. These records take one of two
forms:
> `STACK CFI INIT` _address_ _size_ _register<sub>1</sub>_:
> _expression<sub>1</sub>_ _register<sub>2</sub>_: _expression<sub>2</sub>_ ...
>
> `STACK CFI` _address_ _register<sub>1</sub>_: _expression<sub>1</sub>_
> _register<sub>2</sub>_: _expression<sub>2</sub>_ ...
For example:
```
STACK CFI INIT 804c4b0 40 .cfa: $esp 4 + $eip: .cfa 4 - ^
STACK CFI 804c4b1 .cfa: $esp 8 + $ebp: .cfa 8 - ^
```
The _address_ and _size_ fields are hexadecimal numbers. Each
_register_<sub>i</sub> is the name of a register or pseudoregister. Each
_expression_ is a Breakpad postfix expression, which may contain spaces, but
never ends with a colon. (The appropriate register names for a given
architecture are determined when `STACK CFI` records are first enabled for that
architecture, and should be documented in the appropriate
`stackwalker_`_architecture_`.cc` source file.)
STACK CFI records describe, at each machine instruction in a given function, how
to recover the values the machine registers had in the function's caller.
Naturally, some registers' values are simply lost, but there are three cases in
which they can be recovered:
* You can always recover the program counter, because that's the function's
return address. If the function is ever going to return, the PC must be
saved somewhere.
* You can always recover the stack pointer. The function is responsible for
popping its stack frame before it returns to the caller, so it must be able
to restore this, as well.
* You should be able to recover the values of callee-saves registers. These
are registers whose values the callee must preserve, either by saving them
in its own stack frame before using them and re-loading them before
returning, or by not using them at all.
(As an exception, note that functions which never return may not save any of
this data. It may not be possible to walk the stack past such functions' stack
frames.)
Given rules for recovering the values of a function's caller's registers, we can
walk up the stack. Starting with the current set of registers --- the PC of the
instruction we're currently executing, the current stack pointer, etc. --- we
use CFI to recover the values those registers had in the caller of the current
frame. This gives us a PC in the caller whose CFI we can look up; we apply the
process again to find that function's caller; and so on.
Concretely, CFI records represent a table with a row for each machine
instruction address and a column for each register. The table entry for a given
address and register contains a rule describing how, when the PC is at that
address, to restore the value that register had in the caller.
There are some special columns:
* A column named `.cfa`, for "Canonical Frame Address", tells how to compute
the base address of the frame; other entries can refer to the CFA in their
rules.
* A column named `.ra` represents the return address.
For example, suppose we have a machine with 32-bit registers, one-byte
instructions, a stack that grows downwards, and an assembly language that
resembles C. Suppose further that we have a function whose machine code looks
like this:
```
func: ; entry point; return address at sp
func+0: sp -= 16 ; allocate space for stack frame
func+1: sp[12] = r0 ; save 4-byte r0 at sp+12
... ; stuff that doesn't affect stack
func+10: sp -= 4; *sp = x ; push some 4-byte x on the stack
... ; stuff that doesn't affect stack
func+20: r0 = sp[16] ; restore saved r0
func+21: sp += 20 ; pop whole stack frame
func+22: pc = *sp; sp += 4 ; pop return address and jump to it
```
The following table would describe the function above:
**code address** | **.cfa** | **r0 (on Google Code)** | **r1 (on Google Code)** | ... | **.ra**
:--------------- | :------- | :---------------------- | :---------------------- | :-- | :-------
func+0 | sp | | | | `cfa[0]`
func+1 | sp+16 | | | | `cfa[0]`
func+2 | sp+16 | `cfa[-4]` | | | `cfa[0]`
func+11 | sp+20 | `cfa[-4]` | | | `cfa[0]`
func+21 | sp+20 | | | | `cfa[0]`
func+22 | sp | | | | `cfa[0]`
Some things to note here:
* Each row describes the state of affairs **before** executing the instruction
at the given address. Thus, the row for func+0 describes the state before we
execute the first instruction, which allocates the stack frame. In the next
row, the formula for computing the CFA has changed, reflecting the
allocation.
* The other entries are written in terms of the CFA; this allows them to
remain unchanged as the stack pointer gets bumped around. For example, to
find the caller's value for r0 (on Google Code) at func+2, we would first
compute the CFA by adding 16 to the sp, and then subtract four from that to
find the address at which r0 (on Google Code) was saved.
* Although the example doesn't show this, most calling conventions designate
"callee-saves" and "caller-saves" registers. The callee must restore the
values of "callee-saves" registers before returning (if it uses them at
all), whereas the callee is free to use "caller-saves" registers without
restoring their values. A function that uses caller-saves registers
typically does not save their original values at all; in this case, the CFI
marks such registers' values as "unrecoverable".
* Exactly where the CFA points in the frame --- at the return address? below
it? At some fixed point within the frame? --- is a question of definition
that depends on the architecture and ABI in use. But by definition, the CFA
remains constant throughout the lifetime of the frame. It's up to
architecture- specific code to know what significance to assign the CFA, if
any.
To save space, the most common type of CFI record only mentions the table
entries at which changes take place. So for the above, the CFI data would only
actually mention the non-blank entries here:
**insn** | **cfa** | **r0 (on Google Code)** | **r1 (on Google Code)** | ... | **ra**
:------- | :------ | :---------------------- | :---------------------- | :-- | :-------
func+0 | sp | | | | `cfa[0]`
func+1 | sp+16 | | | |
func+2 | | `cfa[-4]` | | |
func+11 | sp+20 | | | |
func+21 | | r0 (on Google Code) | | |
func+22 | sp | | | |
A `STACK CFI INIT` record indicates that, at the machine instruction at
_address_, belonging to some function, the value that _register<sub>n</sub>_ had
in that function's caller can be recovered by evaluating
_expression<sub>n</sub>_. The values of any callee-saves registers not mentioned
are assumed to be unchanged. (`STACK CFI` records never mention caller-saves
registers.) These rules apply starting at _address_ and continue up to, but not
including, the address given in the next `STACK CFI` record. The _size_ field is
the total number of bytes of machine code covered by this record and any
subsequent `STACK CFI` records (until the next `STACK CFI INIT` record). The
_address_ field is relative to the module's load address.
A `STACK CFI` record (no `INIT`) is the same, except that it mentions only those
registers whose recovery rules have changed from the previous CFI record. There
must be a prior `STACK CFI INIT` or `STACK CFI` record in the symbol file. The
_address_ field of this record must be greater than that of the previous record,
and it must not be at or beyond the end of the range given by the most recent
`STACK CFI INIT` record. The address is relative to the module's load address.
Each expression is a breakpad-style postfix expression. Expressions may contain
spaces, but their tokens may not end with colons. When an expression mentions a
register, it refers to the value of that register in the callee, even if a prior
name/expression pair gives that register's value in the caller. The exception is
`.cfa`, which refers to the canonical frame address computed by the .cfa rule in
force at the current instruction.
The special expression `.undef` indicates that the given register's value cannot
be recovered.
The register names preceding the expressions are always followed by colons. The
expressions themselves never contain tokens ending with colons.
There are two special register names:
* `.cfa` ("Canonical Frame Address") is the base address of the stack frame.
Other registers' rules may refer to this. If no rule is provided for the
stack pointer, the value of `.cfa` is the caller's stack pointer.
* `.ra` is the return address. This is the value of the restored program
counter. We use `.ra` instead of the architecture-specific name for the
program counter.
The Breakpad stack walker requires that there be rules in force for `.cfa` and
`.ra` at every code address from which it unwinds. If those rules are not
present, the stack walker will ignore the `STACK CFI` data, and try to use a
different strategy.
So the CFI for the example function above would be as follows, if `func` were at
address 0x1000 (relative to the module's load address):
```
STACK CFI INIT 1000 .cfa: $sp .ra: .cfa ^
STACK CFI 1001 .cfa: $sp 16 +
STACK CFI 1002 $r0: .cfa 4 - ^
STACK CFI 100b .cfa: $sp 20 +
STACK CFI 1015 $r0: $r0
STACK CFI 1016 .cfa: $sp
```

View file

@ -0,0 +1,70 @@
# Windows Integration overview
## Windows Client Code
The Windows client code is in the `src/client/windows` directory of the tree.
Since the header files are fairly well commented some specifics are purposely
omitted from this document.
## Integration of minidump-generation
Once you build the solution inside `src/client/windows`, an output file of
`exception_handler.lib` will be generated. You can either check this into your
project's directory or build directly from the source, as the project itself
does.
Enabling Breakpad in your application requires you to `#include
"exception_handler.h"` and instantiate the `ExceptionHandler` object like so:
```
handler = new ExceptionHandler(const wstring& dump_path,
FilterCallback filter,
MinidumpCallback callback,
void* callback_context,
int handler_types,
MINIDUMP_TYPE dump_type,
const wchar_t* pipe_name,
const CustomClientInfo* custom_info);
```
The parameters, in order, are:
* pathname for minidumps to be written to - this is ignored if OOP dump
generation is used
* A callback that is called when the exception is first handled - you can
return true/false here to continue/stop exception processing
* A callback that is called after minidumps have been written
* Context for the callbacks
* Which exceptions to handle - see `HandlerType` enumeration in
exception\_handler.h
* The type of minidump to generate, using the `MINIDUMP_TYPE` definitions in
`DbgHelp.h`
* A pipe name that can be used to communicate with a crash generation server
* A pointer to a CustomClientInfo class that can be used to send custom data
along with the minidump when using OOP generation
You can also see `src/client/windows/tests/crash_generation_app/*` for a sample
app that uses OOP generation.
## OOP Minidump Generation
For out of process minidump generation, more work is needed. If you look inside
`src/client/windows/crash_generation`, you will see a file called
`crash_generation_server.h`. This file is the interface for a crash generation
server, which must be instantiated with the same pipe name that is passed to the
client above. The logistics of running a separate process that instantiates the
crash generation server is left up to you, however.
## Build process specifics(symbol generation, upload)
The symbol creation step is talked about in the general overview doc, since it
doesn't vary much by platform. You'll need to make sure that the symbols are
available wherever minidumps are uploaded to for processing.
## Out in the field - uploading the minidump
Inside `src/client/windows/sender` is a class implementation called
`CrashReportSender`. This class can be compiled into a separate standalone CLI
or in the crash generation server and used to upload the report; it can know
when to do so via one of the callbacks provided by the `CrashGenerationServer`
or the `ExceptionHandler` object for in-process generation.