[Docs] add markdown docs (converted from Wiki)

BUG=none R=mark CC=google-breakpad-dev@googlegroups.com Review URL: https://codereview.chromium.org/1357773004 . Patch from Andy Bonventre <andybons@chromium.org>.
2015-09-22 17:29:52 -04:00 · 2015-09-22 17:29:52 -04:00 · 0ff15b41ed
commit 0ff15b41ed
parent 4d06db5a1f
17 changed files with 2948 additions and 37 deletions
--- a/37
+++ b/37
@ -1,37 +0,0 @@
-Breakpad is a set of client and server components which implement a
-crash-reporting system.
-
-
-----
-Getting started in 32-bit mode (from trunk)
-Configure: CXXFLAGS=-m32 CFLAGS=-m32 CPPFLAGS=-m32 ./configure
-    Build: make
-     Test: make check
-  Install: make install
-
-If you need to reconfigure your build be sure to run "make distclean" first.
-
-
-----
-To request change review:
-0. Get a copy of depot_tools repo.
-   http://dev.chromium.org/developers/how-tos/install-depot-tools
-
-1. Create a new directory for checking out the source code.
-   mkdir breakpad && cd breakpad
-
-2. Run the `fetch` tool from depot_tools to download all the source repos.
-   fetch breakpad
-
-3. Make changes. Build and test your changes.
-   For core code like processor use methods above.
-   For linux/mac/windows, there are test targets in each project file.
-
-4. Commit your changes to your local repo and upload them to the server.
-   http://dev.chromium.org/developers/contributing-code
-   e.g. git commit ... && git cl upload ...
-   You will be prompted for credential and a description.
-
-5. At https://codereview.chromium.org/ you'll find your issue listed; click on
-   it, and select Publish+Mail, and enter in the code reviewer and CC
-   google-breakpad-dev@googlegroups.com
--- a/README.md
+++ b/README.md
@ -0,0 +1,47 @@
+# Breakpad
+
+Breakpad is a set of client and server components which implement a
+crash-reporting system.
+
+## Getting started in 32-bit mode (from trunk)
+
+```sh
+# Configure
+CXXFLAGS=-m32 CFLAGS=-m32 CPPFLAGS=-m32 ./configure
+# Build
+make
+# Test
+make check
+# Install
+make install
+```
+
+If you need to reconfigure your build be sure to run `make distclean` first.
+
+## To request change review:
+
+1.  Get a copy of depot_tools repo.
+    http://dev.chromium.org/developers/how-tos/install-depot-tools
+
+2.  Create a new directory for checking out the source code.
+    mkdir breakpad && cd breakpad
+
+3.  Run the `fetch` tool from depot_tools to download all the source repos.
+    `fetch breakpad`
+
+4.  Make changes. Build and test your changes.
+    For core code like processor use methods above.
+    For linux/mac/windows, there are test targets in each project file.
+
+5.  Commit your changes to your local repo and upload them to the server.
+    http://dev.chromium.org/developers/contributing-code
+    e.g. `git commit ... && git cl upload ...`
+    You will be prompted for credential and a description.
+
+6.  At https://codereview.chromium.org/ you'll find your issue listed; click on
+    it, and select Publish+Mail, and enter in the code reviewer and CC
+    google-breakpad-dev@googlegroups.com
+
+## Documentation
+
+Visit https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/
--- a/docs/OWNERS
+++ b/docs/OWNERS
@ -0,0 +1 @@
+*
--- a/docs/breakpad.png
+++ b/docs/breakpad.png
--- a/docs/breakpad.svg
+++ b/docs/breakpad.svg
--- a/docs/client_design.md
+++ b/docs/client_design.md
@ -0,0 +1,224 @@
+# Breakpad Client Libraries
+
+## Objective
+
+The Breakpad client libraries are responsible for monitoring an application for
+crashes (exceptions), handling them when they occur by generating a dump, and
+providing a means to upload dumps to a crash reporting server. These tasks are
+divided between the “handler” (short for “exception handler”) library linked in
+to an application being monitored for crashes, and the “sender” library,
+intended to be linked in to a separate external program.
+
+## Background
+
+As one of the chief tasks of the client handler is to generate a dump, an
+understanding of [dump files](processor_design.md) will aid in understanding the
+handler.
+
+## Overview
+
+Breakpad provides client libraries for each of its target platforms. Currently,
+these exist for Windows on x86 and Mac OS X on both x86 and PowerPC. A Linux
+implementation has been written and is currently under review.
+
+Because the mechanisms for catching exceptions and the methods for obtaining the
+information that a dump contains vary between operating systems, each target
+operating system requires a completely different handler implementation. Where
+multiple CPUs are supported for a single operating system, the handler
+implementation will likely also require separate code for each processor type to
+extract CPU-specific information. One of the goals of the Breakpad handler is to
+provide a prepackaged cross-platform system that masks many of these
+system-level differences and quirks from the application developer. Although the
+underlying implementations differ, the handler library for each system follows
+the same set of principles and exposes a similar interface.
+
+Code that wishes to take advantage of Breakpad should be linked against the
+handler library, and should, at an appropriate time, install a Breakpad handler.
+For applications, it is generally desirable to install the handler as early in
+the start-up process as possible. Developers of library code using Breakpad to
+monitor itself may wish to install a Breakpad handler when the library is
+loaded, or may only want to install a handler when calls are made in to the
+library.
+
+The handler can be triggered to generate a dump either by catching an exception
+or at the request of the application itself. The latter case may be useful in
+debugging assertions or other conditions where developers want to know how a
+program got in to a specific non-crash state. After generating a dump, the
+handler calls a user-specified callback function. The callback function may
+collect additional data about the program’s state, quit the program, launch a
+crash reporter application, or perform other tasks. Allowing for this
+functionality to be dictated by a callback function preserves flexibility.
+
+The sender library is also has a separate implementation for each supported
+platform, because of the varying interfaces for accessing network resources on
+different operating systems. The sender transmits a dump along with other
+application-defined information to a crash report server via HTTP. Because dumps
+may contain sensitive data, the sender allows for the use of HTTPS.
+
+The canonical example of the entire client system would be for a monitored
+application to link against the handler library, install a Breakpad handler from
+its main function, and provide a callback to launch a small crash reporter
+program. The crash reporter program would be linked against the sender library,
+and would send the crash dump when launched. A separate process is recommended
+for this function because of the unreliability inherent in doing any significant
+amount of work from a crashed process.
+
+## Detailed Design
+
+### Exception Handler Installation
+
+The mechanisms for installing an exception handler vary between operating
+systems. On Windows, it’s a relatively simple matter of making one call to
+register a [top-level exception filter]
+(http://msdn.microsoft.com/library/en-us/debug/base/setunhandledexceptionfilter.asp)
+callback function. On most Unix-like systems such as Linux, processes are
+informed of exceptions by the delivery of a signal, so an exception handler
+takes the form of a signal handler. The native mechanism to catch exceptions on
+Mac OS X requires a large amount of code to set up a Mach port, identify it as
+the exception port, and assign a thread to listen for an exception on that port.
+Just as the preparation of exception handlers differ, the manner in which they
+are called differs as well. On Windows and most Unix-like systems, the handler
+is called on the thread that caused the exception. On Mac OS X, the thread
+listening to the exception port is notified that an exception has occurred. The
+different implementations of the Breakpad handler libraries perform these tasks
+in the appropriate ways on each platform, while exposing a similar interface on
+each.
+
+A Breakpad handler is embodied in an `ExceptionHandler` object. Because it’s a
+C++ object, `ExceptionHandler`s may be created as local variables, allowing them
+to be installed and removed as functions are called and return. This provides
+one possible way for a developer to monitor only a portion of an application for
+crashes.
+
+### Exception Basics
+
+Once an application encounters an exception, it is in an indeterminate and
+possibly hazardous state. Consequently, any code that runs after an exception
+occurs must take extreme care to avoid performing operations that might fail,
+hang, or cause additional exceptions. This task is not at all straightforward,
+and the Breakpad handler library seeks to do it properly, accounting for all of
+the minute details while allowing other application developers, even those with
+little systems programming experience, to reap the benefits. All of the Breakpad
+handler code that executes after an exception occurs has been written according
+to the following guidelines for safety at exception time:
+
+*   Use of the application heap is forbidden. The heap may be corrupt or
+    otherwise unusable, and allocators may not function.
+*   Resource allocation must be severely limited. The handler may create a new
+    file to contain the dump, and it may attempt to launch a process to continue
+    handling the crash.
+*   Execution on the thread that caused the exception is significantly limited.
+    The only code permitted to execute on this thread is the code necessary to
+    transition handling to a dedicated preallocated handler thread, and the code
+    to return from the exception handler.
+*   Handlers shouldn’t handle crashes by attempting to walk stacks themselves,
+    as stacks may be in inconsistent states. Dump generation should be performed
+    by interfacing with the operating system’s memory manager and code module
+    manager.
+*   Library code, including runtime library code, must be avoided unless it
+    provably meets the above guidelines. For example, this means that the STL
+    string class may not be used, because it performs operations that attempt to
+    allocate and use heap memory. It also means that many C runtime functions
+    must be avoided, particularly on Windows, because of heap operations that
+    they may perform.
+
+A dedicated handler thread is used to preserve the state of the exception thread
+when an exception occurs: during dump generation, it is difficult if not
+impossible for a thread to accurately capture its own state. Performing all
+exception-handling functions on a separate thread is also critical when handling
+stack-limit-exceeded exceptions. It would be hazardous to run out of stack space
+while attempting to handle an exception. Because of the rule against allocating
+resources at exception time, the Breakpad handler library creates its handler
+thread when it installs its exception handler. On Mac OS X, this handler thread
+is created during the normal setup of the exception handler, and the handler
+thread will be signaled directly in the event of an exception. On Windows and
+Linux, the handler thread is signaled by a small amount of code that executes on
+the exception thread. Because the code that executes on the exception thread in
+this case is small and safe, this does not pose a problem. Even when an
+exception is caused by exceeding stack size limits, this code is sufficiently
+compact to execute entirely within the stack’s guard page without causing an
+exception.
+
+The handler thread may also be triggered directly by a user call, even when no
+exception occurs, to allow dumps to be generated at any point deemed
+interesting.
+
+### Filter Callback
+
+When the handler thread begins handling an exception, it calls an optional
+user-defined filter callback function, which is responsible for judging whether
+Breakpad’s handler should continue handling the exception or not. This mechanism
+is provided for the benefit of library or plug-in code, whose developers may not
+be interested in reports of crashes that occur outside of their modules but
+within processes hosting their code. If the filter callback indicates that it is
+not interested in the exception, the Breakpad handler arranges for it to be
+delivered to any previously-installed handler.
+
+### Dump Generation
+
+Assuming that the filter callback approves (or does not exist), the handler
+writes a dump in a directory specified by the application developer when the
+handler was installed, using a previously generated unique identifier to avoid
+name collisions. The mechanics of dump generation also vary between platforms,
+but in general, the process involves enumerating each thread of execution, and
+capturing its state, including processor context and the active portion of its
+stack area. The dump also includes a list of the code modules loaded in to the
+application, and an indicator of which thread generated the exception or
+requested the dump. In order to avoid allocating memory during this process, the
+dump is written in place on disk.
+
+### Post-Dump Behavior
+
+Upon completion of writing the dump, a second callback function is called. This
+callback may be used to launch a separate crash reporting program or to collect
+additional data from the application. The callback may also be used to influence
+whether Breakpad will treat the exception as handled or unhandled. Even after a
+dump is successfully generated, Breakpad can be made to behave as though it
+didn’t actually handle an exception. This function may be useful for developers
+who want to test their applications with Breakpad enabled but still retain the
+ability to use traditional debugging techniques. It also allows a
+Breakpad-enabled application to coexist with a platform’s native crash reporting
+system, such as Mac OS X’ [CrashReporter]
+(http://developer.apple.com/technotes/tn2004/tn2123.html) and [Windows Error
+Reporting](http://msdn.microsoft.com/isv/resources/wer/).
+
+Typically, when Breakpad handles an exception fully and no debuggers are
+involved, the crashed process will terminate.
+
+Authors of both callback functions that execute within a Breakpad handler are
+cautioned that their code will be run at exception time, and that as a result,
+they should observe the same programming practices that the Breakpad handler
+itself adheres to. Notably, if a callback is to be used to collect additional
+data from an application, it should take care to read only “safe” data. This
+might involve accessing only static memory locations that are updated
+periodically during the course of normal program execution.
+
+### Sender Library
+
+The Breakpad sender library provides a single function to send a crash report to
+a crash server. It accepts a crash server’s URL, a map of key-value parameters
+that will accompany the dump, and the path to a dump file itself. Each of the
+key-value parameters and the dump file are sent as distinct parts of a multipart
+HTTP POST request to the specified URL using the platform’s native HTTP
+facilities. On Linux, [libcurl](http://curl.haxx.se/) is used for this function,
+as it is the closest thing to a standard HTTP library available on that
+platform.
+
+## Future Plans
+
+Although we’ve had great success with in-process dump generation by following
+our guidelines for safe code at exception time, we are exploring options for
+allowing dumps to be generated in a separate process, to further enhance the
+handler library’s robustness.
+
+On Windows, we intend to offer tools to make it easier for Breakpad’s settings
+to be managed by the native group policy management system.
+
+We also plan to offer tools that many developers would find desirable in the
+context of handling crashes, such as a mechanism to determine at launch if the
+program last terminated in a crash, and a way to calculate “crashiness” in terms
+of crashes over time or the number of application launches between crashes.
+
+We are also investigating methods to capture crashes that occur early in an
+application’s launch sequence, including crashes that occur before a program’s
+main function begins executing.
--- a/docs/contributing_to_breakpad.md
+++ b/docs/contributing_to_breakpad.md
@ -0,0 +1,35 @@
+# Introduction
+
+Thanks for thinking of contributing to Breakpad! Unfortunately there are some
+pesky legal issues to get out of the way, but they're quick and painless.
+
+## Legal
+
+If you're doing work individually, not as part of any employment, you'll need to
+sign the <a
+href='http://code.google.com/legal/individual-cla-v1.0.html'>Individual
+Contributor License Agreement</a>. This agreement can be completed
+electronically.
+
+If you're contributing to Breakpad as part of your employment with another
+organization, you'll need to sign a <a
+href='http://code.google.com/legal/corporate-cla-v1.0.html'> Corporate
+Contributor License Agreement</a>. Once completed this document will need to be
+faxed.
+
+**_IMPORTANT_**: The authors(you!) of the contributions will maintain all
+copyrights; the agreements you sign will grant rights to Google to use your
+work.
+
+Thanks, and if you have any questions let me know and I'll loop in the legal guy
+here to get you an answer.
+
+## Technical
+
+Once you have signed the agreement you can be added to our contributors list and
+have write access to code. For full details on getting started see our trunk
+`README`.
+
+## List of people who have signed contributor agreements
+
+None so far.
--- a/docs/exception_handling.md
+++ b/docs/exception_handling.md
@ -0,0 +1,128 @@
+The goal of this document is to give an overview of the exception handling
+options in breakpad.
+
+# Basics
+
+Exception handling is a mechanism designed to handle the occurrence of
+exceptions, special conditions that change the normal flow of program execution.
+
+`SetUnhandledExceptionFilter` replaces all unhandled exceptions when Breakpad is
+enabled. TODO: More on first and second change and vectored v. try/catch.
+
+There are two main types of exceptions across all platforms: in-process and
+out-of-process.
+
+# In-Process
+
+In process exception handling is relatively simple since the crashing process
+handles crash reporting. It is generally considered unsafe to write a minidump
+from a crashed process. For example, key data structures could be corrupted or
+the stack on which the exception handler runs could have been overwritten. For
+this reason all platforms also support some level of out-of-process exception
+handling.
+
+## Windows
+
+In-process exception handling Breakpad creates a 'handler head' that waits
+infinitely on a semaphore at start up. When this thread is woken it writes the
+minidump and signals to the excepting thread that it may continue. A filter will
+tell the OS to kill the process if the minidump is written successfully.
+Otherwise it continues.
+
+# Out-of-Process
+
+Out-of-process exception handling is more complicated than in-process exception
+handling because of the need to set up a separate process that can read the
+state of the crashing process.
+
+## Windows
+
+Breakpad uses two abstractions around the exception handler to make things work:
+`CrashGenerationServer` and `CrashGenerationClient`. The constructor for these
+takes a named pipe name.
+
+During server start up a named pipe and registers callbacks for client
+connections are created. The named pipe is used for registration and all IO on
+the pipe is done asynchronously. `OnPipeConnected` is called when a client
+attempts to connect (call `CreateFile` on the pipe). `OnPipeConnected` does the
+state machine transition from `Initial` to `Connecting` and on through
+`Reading`, `Reading_Done`, `Writing`, `Writing_Done`, `Reading_ACK`, and
+`Disconnecting`.
+
+When registering callbacks, the client passes in two pointers to pointers: 1. A
+pointer to the `EXCEPTION_INFO` pointer 1. A pointer to the `MDRawAssertionInfo`
+which handles various non-exception failures like assertions
+
+The essence of registration is adding a "`ClientInfo`" object that contains
+handles used for synchronization with the crashing process to an array
+maintained by the server. This is how we can keep track of all the clients on
+the system that have registered for minidumps. These handles are: *
+`server_died(mutex)` * `dump_requested(Event)` * `dump_generated(Event)`
+
+The server registers asynchronous waits on these events with the `ClientInfo`
+object as the callback context. When the `dump_requested` event is set by the
+client, the `OnDumpRequested()` callback is called. The server uses the handles
+inside `ClientInfo` to communicate with the child process. Once the child sets
+the event, it waits for two objects: 1. the `dump_generated` event 1. the
+`server_died` mutex
+
+In the end handles are "duped" into the client process, and the clients use
+`SetEvent` to request events, wait on the other event, or the `server_died`
+mutex.
+
+## Linux
+
+### Current Status
+
+As of July 2011, Linux had a minidump generator that is not entirely
+out-of-process. The minidump was generated from a separate process, but one that
+shared an address space, file descriptors, signal handles and much else with the
+crashing process. It worked by using the `clone()` system call to duplicate the
+crashing process, and then uses `ptrace()` and the `/proc` file system to
+retrieve the information required to write the minidump. Since then Breakpad has
+updated Linux exception handling to provide more benefits of out-of-process
+report generation.
+
+### Proposed Design
+
+#### Overview
+
+Breakpad would use a per-user daemon to write out a minidump that does not have,
+interact with or depend on the crashing process. We don't want to start a new
+separate process every time a user launches a Breakpad-enabled process. Doing
+one daemon per machine is unacceptable for security concerns around one user
+being able to initiate a minidump generation for another user's process.
+
+#### Client/Server Communication
+
+On Breakpad initialization in a process, the initializer would check if the
+daemon is running and, if not, start it. The race condition between the check
+and the initialization is not a problem because multiple daemons can check if
+the IPC endpoint already exists and if a server is listening. Even if multiple
+copies of the daemon try to `bind()` the filesystem to name the socket, all but
+one will fail and can terminate.
+
+This point is relevant for error handling conditions. Linux does not clean the
+file system representation of a UNIX domain socket even if both endpoints
+terminate, so checking for existence is not strong enough. However checking the
+process list or sending a ping on the socket can handle this.
+
+Breakpad uses UNIX domain sockets since they support full duplex communication
+(unlike Windows, named pipes on Linux are half) and the kernal automatically
+creates a private channel between the client and server once the client calls
+`connect()`.
+
+#### Minidump Generation
+
+Breakpad could use the current system with `ptrace()` and `/proc` within the
+daemon executable.
+
+Overall the operations look like: 1. Signal from OS indicating crash 1. Signal
+Handler suspends all threads except itself 1. Signal Handler sends
+`CRASH_DUMP_REQUEST` message to server and waits for response 1. Server inspects
+1. Minidump is asynchronously written to disk by the server 1. Server responds
+indicating inspection is done
+
+## Mac OSX
+
+Out-of-process exception handling is fully supported on Mac.
--- a/docs/getting_started_with_breakpad.md
+++ b/docs/getting_started_with_breakpad.md
@ -0,0 +1,121 @@
+# Introduction
+
+Breakpad is a library and tool suite that allows you to distribute an
+application to users with compiler-provided debugging information removed,
+record crashes in compact "minidump" files, send them back to your server, and
+produce C and C++ stack traces from these minidumps. Breakpad can also write
+minidumps on request for programs that have not crashed.
+
+Breakpad is currently used by Google Chrome, Firefox, Google Picasa, Camino,
+Google Earth, and other projects.
+
+![http://google-breakpad.googlecode.com/svn/wiki/breakpad.png]
+(http://google-breakpad.googlecode.com/svn/wiki/breakpad.png)
+
+Breakpad has three main components:
+
+*   The **client** is a library that you include in your application. It can
+    write minidump files capturing the current threads' state and the identities
+    of the currently loaded executable and shared libraries. You can configure
+    the client to write a minidump when a crash occurs, or when explicitly
+    requested.
+
+*   The **symbol dumper** is a program that reads the debugging information
+    produced by the compiler and produces a **symbol file**, in [Breakpad's own
+    format](symbol_files.md).
+
+*   The **processor** is a program that reads a minidump file, finds the
+    appropriate symbol files for the versions of the executables and shared
+    libraries the minidump mentions, and produces a human-readable C/C++ stack
+    trace.
+
+# The minidump file format
+
+The minidump file format is similar to core files but was developed by Microsoft
+for its crash-uploading facility. A minidump file contains:
+
+*   A list of the executable and shared libraries that were loaded in the
+    process at the time the dump was created. This list includes both file names
+    and identifiers for the particular versions of those files that were loaded.
+
+*   A list of threads present in the process. For each thread, the minidump
+    includes the state of the processor registers, and the contents of the
+    threads' stack memory. These data are uninterpreted byte streams, as the
+    Breakpad client generally has no debugging information available to produce
+    function names or line numbers, or even identify stack frame boundaries.
+
+*   Other information about the system on which the dump was collected:
+    processor and operating system versions, the reason for the dump, and so on.
+
+Breakpad uses Windows minidump files on all platforms, instead of the
+traditional core files, for several reasons:
+
+*   Core files can be very large, making them impractical to send across a
+    network to the collector for processing. Minidumps are smaller, as they were
+    designed to be used this way.
+
+*   The core file format is poorly documented. For example, the Linux Standards
+    Base does not describe how registers are stored in `PT_NOTE` segments.
+
+*   It is harder to persuade a Windows machine to produce a core dump file than
+    it is to persuade other machines to write a minidump file.
+
+*   It simplifies the Breakpad processor to support only one file format.
+
+# Overview/Life of a minidump
+
+A minidump is generated via calls into the Breakpad library. By default,
+initializing Breakpad installs an exception/signal handler that writes a
+minidump to disk at exception time. On Windows, this is done via
+`SetUnhandledExceptionFilter()`; on OS X, this is done by creating a thread that
+waits on the Mach exception port; and on Linux, this is done by installing a
+signal handler for various exceptions like `SIGILL, SIGSEGV` etc.
+
+Once the minidump is generated, each platform has a slightly different way of
+uploading the crash dump. On Windows & Linux, a separate library of functions is
+provided that can be called into to do the upload. On OS X, a separate process
+is spawned that prompts the user for permission, if configured to do so, and
+sends the file.
+
+# Terminology
+
+**In-process vs. out-of-process exception handling** - it's generally considered
+that writing the minidump from within the crashed process is unsafe - key
+process data structures could be corrupted, or the stack on which the exception
+handler runs could have been overwritten, etc. All 3 platforms support what's
+known as "out-of-process" exception handling.
+
+# Integration overview
+
+## Breakpad Code Overview
+
+All the client-side code is found by visiting the Google Project at
+http://code.google.com/p/google-breakpad. The following directory structure is
+present in the `src` directory:
+
+*   `processor` Contains minidump-processing code that is used on the server
+    side and isn't of use on the client side
+*   `client` Contains client minidump-generation libraries for all platforms
+*   `tools` Contains source code & projects for building various tools on each
+    platform.
+
+(Among other directories)
+
+*   <a
+    href='http://code.google.com/p/google-breakpad/wiki/WindowsClientIntegration'>Windows
+    Integration Guide</a>
+*   <a
+    href='http://code.google.com/p/google-breakpad/wiki/MacBreakpadStarterGuide'>Mac
+    Integration Guide</a>
+*   <a href='http://code.google.com/p/google-breakpad/wiki/LinuxStarterGuide'>
+    Linux Integration Guide</a>
+
+## Build process specifics(symbol generation)
+
+This applies to all platforms. Inside `src/tools/{platform}/dump_syms` is a tool
+that can read debugging information for each platform (e.g. for OS X/Linux,
+DWARF and STABS, and for Windows, PDB files) and generate a Breakpad symbol
+file. This tool should be run on your binary before it's stripped(in the case of
+OS X/Linux) and the symbol files need to be stored somewhere that the minidump
+processor can find. There is another tool, `symupload`, that can be used to
+upload symbol files if you have written a server that can accept them.
--- a/docs/linux_starter_guide.md
+++ b/docs/linux_starter_guide.md
@ -0,0 +1,97 @@
+# How To Add Breakpad To Your Linux Application
+
+This document is an overview of using the Breakpad client libraries on Linux.
+
+## Building the Breakpad libraries
+
+Breakpad provides an Autotools build system that will build both the Linux
+client libraries and the processor libraries. Running `./configure && make` in
+the Breakpad source directory will produce
+**src/client/linux/libbreakpad\_client.a**, which contains all the code
+necessary to produce minidumps from an application.
+
+## Integrating Breakpad into your Application
+
+First, configure your build process to link **libbreakpad\_client.a** into your
+binary, and set your include paths to include the **src** directory in the
+**google-breakpad** source tree. Next, include the exception handler header: ```
+
+# include "client/linux/handler/exception_handler.h"
+
+```
+
+Now you can instantiate an `ExceptionHandler` object. Exception handling is active for the lifetime of the `ExceptionHandler` object, so you should instantiate it as early as possible in your application's startup process, and keep it alive for as close to shutdown as possible. To do anything useful, the `ExceptionHandler` constructor requires a path where it can write minidumps, as well as a callback function to receive information about minidumps that were written:
+```
+
+static bool dumpCallback(const google_breakpad::MinidumpDescriptor& descriptor,
+void* context, bool succeeded) { printf("Dump path: %s\n", descriptor.path());
+return succeeded; }
+
+void crash() { volatile int* a = (int*)(NULL); *a = 1; }
+
+int main(int argc, char* argv[]) { google_breakpad::MinidumpDescriptor
+descriptor("/tmp"); google_breakpad::ExceptionHandler eh(descriptor, NULL,
+dumpCallback, NULL, true, -1); crash(); return 0; } ```
+
+Compiling and running this example should produce a minidump file in /tmp, and
+it should print the minidump filename before exiting. You can read more about
+the other parameters to the `ExceptionHandler` constructor <a
+href='http://code.google.com/p/google-breakpad/source/browse/trunk/src/client/linux/handler/exception_handler.h'>in
+the exception_handler.h source file</a>.
+
+**Note**: You should do as little work as possible in the callback function.
+Your application is in an unsafe state. It may not be safe to allocate memory or
+call functions from other shared libraries. The safest thing to do is `fork` and
+`exec` a new process to do any work you need to do. If you must do some work in
+the callback, the Breakpad source contains <a
+href='http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/linux/linux_libc_support.h'>some
+simple reimplementations of libc functions</a>, to avoid calling directly into
+libc, as well as <a href='http://code.google.com/p/linux-syscall-support/'>a
+header file for making Linux system calls</a> (in **src/third\_party/lss**) to
+avoid calling into other shared libraries.
+
+## Sending the minidump file
+
+In a real application, you would want to handle the minidump in some way, likely
+by sending it to a server for analysis. The Breakpad source tree contains <a
+href='http://code.google.com/p/google-breakpad/source/browse/#svn/trunk/src/common/linux'>some
+HTTP upload source</a> that you might find useful, as well as <a
+href='http://code.google.com/p/google-breakpad/source/browse/#svn/trunk/src/tools/linux/symupload'>a
+minidump upload tool</a>.
+
+## Producing symbols for your application
+
+To produce useful stack traces, Breakpad requires you to convert the debugging
+symbols in your binaries to <a
+href='http://code.google.com/p/google-breakpad/wiki/SymbolFiles'>text-format
+symbol files</a>. First, ensure that you've compiled your binaries with `-g` to
+include debugging symbols. Next, compile the `dump_syms` tool by running
+`configure && make` in the Breakpad source directory. Next, run `dump_syms` on
+your binaries to produce the text-format symbols. For example, if your main
+binary was named `test`: `$ google-breakpad/src/tools/linux/dump_syms/dump_syms
+./test > test.sym
+`
+
+In order to use these symbols with the `minidump_stackwalk` tool, you will need
+to place them in a specific directory structure. The first line of the symbol
+file contains the information you need to produce this directory structure, for
+example (your output will vary): `$ head -n1 test.sym MODULE Linux x86_64
+6EDC6ACDB282125843FD59DA9C81BD830 test $ mkdir -p
+./symbols/test/6EDC6ACDB282125843FD59DA9C81BD830 $ mv test.sym
+./symbols/test/6EDC6ACDB282125843FD59DA9C81BD830
+`
+
+You may also find the <a
+href='http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/tools/symbolstore.py'>symbolstore.py</a>
+script in the Mozilla repository useful, as it encapsulates these steps.
+
+## Processing the minidump to produce a stack trace
+
+Breakpad includes a tool called `minidump_stackwalk` which can take a minidump
+plus its corresponding text-format symbols and produce a symbolized stacktrace.
+It should be in the **google-breakpad/src/processor** directory if you compiled
+the Breakpad source using the directions above. Simply pass it the minidump and
+the symbol path as commandline parameters:
+`google-breakpad/src/processor/minidump_stackwalk minidump.dmp ./symbols
+` It produces verbose output on stderr, and the stacktrace on stdout, so you may
+want to redirect stderr.
--- a/docs/linux_system_calls.md
+++ b/docs/linux_system_calls.md
@ -0,0 +1,47 @@
+# Introduction
+
+Linux implements its userland-to-kernel transition using a special library
+called linux-gate.so that is mapped by the kernel into every process. For more
+information, see
+
+http://www.trilithium.com/johan/2005/08/linux-gate/
+
+In a nutshell, the problem is that the system call gate function,
+kernel\_vsyscall does not use EBP to point to the frame pointer.
+
+However, the Breakpad processor supports special frames like this via STACK
+lines in the symbol file. If you look in src/client/linux/data you will see
+symbol files for linux-gate.so for both Intel & AMD(the implementation of
+kernel\_vsyscall changes depending on the CPU manufacturer). When processing
+minidumps from Linux 2.6, having these symbol files is necessary for walking the
+stack for crashes that happen while a thread is in a system call.
+
+If you're just interested in processing minidumps, those two symbol files should
+be all you need!
+
+# Details
+
+The particular details of understanding the linux-gate.so symbol files can be
+found by reading about STACK lines inside
+src/common/windows/pdb\_source\_line\_writer.cc, and the above link. To
+summarize briefly, we just have to inform the processor how to get to the
+previous frame when the EIP is inside kernel\_vsyscall, and we do that by
+telling the processor how many bytes kernel\_vsyscall has pushed onto the stack
+in it's prologue. For example, one of the symbol files looks somewhat like the
+following:
+
+MODULE Linux x86 random\_debug\_id linux-gate.so PUBLIC 400 0 kernel\_vsyscall
+STACK WIN 4 100 1 1 0 0 0 0 0 1
+
+The PUBLIC line indicates that kernel\_vsyscall is at offset 400 (in bytes) from
+the beginning of linux-gate.so. The STACK line indicates the size of the
+function(100), how many bytes it pushes(1), and how many bytes it pops(1). The
+last 1 indicates that EBP is pushed onto the stack before being used by the
+function.
+
+# Warnings
+
+These functions might change significantly depending on kernel version. In my
+opinion, the actual function stack information is unlikely to change frequently,
+but the Linux kernel might change the address of kernel\_vsyscall w.r.t the
+beginning of linux-gate.so, which would cause these symbol files to be invalid.
--- a/docs/mac_breakpad_starter_guide.md
+++ b/docs/mac_breakpad_starter_guide.md
@ -0,0 +1,184 @@
+# How To Add Breakpad To Your Mac Client Application
+
+This document is a step-by-step recipe to get your Mac client app to build with
+Breakpad.
+
+## Preparing a binary build of Breakpad for use in your tree
+
+You can either check in a binary build of the Breakpad framework & tools or
+build it as a dependency of your project. The former is recommended, and
+detailed here, since building dependencies through other projects is
+problematic(matching up configuration names), and the Breakpad code doesn't
+change nearly often enough as your application's will.
+
+## Building the requisite targets
+
+All directories are relative to the `src` directory of the Breakpad checkout.
+
+*   Build the 'All' target of `client/mac/Breakpad.xcodeproj` in Release mode.
+*   Execute `cp -R client/mac/build/Release/Breakpad.framework <location in your
+    source tree>`
+*   Inside `tools/mac/dump_syms` directory, build dump\_syms.xcodeproj, and copy
+    tools/mac/dump\_syms/build/Release/dump\_syms to a safe location where it
+    can be run during the build process.
+
+## Adding Breakpad.framework
+
+Inside your application's framework, add the Breakpad.Framework to your
+project's framework settings. When you select it from the file chooser, it will
+let you pick a target to add it to; go ahead and check the one that's relevant
+to your application.
+
+## Copy Breakpad into your Application Package
+
+Copy Breakpad into your Application Package, so it will be around at run time.
+
+Go to the Targets section of your Xcode Project window. Hit the disclosure
+triangle to reveal the build phases of your application. Add a new Copy Files
+phase using the Contextual menu (Control Click). On the General panel of the new
+'Get Info' of this new phase, set the destination to 'Frameworks' Close the
+'Info' panel. Use the Contextual Menu to Rename your new phase 'Copy Frameworks'
+Now drag Breakpad again into this Copy Frameworks phase. Drag it from whereever
+it appears in the project file tree.
+
+## Add a New Run Script build phase
+
+Near the end of the build phases, add a new Run Script build phase. This will be
+run before Xcode calls /usr/bin/strip on your project. This is where you'll be
+calling dump\_sym to output the symbols for each architecture of your build. In
+my case, the relevant lines read:
+
+```
+#!/bin/sh
+$TOOL_DIR=<location of dump_syms from step 3 above>
+
+"$TOOL_DIR/dump_syms" -a ppc "$PROD" > "$TARGET_NAME ppc.breakpad"
+
+"$TOOL_DIR/dump_syms" -a i386 "$PROD" > "$TARGET_NAME i386.breakpad"
+```
+
+## Adjust the Project Settings
+
+*   Turn on Separate Strip,
+*   Set the Strip Style to Non-Global Symbols.
+
+## Write Code!
+
+You'll need to have an object that acts as the delegate for NSApplication.
+Inside this object's header, you'll need to add
+
+1.  add an ivar for Breakpad and
+2.  a declaration for the applicationShouldTerminate:(NSApplication`*` sender)
+    message.
+
+```
+#import <Breakpad/Breakpad.h>
+
+@interface BreakpadTest : NSObject {
+   .
+   .
+   .
+   BreakpadRef breakpad;
+   .
+   .
+   .
+}
+.
+.
+- (NSApplicationTerminateReply)applicationShouldTerminate:(NSApplication *)sender;
+.
+.
+@end
+```
+
+Inside your object's implementation file,
+
+1.  add the following method InitBreakpad
+2.  modify your awakeFromNib method to look like the one below,
+3.  modify/add your application's delegate method to look like the one below
+
+```
+static BreakpadRef InitBreakpad(void) {
+  NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
+  BreakpadRef breakpad = 0;
+  NSDictionary *plist = [[NSBundle mainBundle] infoDictionary];
+  if (plist) {
+    // Note: version 1.0.0.4 of the framework changed the type of the argument 
+    // from CFDictionaryRef to NSDictionary * on the next line:
+    breakpad = BreakpadCreate(plist);
+  }
+  [pool release];
+  return breakpad;
+}
+
+- (void)awakeFromNib {
+  breakpad = InitBreakpad();
+}
+
+- (NSApplicationTerminateReply)applicationShouldTerminate:(NSApplication *)sender {
+  BreakpadRelease(breakpad);
+  return NSTerminateNow;
+}
+```
+
+## Configure Breakpad
+
+Configure Breakpad for your application.
+
+1.  Take a look inside the Breakpad.framework at the Breakpad.h file for the
+    keys, default values, and descriptions to be passed to BreakpadCreate().
+2.  Add/Edit the Breakpad specific entries in the dictionary passed to
+    BreakpadCreate() -- typically your application's info plist.
+
+Example from the Notifier Info.plist:
+`<key>BreakpadProduct</key><string>Google_Notifier_Mac</string>
+<key>BreakpadProductDisplay</key><string>${PRODUCT_NAME}</string>
+`
+
+## Build Your Application
+
+Almost done!
+
+## Verify
+
+Double-check:
+
+Your app should have in its package contents:
+myApp.app/Contents/Frameworks/Breakpad.framework.
+
+The symbol files have reasonable contents (you can look at them with a text
+editor.)
+
+Look again at the Copy Frameworks phase of your project. Are you leaking .h
+files? Select them and delete them. (If you drag a bunch of files into your
+project, Xcode often wants to copy your .h files into the build, revealing
+Google secrets. Be vigilant!)
+
+## Upload the symbol file
+
+You'll need to configure your build process to store symbols in a location that
+is accessible by the minidump processor. There is a tool in tools/mac/symupload
+that can be used to send the symbol file via HTTP post.
+
+1.  Test
+
+Configure breakpad to send reports to a URL by adding to your app's Info.plist:
+
+```
+<key>BreakpadURL</key>
+<string>upload URL</string>
+<key>BreakpadReportInterval</key>
+<string>30</string>
+```
+
+## Final Notes
+
+Breakpad checks whether it is being run under a debugger, and if so, normally
+does nothing. But, you can force Breakpad to function under a debugger by
+setting the Unix shell variable BREAKPAD\_IGNORE\_DEBUGGER to a non-zero value.
+You can bracket the source code in the above Write The Code step with #if DEBUG
+to completely eliminate it from Debug builds. See
+//depot/googlemac/GoogleNotifier/main.m for an example. FYI, when your process
+forks(), exception handlers are reset to the default for child processes. So
+they must reinitialize Breakpad, otherwise exceptions will be handled by Apple's
+Crash Reporter.
--- a/docs/mozilla_brown_bag_talk.md
+++ b/docs/mozilla_brown_bag_talk.md
@ -0,0 +1,84 @@
+# Breakpad Crash Reporting for Mozilla
+
+*   January 24, 2007
+    *   Links updated February 14, 2007
+*   Mozilla HQ
+*   Mark Mentovai
+*   Brian Ryner
+
+## What is a crash reporter?
+
+*   Enables developers to analyze crashes that occur in the wild
+*   Produces stack backtraces that help identify how a program failed
+*   Offers higher-level data aggregation (topcrashes, MTBF statistics)
+
+## Motivation
+
+*   Talkback is proprietary and unmaintained
+*   Smaller open-source projects have few options
+*   Larger projects need flexibility and scalability
+
+## Design Options
+
+*   Stackwalking done on client
+    *   Apple CrashReporter
+    *   GNOME BugBuddy
+*   Client sends memory dump
+    *   Talkback
+    *   Windows Error Reporting
+    *   Breakpad
+
+## Goals
+
+*   Provide libraries around which systems can be based
+*   Open-source
+*   Cross-platform
+    *   Mac OS X x86, PowerPC
+    *   Linux x86
+    *   Windows x86
+*   No requirement to distribute symbols
+
+## Client Libraries
+
+*   Exception handler installed at application startup
+    *   Spawns a separate thread
+*   Minidump file written at crash time
+    *   Format used by Windows debuggers
+*   Separate application invoked to send
+    *   HTTP[S](S.md) POST, can include additional parameters
+
+## Symbols
+
+*   Cross-platform symbol file format
+*   Contents
+    *   Function names
+    *   Source file names and line numbers
+    *   Windows: Frame pointer omission data
+    *   Future: parameters and local variables
+*   Symbol conversion methods
+
+## Processor
+
+*   Examines minidump file and invokes stackwalker
+*   Symbol files requested from a SymbolSupplier
+*   Produces stack trace
+*   Output may be placed where convenient
+
+## Intergation
+
+*   Breakpad client present in Gran Paradiso Alpha 1 for Windows
+    *   Disabled by default
+    *   Enable with `MOZ_AIRBAG`
+*   Proof-of-concept collector
+    *   http://mavra.perilith.com/~luser/airbag-collector/list.pl
+*   Other platforms coming soon
+
+## More Information
+
+*   Project home: http://code.google.com/p/google-breakpad/
+*   Mailing lists
+    *   [google-breakpad-dev@googlegroups.com]
+        (http://groups.google.com/group/google-breakpad-dev/)
+    *   [google-breakpad-discuss@googlegroups.com]
+        (http://groups.google.com/group/google-breakpad-discuss/)
+*   Ask me (irc.mozilla.org: mento)
--- a/docs/processor_design.md
+++ b/docs/processor_design.md
@ -0,0 +1,230 @@
+# Breakpad Processor Library
+
+## Objective
+
+The Breakpad processor library is an open-source framework to access the the
+information contained within crash dumps for multiple platforms, and to use that
+information to produce stack traces showing the call chain of each thread in a
+process. After processing, this data is made available to users of the library.
+
+## Background
+
+The Breakpad processor is intended to sit at the core of a comprehensive
+crash-reporting system that does not require debugging information to be
+provided to those running applications being monitored. Some existing
+crash-reporting systems, such as [GNOME](http://www.gnome.org/)’s Bug-Buddy and
+[Apple](http://www.apple.com/)’s [CrashReporter]
+(http://developer.apple.com/technotes/tn2004/tn2123.html), require symbolic
+information to be present on the end user’s computer; in the case of
+CrashReporter, the reports are transmitted only to Apple, not to third-party
+developers. Other systems, such as [Microsoft](http://www.microsoft.com/)’s
+[Windows Error Reporting](http://msdn.microsoft.com/isv/resources/wer/) and
+SupportSoft’s Talkback, transmit only a snapshot of a crashed process’ state,
+which can later be combined with symbolic debugging information without the need
+for it to be present on end users’ computers. Because symbolic debugging
+information consumes a large amount of space and is otherwise not needed during
+the normal operation of software, and because some developers are reluctant to
+release debugging symbols to their customers, Breakpad follows the latter
+approach.
+
+We know of no currently-maintained crash-reporting systems that meet our
+requirements, which are to: * allow for symbols to be separate from the
+application, * handle crash reports from multiple platforms, * allow developers
+to operate their own crash-reporting platform, and to * be open-source. Windows
+Error Reporting only functions for Microsoft products, and requires the
+involvement of Microsoft’s servers. Talkback, while cross-platform, has not been
+maintained and at this point does not support Mac OS X on x86, which we consider
+to be a significant platform. Talkback is also closed-source commercial
+software, and has very specific requirements for its server platform.
+
+We are aware of Windows-only crash-reporting systems that leverage Microsoft’s
+debugging interfaces. Such systems, even if extended to support dumps from other
+platforms, are tied to using Windows for at least a portion of the processor
+platform.
+
+## Overview
+
+The Breakpad processor itself is written in standard C++ and will work on a
+variety of platforms. The dumps it accepts may also have been created on a
+variety of systems. The library is able to combine dumps with symbolic debugging
+information to create stack traces that include function signatures. The
+processor library includes simple command-line tools to examine dumps and
+process them, producing stack traces. It also exposes several layers of APIs
+enabling crash-reporting systems to be built around the Breakpad processor.
+
+## Detailed Design
+
+### Dump Files
+
+In the processor, the dump data is of primary significance. Dumps typically
+contain:
+
+*   CPU context (register data) as it was at the time the crash occurred, and an
+    indication of which thread caused the crash. General-purpose registers are
+    included, as are special-purpose registers such as the instruction pointer
+    (program counter).
+*   Information about each thread of execution within a crashed process,
+    including:
+    *   The memory region used for each thread’s stack.
+    *   CPU context for each thread, which for various reasons is not the same
+        as the crash context in the case of the crashed thread.
+*   A list of loaded code segments (or modules), including:
+    *   The name of the file (`.so`, `.exe`, `.dll`, etc.) which provides the
+        code.
+    *   The boundaries of the memory region in which the code segment is visible
+        to the process.
+    *   A reference to the debugging information for the code module, when such
+        information is available.
+
+Ordinarily, dumps are produced as a result of a crash, but other triggers may be
+set to produce dumps at any time a developer deems appropriate. The Breakpad
+processor can handle dumps in the minidump format, either generated by an
+[Breakpad client “handler”](client_design.md) implementation, or by another
+implementation that produces dumps in this format. The
+[DbgHelp.dll!MiniDumpWriteDump]
+(http://msdn2.microsoft.com/en-us/library/ms680360.aspx) function on Windows
+produces dumps in this format, and is the basis for the Breakpad handler
+implementation on that platform.
+
+The [minidump format]
+(http://msdn.microsoft.com/en-us/library/ms679293%28VS.85%29.aspx) is
+essentially a simple container format, organized as a series of streams. Each
+stream contains some type of data relevant to the crash. A typical “normal”
+minidump contains streams for the thread list, the module list, the CPU context
+at the time of the crash, and various bits of additional system information.
+Other types of minidump can be generated, such as a full-memory minidump, which
+in addition to stack memory contains snapshots of all of a process’ mapped
+memory regions.
+
+The minidump format was chosen as Breakpad’s dump format because it has an
+established track record on Windows, and it can be adapted to meet the needs of
+the other platforms that Breakpad supports. Most other operating systems use
+“core” files as their native dump formats, but the capabilities of core files
+vary across platforms, and because core files are usually presented in a
+platform’s native executable format, there are complications involved in
+accessing the data contained therein without the benefit of the header files
+that define an executable format’s entire structure. Because minidumps are
+leaner than a typical executable format, a redefinition of the format in a
+cross-platform header file, `minidump_format.h`, was a straightforward task.
+Similarly, the capabilities of the minidump format are understood, and because
+it provides an extensible container, any of Breakpad’s needs that could not be
+met directly by the standard minidump format could likely be met by extending it
+as needed. Finally, using this format means that the dump file is compatible
+with native debugging tools at least on Windows. A possible future avenue for
+exploration is the conversion of minidumps to core files, to enable this same
+benefit on other platforms.
+
+We have already provided an extension to the minidump format that allows it to
+carry dumps generated on systems with PowerPC processors. The format already
+allows for variable CPUs, so our work in this area was limited to defining a
+context structure sufficient to represent the execution state of a PowerPC. We
+have also defined an extension that allows minidumps to indicate which thread of
+execution requested a dump be produced for non-crash dumps.
+
+Often, the information contained within a dump alone is sufficient to produce a
+full stack backtrace for each thread. Certain optimizations that compilers
+employ in producing code frustrate this process. Specifically, the “frame
+pointer omission” optimization of x86 compilers can make it impossible to
+produce useful stack traces given only a stack snapshot and CPU context. In
+these cases, however, compiler-emitted debugging information can aid in
+producing useful stack traces. The Breakpad processor is able to take advantage
+of this debugging information as supplied by Microsoft’s C/C++ compiler, the
+only compiler to apply such optimizations by default. As a result, the Breakpad
+processor can produce useful stack traces even from code with frame pointer
+omission optimizations as produced by this compiler.
+
+### Symbol Files
+
+The [symbol files](symbol_files.md) that the Breakpad processor accepts allow
+for frame pointer omission data, but this is only one of their capabilities.
+Each symbol file also includes information about the functions, source files,
+and source code line numbers for a single module of code. A module is an
+individually-loadble chunk of code: these can be executables containing a main
+program (`exe` files on Windows) or shared libraries (`.so` files on Linux,
+`.dylib` files, frameworks, and bundles on Mac OS X, and `.dll` files on
+Windows). Dumps contain information about which of these modules were loaded at
+the time the dump was produced, and given this information, the Breakpad
+processor attempts to locate debugging symbols for the module through a
+user-supplied function embodied in a “symbol supplier.” Breakpad includes a
+sample symbol supplier, called `SimpleSymbolSupplier`, that is used by its
+command-line tools; this supplier locates symbol files by pathname.
+`SimpleSymbolSupplier` is also available to other users of the Breakpad
+processor library. This allows for the use of a simple reference implementation,
+but preserves flexibility for users who may have more demanding symbol file
+storage needs.
+
+Breakpad’s symbol file format is text-based, and was defined to be fairly
+human-readable and to encompass the needs of multiple platforms. The Breakpad
+processor itself does not operate directly with native symbol formats ([DWARF]
+(http://dwarf.freestandards.org/) and [STABS]
+(http://sourceware.org/gdb/current/onlinedocs/stabs.html) on most Unix-like
+systems, [.pdb files]
+(http://msdn2.microsoft.com/en-us/library/yd4f8bd1(VS.80).aspx) on Windows),
+because of the complications in accessing potentially complex symbol formats
+with slight variations between platforms, stored within different types of
+binary formats. In the case of `.pdb` files, the debugging format is not even
+documented. Instead, Breakpad’s symbol files are produced on each platform,
+using specific debugging APIs where available, to convert native symbols to
+Breakpad’s cross-platform format.
+
+### Processing
+
+Most commonly, a developer will enable an application to use Breakpad by
+building it with a platform-specific [client “handler”](client_design.md)
+library. After building the application, the developer will create symbol files
+for Breakpad’s use using the included `dump_syms` or `symupload` tools, or
+another suitable tool, and place the symbol files where the processor’s symbol
+supplier will be able to locate them.
+
+When a dump file is given to the processor’s `MinidumpProcessor` class, it will
+read it using its included minidump reader, contained in the `Minidump` family
+of classes. It will collect information about the operating system and CPU that
+produced the dump, and determine whether the dump was produced as a result of a
+crash or at the direct request of the application itself. It then loops over all
+of the threads in a process, attempting to walk the stack associated with each
+thread. This process is achieved by the processor’s `Stackwalker` components, of
+which there are a slightly different implementations for each CPU type that the
+processor is able to handle dumps from. Beginning with a thread’s context, and
+possibly using debugging data, the stackwalker produces a list of stack frames,
+containing each instruction executed in the chain. These instructions are
+matched up with the modules that contributed them to a process, and the
+`SymbolSupplier` is invoked to locate a symbol file. The symbol file is given to
+a `SourceLineResolver`, which matches the instruction up with a specific
+function name, source file, and line number, resulting in a representation of a
+stack frame that can easily be used to identify which code was executing.
+
+The results of processing are made available in a `ProcessState` object, which
+contains a vector of threads, each containing a vector of stack frames.
+
+For small-scale use of the Breakpad processor, and for testing and debugging,
+the `minidump_stackwalk` tool is provided. It invokes the processor and displays
+the full results of processing, optionally allowing symbols to be provided to
+the processor by a pathname-based symbol supplier, `SimpleSymbolSupplier`.
+
+For lower-level testing and debugging, the processor library also includes a
+`minidump_dump` tool, which walks through an entire minidump file and displays
+its contents in somewhat readable form.
+
+### Platform Support
+
+The Breakpad processor library is able to process dumps produced on Mac OS X
+systems running on x86, x86-64, and PowerPC processors, on Windows and Linux
+systems running on x86 or x86-64 processors, and on Android systems running ARM
+or x86 processors. The processor library itself is written in standard C++, and
+should function properly in most Unix-like environments. It has been tested on
+Linux and Mac OS X.
+
+## Future Plans
+
+There are currently no firm plans or timetables to implement any of these
+features, although they are possible avenues for future exploration.
+
+The symbol file format can be extended to carry information about the locations
+of parameters and local variables as stored in stack frames and registers, and
+the processor can use this information to provide enhanced stack traces showing
+function arguments and variable values.
+
+On Mac OS X and Linux, we can provide tools to convert files from the minidump
+format into the native core format. This will enable developers to open dump
+files in a native debugger, just as they are presently able to do with minidumps
+on Windows.
--- a/docs/stack_walking.md
+++ b/docs/stack_walking.md
@ -0,0 +1,160 @@
+# Introduction
+
+This page aims to provide a detailed description of how Breakpad produces stack
+traces from the information contained within a minidump file.
+
+# Details
+
+## Starting the Process
+
+Typically the stack walking process is initiated by instantiating the
+[MinidumpProcessor]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/minidump_processor.cc)
+class and calling the [MinidumpProcessor::Process]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/minidump_processor.cc#61)
+method, providing it a minidump file to process. To produce a useful stack
+trace, the MinidumpProcessor requires two other objects which are passed in its
+constructor: a [SymbolSupplier]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/symbol_supplier.h)
+and a [SourceLineResolverInterface]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h).
+The SymbolSupplier object is responsible for locating and providing SymbolFiles
+that match modules from the minidump. The SourceLineResolverInterface is
+responsible for loading the symbol files and using the information contained
+within to provide function and source information for stack frames, as well as
+information on how to unwind from a stack frame to its caller. More detail will
+be provided on these interactions later.
+
+A number of data streams are extracted from the minidump to begin stack walking:
+the list of threads from the process ([MinidumpThreadList]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#335)),
+the list of modules loaded in the process ([MinidumpModuleList]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#501)),
+and information about the exception that caused the process to crash
+([MinidumpException]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#615)).
+
+## Enumerating Threads
+
+For each thread in the thread list ([MinidumpThread]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#299)),
+the thread memory containing the stack for the thread ([MinidumpMemoryRegion]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#236))
+and the CPU context representing the CPU state of the thread at the time the
+dump was written ([MinidumpContext]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/minidump.h#171))
+are extracted from the minidump. If the thread being processed is the thread
+that produced the exception then a CPU context is obtained from the
+MinidumpException object instead, which represents the CPU state of the thread
+at the point of the exception. A stack walker is then instantiated by calling
+the [Stackwalker::StackwalkerForCPU]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#77)
+method and passing it the CPU context, the thread memory, the module list, as
+well as the SymbolSupplier and SourceLineResolverInterface. This method selects
+the specific !Stackwalker subclass based on the CPU architecture of the provided
+CPU context and returns an instance of that subclass.
+
+## Walking a thread's stack
+
+Once a !Stackwalker instance has been obtained, the processor calls the
+[Stackwalker::Walk]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h)
+method to obtain a list of frames representing the stack of this thread. The
+!Stackwalker starts by calling the GetContextFrame method which returns a
+StackFrame representing the top of the stack, with CPU state provided by the
+initial CPU context. From there, the stack walker repeats the following steps
+for each frame in turn:
+
+### Finding the Module
+
+The address of the instruction pointer of the current frame is used to determine
+which module contains the current frame by calling the module list's
+[GetModuleForAddress]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/code_modules.h#56)
+method.
+
+### Locating Symbols
+
+If a module is located, the SymbolSupplier is asked to locate symbols
+corresponding to the module by calling its [GetCStringSymbolData]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/symbol_supplier.h#87)
+method. Typically this is implemented by using the module's debug filename (the
+PDB filename for Windows dumps) and debug identifier (a GUID plus one extra
+digit) as a lookup key. The [SimpleSymbolSupplier]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/simple_symbol_supplier.cc)
+class simply uses these as parts of a file path to locate a flat file on disk.
+
+### Loading Symbols
+
+If a symbol file is located, the SourceLineResolverInterface is then asked to
+load the symbol file by calling its [LoadModuleUsingMemoryBuffer]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#71)
+method. The [BasicSourceLineResolver]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/basic_source_line_resolver.cc)
+implementation parses the text-format [symbol file](symbol_files.md) into
+in-memory data structures to make lookups by address of function names, source
+line information, and unwind information easy.
+
+### Getting source line information
+
+If a symbol file has been successfully loaded, the SourceLineResolverInterface's
+[FillSourceLineInfo]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#89)
+method is called to provide a function name and source line information for the
+current frame. This is done by subtracting the base address of the module
+containing the current frame from the instruction pointer of the current frame
+to obtain a relative virtual address (RVA), which is a code offset relative to
+the start of the module. This RVA is then used as a lookup into a table of
+functions ([FUNC lines](SymbolFiles#FUNC_records.md) from the symbol file), each
+of which has an associated address range (function start address, function
+size). If a function is found whose address range contains the RVA, then its
+name is used. The RVA is then used as a lookup into a table of source lines
+([line records](SymbolFiles#Line_records.md) from the symbol file), each of
+which also has an associated address range. If a match is found it will provide
+the file name and source line associated with the current frame. If no match was
+found in the function table, another table of publicly exported symbols may be
+consulted ([PUBLIC lines](SymbolFiles#PUBLIC_records.md) from the symbol file).
+Public symbols contain only a start address, so the lookup simply looks for the
+nearest symbol that is less than the provided RVA.
+
+### Finding the caller frame
+
+To find the next frame in the stack, the !Stackwalker calls its [GetCallerFrame]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#186)
+method, passing in the current frame. Each !Stackwalker subclass implements
+GetCallerFrame differently, but there are common patterns.
+
+Typically the first step is to query the SourceLineResolverInterface for the
+presence of detailed unwind information. This is done using its
+[FindWindowsFrameInfo]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#96)
+and [FindCFIFrameInfo]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/source_line_resolver_interface.h#102)
+methods. These methods look for Windows unwind info extracted from a PDB file
+([STACK WIN](SymbolFiles#STACK_WIN_records.md) lines from the symbol file), or
+DWARF CFI extracted from a binary ([STACK CFI](SymbolFiles#STACK_CFI_records.md)
+lines from the symbol file) respectively. The information covers address ranges,
+so the RVA of the current frame is used for lookup as with function and source
+line information.
+
+If unwind info is found it provides a set of rules to recover the register state
+of the caller frame given the current register state as well as the thread's
+stack memory. The rules are evaluated to produce the caller frame.
+
+If unwind info is not found then the !Stackwalker may resort to other methods.
+Typically on architectures which specify a frame pointer unwinding by
+dereferencing the frame pointer is tried next. If that is successful it is used
+to produce the caller frame.
+
+If no caller frame was found by any other method most !Stackwalker
+implementations resort to stack scanning by looking at each word on the stack
+down to a fixed depth (implemented in the [Stackwalker::ScanForReturnAddress]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#131)
+method) and using a heuristic to attempt to find a reasonable return address
+(implemented in the [Stackwalker::InstructionAddressSeemsValid]
+(http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/processor/stackwalker.h#111)
+method).
+
+If no caller frame is found or the caller frame seems invalid, stack walking
+stops. If a caller frame was found then these steps repeat using the new frame
+as the current frame.
--- a/docs/symbol_files.md
+++ b/docs/symbol_files.md
@ -0,0 +1,497 @@
+# Introduction
+
+Given a minidump file, the Breakpad processor produces stack traces that include
+function names and source locations. However, minidump files contain only the
+byte-by-byte contents of threads' registers and stacks, without function names
+or machine-code-to-source mapping data. The processor consults Breakpad symbol
+files for the information it needs to produce human-readable stack traces from
+the binary-only minidump file.
+
+The platform-specific symbol dumping tools parse the debugging information the
+compiler provides (whether as DWARF or STABS sections in an ELF file or as
+stand-alone PDB files), and write that information back out in the Breakpad
+symbol file format. This format is much simpler and less detailed than compiler
+debugging information, and values legibility over compactness.
+
+# Overview
+
+Breakpad symbol files are ASCII text files, with lines delimited as appropriate
+for the host platform. Each line is a _record_, divided into fields by single
+spaces; in some cases, the last field of the record can contain spaces. The
+first field is a string indicating what sort of record the line represents
+(except for line records; these are very common, making them the default saves
+space). Some fields hold decimal or hexadecimal numbers; hexadecimal numbers
+have no "0x" prefix, and use lower-case letters.
+
+Breakpad symbol files contain the following record types. With some
+restrictions, these may appear in any order.
+
+*   A `MODULE` record describes the executable file or shared library from which
+    this data was derived, for use by symbol suppliers. A `MODULE' record should
+    be the first record in the file.
+
+*   A `FILE` record gives a source file name, and assigns it a number by which
+    other records can refer to it.
+
+*   A `FUNC` record describes a function present in the source code.
+
+*   A line record indicates to which source file and line a given range of
+    machine code should be attributed. The line is attributed to the function
+    defined by the most recent `FUNC` record.
+
+*   A `PUBLIC` record gives the address of a linker symbol.
+
+*   A `STACK` record provides information necessary to produce stack traces.
+
+# `MODULE` records
+
+A `MODULE` record provides meta-information about the module the symbol file
+describes. It has the form:
+
+> `MODULE` _operatingsystem_ _architecture_ _id_ _name_
+
+For example: `MODULE Linux x86 D3096ED481217FD4C16B29CD9BC208BA0 firefox-bin
+` These records provide meta-information about the executable or shared library
+from which this symbol file was generated. A symbol supplier might use this
+information to find the correct symbol files to use to interpret a given
+minidump, or to perform other sorts of validation. If present, a `MODULE` record
+should be the first line in the file.
+
+The fields are separated by spaces, and cannot contain spaces themselves, except
+for _name_.
+
+*   The _operatingsystem_ field names the operating system on which the
+    executable or shared library was intended to run. This field should have one
+    of the following values: | **Value** | **Meaning** |
+    |:----------|:--------------------| | Linux | Linux | | mac | Macintosh OSX
+    | | windows | Microsoft Windows |
+
+*   The _architecture_ field indicates what processor architecture the
+    executable or shared library contains machine code for. This field should
+    have one of the following values: | **Value** | **Instruction Set
+    Architecture** | |:----------|:---------------------------------| | x86 |
+    Intel IA-32 | | x86\_64 | AMD64/Intel 64 | | ppc | 32-bit PowerPC | | ppc64
+    | 64-bit PowerPC | | unknown | unknown |
+
+*   The _id_ field is a sequence of hexadecimal digits that identifies the exact
+    executable or library whose contents the symbol file describes. The way in
+    which it is computed varies from platform to platform.
+
+*   The _name_ field contains the base name (the final component of the
+    directory path) of the executable or library. It may contain spaces, and
+    extends to the end of the line.
+
+# `FILE` records
+
+A `FILE` record holds a source file name for other records to refer to. It has
+the form:
+
+> `FILE` _number_ _name_
+
+For example: `FILE 2 /home/jimb/mc/in/browser/app/nsBrowserApp.cpp
+`
+
+A `FILE` record provides the name of a source file, and assigns it a number
+which other records (line records, in particular) can use to refer to that file
+name. The _number_ field is a decimal number. The _name_ field is the name of
+the file; it may contain spaces.
+
+# `FUNC` records
+
+A `FUNC` record describes a source-language function. It has the form:
+
+> `FUNC` _address_ _size_ _parameter\_size_ _name_
+
+For example: `FUNC c184 30 0 nsQueryInterfaceWithError::operator()(nsID const&,
+void**) const
+`
+
+The _address_ and _size_ fields are hexadecimal numbers indicating the start
+address and length in bytes of the machine code instructions the function
+occupies. (Breakpad symbol files cannot accurately describe functions whose code
+is not contiguous.) The start address is relative to the module's load address.
+
+The _parameter\_size_ field is a hexadecimal number indicating the size, in
+bytes, of the arguments pushed on the stack for this function. Some calling
+conventions, like the Microsoft Windows `stdcall` convention, require the called
+function to pop parameters passed to it on the stack from its caller before
+returning. The stack walker uses this value, along with data from `STACK`
+records, to step from the called function's frame to the caller's frame.
+
+The _name_ field is the name of the function. In languages that use linker
+symbol name mangling like C++, this should be the source language name (the
+"unmangled" form). This field may contain spaces.
+
+# Line records
+
+A line record describes the source file and line number to which a given range
+of machine code should be attributed. It has the form:
+
+> _address_ _size_ _line_ _filenum_
+
+For example: `c184 7 59 4
+`
+
+Because they are so common, line records do not begin with a string indicating
+the record type. All other record types' names use upper-case letters;
+hexadecimal numbers, like a line record's _address_, use lower-case letters.
+
+The _address_ and _size_ fields are hexadecimal numbers indicating the start
+address and length in bytes of the machine code. The address is relative to the
+module's load address.
+
+The _line_ field is the line number to which the machine code should be
+attributed, in decimal; the first line of the source file is line number 1. The
+_filenum_ field is a decimal number appearing in a prior `FILE` record; the name
+given in that record is the source file name for the machine code.
+
+The line is assumed to belong to the function described by the last preceding
+`FUNC` record. Line records may not appear before the first `FUNC' record.
+
+No two line records in a symbol file cover the same range of addresses. However,
+there may be many line records with identical line and file numbers, as a given
+source line may contribute many non-contiguous blocks of machine code.
+
+# `PUBLIC` records
+
+A `PUBLIC` record describes a publicly visible linker symbol, such as that used
+to identify an assembly language entry point or region of memory. It has the
+form:
+
+> PUBLIC _address_ _parameter\_size_ _name_
+
+For example: `PUBLIC 2160 0 Public2_1
+`
+
+The Breakpad processor essentially treats a `PUBLIC` record as defining a
+function with no line number data and an indeterminate size: the code extends to
+the next address mentioned. If a given address is covered by both a `PUBLIC`
+record and a `FUNC` record, the processor uses the `FUNC` data.
+
+The _address_ field is a hexadecimal number indicating the symbol's address,
+relative to the module's load address.
+
+The _parameter\_size_ field is a hexadecimal number indicating the size of the
+parameters passed to the code whose entry point the symbol marks, if known. This
+field has the same meaning as the _parameter\_size_ field of a `FUNC` record;
+see that description for more details.
+
+The _name_ field is the name of the symbol. In languages that use linker symbol
+name mangling like C++, this should be the source language name (the "unmangled"
+form). This field may contain spaces.
+
+# `STACK WIN` records
+
+Given a stack frame, a `STACK WIN` record indicates how to find the frame that
+called it. It has the form:
+
+> STACK WIN _type_ _rva_ _code\_size_ _prologue\_size_ _epilogue\_size_
+> _parameter\_size_ _saved\_register\_size_ _local\_size_ _max\_stack\_size_
+> _has\_program\_string_ _program\_string\_OR\_allocates\_base\_pointer_
+
+For example: `STACK WIN 4 2170 14 1 0 0 0 0 0 1 $eip 4 + ^ = $esp $ebp 8 + =
+$ebp $ebp ^ =
+`
+
+All fields of a `STACK WIN` record, except for the last, are hexadecimal
+numbers.
+
+The _type_ field indicates what sort of stack frame data this record holds. Its
+value should be one of the values of the [StackFrameTypeEnum]
+(http://msdn.microsoft.com/en-us/library/bc5207xw%28VS.100%29.aspx) type in
+Microsoft's [Debug Interface Access (DIA)]
+(http://msdn.microsoft.com/en-us/library/x93ctkx8%28VS.100%29.aspx) API.
+Breakpad uses only records of type 4 (`FrameTypeFrameData`) and 0
+(`FrameTypeFPO`); it ignores others. These types differ only in whether the last
+field is an _allocates\_base\_pointer_ flag (`FrameTypeFPO`) or a program string
+(`FrameTypeFrameData`). If more than one record covers a given address, Breakpad
+prefers `FrameTypeFrameData` records over `FrameTypeFPO` records.
+
+The _rva_ and _code\_size_ fields give the starting address and length in bytes
+of the machine code covered by this record. The starting address is relative to
+the module's load address.
+
+The _prologue\_size_ and _epilogue\_size_ fields give the length, in bytes, of
+the prologue and epilogue machine code within the record's range. Breakpad does
+not use these values.
+
+The _parameter\_size_ field gives the number of argument bytes this function
+expects to have been passed. This field has the same meaning as the
+_parameter\_size_ field of a `FUNC` record; see that description for more
+details.
+
+The _saved\_register\_size_ field gives the number of bytes in the stack frame
+dedicated to preserving the values of any callee-saves registers used by this
+function.
+
+The _local\_size_ field gives the number of bytes in the stack frame dedicated
+to holding the function's local variables and temporary values.
+
+The _max\_stack\_size_ field gives the maximum number of bytes pushed on the
+stack in the frame. Breakpad does not use this value.
+
+If the _has\_program\_string_ field is zero, then the `STACK WIN` record's final
+field is an _allocates\_base\_pointer_ flag, as a hexadecimal number; this is
+expected for records whose _type_ is 0. Otherwise, the final field is a program
+string.
+
+## Interpreting a `STACK WIN` record
+
+Given the register values for a frame F, we can find the calling frame as
+follows:
+
+*   If the _has\_program\_string_ field of a `STACK WIN` record is zero, then
+    the final field is _allocates\_base\_pointer_, a flag indicating whether the
+    frame uses the frame pointer register, `%ebp`, as a general-purpose
+    register.
+    *   If _allocates\_base\_pointer_ is true, then `%ebp` does not point to the
+        frame's base address. Instead,
+        *   Let _next\_parameter\_size_ be the parameter size of the function
+            frame F called (**not** this record's _parameter\_size_ field), or
+            zero if F is the youngest frame on the stack. You must find this
+            value in F's callee's `FUNC`, `STACK WIN`, or `PUBLIC` records.
+        *   Let _frame\_size_ be the sum of the _local\_size_ field, the
+            _saved\_register\_size_ field, and _next\_parameter\_size_. > > With
+            those definitions in place, we can recover the calling frame as
+            follows:
+        *   F's return address is at `%esp +`_frame\_size_,
+        *   the caller's value of `%ebp` is saved at `%esp
+            +`_next\_parameter\_size_`+`_saved\_register\_size_`- 8`, and
+        *   the caller's value of `%esp` just before the call instruction was
+            `%esp +`_frame\_size_`+ 4`. > > (Why do we include
+            _next\_parameter\_size_ in the sum when computing _frame\_size_ and
+            the address of the saved `%ebp`? When a function A has called a
+            function B, the arguments that A pushed for B are considered part of
+            A's stack frame: A's value for `%esp` points at the last argument
+            pushed for B. Thus, we must include the size of those arguments
+            (given by the debugging info for B) along with the size of A's
+            register save area and local variable area (given by the debugging
+            info for A) when computing the overall size of A's frame.)
+    *   If _allocates\_base\_pointer_ is false, then F's function doesn't use
+        `%ebp` at all. You may recover the calling frame as above, except that
+        the caller's value of `%ebp` is the same as F's value for `%ebp`, so no
+        steps are necessary to recover it.
+*   If the _has\_program\_string_ field of a `STACK WIN` record is not zero,
+    then the record's final field is a string containing a program to be
+    interpreted to recover the caller's frame. The comments in the
+    [postfix\_evaluator.h]
+    (http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/postfix_evaluator.h#40)
+    header file explain the language in which the program is written. You should
+    place the following variables in the dictionary before interpreting the
+    program:
+    *   `$ebp` and `$esp` should be the values of the `%ebp` and `%esp`
+        registers in F.
+    *   `.cbParams`, `.cbSavedRegs`, and `.cbLocals`, should be the values of
+        the `STACK WIN` record's _parameter\_size_, _saved\_register\_size_, and
+        _local\_size_ fields.
+    *   `.raSearchStart` should be set to the address on the stack to begin
+        scanning for a return address, if necessary. The Breakpad processor sets
+        this to the value of `%esp` in F, plus the _frame\_size_ value mentioned
+        above.
+
+> If the program stores values for `$eip`, `$esp`, `$ebp`, `$ebx`, `$esi`, or
+> `$edi`, then those are the values of the given registers in the caller. If the
+> value of `$eip` is zero, that indicates that the end of the stack has been
+> reached.
+
+The Breakpad processor checks that the value yielded by the above for the
+calling frame's instruction address refers to known code; if the address seems
+to be bogus, then it uses a heuristic search to find F's return address and
+stack base.
+
+# `STACK CFI` records
+
+`STACK CFI` ("Call Frame Information") records describe how to walk the stack
+when execution is at a given machine instruction. These records take one of two
+forms:
+
+> `STACK CFI INIT` _address_ _size_ _register<sub>1</sub>_:
+> _expression<sub>1</sub>_ _register<sub>2</sub>_: _expression<sub>2</sub>_ ...
+>
+> `STACK CFI` _address_ _register<sub>1</sub>_: _expression<sub>1</sub>_
+> _register<sub>2</sub>_: _expression<sub>2</sub>_ ...
+
+For example:
+
+```
+STACK CFI INIT 804c4b0 40 .cfa: $esp 4 + $eip: .cfa 4 - ^
+STACK CFI 804c4b1 .cfa: $esp 8 + $ebp: .cfa 8 - ^
+```
+
+The _address_ and _size_ fields are hexadecimal numbers. Each
+_register_<sub>i</sub> is the name of a register or pseudoregister. Each
+_expression_ is a Breakpad postfix expression, which may contain spaces, but
+never ends with a colon. (The appropriate register names for a given
+architecture are determined when `STACK CFI` records are first enabled for that
+architecture, and should be documented in the appropriate
+`stackwalker_`_architecture_`.cc` source file.)
+
+STACK CFI records describe, at each machine instruction in a given function, how
+to recover the values the machine registers had in the function's caller.
+Naturally, some registers' values are simply lost, but there are three cases in
+which they can be recovered:
+
+*   You can always recover the program counter, because that's the function's
+    return address. If the function is ever going to return, the PC must be
+    saved somewhere.
+
+*   You can always recover the stack pointer. The function is responsible for
+    popping its stack frame before it returns to the caller, so it must be able
+    to restore this, as well.
+
+*   You should be able to recover the values of callee-saves registers. These
+    are registers whose values the callee must preserve, either by saving them
+    in its own stack frame before using them and re-loading them before
+    returning, or by not using them at all.
+
+(As an exception, note that functions which never return may not save any of
+this data. It may not be possible to walk the stack past such functions' stack
+frames.)
+
+Given rules for recovering the values of a function's caller's registers, we can
+walk up the stack. Starting with the current set of registers --- the PC of the
+instruction we're currently executing, the current stack pointer, etc. --- we
+use CFI to recover the values those registers had in the caller of the current
+frame. This gives us a PC in the caller whose CFI we can look up; we apply the
+process again to find that function's caller; and so on.
+
+Concretely, CFI records represent a table with a row for each machine
+instruction address and a column for each register. The table entry for a given
+address and register contains a rule describing how, when the PC is at that
+address, to restore the value that register had in the caller.
+
+There are some special columns:
+
+*   A column named `.cfa`, for "Canonical Frame Address", tells how to compute
+    the base address of the frame; other entries can refer to the CFA in their
+    rules.
+
+*   A column named `.ra` represents the return address.
+
+For example, suppose we have a machine with 32-bit registers, one-byte
+instructions, a stack that grows downwards, and an assembly language that
+resembles C. Suppose further that we have a function whose machine code looks
+like this:
+
+```
+func:                                ; entry point; return address at sp
+func+0:      sp -= 16                ; allocate space for stack frame
+func+1:      sp[12] = r0             ; save 4-byte r0 at sp+12
+             ...                     ; stuff that doesn't affect stack
+func+10:     sp -= 4; *sp = x        ; push some 4-byte x on the stack
+             ...                     ; stuff that doesn't affect stack
+func+20:     r0 = sp[16]             ; restore saved r0
+func+21:     sp += 20                ; pop whole stack frame
+func+22:     pc = *sp; sp += 4       ; pop return address and jump to it
+```
+
+The following table would describe the function above:
+
+**code address** | **.cfa** | **r0 (on Google Code)** | **r1 (on Google Code)** | ... | **.ra**
+:--------------- | :------- | :---------------------- | :---------------------- | :-- | :-------
+func+0           | sp       |                         |                         |     | `cfa[0]`
+func+1           | sp+16    |                         |                         |     | `cfa[0]`
+func+2           | sp+16    | `cfa[-4]`               |                         |     | `cfa[0]`
+func+11          | sp+20    | `cfa[-4]`               |                         |     | `cfa[0]`
+func+21          | sp+20    |                         |                         |     | `cfa[0]`
+func+22          | sp       |                         |                         |     | `cfa[0]`
+
+Some things to note here:
+
+*   Each row describes the state of affairs **before** executing the instruction
+    at the given address. Thus, the row for func+0 describes the state before we
+    execute the first instruction, which allocates the stack frame. In the next
+    row, the formula for computing the CFA has changed, reflecting the
+    allocation.
+
+*   The other entries are written in terms of the CFA; this allows them to
+    remain unchanged as the stack pointer gets bumped around. For example, to
+    find the caller's value for r0 (on Google Code) at func+2, we would first
+    compute the CFA by adding 16 to the sp, and then subtract four from that to
+    find the address at which r0 (on Google Code) was saved.
+
+*   Although the example doesn't show this, most calling conventions designate
+    "callee-saves" and "caller-saves" registers. The callee must restore the
+    values of "callee-saves" registers before returning (if it uses them at
+    all), whereas the callee is free to use "caller-saves" registers without
+    restoring their values. A function that uses caller-saves registers
+    typically does not save their original values at all; in this case, the CFI
+    marks such registers' values as "unrecoverable".
+
+*   Exactly where the CFA points in the frame --- at the return address? below
+    it? At some fixed point within the frame? --- is a question of definition
+    that depends on the architecture and ABI in use. But by definition, the CFA
+    remains constant throughout the lifetime of the frame. It's up to
+    architecture- specific code to know what significance to assign the CFA, if
+    any.
+
+To save space, the most common type of CFI record only mentions the table
+entries at which changes take place. So for the above, the CFI data would only
+actually mention the non-blank entries here:
+
+**insn** | **cfa** | **r0 (on Google Code)** | **r1 (on Google Code)** | ... | **ra**
+:------- | :------ | :---------------------- | :---------------------- | :-- | :-------
+func+0   | sp      |                         |                         |     | `cfa[0]`
+func+1   | sp+16   |                         |                         |     |
+func+2   |         | `cfa[-4]`               |                         |     |
+func+11  | sp+20   |                         |                         |     |
+func+21  |         | r0 (on Google Code)     |                         |     |
+func+22  | sp      |                         |                         |     |
+
+A `STACK CFI INIT` record indicates that, at the machine instruction at
+_address_, belonging to some function, the value that _register<sub>n</sub>_ had
+in that function's caller can be recovered by evaluating
+_expression<sub>n</sub>_. The values of any callee-saves registers not mentioned
+are assumed to be unchanged. (`STACK CFI` records never mention caller-saves
+registers.) These rules apply starting at _address_ and continue up to, but not
+including, the address given in the next `STACK CFI` record. The _size_ field is
+the total number of bytes of machine code covered by this record and any
+subsequent `STACK CFI` records (until the next `STACK CFI INIT` record). The
+_address_ field is relative to the module's load address.
+
+A `STACK CFI` record (no `INIT`) is the same, except that it mentions only those
+registers whose recovery rules have changed from the previous CFI record. There
+must be a prior `STACK CFI INIT` or `STACK CFI` record in the symbol file. The
+_address_ field of this record must be greater than that of the previous record,
+and it must not be at or beyond the end of the range given by the most recent
+`STACK CFI INIT` record. The address is relative to the module's load address.
+
+Each expression is a breakpad-style postfix expression. Expressions may contain
+spaces, but their tokens may not end with colons. When an expression mentions a
+register, it refers to the value of that register in the callee, even if a prior
+name/expression pair gives that register's value in the caller. The exception is
+`.cfa`, which refers to the canonical frame address computed by the .cfa rule in
+force at the current instruction.
+
+The special expression `.undef` indicates that the given register's value cannot
+be recovered.
+
+The register names preceding the expressions are always followed by colons. The
+expressions themselves never contain tokens ending with colons.
+
+There are two special register names:
+
+*   `.cfa` ("Canonical Frame Address") is the base address of the stack frame.
+    Other registers' rules may refer to this. If no rule is provided for the
+    stack pointer, the value of `.cfa` is the caller's stack pointer.
+
+*   `.ra` is the return address. This is the value of the restored program
+    counter. We use `.ra` instead of the architecture-specific name for the
+    program counter.
+
+The Breakpad stack walker requires that there be rules in force for `.cfa` and
+`.ra` at every code address from which it unwinds. If those rules are not
+present, the stack walker will ignore the `STACK CFI` data, and try to use a
+different strategy.
+
+So the CFI for the example function above would be as follows, if `func` were at
+address 0x1000 (relative to the module's load address):
+
+```
+STACK CFI INIT 1000 .cfa: $sp .ra: .cfa ^
+STACK CFI      1001 .cfa: $sp 16 +
+STACK CFI      1002 $r0: .cfa 4 - ^
+STACK CFI      100b .cfa: $sp 20 +
+STACK CFI      1015 $r0: $r0
+STACK CFI      1016 .cfa: $sp
+```
--- a/docs/windows_client_integration.md
+++ b/docs/windows_client_integration.md
@ -0,0 +1,70 @@
+# Windows Integration overview
+
+## Windows Client Code
+
+The Windows client code is in the `src/client/windows` directory of the tree.
+Since the header files are fairly well commented some specifics are purposely
+omitted from this document.
+
+## Integration of minidump-generation
+
+Once you build the solution inside `src/client/windows`, an output file of
+`exception_handler.lib` will be generated. You can either check this into your
+project's directory or build directly from the source, as the project itself
+does.
+
+Enabling Breakpad in your application requires you to `#include
+"exception_handler.h"` and instantiate the `ExceptionHandler` object like so:
+
+```
+  handler = new ExceptionHandler(const wstring& dump_path,
+                                                              FilterCallback filter,
+                                                              MinidumpCallback callback,
+                                                              void* callback_context,
+                                                              int handler_types,
+                                                              MINIDUMP_TYPE dump_type,
+                                                              const wchar_t* pipe_name,
+                                                              const CustomClientInfo* custom_info);
+```
+
+The parameters, in order, are:
+
+*   pathname for minidumps to be written to - this is ignored if OOP dump
+    generation is used
+*   A callback that is called when the exception is first handled - you can
+    return true/false here to continue/stop exception processing
+*   A callback that is called after minidumps have been written
+*   Context for the callbacks
+*   Which exceptions to handle - see `HandlerType` enumeration in
+    exception\_handler.h
+*   The type of minidump to generate, using the `MINIDUMP_TYPE` definitions in
+    `DbgHelp.h`
+*   A pipe name that can be used to communicate with a crash generation server
+*   A pointer to a CustomClientInfo class that can be used to send custom data
+    along with the minidump when using OOP generation
+
+You can also see `src/client/windows/tests/crash_generation_app/*` for a sample
+app that uses OOP generation.
+
+## OOP Minidump Generation
+
+For out of process minidump generation, more work is needed. If you look inside
+`src/client/windows/crash_generation`, you will see a file called
+`crash_generation_server.h`. This file is the interface for a crash generation
+server, which must be instantiated with the same pipe name that is passed to the
+client above. The logistics of running a separate process that instantiates the
+crash generation server is left up to you, however.
+
+## Build process specifics(symbol generation, upload)
+
+The symbol creation step is talked about in the general overview doc, since it
+doesn't vary much by platform. You'll need to make sure that the symbols are
+available wherever minidumps are uploaded to for processing.
+
+## Out in the field - uploading the minidump
+
+Inside `src/client/windows/sender` is a class implementation called
+`CrashReportSender`. This class can be compiled into a separate standalone CLI
+or in the crash generation server and used to upload the report; it can know
+when to do so via one of the callbacks provided by the `CrashGenerationServer`
+or the `ExceptionHandler` object for in-process generation.