forked from eden-emu/eden
		
	
		
			
	
	
		
			224 lines
		
	
	
	
		
			13 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			224 lines
		
	
	
	
		
			13 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
|  | # Breakpad Client Libraries
 | |||
|  | 
 | |||
|  | ## Objective
 | |||
|  | 
 | |||
|  | The Breakpad client libraries are responsible for monitoring an application for | |||
|  | crashes (exceptions), handling them when they occur by generating a dump, and | |||
|  | providing a means to upload dumps to a crash reporting server. These tasks are | |||
|  | divided between the “handler” (short for “exception handler”) library linked in | |||
|  | to an application being monitored for crashes, and the “sender” library, | |||
|  | intended to be linked in to a separate external program. | |||
|  | 
 | |||
|  | ## Background
 | |||
|  | 
 | |||
|  | As one of the chief tasks of the client handler is to generate a dump, an | |||
|  | understanding of [dump files](processor_design.md) will aid in understanding the | |||
|  | handler. | |||
|  | 
 | |||
|  | ## Overview
 | |||
|  | 
 | |||
|  | Breakpad provides client libraries for each of its target platforms. Currently, | |||
|  | these exist for Windows on x86 and Mac OS X on both x86 and PowerPC. A Linux | |||
|  | implementation has been written and is currently under review. | |||
|  | 
 | |||
|  | Because the mechanisms for catching exceptions and the methods for obtaining the | |||
|  | information that a dump contains vary between operating systems, each target | |||
|  | operating system requires a completely different handler implementation. Where | |||
|  | multiple CPUs are supported for a single operating system, the handler | |||
|  | implementation will likely also require separate code for each processor type to | |||
|  | extract CPU-specific information. One of the goals of the Breakpad handler is to | |||
|  | provide a prepackaged cross-platform system that masks many of these | |||
|  | system-level differences and quirks from the application developer. Although the | |||
|  | underlying implementations differ, the handler library for each system follows | |||
|  | the same set of principles and exposes a similar interface. | |||
|  | 
 | |||
|  | Code that wishes to take advantage of Breakpad should be linked against the | |||
|  | handler library, and should, at an appropriate time, install a Breakpad handler. | |||
|  | For applications, it is generally desirable to install the handler as early in | |||
|  | the start-up process as possible. Developers of library code using Breakpad to | |||
|  | monitor itself may wish to install a Breakpad handler when the library is | |||
|  | loaded, or may only want to install a handler when calls are made in to the | |||
|  | library. | |||
|  | 
 | |||
|  | The handler can be triggered to generate a dump either by catching an exception | |||
|  | or at the request of the application itself. The latter case may be useful in | |||
|  | debugging assertions or other conditions where developers want to know how a | |||
|  | program got in to a specific non-crash state. After generating a dump, the | |||
|  | handler calls a user-specified callback function. The callback function may | |||
|  | collect additional data about the program’s state, quit the program, launch a | |||
|  | crash reporter application, or perform other tasks. Allowing for this | |||
|  | functionality to be dictated by a callback function preserves flexibility. | |||
|  | 
 | |||
|  | The sender library is also has a separate implementation for each supported | |||
|  | platform, because of the varying interfaces for accessing network resources on | |||
|  | different operating systems. The sender transmits a dump along with other | |||
|  | application-defined information to a crash report server via HTTP. Because dumps | |||
|  | may contain sensitive data, the sender allows for the use of HTTPS. | |||
|  | 
 | |||
|  | The canonical example of the entire client system would be for a monitored | |||
|  | application to link against the handler library, install a Breakpad handler from | |||
|  | its main function, and provide a callback to launch a small crash reporter | |||
|  | program. The crash reporter program would be linked against the sender library, | |||
|  | and would send the crash dump when launched. A separate process is recommended | |||
|  | for this function because of the unreliability inherent in doing any significant | |||
|  | amount of work from a crashed process. | |||
|  | 
 | |||
|  | ## Detailed Design
 | |||
|  | 
 | |||
|  | ### Exception Handler Installation
 | |||
|  | 
 | |||
|  | The mechanisms for installing an exception handler vary between operating | |||
|  | systems. On Windows, it’s a relatively simple matter of making one call to | |||
|  | register a [top-level exception | |||
|  | filter](http://msdn.microsoft.com/library/en-us/debug/base/setunhandledexceptionfilter.asp) | |||
|  | callback function. On most Unix-like systems such as Linux, processes are | |||
|  | informed of exceptions by the delivery of a signal, so an exception handler | |||
|  | takes the form of a signal handler. The native mechanism to catch exceptions on | |||
|  | Mac OS X requires a large amount of code to set up a Mach port, identify it as | |||
|  | the exception port, and assign a thread to listen for an exception on that port. | |||
|  | Just as the preparation of exception handlers differ, the manner in which they | |||
|  | are called differs as well. On Windows and most Unix-like systems, the handler | |||
|  | is called on the thread that caused the exception. On Mac OS X, the thread | |||
|  | listening to the exception port is notified that an exception has occurred. The | |||
|  | different implementations of the Breakpad handler libraries perform these tasks | |||
|  | in the appropriate ways on each platform, while exposing a similar interface on | |||
|  | each. | |||
|  | 
 | |||
|  | A Breakpad handler is embodied in an `ExceptionHandler` object. Because it’s a | |||
|  | C++ object, `ExceptionHandler`s may be created as local variables, allowing them | |||
|  | to be installed and removed as functions are called and return. This provides | |||
|  | one possible way for a developer to monitor only a portion of an application for | |||
|  | crashes. | |||
|  | 
 | |||
|  | ### Exception Basics
 | |||
|  | 
 | |||
|  | Once an application encounters an exception, it is in an indeterminate and | |||
|  | possibly hazardous state. Consequently, any code that runs after an exception | |||
|  | occurs must take extreme care to avoid performing operations that might fail, | |||
|  | hang, or cause additional exceptions. This task is not at all straightforward, | |||
|  | and the Breakpad handler library seeks to do it properly, accounting for all of | |||
|  | the minute details while allowing other application developers, even those with | |||
|  | little systems programming experience, to reap the benefits. All of the Breakpad | |||
|  | handler code that executes after an exception occurs has been written according | |||
|  | to the following guidelines for safety at exception time: | |||
|  | 
 | |||
|  | *   Use of the application heap is forbidden. The heap may be corrupt or | |||
|  |     otherwise unusable, and allocators may not function. | |||
|  | *   Resource allocation must be severely limited. The handler may create a new | |||
|  |     file to contain the dump, and it may attempt to launch a process to continue | |||
|  |     handling the crash. | |||
|  | *   Execution on the thread that caused the exception is significantly limited. | |||
|  |     The only code permitted to execute on this thread is the code necessary to | |||
|  |     transition handling to a dedicated preallocated handler thread, and the code | |||
|  |     to return from the exception handler. | |||
|  | *   Handlers shouldn’t handle crashes by attempting to walk stacks themselves, | |||
|  |     as stacks may be in inconsistent states. Dump generation should be performed | |||
|  |     by interfacing with the operating system’s memory manager and code module | |||
|  |     manager. | |||
|  | *   Library code, including runtime library code, must be avoided unless it | |||
|  |     provably meets the above guidelines. For example, this means that the STL | |||
|  |     string class may not be used, because it performs operations that attempt to | |||
|  |     allocate and use heap memory. It also means that many C runtime functions | |||
|  |     must be avoided, particularly on Windows, because of heap operations that | |||
|  |     they may perform. | |||
|  | 
 | |||
|  | A dedicated handler thread is used to preserve the state of the exception thread | |||
|  | when an exception occurs: during dump generation, it is difficult if not | |||
|  | impossible for a thread to accurately capture its own state. Performing all | |||
|  | exception-handling functions on a separate thread is also critical when handling | |||
|  | stack-limit-exceeded exceptions. It would be hazardous to run out of stack space | |||
|  | while attempting to handle an exception. Because of the rule against allocating | |||
|  | resources at exception time, the Breakpad handler library creates its handler | |||
|  | thread when it installs its exception handler. On Mac OS X, this handler thread | |||
|  | is created during the normal setup of the exception handler, and the handler | |||
|  | thread will be signaled directly in the event of an exception. On Windows and | |||
|  | Linux, the handler thread is signaled by a small amount of code that executes on | |||
|  | the exception thread. Because the code that executes on the exception thread in | |||
|  | this case is small and safe, this does not pose a problem. Even when an | |||
|  | exception is caused by exceeding stack size limits, this code is sufficiently | |||
|  | compact to execute entirely within the stack’s guard page without causing an | |||
|  | exception. | |||
|  | 
 | |||
|  | The handler thread may also be triggered directly by a user call, even when no | |||
|  | exception occurs, to allow dumps to be generated at any point deemed | |||
|  | interesting. | |||
|  | 
 | |||
|  | ### Filter Callback
 | |||
|  | 
 | |||
|  | When the handler thread begins handling an exception, it calls an optional | |||
|  | user-defined filter callback function, which is responsible for judging whether | |||
|  | Breakpad’s handler should continue handling the exception or not. This mechanism | |||
|  | is provided for the benefit of library or plug-in code, whose developers may not | |||
|  | be interested in reports of crashes that occur outside of their modules but | |||
|  | within processes hosting their code. If the filter callback indicates that it is | |||
|  | not interested in the exception, the Breakpad handler arranges for it to be | |||
|  | delivered to any previously-installed handler. | |||
|  | 
 | |||
|  | ### Dump Generation
 | |||
|  | 
 | |||
|  | Assuming that the filter callback approves (or does not exist), the handler | |||
|  | writes a dump in a directory specified by the application developer when the | |||
|  | handler was installed, using a previously generated unique identifier to avoid | |||
|  | name collisions. The mechanics of dump generation also vary between platforms, | |||
|  | but in general, the process involves enumerating each thread of execution, and | |||
|  | capturing its state, including processor context and the active portion of its | |||
|  | stack area. The dump also includes a list of the code modules loaded in to the | |||
|  | application, and an indicator of which thread generated the exception or | |||
|  | requested the dump. In order to avoid allocating memory during this process, the | |||
|  | dump is written in place on disk. | |||
|  | 
 | |||
|  | ### Post-Dump Behavior
 | |||
|  | 
 | |||
|  | Upon completion of writing the dump, a second callback function is called. This | |||
|  | callback may be used to launch a separate crash reporting program or to collect | |||
|  | additional data from the application. The callback may also be used to influence | |||
|  | whether Breakpad will treat the exception as handled or unhandled. Even after a | |||
|  | dump is successfully generated, Breakpad can be made to behave as though it | |||
|  | didn’t actually handle an exception. This function may be useful for developers | |||
|  | who want to test their applications with Breakpad enabled but still retain the | |||
|  | ability to use traditional debugging techniques. It also allows a | |||
|  | Breakpad-enabled application to coexist with a platform’s native crash reporting | |||
|  | system, such as Mac OS X’ [CrashReporter](http://developer.apple.com/technotes/tn2004/tn2123.html) | |||
|  | and [Windows Error Reporting](http://msdn.microsoft.com/isv/resources/wer/). | |||
|  | 
 | |||
|  | Typically, when Breakpad handles an exception fully and no debuggers are | |||
|  | involved, the crashed process will terminate. | |||
|  | 
 | |||
|  | Authors of both callback functions that execute within a Breakpad handler are | |||
|  | cautioned that their code will be run at exception time, and that as a result, | |||
|  | they should observe the same programming practices that the Breakpad handler | |||
|  | itself adheres to. Notably, if a callback is to be used to collect additional | |||
|  | data from an application, it should take care to read only “safe” data. This | |||
|  | might involve accessing only static memory locations that are updated | |||
|  | periodically during the course of normal program execution. | |||
|  | 
 | |||
|  | ### Sender Library
 | |||
|  | 
 | |||
|  | The Breakpad sender library provides a single function to send a crash report to | |||
|  | a crash server. It accepts a crash server’s URL, a map of key-value parameters | |||
|  | that will accompany the dump, and the path to a dump file itself. Each of the | |||
|  | key-value parameters and the dump file are sent as distinct parts of a multipart | |||
|  | HTTP POST request to the specified URL using the platform’s native HTTP | |||
|  | facilities. On Linux, [libcurl](http://curl.haxx.se/) is used for this function, | |||
|  | as it is the closest thing to a standard HTTP library available on that | |||
|  | platform. | |||
|  | 
 | |||
|  | ## Future Plans
 | |||
|  | 
 | |||
|  | Although we’ve had great success with in-process dump generation by following | |||
|  | our guidelines for safe code at exception time, we are exploring options for | |||
|  | allowing dumps to be generated in a separate process, to further enhance the | |||
|  | handler library’s robustness. | |||
|  | 
 | |||
|  | On Windows, we intend to offer tools to make it easier for Breakpad’s settings | |||
|  | to be managed by the native group policy management system. | |||
|  | 
 | |||
|  | We also plan to offer tools that many developers would find desirable in the | |||
|  | context of handling crashes, such as a mechanism to determine at launch if the | |||
|  | program last terminated in a crash, and a way to calculate “crashiness” in terms | |||
|  | of crashes over time or the number of application launches between crashes. | |||
|  | 
 | |||
|  | We are also investigating methods to capture crashes that occur early in an | |||
|  | application’s launch sequence, including crashes that occur before a program’s | |||
|  | main function begins executing. |