Skip to content

Latest commit

 

History

History
336 lines (248 loc) · 15.3 KB

overview.md

File metadata and controls

336 lines (248 loc) · 15.3 KB
Interface Overview Tutorial Slides FAQ Example Download

A garbage collector for C and C++

  • Platforms
  • Some collector details
  • Further reading
  • Information provided on the BDWGC site
  • Documentation files
  • More background information
  • Contacts and new release announcements

[ This is an updated version of the page formerly at www.hpl.hp.com/personal/Hans_Boehm/gc/, before that at http://reality.sgi.com/boehm/gc.html and before that at ftp://ftp.parc.xerox.com/pub/gc/gc.html. ]

The Boehm-Demers-Weiser conservative Garbage Collector (BDWGC) can be used as a garbage collecting replacement for C malloc or C++ new. It allows you to allocate memory basically as you normally would, without explicitly deallocating memory that is no longer useful. The collector automatically recycles memory when it determines that it can no longer be otherwise accessed. A simple example of such a use is given here.

The collector is also used by a number of programming language implementations that either use C as intermediate code, want to facilitate easier interoperation with C libraries, or just prefer the simple collector interface. For a more detailed description of the interface, see here.

Alternatively, the garbage collector may be used as a leak detector for C or C++ programs, though that is not its primary goal.

Typically several versions are offered for downloading: preview, stable, legacy. Usually you should use the one marked as the latest stable release. Preview versions may contain additional features, platform support, but are likely to be less well tested. The list of changes for each version is specified on the releases page. The development version (snapshot) is available in the master branch of bdwgc git repository on GitHub.

The arguments for and against conservative garbage collection in C and C++ are briefly discussed here.

The garbage collector code is copyrighted by Hans-J. Boehm, Alan J. Demers, Xerox Corporation, Silicon Graphics, Hewlett-Packard Company, Ivan Maidanski, and partially by some others. It may be used and copied without payment of a fee under minimal restrictions. See the LICENSE file in the distribution for more details. IT IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED OR IMPLIED. ANY USE IS AT YOUR OWN RISK.

Empirically, this collector works with most unmodified C programs, simply by replacing malloc and calloc with GC_malloc calls, replacing realloc with GC_realloc calls, and removing free calls. Exceptions are discussed here.

Platforms

The collector is not completely portable, but the distribution includes ports to most standard PC and UNIX/Linux platforms. The collector should work on Linux, Android, BSD variants, OS/2, Windows (Win32 and Win64), MacOS X, iOS, HP/UX, Solaris, Tru64, Irix, Symbian and other operating systems. Some platforms are more polished (better supported) than others.

Irix pthreads, Linux threads, Windows threads, Solaris threads (pthreads only), HP/UX 11 pthreads, Tru64 pthreads, and MacOS X threads are supported.

See also here for the instructions on how to port the library to new platforms.

Some Collector Details

The collector uses a mark-sweep algorithm. It provides incremental and generational collection under operating systems which provide the right kind of virtual memory support. (Currently this includes SunOS[45], IRIX, OSF/1, Linux, and Windows, with varying restrictions.) It allows finalization code to be invoked when an object is collected. It can take advantage of type information to locate pointers if such information is provided, but it is usually used without such information. See the README and gc.h files in the distribution for more details.

For an overview of the implementation, see here.

The garbage collector distribution includes a C string (cord.h) package that provides for fast concatenation and substring operations on long strings. A simple curses- and Windows-based editor that represents the entire file as a cord is included as a sample application. See cords.md file for the details.

Performance of the non-incremental collector is typically competitive with malloc/free implementations. Both space and time overhead are likely to be only slightly higher for programs written for malloc/free (see Detlefs, Dosser and Zorn's Memory Allocation Costs in Large C and C++ Programs). For programs allocating primarily very small objects, the collector may be faster; for programs allocating primarily large objects it will be slower. If the collector is used in a multi-threaded environment and configured for thread-local allocation, it may in some cases significantly outperform malloc/free allocation in time.

We also expect that in many cases any additional overhead will be more than compensated for by e.g. decreased copying if programs are written and tuned for garbage collection.

Further reading

The following provide information on garbage collection in general:

Paul Wilson's garbage collection ftp archive and GC survey.

The Ravenbrook Memory Management Reference.

David Chase's GC FAQ.

Richard Jones' Garbage Collection page and his book mentioned on the page.

The following papers describe the collector algorithms we use and the underlying design decisions at a higher level:

(Some of the lower level details can be found here.)

The first one is not available electronically due to copyright considerations. Most of the others are subject to ACM copyright.

Boehm, H., Dynamic Memory Allocation and Garbage Collection, Computers in Physics 9, 3, May/June 1995, pp. 297-303. This is directed at an otherwise sophisticated audience unfamiliar with memory allocation issues. The algorithmic details differ from those in the implementation. There is a related letter to the editor and a minor correction in the next issue.

Boehm, H., and M. Weiser, Garbage Collection in an Uncooperative Environment, Software Practice and Experience, September 1988, pp. 807-820.

Boehm, H., A. Demers, and S. Shenker, Mostly Parallel Garbage Collection, Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, SIGPLAN Notices 26, 6 (June 1991), pp. 157-164.

Boehm, H., Space Efficient Conservative Garbage Collection, Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design and Implementation, SIGPLAN Notices 28, 6 (June 1993), pp. 197-206.

Boehm, H., Reducing Garbage Collector Cache Misses, Proceedings of the 2000 International Symposium on Memory Management. Official version. Describes the prefetch strategy incorporated into the collector for some platforms. Explains why the sweep phase of a mark-sweep collector should not really be a distinct phase.

M. Serrano, H. Boehm, Understanding Memory Allocation of Scheme Programs, Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming, 2000, Montreal, Canada, pp. 245-256. Official version. Includes some discussion of the collector debugging facilities for identifying causes of memory retention.

Boehm, H., Fast Multiprocessor Memory Allocation and Garbage Collection. Discusses the parallel collection algorithms, and presents some performance results.

Boehm, H., Bounding Space Usage of Conservative Garbage Collectors, Proceedings of the 2002 ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Jan. 2002, pp. 93-100. Official version. Includes a discussion of a collector facility to much more reliably test for the potential of unbounded heap growth.

The following papers discuss language and compiler restrictions necessary to guaranteed safety of conservative garbage collection:

We thank John Levine and JCLT for allowing us to make the second paper available electronically, and providing PostScript for the final version.

Boehm, H., Simple Garbage-Collector-Safety, Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design and Implementation.

Boehm, H., and D. Chase, A Proposal for Garbage-Collector-Safe C Compilation, Journal of C Language Translation 4, 2 (December 1992), pp. 126-141.

Other related information:

The Detlefs, Dosser and Zorn's Memory Allocation Costs in Large C and C++ Programs. This is a performance comparison of the Boehm-Demers-Weiser collector to malloc/free, using programs written for malloc/free.

Joel Bartlett's Mostly Copying Conservative Garbage Collector for C++.

John Ellis and David Detlef's "Safe, Efficient Garbage Collection for C++" proposal.

Henry Baker's Archive of Research Papers.

Slides for Hans Boehm's Allocation and GC Myths talk.

Information provided on the BDWGC site

Known BDWGC users list.

Tutorial slides from an ISMM 2004: The Boehm-Demers-Weiser Conservative Garbage Collector.

A FAQ (frequently asked questions) list.

Directory containing the distribution files of all garbage collector releases. It duplicates Download page on GitHub.

Documentation files

The following documents are not platform-specific in general.

A simple illustration of how to build and use the collector.

Description of alternate interfaces to the garbage collector.

How to use the garbage collector as a leak detector.

Some hints on debugging garbage collected applications.

An overview of the implementation of the garbage collector.

The data structure used for fast pointer lookups.

Scalability of the collector to multiprocessors.

Instructions on building the library using autoconf/configure.

Instructions on building the library using cmake.

List of environment variables that affect the collector operation at runtime.

List of compile time macros that affect the library when built.

Details on the finalization facility.

Instructions on how to port the library to new platforms.

Description of the cord library built on top of GC.

More background information

An attempt to establish a bound on space usage of conservative garbage collectors.

Mark-sweep versus copying garbage collectors and their complexity.

"Why Conservative Garbage Collectors?" (pros and cons of conservative garbage collectors, in comparison to other collectors).

Advantages and Disadvantages of Conservative Garbage Collection (issues related to garbage collection versus manual memory management in C/C++).

An example of Expensive Explicit Deallocation in which the garbage collection results in a much faster implementation as a result of reduced synchronization.

Slide set discussing Performance of Non-Moving Garbage Collectors.

Slide set discussing "Destructors, Finalizers, and Synchronization", POPL 2003 (and the corresponding paper).

A Java/Scheme/C/C++ garbage collection benchmark.

Slides for talk on Memory Allocation Myths.

Slides for OOPSLA 98 Garbage Collection Talk.

Some other related papers.

Contacts and new release announcements

GitHub and Stack Overflow are the major two places for communication.

Technical questions (how to, how does it work, etc.) should be posted to Stack Overflow with boehm-gc tag.

To contribute, please rebase your code to the latest master and submit a pull request to GitHub.

To report a bug, or propose (request) a new feature, create a GitHub issue. Please make sure it has not been reported yet by someone else.

To receive notifications on every release, please subscribe to Releases RSS feed. Notifications on all issues and pull requests are available by watching the project.

Mailing lists (bdwgc-announce@lists.opendylan.org, bdwgc@lists.opendylan.org, and the former gc-announce@linux.hpl.hp.com and gc@linux.hpl.hp.com) are not used at this moment. Their content is available in bdwgc-announce and bdwgc archive files, respectively. The gc list archive may also be read at Narkive.

Some prior discussion of the collector has taken place on the gcc java mailing list, whose archives appear here, and also on gclist@iecc.com.