Anda di halaman 1dari 78

A Garbage Collector for the C Programming Language

Internal guides: Ms. Padma Ms. Aarthi External guide: Dr.B.Prabhakar Done by, R.Baskar. S.Prince Philemon Raj. V. Ramakrishnan. V.S.Vijay Raj.

Dec 2001 - March 2002

Static Vs. Dynamic Allocation

Early versions of Fortran


All memory was static

C
Mix of static and dynamic allocation Dynamic allocation must be managed 100% by programmer
malloc realloc calloc Free
2

Garbage Collection
Sometimes called Automatic Memory Management . Affects design of programs

Tendency to use painless features Does have cost

Part of overall heap management problem. Supported by many languages like Java and C# ( C sharp ).

What Is Garbage Collection?


Program(mer) requests allocation of memory from heap. If allocation is granted, memory is allocated and address is returned and stored in pointer variable. Contents of pointer variable may be copied so that multiple pointers may exist pointing to same location. The allocated area becomes "garbage" if it is no longer being referenced by any pointer. Typically garbage collection occurs when the runtime system no longer has any free memory to allocate.

What Is Garbage Collection?

Garbage collection describes the process of automatically detecting which memory is no longer in use and reclaiming it. Memory to which no pointers refer is considered to be dead or as garbage. A garbage collector needs to be able to identify root pointers and pointers contained within allocated objects.
5

Types of Garbage Collection


1) Accurate GC. and 2) Conservative GC.

Accurate vs. Conservative GC

A conservative GC assumes every memory word contains a pointer An accurate garbage collector relies on the type system to distinguish pointers
Pointers can be tagged at run time or static type information is derived by compilers
7

Now lets have a look at some of the other Popularly used Garbage Collection Techniques

Various GC Algorithms
a) b)

c)
d) e)

Mark Sweep Garbage Collection. Copying Collectors. Generational Collectors. Real Time GC. Reference Counting. amongst others.

Fully Copying Garbage Collection

Incrementally copy objects from fromspace into to-space .( easy defragmentation)

Redirect

memory accesses between from-space and to-space .


only 50% of memory is used for allocation.
10

But,

Generational Garbage Collection


Most objects die young Memory allocation is performed from a region designated as the nursery Garbage collection focuses effort on the nursery, because thats where the greatest percentage of objects are likely to be dead
Objective: for one tenth the effort required to perform full GC, we reclaim significantly more than one tenth the available garbage
11

Real-Time GC
Mission-critical embedded application may require real-time GC A Real-Time GC algorithm must have the following properties.

Incremental Fully accurate De-fragmenting

12

Reference Counters

Each allocated block of memory contains a counter.


Each time another pointer starts pointing to the block the counter is incremented Each time a pointer stops pointing at a block the counter is decremented If the counter = 0 the block is returned to the free memory list

Problems
If the blocks are small the storage taken up by counters becomes significant Execution time penalty in the form of incrementing the counters each and every time. Circular structures pose difficulties (not insurmountable)

13

Comparative Benefits of Different GC Approaches

Copying:
Fully defragments heap on each pass Relatively expensive read (especially) and write barriers

Mark and Sweep:


Typical heap utilization of 75% Vulnerable to arbitrarily poor worst-case memory utilization due to fragmentation
14

Mark Sweep Algorithm


first algorithm for automatic storage reclamation (McCarthy 1960). a stop and run algorithm. tracing garbage collection technique works in 2 Phases:

mark all live nodes by traversal. sweep the heap by a linear scan.

15

Mark-Sweep Algorithm
Benefits: a) No overhead on pointer manipulations. b) Low space cost: using a simple markfield/bit. Drawbacks: a) Computation halted while gc b) Every cell is visited during marking c) Cells are again re-examined by sweep d) Recursive marking (time and space!)
16

Mark -- Sweep Algorithm

Each block or its corresponding book keeping must contain bit (mark bit) Initially all blocks are unmarked Starting at each symbol perform a search marking all blocks reachable (mark means in-use) Sweep through all blocks.
If marked: Unmark If unmarked: move to free list ( i.e free it )

Note: Algorithm must be only thing running Garbage collection is only done when necessary
i.e. When free list is empty
17

Characteristics of Mark and Sweep Algorithm

Can place an upper bound on the total amount of time required to perform complete GC. Use this to pace GC against ongoing allocation needs. No time intrusive overheads to interfere with calculation of worst-case task execution times. But cannot guarantee to avoid fragmentation.

18

Consider the following Arrangement of nodes


Internally
X foo blarg () bar Y foo bar baz
19

()

baz ()

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

20

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

21

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

22

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

23

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

24

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

25

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

26

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

27

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

28

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

29

Free

Free List
()

Mark

X ()

Y () ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

30

Free

Sweep

Free List
()

X ()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

31

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

32

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

33

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

34

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

35

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

36

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

37

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

38

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

39

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.
40

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

41

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

42

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

43

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

44

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

45

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

46

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

47

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

48

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

49

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

50

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

51

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

52

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

53

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

54

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

55

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

56

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

57

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

58

Free

Sweep

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

()

59

Free

Done

Free List
()

()

Y ()

Here , X and Y correspond to the root Pointers and the block pointed to by Free Corresponds to the Free List.

()

60

As Simple as ......

61

Our Algorithm
The Algorithm that we follow in our Garbage Collector is the Periodic Conservative Mark-Sweep GC Algorithm . By Periodic what we mean is our GC routine runs whenever allocation exceeds a predefined limit.This limit can also be defined by the user. Conservative GC algorithms have the following Properties.
62

Comments on Conservative GC
Conservative GC cannot guarantee to reclaim all of the dead memory. Conservative GC can be guaranteed not to reclaim live memory. Nevertheless, conservative GC:

Has very low run-time overhead, since no runtime type information Works well in practice
63

Detecting Garbage
Roots

64

The C Interface for our Collector


The following functions have been defined in our product. They are void * GC_malloc( size_t size) void * GC_malloc_uncollectable( size_t size) void* GC_malloc_finalizer( size_t size , void ( * func ) () ) void GC_free( void * waste ) void GC_set_gc_rate( size_t max_heap_size ) Void GC_collect()
65

Void * GC_malloc(size_t size)

The memory that gets allocated by this routine will be subject to Garbage Collection by our GC Algorithm. The argument size holds the size of the object for which a request has been made for allocation. The return type is - pointer to the allocated object if successful. - NULL if request cannot be satisfied.
66

Void * GC_collect(void )

This function can be used by the programmer to envoke the collector at any point of time.

67

Void * GC_malloc_uncollectable(size_t size)

The memory that gets allocated by this routine will NOT be subject to Garbage Collection by our GC Algorithm. The argument size holds the size of the object for which a request has been made for allocation. The return type is - pointer to the allocated object if successful. - NULL if request cannot be satisfied. Memory allocated in this manner must be freed explicitly using the function GC_free().
68

C with Finalizers
A finalizer is a routine that gets called just before an object is freed. This may be used to perform some clean up activities that the object may wish to do just before getting destroyed. C has no inherent support for Finalizers We have developed a module that implements Finalizers for the C programming language.

69

Void * GC_malloc_finalizer(size_t size , void ( * func) () ) The memory that gets allocated by this routine will be subject to Garbage Collection by our GC Algorithm. The argument size holds the size of the object for which a request has been made for allocation. The argument func holds the address of the function/finalizer that will be called before freeing collecting the allocated object by our GC routine. The return type is - pointer to the allocated object if success full. - NULL if allocation fails. 70

Void GC_free(void* waste)


This

function frees ( adds to the freelist ) the object pointed to by the pointer argument waste.

71

How to use our Product?


1) Usage of the function void * malloc( size_t size ) is replaced with our GC function void* GC_malloc( size_t size) 2) No corresponding free(void * ptr) is needed for the memory allocated using GC_malloc(). The memory allocated using GC_malloc() are automatically collected by our GC routine.
72

Programmers Choice!!!!!!!
We do not force the programmer to use only our memory allocation routines. He is free to use either the standard library functions like malloc() or our GC routines. But the memory allocated by using malloc and other standard routines are not garbage collected by us. Collecting them is as usual the job of the Programmer.
73

Our Product is..


A Free-Ware ( free download from our college website) Its of Open-Source type Platform Dependent Tested to work properly with little overhead on the following operating systems (as of now)

FreeBSD 2.2.8-RELEASE FreeBSD 4.3 -RELEASE RED HAT LINUX 7.2


74

GC vs. Explicit Memory Management


GC requires time lag between when memory becomes garbage and when its recycled Incremental and real-time GC imposes extra overhead on applications (5-15%) Properly paced de-fragmenting real-time GC offers more reliable real-time operation than explicit management Expect GC to run 10-20% slower and 25% larger than explicit memory management Saves 40% of the total cost of development (- by Rovner)

75

Key Benefits of GC:

40% productivity boost during development and maintenance Improved reliability:


No dangling pointers Simpler application software Reduced memory leaks
76

Summary
Garbage Collection as described has a lot of advantages. C , one of the most widely used programming languages, lacks this feature. We through this project have improved its functionality by designing a GC for it. How it has been done is given in a more detailed manner in our manual .

77

Our Sincere Thanks to


1) 2)

3)
4) 5)

Dr.C.Aravindan, HOD CSE Dept. Mr. Nick Brans, Research Scholar Mr. Hans Boehm, Hewlett Packard Mr. Karl, Debian Linux Mr. Dave, Sun Micro Systems who have helped us a lot during the different stages of our project.
78