Garbage Collection in Data Structure

Garbage collection (GC) is a dynamic technique for memory management and heap allocation that examines and identifies dead memory blocks before reallocating storage for reuse. Garbage collection's primary goal is to reduce memory leaks. Garbage collection frees the programmer from having to deallocate and return objects to the memory system manually. Garbage collection can account for a considerable amount of a program's total processing time, and as a result, can have a significant impact on performance. Stack allocation, region inference, memory ownership, and combinations of various techniques are examples of related techniques.

The basic principles of garbage collection are finding data objects in a program that cannot be access

ed in the future and reclaiming the resources used by those objects. Garbage collection does not often handle resources other than memory, such as network sockets, database handles, user interaction windows, files, and device descriptors. Methods for managing such resources, especially destructors, may be sufficient to manage memory without the requirement for GC. Other resources can be associated with a memory sector in some GC systems, which, when collected, causes the task of reclaiming these resources.

Many programming languages, such as RPL, Java, C#, Go, and most scripting languages, require garbage collection either as part of the language specification or effectively for practical implementation (for example, formal languages like lambda calculus); these are referred to as garbage-collected languages. Other languages, such as C and C++, were designed for use with manual memory management but included garbage-collected implementations. Some languages, such as Ada, Modula-3, and C++/CLI, allow for both garbage collection and manual memory management in the same application by using separate heaps for collected and manually managed objects; others, such as D, are garbage-collected but allow the user to delete objects manually and completely disable garbage collection when speed is required.

Garbage collection's dynamic approach to automatic heap allocation addresses common and costly faults that, if left undiscovered, can lead to real-world programmer problems.

Allocation errors are costly because they are difficult to detect and correct. As a result, many programmers regard garbage collection as an essential language feature that simplifies the programmer's job by reducing manual heap allocation management.

Now let us have a look at some of the most famous and commonly implemented Garbage Collection techniques.

Mark and Sweep
Reference Counting

Mark and Sweep

The Mark Sweep algorithm is as straightforward as its name suggests. It consists of two phases: a mark phase and a sweep phase. The collector crawls across all the roots (global variables, local variables, stack frames, virtual and hardware registers, and so on) and marks every item it meets by setting a bit anywhere around that object during the mark phase. It also walks across the heap during the sweep phase, reclaiming memory from all the unmarked items.

The fundamental algorithm is outlined in pseudo-code in Python below. The collector is assumed to be single-threaded in this example, although there might be several mutators. While the collector is running, all mutator threads are paused. This stop-the-world technique may appear inefficient, but it vastly simplifies the collector implementation because mutators cannot affect the state beneath it.

Code

def gc():
    stop_all_mutators()
    mark_roots()
    sweep()
    resume_all_mutators()

def mark_roots():
    candidates = Stack()
    for field in Roots:
        if field != nil && not is_marked(field):
            set_marked(field)
            candidates.push(field)
            mark(candidates)

def mark(candidates):
    while not candidates.empty():
        ref = candidates.pop()
        for field in pointers(ref):
            if field != nil && not is_marked(field):
                set_marked(field)
                candidates.push(field)

def sweep():
    scan = start_of_heap()
    end = end_of_heap()
    while scan < end:
        if is_marked(scan):
            unset_marked(scan)
        else:
            free(scan)
        scan = next_object(scan)

def next_object(address):
    # Parse the heap and return the next object.
    ...

It is evident from the pseudo-code that mark-sweep does not immediately identify rubbish. Instead, it first recognizes all items that aren't rubbish, such as living things, before concluding that everything else is garbage. The process of marking is a cyclical one. We recurse into its child fields after detecting a live reference, and so on. Because of the time cost and risk for stack overflow, recursive procedure calls aren't a suitable way for marking. That's why we're utilizing a stack that's explicitly defined. The space and time overhead of the marking phase are both made obvious by this technique. The size of the longest path that must be traced via the object graph determines the maximum depth of the candidate's stack.

Theoretically, the worst case is equal to the number of nodes on the heap. However, most real-world applications yield rather shallow stacks. Despite this, a secure GC system must deal with unusual scenarios. We use the mark() right after adding a new object to the candidates in our implementation to keep the stack size under control. The problem with marking is that GC is required exactly because there is little memory, yet auxiliary stacks demand more space. Large applications might lead the trash collector to run out of memory.

There are a variety of approaches to detect overflow. One advantage of using an explicit stack is that an overflow may be immediately identified and a recovery procedure initiated. Using an inline check-in for each push is a straightforward approach ( ). Using a guard page and triggering recovery after trapping the guard violation exception might be a somewhat more efficient solution. Both techniques' tradeoffs must be considered in the context of the underlying operating system and hardware. The is-full test will probably cost a few instructions (test followed by a branch) in the first technique, but it will be performed every time we inspect an object. The second technique necessitates catching access violation exceptions, which are often costly but uncommon.

Sweep() is a simple function with a straightforward implementation. It linearly traverses the heap, freeing any objects that aren't tagged. Our heap layout does face parseability restrictions as a result of this. The next object(address) implementation must be able to return the heap's next object. In most cases, the heap just has to be parseable in one way. In most GC-enabled language runtimes, an object's data is often tagged with an object header. The header provides details about the item, such as type, size, hashcode, mark bits, sync block, etc.

The header of an object is usually placed before the object's data. As a result, the object's reference points to the middle of the allocated heap cell immediately after the object header, rather than the first byte. This makes it easier to parse the heap from the top down. In most cases, free(address) will fill the freed cell with a predetermined filler pattern that the heap parsing algorithm recognizes.

Advantages of Mark and Sweep Algorithm

The usage efficiency of hardware cache is usually the deciding factor in the performance of most applications. The L1-L3 caches may now be accessed in 2 to 10 CPU cycles, whereas the RAM can take up to 100 cycles. Caches help applications with good temporal and spatial locality operate better. When a program accesses a memory place that has recently been accessed, it is said to be temporal local. If a program accesses nearby memory regions in a scan-like pattern, it has a high spatial locality. Unfortunately, the mark phase in the mark-sweep algorithm fails miserably regarding the temporal and geographical locality. The header of an object is normally read and written just once in mark() (assuming that most objects are popular and are referenced by only a single pointer). We read the mark bit, and if the object hasn't been marked yet, it won't be accessed again. Hardware prefetching (whether speculative or via explicit prefetch instructions) isn't ideal for such erratic pointer chasing. Instead of making the mark bits part of the object headers, one typical strategy for improving cache speed is to place them in a separate bitmap. The bitmap's format, position, and size are determined by various parameters, including heap size, object alignment requirements, hardware cache sizes, etc. The mark-sweep algorithm benefits from these marking bitmaps in terms of performance. Marking, for example, does not need object modification; numerous objects can be marked with a single instruction (bit whacking against a bitmap word). Because it alters fewer words, it generates fewer dirty cache lines, resulting in fewer cache flushes. Sweeping does not need to read any active objects and may depend entirely on the bitmap for heap scanning.
The mark phase has an O(L) complexity, where L is the size of living objects accessible from all roots. The sweep phase's temporal complexity is O(H), where H is the number of heap cells. Given that H > L, it's easy to think that O(H) dominates O(L), but in reality, the sweep phase has excellent cache performance owing to high spatial locality, but the whole collection pace is dominated by O(L) due to all the cache-unfriendly pointer chasing.
Because marking is a costly procedure, it is only done on a limited basis (only when required). The mark-sweep approach uses less space and can cleanly handle cyclical structures without any pointer manipulation complexity compared to reference counting techniques. It, like other tracing algorithms, demands certain heap headroom to function. Additionally, because mark-sweep does not compact the heap, the system may experience increased internal fragmentation, resulting in lower heap utilization (especially for larger allocations).
With mutator's read and write operations, mark-sweep adds essentially no coordination overhead. The object allocation function is the sole way to interact with the mutators, and even then, the overhead is small.
In general, complicated allocators that comprehend and support heap parsing and bitmap manipulation are required for mark-sweep systems. Heap managers may need to design non-trivial implementation solutions to deal with internal fragmentation. Mark sweep, on the other hand, because it does not move objects, is a good candidate for usage in non-cooperative contexts where the language runtime does not coordinate with the garbage collector (it can happen if the GC was introduced as an afterthought in the language design). Another benefit of not moving is that object addresses do not change. Therefore no patching is required after the sweep phase.

Reference Counting

The method of reference counting is really easy. It is based on counting how many pointer references each allocated object has. It's a straightforward, inherently incremental solution because the program's memory management overhead is distributed. Aside from memory management, reference counting is widely used in operating systems as a resource management tool for managing system resources such as files, sockets, etc.

Each allocated object in the reference counting technique has a reference count field. The memory manager is in charge of ensuring that the reference count of each object is equal to the number of direct pointer references to that object at all times. Below is a simplified version of the algorithm.

Code

# new() allocates a new object. For brevity, we've ignored the object types and
# assumed that all objects are of the same type and size.
def new():
	obj = allocate_memory()
	obj.set_reference_count(1)
	return obj

# delete() is invoked when an object is no longer required by the client program
def delete(obj):
	obj.decrement_reference_count()
	if obj.get_reference_count() == 0:
		for child in children(obj):
			delete(child)
		release_memory(obj)

# update() is the only blessed way to perform pointer assignments in the system.
def update(source, target):
	# We increment before deleting, this correctly deals with source == target case.
	target.increment_reference_count()
	delete(source)
	source = target

The inability to recover cyclic storage is the most significant disadvantage of reference counting. Cyclic data structures such as doubly-linked lists and non-basic graphs cannot be successfully recovered using a simple reference counting technique and will leak memory.

Advantages of Reference Counting

Compared to tracing collectors, the memory management cost is dispersed across the application, resulting in a significantly smoother and responsive system. It's worth noting that the processing cost is proportional to the size of the sub-graph referenced by the final pointer, and it's not always trivial.
A reference counting system's spatial locality is usually no worse than that of the actual client program, and it's usually better than that of tracing GCs, which must trace across all living objects.
Unlike tracing collectors, which leave inaccessible memory unallocated until the collector executes (usually on heap depletion), the reference counting technique allows the wasted memory to be reused right away. Because of the instant reuse, caches have a greater temporal locality, resulting in fewer page faults. It also makes resource cleanup easier because finalizers may be called immediately, resulting in faster system resource release. Immediate reuse of space also allows for improvements such as in-place data-structure modifications.
In terms of technical specifics, reference counting-based collection is the simplest garbage collection approach. If the language runtime doesn't enable pointer manipulation and/or the programmers can't determine/manipulate the object roots, the implementation is extremely simple.
The programmer can have total control over the allocation and deallocation of an object using a reference counting technique. It may be possible for a programmer to optimize away the reference counting cost in places where it is judged safe. This represents difficulty in terms of accuracy and therefore necessitates a greater level of code discipline. Even in the absence of smart optimizations, the interface of a client application and the reference counting method are tightly coupled. Clients must appropriately call operations that increase and reduce reference counts.
Each item carries the space above the reference-count field. This might theoretically equal a 50% overhead for very tiny items. This expense must be considered against the fact that memory cells can be reused right away and that reference counting does not utilize heap space during collection. Instead of utilizing a complete word for ref-count, a reference counting system might save space by employing a single byte. Such systems use a fall-back tracing mechanism (like mark-sweep) to gather objects with maxed-out reference counts and reference counting (and circular references).
Unlike tracing techniques, where pointer changes are free, reference counting has a large cost since each pointer update necessitates updating two reference counts to keep the program valid.
As previously stated, reference counting's major flaw is its inability to recover cyclic storage. Cyclic data structures such as doubly-linked lists and non-basic graphs cannot be successfully recovered using a simple reference counting technique and will leak memory.

Conclusion

So, in this article, we have understood the Garbage collection in data structure and the importance of garbage collection to make different data structures more efficient. We also understood the two major garbage collection algorithms named Mark and Sweep and Reference Counting and the working of both these algorithms, along with the prominent advantages of these garbage collection algorithms mentioned above.

Next TopicMerge Sort on Doubly Linked List

← prev next →