Saturday, November 29, 2008

Memory Management in PIRC

PIRC is a new implementation of the PIR language, and currently under heavy construction. While the code is documented, it does not provide an easy overview of implementation issues and design decisions. In order to solve this documentation gap, I'm writing these design decisions in a series of articles. This very short article discusses the memory management of PIRC. 

The current PIR compiler, IMCC, has had a lot of memory leaks, most of which have been solved by now. Although conceptually manual memory management through malloc() and free() is very simple and straightforward, things become ugly once you loose overview of when datastructures go out of scope, and as a result you forget to free the memory. In PIRC a different approach is taken.
Note that instead of using malloc() and free() directly, you should use mem_sys_allocate() and mem_sys_free(), respectively, which are provided by Parrot.

PIRC parses the PASM or PIR input, builds a data structure, and generates a Parrot Byte Code file (a so-called Packfile). (at the moment of writing, the PBC is not generated yet, but work is underway to fix this; might take a while, though). The data structures that are created represent the PASM or PIR program being parsed, and as such can be considered an Abstract Syntax Tree.
As these data structures are used throughout the compilation phase, PIRC can cheat with memory management. Instead of freeing the memory of the data structures by manually calling mem_sys_free(), PIRC keeps track of all allocated memory blocks. Only when PIRC is done with the bytecode generation will it release all resources. This is fine, because the data structures are needed up to the end anyway, so memory is not occupied longer than necessary.

Poor Man's Garbage Collector
Whenever PIRC needs memory, a block of memory is allocated through Parrot's memory allocation function, mem_sys_allocate() (or a variant, which zeroes out all bytes). Before returning a pointer to the allocated block of memory, however, PIRC stores a pointer to the block of memory as well (in a list), and only then is the pointer to the block returned.
When PIRC is done with compiling, it will go through the list of memory pointers, and release the memory pointed to by each of the pointers. In a sense, you could consider this a garbage collector, except that there's no reuse of memory (but there's no need to anyway).

"Wait a minute," you might say, where are these pointers stored then? Well, of course, these pointers are stored in an extensible list, as we don't know the number of pointers to store beforehand. Surely the list cannot be allocated and store a pointer to itself.  This is fine, because the problem of keeping track of memory is now isolated to a single point in the program. The nodes in the list of allocated memory pointers are allocated directly through Parrot's memory functions. When all stored pointers are mem_sys_free()d, we only have to remember to mem_sys_free() these nodes. 

In this way, there's no need to worry about when pointers should be freed: it's done automatically, as long as PIRC allocates its memory through its built-in memory management system.

List of Pointers
Obviously, we don't want to create a new node for each pointer to store. Instead, a node can store a number of pointers, currently set to 512. So, after allocating memory for 512 times, a new node is created. This number of 512 was decided upon after some experimentation, but might prove too low for real world PIR input. As it's #define'd, it's easy to change, though.

In this very short article, I described how PIRC does its memory management. PIRC cheats a bit, by storing each pointer to an allocated block of memory in a special list. Once PIRC is done with compiling, all these pointers are passed to mem_sys_free(), Parrot's free() function (one by one, obviously). This way, you don't have to worry about when to free the memory in PIRC.


Justin said...

That's one way to do it. Why not the APR way of things? Creating memory pools in which memory can only be allocated (not freed) and after each step is finished you simpy reclaim the entire pool (no fragmentation, no pointer tracking and they even use it to keep track of filedescriptors).

Another trick would be to simply allocate sizeof(void*) in each allocation and use that to build a single-linked-list (NULL marks the end). This keeps you from having to allocate additional pointer blocks and it's more CPU cache friendly (since you will only need to access the last memory allocation (probably cache-hot) and the new allocation (which will probably be needed soon).

But these are just simple alternatives and I haven't dug in to see whether there are other reasons for not using these schemes. Or having looked at the implementation at all for that matter, just my $0.02 ;-)

kjs said...

Justin, thanks for sharing your thoughts. I'm not sure what "APR" way of things is.. Creating memory pools and reclaiming the pool after each step: all allocated memory is needed till the end of the compilation, so there's no different phases or stages regarding memory requirements. Otherwise, yes, that'd be a nice approach. Your second option, allocating a sizeof (void*) in each allocation to build a single linked list: if I understand correctly, that does mean there's a node for each allocation, which is sizeof (void *) + sizeof (pointer to node), which means there's a 100% overhead on the "next" pointers. In the implemented solution, there's only one "next" pointer needed for each 512 allocations (and that number could be changed).

Justin said...

APR is the Apache Runtime library. It's centered around the idea that all allocations are done for some specific request/task and that most (or even all) of that memory will be needed for the duration of that request/task.

In the other case I meant that the additional pointer for the singly linked list is part of the allocation itself (not of a meta data-structure). So if you need to allocate 12 bytes you would allocate 12 bytes + sizeof(pointer) and return the pointer of that allocation + sizeof(pointer) bytes so that the person who requested that memory does not see this overhead.

Allocations using the APR trick work from allocation pools (each pool is allocated using malloc-like things but serves multiple memory allocation till it's full).
The other trick uses a malloc-like thing per allocation but uses pointer tricks to hide it's own meta-data.
I have no idea if this is allowed/acceptable in this context or that the allocations themselves must be pure malloc-like allocations.