Dynamic Memory Allocation Summary
Preface
This is the hard part of the engine, it took me a little bit longer to understand the detail, and make me about 1 week behind the class.
In the assignment PA2, it’s important to understand this. Here are my reading notes for Game Engine Architecture Page 205 – 215.
Why we need to do this?
C++ malloc(), free(), new, delete operator is slow, because they have to meet any different situation and need to context-switch from user mode to kernel mode.
Solution
Implement custom dynamic allocation for games. Pre-allocate the memory at the beginning of the game.
A rule in game development:
Keep heap allocations to a minimum, and never allocate from the heap within a tight loop.
Memory Allocator Implementation
Several ways to implement the custom memory allocators
Stack-Based Allocators
Allocate a large contiguous block by malloc(), new, or declaring a global array of bytes.
A pointer to top of the stack is maintained.
Add/Delete from top, memory cannot be freed in an arbitrary order.
Double-Ended Stack Allocators
A single memory can contain two stack allocators.
One from top, one from down.
More efficient.
Pool Allocators
Perfect for the allocation of lots of small blocks of memory which are the same size.
Pre-allocate a large block of memory.
Pre-allocation size should be exact multiple of the size of the elements that will be allocated.
Each element in within the pool is added to a linked list of free elements.
Whenever an allocation request is made, simply grab the next free element off the free list and return it.
When an element is freed, track back it onto the free list.
Single linked list pointer can stored in the free block.
Aligned Allocators
Every data and variable has an alignment requirement.
All memory allocator must be capable of returning aligned memory blocks.
Implement
Allocate a little bit more memory than was actually requested.
Adjust the address of the memory block upward slightly so that it is aligned properly.
Mostly, additional bytes allocated is equal to the alignment.
Determine the adjustment bytes. (By using a mask calculation)
Add adjustment bytes to the original address.
Need to store original address in the byte immediately preceding the adjusted address for free().
Adjustment will never be more than 256 so only 1 bytes will be used.
Return the adjusted address.
Alignment Graph
Code Example
// Aligned allocation function. IMPORTANT: 'alignment' // must be a power of 2 (typically 4 or 16 void* allocateAligned(U32 size_bytes, U32 alignment) { // Clients must call allocateUnaligned() and // freeUnaligned() if alignment == 1. ASSERT(alignment > 1); // Determine total amount of memory to allocate. U32 expandedSize_bytes = size_bytes + alignment; // Allocate an unaligned block & convert address to a // U32 U32 rawAddress = (U32)allocateUnaligned(expandedSize_bytes); // Calculate the adjustment by masking off the lower // bits of the address, to determine how "misaligned" // it is. U32 mask = (alignment - 1); U32 misalignment = (rawAddress & mask); U32 adjustment = alignment - misalignment; // calculate the adjusted address, and return as a // pointer. U32 alignedAddress = rawAddress + adjustment; // Store the adjustment in the four bytes immediately // preceding the adjusted address that we're // returning. U32* pAdjustment = (U32*)(alignedAddress - 4); *pAdjustment = adjustment; // adjustment will never be more than 256 // so only 1 bytes will be used return (void*)alignedAddress; } // corresponding freeAligned() void freeAligned(void* p) { U32 alignedAddress = (U32)p; U8* pAdjustment = (U8*)(alignedAddress - 4); U32 adjustment = (U32)*pAdjustment; U32 rawAddress = alignedAddress - adjustment; freeUnaligned((void*)rawAddress); }
Single-Frame and Double-Buffered Memory Allocators
For temporary data storage for games.
Single-Frame Allocators
A block of memory can be allocated and used the only in the current frame. Clear every frame.
Benefits
Allocated memory needn’t ever be freed
Negative
Requires a reasonable level of discipline on the part of the programmer.
Only valid on current frame.
Never cache a pointer to a single-frame memory block across the frame boundary.
Double-Buffered Allocators
A block of memory can be allocated on frame i to be used on frame (i+1);
Useful for caching the results of asynchronous processing on a multicore game console.
Jin Han
February 18, 2013