The Heap ======== Arrays and structs are stored in the heap. Variables only store a pointer to the actual data. Local variables and intermediate results are stored on the stack. Allocated variables are actually stored on a shadow stack. The first entry on the shadow stack is a struct variable which represents all global data. (global data must be mocked up like a struct data) All allocated blocks are accessible through the 'shadow' stack. (the shadow stack must be mocked up like an array during GC run) Allocated data may again contain (pointers to) allocated data. The heap is defined by 3 pointers: ptr heap_start; ptr heap_ptr; ptr heap_end; Data blocks are stored as follows: dw size; // data size without uint size itself db data[size]; // raw data 'size' contains flags in msbits: size.bit15 GC bit: normally cleared set or cleared by the recursive runs during GC indicate at some point that a block was found and is still in use size.bit14 deep bit: 1 => this block contains allocated items. size.bit13 if bit14 is set: 1 => struct; 0 => array. This imposes a maximum sizes on arrays: 0x3fff bytes for flat arrays and structs (bit14=0) 0x1fff bytes for arrays and structs with allocated items (bit14=1) Structs which contain data members of allocated data types start with an uint8 array which defines the offsets of these data members in the block data. This array is shared between all instances of the same type and must be located outside the heap, e.g. directly in the program code. dw size; // raw data size without uint size itself dw offsets; // this is only the address of the array! db data[size-2]; // struct data The Garbage Collection ====================== This is a 'Java'-style garbage collection: it finds all used blocks and compacts the heap by removing unused blocks and moving the used blocks down and updating the addresses in all variables. Limitations: No finalizer function Pointers to items inside allocated blocks can't be updated and are not allowed (while a GC can occur). You must use indexes to iterate over arrays. Flat arrays and structs with only flat items with const data can be used in ROM. Deep arrays or structs with allocated items can't be used in ROM. No blocks can be used directly in EEPROM. (unless special code added) Blocks accessed by interrupt service routines must be outside heap or in the lowest range of never moving blocks. then function Array.count() must handle bit15 which may be set. Method of work: Visit all variables and mark them as used. Allocated a temp buffer bu[]. Divide the used heap into 3 ranges for already processed, currently processed and not yet processed blocks. We'll start the current range at heap_start and move up to heap_end. Ideally the current range covers the whole heap from heap_start to heap_end, then we need only one run. For each current range: For each used block store block.size in bu[i] and store the address of bu[i] in block.size. Visit all variables and if they point into the current range then set their address to bu[i]. For each used block move down the block to fill the gap, restore block.size from bu[i] and set bu[i] to the new address. Visit all variables and if they point into bu[] then set their address to the address in bu[i]. The GC needs a temp. buffer bu[] to hold addresses or sizes. This can be statically allocated or reserved at the end of the heap or allocated on the stack. A larger buffer increases speed. ptr bu | dw size +----> dw data[size/2] The used heap is divided into 4 areas, delimited by 5 pointers. It starts with everything in range 4 and ends when everything is in range 1. ptr heap_start start of range1 of finished block ptr gc_a start of range2 which only contains free space ptr gc_p start of range3 which is currently compacted ptr gc_e start of range4 of unprocessed blocks ptr heap_ptr start of free space in heap The GC needs to access the shadow stack like an array, for simplicity. This is set up by the function which retrieves the shadow stack. ptr locals | dw size +----> dw lvars[size/2] The recursion function to process all used blocks takes an argument 'i'. this is used to prevent that long lists of nested or linked items result in deep recursion with an unpredictably high stack usage. the assumption is that linking in such a case is always done with an item at the same position. Then we'll effectively iterate and not recurse over this list. This ('i') is the index of the current block in it's parent struct (if any). If this block again is a struct with allocated items, then this index will be processed last without a recursive call, but by replacing the current block with the item at this index and restarting the function. Data in ROM: The recursion function toggles bit15 in block.size to prevent infinite recursion in circular lists. Flat arrays (and structs with only flat items) can be stored in ROM. Though the result will be, that recursive calls may try to set this bit multiple times. Deep arrays and structs (with allocated items) can NOT be stored in ROM. In recursion, when clearing bit15 indicates processed blocks, then the block will never be processed, because it looks like processed from the start. Arrays and Structs, whether flat or deep, can NOT be stored in EEPROMS, unless the code actively supports this and prevents writing to block.size.bit15! GC Run: ======= // variables are visited using a recursion helper function. // this is a template for the 'payload' function: bool fu(var,block) // return 1 to resume, 0 to exit { assert (block!=0) if (block.size.bit15) return 0 // already processed set block.size.bit15 // set marker // do something return 1 } // asm implementation: // this function set's bit15 in all used variables. // this is what is called first. // in: hl -> var // de -> block.data // out: ret: exit, already visited // else jp gc_recursion set_bit15: dec hl bit 7,(hl) ret nz ; already set set 7,(hl) jp gc_recursion // because we have 3 variable runs we need a 4th to reset bit15: clr_bit15: dec hl bit 7,(hl) ret z ; already cleared clr 7,(hl) jp gc_recursion // recursion helper: function visit_all_vars(hl=var,de=block,ix=fu,i) { L1: if(fu(var,block)==0) return; // already processed if (block.size.bit14==0) return // no allocated items // handle recursion: // // for all allocated items: // if item != null call self(var,block,fu,i) if (block.size.bit13) // struct { offsets = peekw(block+0); n = offsets.size a = offsets e = a + n while(a block 1 (finished) = empty gc_p = gc_a // => block 2 (free space) = empty gc_e = gc_a // => block 3 (in work) = empty bu = get_temp_buffer(); // decide where & how large, setup bu.size and return ptr to bu.data locals = get_shadow_stack(); // calculate and prefix stack with locals.size and return ptr to locals.data // find all allocated blocks: fu(var) {/*nop*/} visit_all_vars(locals, fu, 0, 1); •STATE: ranges: only range4 is used (initial state) variables: all variables point to their blocks blocks: all used blocks have bit15 set all unused blocks haven't bu[]: unused // grow range1 (finished): step over the first used blocks which will not move: while(gc_a < heap_ptr) { if(gc_a.size.bit15 == 0) break; } gc_e = gc_p = gc_a; LOOP: •STATE: ranges: range1 (finished) and range4 (unprocessed) used variables: all variables point to their blocks blocks: all used blocks have bit15 set all unused blocks haven't bu[]: unused // grow range3 (current work): // store block.size in bu[i] and the address of bu[i] in block.size bui = 0; // buffer index while (gc_e < heap_ptr) { if (gc_e.size.bit15) // used { if (bui == bu.count) break; bu[bui] = gc_e.size; gc_e.size = &bu[bui]; bui++; } gc_e += 2 + gc_e.size; // step over block } •STATE: ranges: range1 (finished), range2 (empty space), range3 (in work) range4 (unprocessed) variables: all variables point to their blocks blocks: range gc_p to gc_e: used blocks: block.size = &bu[i] bu[i] = block size, bit15 set unused blocks: bit15 cleared others: used blocks: bit15 set unused blocks: bit15 cleared bu[]: block sizes of used blocks in range gc_p to gc_e // in all variables which point to blocks in range gc_p to gc_e: // replace address with address of bu[i] // recursion will be controlled by bit15 which is cleared in this go. **DOIT()** fu(var) { block = peekw(var) if(block ≥ gc_p && block < gc_e) { pokew(var, peekw(block-2) + 2); } } visit_all_vars(locals, fu, 0, 0); •STATE: variables: range gc_p to gc_e: variables point to &bu[i+1] others: variables point to their blocks blocks: range gc_p to gc_e: used blocks: block.size = &bu[i] bu[i] = block size, bit15 cleared unused blocks: bit15 cleared others: used blocks: bit15 cleared unused blocks: bit15 cleared bu[]: block sizes of used blocks in range gc_p to gc_e // move blocks in range gc_p to gc_e down to gc_a // these can be found in bu[] // restore block.size from bu[i] // store new block address in bu[i] **DOIT()** **TODO** •STATE: ranges: gc_p to gc_e contains the free space variables: range gc_a to gc_p: variables point to &bu[i+1] others: variables point to their blocks blocks: used blocks: bit15 cleared unused blocks: bit15 cleared bu[]: pointers to block.data of used blocks in range gc_a to gc_p // update addresses for variables which point to bu[] // if variable points to bu[] then replace address with new address from bu[] // recursion will be controlled by bit15 which is set in this go. **DOIT()** fu(var) { block = peekw(var); if(block-2 ≥ bu+0 && block-2 < bu+bu.size ) { pokew(var,peekw(block-2)); } } visit_all_vars(locals, fu, 0, 1); •STATE: ranges: gc_p to gc_e contains the free space variables: variables point to their blocks blocks: used blocks: bit15 set unused blocks: bit15 cleared bu[]: void // cleanup // in all variables: clear bit15 in block.size // recursion is controlled by bit15, which is cleared (sic!) // update gc_a and gc_p // test whether we are done or loop gc_a = gc_p gc_p = gc_e if(gc_e < heap_ptr) goto LOOP memclear(gc_e to heap_ptr) note: if bu[] was allocated in heap, then clear gc_e to heap_end heap_ptr = gc_e // clear bit15 fu(var) {/*nop*/} visit_all_vars(locals, fu, 0, 0); return