-------------------------------------
  Z80 Return-oriented Virtual Machine 
  (c) 2016 - 2017 kio@little-bat.de
-------------------------------------

Register Usage:

	SP  = VM PC
	AF  = 
	BC  = VM variables stack (top down)
	DE  = TOP: top vstack value on vstack
	HL  = AUX: top.hi or 2nd vstack value if possible
	IX  =
	IY  =
	AF' =
	BC' = VM return stack (bottom up) 
	DE' =
	HL' =

Interrupt Handling:

	If interrupts are enabled during virtual code execution, then the interrupt 
	overwrites at least one opcode of the virtual program. This must be handled:

	V1: VM runs with interrupts disabled and polls interrupts in branching opcodes.
	    Interrupts must be pollable -> e.g. K1-Bus SRAM Board
	V2: VM runs with interrupts enabled and restores overwritten word from a copy.
	    System must have enough ram for a full copy -> ZX Spectrum 128k
	V3: VM runs with interrupts enabled and recalculates overwritten word from checksums
	    Last resort: time consuming -> ZX Spectrum 48k
	V4: not planned to implement: the virtual code is in a rom and writing goes to a shadow ram

Restrictions for the Checksum Version (zx48k):

	The virtual program code must be WORD-ALIGNED.
	The virtual program code is NOT ROMABLE.
	All virtual program code must be kept in a SIGLE SEGMENT.
	All virtual code and data in this segment MUST NEVER BE MODIFIED.
	  Actually all memory pages used by virtual code must be const.
	  Therefore this segment should start on a 256-byte page boundary
	  and the remaining bytes in the last page cannot be used for variable data.


Storage types handled by the VM:

	immediate value
	local variable on vstack
	global variable
	data in array
	data in struct


Data types as seen by the VM:

VOID	void	value in register DE (TOP) may be void
	void	value in register HL (AUX) may be void

BYTE	8	when writing: byte
	u8	when reading: unsigned byte
	s8	when reading: signed byte

	when reading, bytes are sign-extended to word-size		TODO: int = int8 ?
	local byte variables on vstack actually have word-size 
	but the high byte of s8 variables may be void!
	when writing to local s8 variables the high byte IS NOT UPDATED!

WORD	standard data type, min. stack size 				TODO: int = int8 ?

	<no suffix>	reading or writing: word
	u		unsigned word: some functions distinguish between signed and unsigned
	s		signed word:   some functions distinguish between signed and unsigned

  pointers:
	var¢	pointer to (handle to) struct data
	b[]¢	pointer to (handle to) byte array data
	w[]¢	word array 
	l[]¢	long array
	p8	pointer to raw byte
	p16	pointer to raw word
	p32	pointer to raw long

LONG (4 bytes)
	32	reading or writing
	u32	unsigned: some functions distinguish between signed and unsigned
	s32	signed:   some functions distinguish between signed and unsigned
	f32	float


Stacks:

Values on stacks are always 2 bytes. 					TODO: int = int8 ?

vstack:	This is a top-down stack. This allows use for the real machine return stack in mcode.
	All data on the variables stack is 2*N bytes in size.
	Pushing and popping is expensive!

	TOP: DE may contain the top value of the vstack 
	     or may be void to reduce pushing and popping on the vstack.
	AUX: HL may contain the 2nd value or the high word of a long value of the vstack.

rstack:	This is a bottom-up stack. B'C' points behind the last entry. 
	The vstack and rstack typically share a common memory block.
	All data on the return stack is 2*N bytes in size.
	Pushing and popping is expensive!

Opcodes: All opcodes are addresses (2 bytes). 
	If they have immediate arguments, these are always multiples of words.
	Fetching the next opcode (with 'ret') and fetching immediate data (with 'pop hl') is fast.
	To reduce expensive work with the vstack, typical opcode sequences with a balanced vstack 
	should be combined into combi opcodes.

Byte Variables:								TODO: int = int8 ?
	For simplicity, bytes are immediately extended to word size (signed or unsigned) when read.
	-> local byte variables (on the vstack) are actually 2 bytes.
	-> the high byte of a signed byte local variable is invalid. (be careful with DUP or OVER)

Procedures:
	Are called like VM opcodes: the procedure is entered in native Z80 code!
	if the proc has arguments then DE = TOP else DE = void.
	macro p_enter pushes the VM PC on the VM rstack and switches to VM code.
	the procedure must return with RETURN (or variant).

	foo ( … n -- n ) 	in: DE = TOP,  out: DE = TOP
	foo ( -- n )		in: DE = void, out: DE = TOP	
	foo ( … n -- )		in: DE = TOP,  out: DE = void
	foo ( -- )		in: DE = void, out: DE = void


Runtime optimization:

opcode ( -- n )
	Opcodes ohne Argumente sollte es in einer Version mit und ohne vorangehendem PUSH_DE geben.
	idR. kann man dem Opcode 'ohne' einfach den Code für PUSH_DE voranstellen. (6 Bytes)
	Außerdem in einer Version, die HL als SOT (2nd on top) benutzt.

	Betroffene Opcodes:
	LVAR, GVAR, IVAL und Varianten
	
		PUSH_LVAR ( -- n )		; push DE, then get LVAR in DE
		LVAR      ( void -- n )		; just overwrite DE with LVAR
		LVAR_hlde ( de:n1 -- hl:n1 de:n2 ) ; move TOP into SOT and load LVAR into TOP
	
opcode ( n -- n )
	sind super. Kein Stackzugriff nötig. Keine Varianten nötig. 
	Sie sind das Ziel der Stackoptimierung!
	
opcode ( n1 n2 -- n )
	sollten in einer Version mit n1 auf dem Stack und einer mit n1 in HL vorliegen.
	soweit sinnvoll mit n2 als inline-Argument oder mit implizitem n2.

	Betroffene Opcodes:
	Operatoren etc.
	
		ADD      ( n1 n2 -- n )		; n1 auf dem Stack
		ADD_hlde ( hl:n1 de:n2 -- n )	; n1 in HL
		ADDI     ( n1 $w -- n )		; n2 = inline argument
		ADD2     ( n1 -- n )		; n2 = implizit

opcode ( * -- void )
	Funktionen ohne rval hinterlassen DE void.
	
	Betroffene Opcodes:
	Assignment operators
	
call procedure:
	muss konsistent sein für den Aufruf von Prozeduren (und Opcodes!) über ProcPointer.
	Prozeduren und Opcodes sollen sich aufruftechnisch nicht unterscheiden.
	
	CALLPROCPTR unterscheidet bisher nicht nach Anzahl Argumente.
	Man sollte aber ein weiteres CALLPROCPTR, für Prozeduren ohne Argumente einführen.

	Prozeduren mit Argumenten werden mit DE = TOP aufgerufen, 
	weil die Opcodes mit der Signatur "( n -- n )" so aufgerufen werden.

	Prozeduren ohne Argumente werden mit TOP = void aufgerufen.