#compress <segment_name>
#compress <segment_name1> to <segment_name2>
new in version 4.1.0
This defines that the named code segment or range of code segments shall be compressed using Einar Saukas' "optimal" LZ77 packer. The named segments must be defined before the #compress directive and they must be Assembler directives: #code Including C Source Files: #code#code segments. Compression of Assembler directives: #data Including C Source Files: #data#data segments makes no sense, as they are not stored in the output file. zasm currently only supports bottom-up compression, where the decompression increments like in 'ldir'. Assembler directives: #if, #elif, #else, #endif Pseudo instructions: if, endifIf multiple segments are compressed then all but the first must not have an explicit address Pseudo instructions: defl, set and '=' Labels: SETset. They must follow each other without any address gap.
Command Line Options: --z80 Pseudo instructions: .z80, .z180 and .8080 Targets: #target Z80Z80 decompressors are included in the zasm distribution or can be downloaded from Einar Saukas' Dropbox.
#compress will define some Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels for you to use in your source:
SEGMENT_csize
SEGMENT_cgain
SEGMENT_cdelta
SEGMENT1_to_SEGMENT2_size
SEGMENT1_to_SEGMENT2_csize
SEGMENT1_to_SEGMENT2_cgain
SEGMENT1_to_SEGMENT2_cdelta
Note: zasm always defines the following Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels for all segments:
SEGMENT
SEGMENT_size
SEGMENT_end
This is the uncompressed size of the segment(s).
This is the compressed size of the segment(s) as stored in the output file.
This is the difference between uncompressed and compressed size. Assembler directives: #if, #elif, #else, #endif Pseudo instructions: if, endifIf it's negative then compression actually increased the size of your data.
A common szenario is that the compressed data and decompressed code segment overlap and the decompressed data overwrites the compressed data while it grows. Then the compressed data must be loaded high enough above the decompressed data so that no unprocessed bytes are overwritten. Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: LabelsLabel 'cdelta' defines the minimum difference between compressed data Pseudo instructions: end, .end 8080 pseudo instructions: ENDend and uncompressed data Pseudo instructions: end, .end 8080 pseudo instructions: ENDend.
Compression unavoidably leads to relocated code. So let's start with code relocation first.
Imagine you load code at a 'load_address' and move it down to 'code_address' before it is executed:
Assembler directives: #code Including C Source Files: #code#code LOADER, load_address
loader: ld hl, LOADER_end ; address of Assembler directives: #code Including C Source Files: #code#code MCODE after loading
ld de, code_address ; to be moved here
ld bc, MCODE_size
ldir
jp reset ; = code_address
Assembler directives: #code Including C Source Files: #code#code MCODE, code_address
reset: di
ld sp,0
...
Note: you must be careful to get your Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabel values right. This can be a real headache.
The whole code is loaded to (or initially located at) 'load_address'. Therefore we declare our LOADER segment to start with this physical address, so that all Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels inside this segment are right. (currently there's only one Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabel 'loader').
The CODE segment is copied to address 'code_address' and executed there. Therefore we declare our CODE segment to start with this physical address, so that all Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels inside this segment are right.
It becomes obvious that source and destination block must not overlap and the loader code must not be overwritten as well, at least not the last 5 bytes: 'ldir' and 'jp reset'.
Since large code loaded from tape can frequently not be loaded high or low enough so that the relocated block does not overwrite the loader, we have to make one first improvement: Assembler directives: #if, #elif, #else, #endif Pseudo instructions: if, endifIf we move the block downwards then put the loader at the Pseudo instructions: end, .end 8080 pseudo instructions: ENDend else use lddr instead of ldir. The lddr version is not discussed here as it is not yet supported by zxsp. So let's put the loader at the Pseudo instructions: end, .end 8080 pseudo instructions: ENDend:
Assembler directives: #code Including C Source Files: #code#code MCODE, code_address
reset: di
ld sp,0
...
Assembler directives: #code Including C Source Files: #code#code LOADER, load_address + MCODE_size
loader: ld hl, load_address ; Assembler directives: #code Including C Source Files: #code#code MCODE after loading
ld de, code_address ; to be moved here
ld bc, MCODE_size
ldir
jp reset ; = code_address
Now let's add compression: This is very similar except that the code on the tape is compressed and therefore the block will grow during 'ldir'.
Assembler directives: #code Including C Source Files: #code#code MCODE, code_address
reset: di
ld sp,0
...
Assembler directives: #code Including C Source Files: #code#code LOADER, load_address + MCODE_csize
loader: ld hl, load_address ; Assembler directives: #code Including C Source Files: #code#code MCODE after loading
ld de, code_address ; to be moved here
;ld bc, MCODE_size ; not reqired
call decompress_zx7_standard ; ldir
jp reset ; = code_address
Assembler directives: #include Including C Source Files: #include#include "decompress_zx7_standard.s" ; include the zx7 decompressor
#compress MCODE ; define MCODE to be compressed
Assembler directives: #assert Pseudo instructions: #assert#assert code_address+MCODE_csize+MCODE_cdelta <= load_address+MCODE_csize
Assembler directives: #assert Pseudo instructions: #assert#assert MCODE_end+MCODE_cdelta <= LOADER
Mind the difference: the LOADER is no longer located at load_address+MCODE_size but now at load_address+MCODE_csize. Right after loading the bytes from load_address to load_address+MCODE_csize are compressed and not executable at all.
#compress MCODE defines that the code segment 'MCODE' shall be written compressed to the output file.
Finally there's a check to Assembler directives: #assert Pseudo instructions: #assertassert that the decompressed code will not overwrite the not yet processed compressed data. Both Assembler directives: #assert Pseudo instructions: #assert#asserts are identical in this #insert: Examples: #assert: Example: incbin: Examples: #assert: Example:example.
It's not obvious but zasm resolves a cyclic dependency here: As the LOADER address depends on the compressed MCODE size, all code Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels in the LOADER segment are not valid until the compressed MCODE size is valid. But the compressed MCODE size cannot be valid unless all Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels in the source are valid, because a not-yet valid Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabel may be used in the MCODE segment and the compressed size may change when the Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabel value is finally resolved. Therefore MCODE_csize never becomes valid.
zasm uses a 'preliminary' state here to solve this problem and therefore requires at least one more pass to assemble the source.
Sometimes 'preliminary' Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels may toggle between each pass or infinitely grow and grow and grow, depending on your source. Then assembly fails. The way to solve this situation depends on the kind and location of the problem. Frequently the problem results from code alignments in your source. E.g. there is a '.align 2' and in one pass it adds no space, zasm adjusts the segment address by one and now it inserts a byte, annihilating zasm's effort. Similarly the compressed data size can toggle between two values when a Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabel used in the uncompressed code depends on the compressed data size.
This can eventually be solved by adding a dummy byte. The effect will not last for ever, Assembler directives: #if, #elif, #else, #endif Pseudo instructions: if, endifif it happens again remove the byte.
Normally the assembly success should be predictable. Assembler directives: #if, #elif, #else, #endif Pseudo instructions: if, endifIf the uncompressed code contains assembly-time dependent data, as __date__ and __time__, then assembly may succeed or fail unpredictably: you assemble: it fails, you assemble again: it succeeds, you assemble again: it fails, and so on. Using the 1-byte trick may eliminate the problem for a while, but may also just put this data in a uncompressed section instead.
Though the zx7 compression is pretty fast on small code blocks, it can become very slow on larger ones because it has a quadratic time stamp. Additionally larger sources with lots of segments tend to require more passes for the assembler. Assembler directives: #if, #elif, #else, #endif Pseudo instructions: if, endifIf assembly time becomes too long then you can do two things:
• Split the compressed blocks. This will speed up assembly time but grow your source, both for additional calls to the decompressor and a not so good compression result. • Store and compress data in separate files and #insert them. This is the traditional method anyway. This is good for static data with no references to other data but does not work Assembler directives: #if, #elif, #else, #endif Pseudo instructions: if, endifif it contains references to non-const Pseudo instructions: Label definition Numeric expressions: Labels 8080 Assembler: Labelslabels in the rest of the program.
Compression can be used with most file formats. There are certain restrictions on what can be compressed, e.g. the file headers of snapshot files cannot be compressed. While with TAP files the major benefit is the reduced loading time, the major benefit for snapshot files probably is that you can store more game data like levels.
Currently (vs. 4.1.0) Command Line Options: --z80 Pseudo instructions: .z80, .z180 and .8080 Targets: #target Z80Z80 files cannot be compressed. This will be implemented as soon as possible.
|