back

Last regenerated: 2022-04-20 17:31:36 kio

zasm - Z80 Assembler – Version 4.4

Assembler directives

[<prev] [next>]

#compress

Defined Labels

Implementation

Resolving Cyclic Dependencies

Compression Time

Supported Target Files

#compress <segment_name>
#compress <segment_name1> to <segment_name2>

new in version 4.1.0

This defines that the named code segment or range of code segments shall be compressed using Einar Saukas' "optimal" LZ77 packer. The named segments must be defined before the #compress directive and they must be Assembler directives: #code
Including C Source Files: #code#code segments. Compression of Assembler directives: #data
Including C Source Files: #data#data segments makes no sense, as they are not stored in the output file. zasm currently only supports bottom-up compression, where the decompression increments like in 'ldir'. Assembler directives: #if, #elif, #else, #endif
Pseudo instructions: if, endifIf multiple segments are compressed then all but the first must not have an explicit address Pseudo instructions: defl, set and '='
Labels: SETset. They must follow each other without any address gap.

Command Line Options: --z80
Pseudo instructions: .z80, .z180 and .8080
Targets: #target Z80Z80 decompressors are included in the zasm distribution or can be downloaded from Einar Saukas' Dropbox.

Defined Labels

#compress will define some Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels for you to use in your source:

#compress SEGMENT

SEGMENT_csize
SEGMENT_cgain
SEGMENT_cdelta

#compress SEGMENT1 to SEGMENT2

SEGMENT1_to_SEGMENT2_size
SEGMENT1_to_SEGMENT2_csize
SEGMENT1_to_SEGMENT2_cgain
SEGMENT1_to_SEGMENT2_cdelta

Note: zasm always defines the following Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels for all segments:

SEGMENT
SEGMENT_size
SEGMENT_end

SEGMENT_size and SEGMENT1_to_SEGMENT2_size

This is the uncompressed size of the segment(s).

SEGMENT_csize and SEGMENT1_to_SEGMENT2_csize

This is the compressed size of the segment(s) as stored in the output file.

SEGMENT_cgain and SEGMENT1_to_SEGMENT2_cgain

This is the difference between uncompressed and compressed size. Assembler directives: #if, #elif, #else, #endif
Pseudo instructions: if, endifIf it's negative then compression actually increased the size of your data.

SEGMENT_cdelta and SEGMENT1_to_SEGMENT2_cdelta

A common szenario is that the compressed data and decompressed code segment overlap and the decompressed data overwrites the compressed data while it grows. Then the compressed data must be loaded high enough above the decompressed data so that no unprocessed bytes are overwritten. Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: LabelsLabel 'cdelta' defines the minimum difference between compressed data Pseudo instructions: end, .end
8080 pseudo instructions: ENDend and uncompressed data Pseudo instructions: end, .end
8080 pseudo instructions: ENDend.

Implementation

Compression unavoidably leads to relocated code. So let's start with code relocation first.

Imagine you load code at a 'load_address' and move it down to 'code_address' before it is executed:

Assembler directives: #code
Including C Source Files: #code#code LOADER, load_address
loader: ld      hl, LOADER_end      ; address of Assembler directives: #code
Including C Source Files: #code#code MCODE after loading
        ld      de, code_address    ; to be moved here
        ld      bc, MCODE_size
        ldir
        jp      reset               ; = code_address
 
Assembler directives: #code
Including C Source Files: #code#code MCODE, code_address
reset:  di
        ld      sp,0
        ...

Note: you must be careful to get your Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabel values right. This can be a real headache.

The whole code is loaded to (or initially located at) 'load_address'. Therefore we declare our LOADER segment to start with this physical address, so that all Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels inside this segment are right. (currently there's only one Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabel 'loader').

The CODE segment is copied to address 'code_address' and executed there. Therefore we declare our CODE segment to start with this physical address, so that all Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels inside this segment are right.

It becomes obvious that source and destination block must not overlap and the loader code must not be overwritten as well, at least not the last 5 bytes: 'ldir' and 'jp reset'.

Since large code loaded from tape can frequently not be loaded high or low enough so that the relocated block does not overwrite the loader, we have to make one first improvement: Assembler directives: #if, #elif, #else, #endif
Pseudo instructions: if, endifIf we move the block downwards then put the loader at the Pseudo instructions: end, .end
8080 pseudo instructions: ENDend else use lddr instead of ldir. The lddr version is not discussed here as it is not yet supported by zxsp. So let's put the loader at the Pseudo instructions: end, .end
8080 pseudo instructions: ENDend:

Assembler directives: #code
Including C Source Files: #code#code MCODE, code_address
reset:  di
        ld      sp,0
        ...
 
Assembler directives: #code
Including C Source Files: #code#code LOADER, load_address + MCODE_size
loader: ld      hl, load_address    ; Assembler directives: #code
Including C Source Files: #code#code MCODE after loading
        ld      de, code_address    ; to be moved here
        ld      bc, MCODE_size
        ldir
        jp      reset               ; = code_address

Now let's add compression: This is very similar except that the code on the tape is compressed and therefore the block will grow during 'ldir'.

Assembler directives: #code
Including C Source Files: #code#code MCODE, code_address
reset:      di
    ld      sp,0
    ...
 
Assembler directives: #code
Including C Source Files: #code#code LOADER, load_address + MCODE_csize
loader: ld      hl, load_address        ; Assembler directives: #code
Including C Source Files: #code#code MCODE after loading
        ld      de, code_address        ; to be moved here
        ;ld     bc, MCODE_size          ; not reqired
        call    decompress_zx7_standard ; ldir
        jp      reset                   ; = code_address
Assembler directives: #include
Including C Source Files: #include#include "decompress_zx7_standard.s"    ; include the zx7 decompressor
 
#compress MCODE                         ; define MCODE to be compressed
Assembler directives: #assert
Pseudo instructions: #assert#assert code_address+MCODE_csize+MCODE_cdelta <= load_address+MCODE_csize
Assembler directives: #assert
Pseudo instructions: #assert#assert MCODE_end+MCODE_cdelta <= LOADER

Mind the difference: the LOADER is no longer located at load_address+MCODE_size but now at load_address+MCODE_csize. Right after loading the bytes from load_address to load_address+MCODE_csize are compressed and not executable at all.

#compress MCODE defines that the code segment 'MCODE' shall be written compressed to the output file.

Finally there's a check to Assembler directives: #assert
Pseudo instructions: #assertassert that the decompressed code will not overwrite the not yet processed compressed data. Both Assembler directives: #assert
Pseudo instructions: #assert#asserts are identical in this #insert: Examples:
#assert: Example:
incbin: Examples:
#assert: Example:example.

Resolving Cyclic Dependencies

It's not obvious but zasm resolves a cyclic dependency here: As the LOADER address depends on the compressed MCODE size, all code Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels in the LOADER segment are not valid until the compressed MCODE size is valid. But the compressed MCODE size cannot be valid unless all Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels in the source are valid, because a not-yet valid Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabel may be used in the MCODE segment and the compressed size may change when the Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabel value is finally resolved. Therefore MCODE_csize never becomes valid.

zasm uses a 'preliminary' state here to solve this problem and therefore requires at least one more pass to assemble the source.

Sometimes 'preliminary' Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels may toggle between each pass or infinitely grow and grow and grow, depending on your source. Then assembly fails. The way to solve this situation depends on the kind and location of the problem. Frequently the problem results from code alignments in your source. E.g. there is a '.align 2' and in one pass it adds no space, zasm adjusts the segment address by one and now it inserts a byte, annihilating zasm's effort. Similarly the compressed data size can toggle between two values when a Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabel used in the uncompressed code depends on the compressed data size.

This can eventually be solved by adding a dummy byte. The effect will not last for ever, Assembler directives: #if, #elif, #else, #endif
Pseudo instructions: if, endifif it happens again remove the byte.

date and time

Normally the assembly success should be predictable. Assembler directives: #if, #elif, #else, #endif
Pseudo instructions: if, endifIf the uncompressed code contains assembly-time dependent data, as __date__ and __time__, then assembly may succeed or fail unpredictably: you assemble: it fails, you assemble again: it succeeds, you assemble again: it fails, and so on. Using the 1-byte trick may eliminate the problem for a while, but may also just put this data in a uncompressed section instead.

Compression Time

Though the zx7 compression is pretty fast on small code blocks, it can become very slow on larger ones because it has a quadratic time stamp. Additionally larger sources with lots of segments tend to require more passes for the assembler. Assembler directives: #if, #elif, #else, #endif
Pseudo instructions: if, endifIf assembly time becomes too long then you can do two things:

• Split the compressed blocks. This will speed up assembly time but grow your source, both for additional calls to the decompressor and a not so good compression result.
• Store and compress data in separate files and #insert them. This is the traditional method anyway. This is good for static data with no references to other data but does not work Assembler directives: #if, #elif, #else, #endif
Pseudo instructions: if, endifif it contains references to non-const Pseudo instructions: Label definition
Numeric expressions: Labels
8080 Assembler: Labelslabels in the rest of the program.

Supported Target Files

Compression can be used with most file formats. There are certain restrictions on what can be compressed, e.g. the file headers of snapshot files cannot be compressed. While with TAP files the major benefit is the reduced loading time, the major benefit for snapshot files probably is that you can store more game data like levels.

Currently (vs. 4.1.0) Command Line Options: --z80
Pseudo instructions: .z80, .z180 and .8080
Targets: #target Z80Z80 files cannot be compressed. This will be implemented as soon as possible.

[<prev] [top] [next>]