****************** GENERAL INFORMATION ABOUT ZX-SPECTRUM *************** This section is based on the text contributed by Gerton Lunter, author of "Z80" Spectrum emulator. I allowed myself to make some changes which don't change the content. 1. Z80 CPU: Most Z80 opcodes are one byte long, not counting a possible byte or word operand. The four opcodes CB, DD, ED and FD are shift opcodes: they change the meaning of the opcode following them. a) CB opcodes: There are 248 different CB opcodes. The block CB 30 to CB 37 is missing from the official list. These instructions, usually denoted by the mnemonic SLL, Shift Left Logical, shift left the operand and make bit 0 always one. These instructions are quite commonly used. For example, Bounder and Enduro Racer use them. b) DD and FD opcodes: The DD and FD opcodes precede instructions using the IX and IY registers. If you look at the instructions carefully, you see how they work: 2A nn LD HL,(nn) DD 2A nn LD IX,(nn) 7E LD A,(HL) DD 7E d LD A,(IX+d) A DD opcode simply changes the meaning of HL in the next instruction. If a memory byte is addressed indirectly via HL, as in the second example, a displacement byte is added. Otherwise the instruction simply acts on IX instead of HL (A notational awkwardness, that will only bother assembler and disassembler writers: JP (HL) is not indirect; it should have been denoted by JP HL). If a DD opcode precedes an instruction that doesn't use the HL register pair at all, the instruction is executed as usual. However, if the instruction uses the H or L register, it will now use the high or low halves of the IX register! Example: 44 LD B,H FD 44 LD B,IYh These types of inofficial instructions are used in very many programs. By the way, many DD or FD opcodes after each other will effectively be NOPs, doing nothing except repeatedly setting the flag "treat HL as IX" (or IY) and taking up 4 T states (But try to let MONS disassemble such a block.). c) ED opcodes: There are a number of inofficial ED instructions, but none of them are very useful. The ED opcodes in the range 00-3F and 80-FF (except for the block instructions of course) do nothing at all but taking up 8 T states and incrementing the R register by 2. Most of the unlisted opcodes in the range 40-7F do have an effect, however. The complete list: (* = not official) ED40 IN B,(C) ED60 IN H,(C) ED41 OUT (C),B ED61 OUT (C),H ED42 SBC HL,BC ED62 SBC HL,HL ED43 LD (nn),BC ED63 * LD (nn),HL ED44 NEG ED64 * NEG ED45 RETN ED65 * RET ED46 IM 0 ED66 * IM 0 ED47 LD I,A ED67 RRD ED48 IN C,(C) ED68 IN L,(C) ED49 OUT (C),C ED69 OUT (C),L ED4A ADC HL,BC ED6A ADC HL,HL ED4B LD BC,(nn) ED6B * LD HL,(nn) ED4C * NEG ED6C * NEG ED4D RETI ED6D * RET ED4E * IM 0/1 ED6E * IM 0/1 ED4F LD R,A ED6F RLD ED50 IN D,(C) ED70 * IN (C) ED51 OUT (C),D ED71 * OUT (C),0 ED52 SBC HL,DE ED72 SBC HL,SP ED53 LD (nn),DE ED73 LD (nn),SP ED54 * NEG ED74 * NEG ED55 * RET ED75 * RET ED56 IM 1 ED76 * IM 1 ED57 LD A,I ED77 * NOP ED58 IN E,(C) ED78 IN A,(C) ED59 OUT (C),E ED79 OUT (C),A ED5A ADC HL,DE ED7A ADC HL,SP ED5B LD DE,(nn) ED7B LD SP,(nn) ED5C * NEG ED7C * NEG ED5D * RET ED7D * RET ED5E IM 2 ED7E * IM 2 ED5F LD A,R ED7F * NOP The ED70 instruction reads from port (C), just like the other instructions, but throws away the result. It does change the flags in the same way as the other IN instructions, however. The ED71 instruction OUTs a byte zero to port (C), interestingly. These instructions "should", by regularity of the instruction set, use (HL) as operand, but since from the processor's point of view accessing memory or accessing I/O devices is almost the same thing, and since the Z80 cannot access memory twice in one instruction (disregarding instruction fetch of course) it can't fetch or store the data byte (A hint in this direction is that, even though the NOP-synonyms LD B,B, LD C,C etcetera do exist, LD (HL),(HL) is absent and replaced by the HALT instruction.). The IM 0/1 instruction puts the processor in either IM 0 or 1, I couldn't figure out which on my own Spectrum. d) About the R register: This is not really an undocumented feature, although I have never seen any thorough description of it anywhere. The R register is a counter that is updated every instruction, where DD, FD, ED and CB are to be regarded as separate instructions. So shifted instruction will increase R by two. There's an interesting exception: doubly-shifted opcodes, the DDCB and FDCB ones, increase R by two too. LDI increases R by two, LDIR increases it by 2 times BC, as does LDDR etcetera. The sequence LD R,A/LD A,R increases A by two, except for the highest bit: this bit of the R register is never changed. This is because in the old days everyone used 16 Kbit chips. Inside the chip the bits where grouped in a 128x128 matrix, needing a 7 bit refresh cycle. Therefore ZiLOG decided to count only the lowest 7 bits. You can easily check that the R register is really crucial to memory refresh. Assemble this program: ORG 32768 DI LD B,0 L1: XOR A LD R,A DEC HL LD A,H OR L JR NZ,L1 DJNZ L1 EI RET It will take about three minutes to run. Look at the upper 32K of memory, for instance the UDG graphics. It will have faded. Only the first few bytes of each 256 byte block will still contain zeros, because they were refreshed during the execution of the loop. The ULA took care of the refreshing of the lower 16K (This example won't work on the emulator, of course!). e) Undocumented flags: This undocumented "feature" of Z80 has its effect on programs like Sabre Wulf, Ghosts'n Goblins and Speedlock. Bits 3 and 5 of the F register are not used. They can contain information, as you can readily figure out by PUSHing AF onto the stack and then POPping some it into another pair of registers. Furthermore, sometimes their values change. I found the following empirical rule: The values of bits 7, 5 and 3 follow the values of the corresponding bits of the last 8 bit result of an instruction that changed the usual flags. For instance, after an ADD A,B those bits will be identical to the bits of the A register (Bit 7 of F is the sign flag, and fits the rule exactly). An exception is the CP x instruction (x=register, (HL) or direct argument). In this case the bits are copied from the argument. If the instruction is one that operates on a 16 bit word, the 8 bits of the rule are the highest 8 bits of the 16 bit result - that was to be expected since the S flag is extracted from bit 15. Ghosts'n Goblins use the undocumented flag due to a programming error. The rhino in Sabre Wulf walks backward or keeps running in little circles in a corner, if the (in this case undocumented) behaviour of the sign flag in the BIT instruction isn't right. I quote: AD86 DD CB 06 7E BIT 7,(IX+6) AD8A F2 8F AD JP P,#AD8F An amazing piece of code! Speedlock does so many weird things that all must be exactly right for it to run. Finally, the '128 ROM uses the AF register to hold the return address of a subroutine for a while. f) Interrupt flip-flops IFF1 and IFF2: There seems to be a little confusion about these. These flip flops are simultaneously set or reset by the EI and DI instructions. IFF1 determines whether interrupts are allowed, but its value cannot be read. The value of IFF2 is copied to the P/V flag by LD A,I and LD A,R. When an NMI occurs, IFF1 is reset, thereby disallowing further [maskable] interrupts, but IFF2 is left unchanged. This enables the NMI service routine to check whether the interrupted program had enabled or disabled maskable interrupts. So, Spectrum snapshot software can only read IFF2, but most emulators will emulate both, and then the one that matters most is IFF1. 2. ZX-Spectrum Hardware: At the hardware level, the Spectrum is a very simple machine. There's the 16K ROM which occupies the lowest part of the address space, and 48K of RAM which fills up the rest. An ULA which reads the lowest 6912 bytes of RAM to display the screen, and contains the logic for just one I/O port completes the machine, from a software point of view at least. Every even I/O address will address the ULA, but to avoid problems with other I/O devices only port FE should be used. If this port is written to, bits have the following meaning: Bit 7 6 5 4 3 2 1 0 +-------------------------------+ | | | | E | M | Border | +-------------------------------+ The lowest three bits specify the border color; a zero in bit 3 activates the MIC output, and a one in bit 4 activates the EAR output (which sounds the internal speaker). The real Spectrum also activates he MIC when the ear is written to. The upper three bits are unused. If port FE is read from, the highest eight address lines are important too. A zero on one of these lines selects a particular half-row of five keys: IN: Reads keys (bit 0 to bit 4 inclusive) #FEFE SHIFT, Z, X, C, V #EFFE 0, 9, 8, 7, 6 #FDFE A, S, D, F, G #DFFE P, O, I, U, Y #FBFE Q, W, E, R, T #BFFE ENTER, L, K, J, H #F7FE 1, 2, 3, 4, 5 #7FFE SPACE, SYM SHFT, M, N, A zero in one of the five lowest bits means that the corresponding key is pressed. If more than one address line is made low, the result is the logical AND of all single inputs, so a zero in a bit means that at least one of the appropriate keys is pressed. For example, only if each of the five lowest bits of the result from reading from port 00FE (for instance by XOR A/IN A,(FE)) is one, no key is pressed. A final remark about the keyboard. It is connected in a matrix-like fashion, with 8 rows of 5 columns, as is obvious from the above remarks. Any two keys pressed simultaneously can be uniquely decoded by reading from the IN ports. However, if more than two keys are pressed decoding may not be uniquely possible. For instance, if you press Caps shift, B and V, the Spectrum will think also the Space key is pressed, and react by giving the "Break into Program" report. Without this matrix behaviour Zynaps, for instance, won't pause when you press 5,6,7,8 and 0 simultaneously. Bit 5 (value 64) of IN-port FE is the ear input bit. When the line is silent, its value is zero, except in the early Model 2 of the Spectrum, where it was one. When there is a signal, this bit toggles. The Spectrum loading software is not sensitive to the polarity of this bit (which it definitely should not be, not only because of this model difference, but also because you cannot be sure the tape recorder doesn't change the polarity of the signal recorded!). Some old programs rely on the fact that bit 5 is always one (for instance Spinads). Bits 6 and 7 are always one. The ULA with the lower 16K of RAM, and the processor with the upper 32K RAM and 16K ROM are working independently of each other. The data and address buses of the Z80 and the ULA are connected by small resistors; normally, these do effectively decouple the buses. However, if the Z80 wants to read of write the lower 16K, the ULA halts the processor if it is busy reading, and after it's finished lets the processor access lower memory through the resistors. A very fast, cheap and neat design indeed! If you run a program in the lower 16K of RAM, or read or write in that memory, the processor is halted sometimes. This part of memory is therefore somewhat slower than the upper 32K block. This is also the reason that you cannot write a sound- or save-routine in lower memory; the timing won't be exact, and the music will sound harsh. Also, INning from port FE will halt the processor, because the ULA has to supply the result. Therefore, INning from port FE is a tiny bit slower on average than INning from other ports; whilst normally an IN A,(nn) instruction would take 11 T states, it takes 12.15 T states on average if nn=FE. See below for more exact information. If the processor reads from a non-existing IN port, for instance FF, the ULA won't stop, but nothing will put anything on the data bus. Therefore, you'll read a mixture of FF's (idle bus), and screen and ATTR data bytes (the latter being very scarce, by the way). This will only happen when the ULA is reading the screen memory, about 60% of the 1/50th second time slice in which a frame is generated. The other 40% the ULA is building the border or generating a vertical retrace. This behaviour is actually used in some programs, for instance, in Arkanoid. Finally, there is an interesting bug in the ULA which also has to do with this split bus. After each instruction fetch cycle of the processor, the processor puts the I-R register "pair" (not the 8 bit internal Instruction Register, but the Interrupt and R registers) on the address bus. The lowest 7 bits, the R register, are used for memory refresh. However, the ULA gets confused if I is in the range 64-127, because it thinks the processor wants to read from lower 16K ram very, very often. The ULA can't cope with this read-frequency, and regularly misses a screen byte. Instead of the actual byte, the byte previously read is used to build up the video signal. The screen seems to be filled with 'snow'; however, the Spectrum won't crash, and program will continue to run normally. There's one program I know of that uses this to generate a nice effect: Vectron (which has very nice music too, by the way). The processor has three interrupt modes, selected by the instructions IM 0, IM 1 and IM 2. In mode 1, the processor simply executes an RST #38 instruction if an interrupt is requested. This is the mode the Spectrum is normally in. The other mode that is commonly used is IM 2. If an interrupt is requested, egister (as the high byte) with whatever the interrupting device places on the data bus. The subroutine at this address is then called. Rodnay Zaks in his book "Programming the Z80" states that only even bytes are allowed as low index byte, but that isn't true. The normal Spectrum contains no hardware to place a byte on the bus, and the bus will therefore always read FF (because the ULA also doesn't read the screen if it generates an interrupt), so the resulting index address is 256*I+255. However, some not-so-neat hardware devices put things on the data bus when they shouldn't, so later programs didn't assume the low index byte was FF. These programs contain a 257 byte table of equal bytes starting at 256*I, and the interrupt routine is placed at an address that is a multiple of 257. A useful but not so much used trick is to make the table contain FF's (or use the ROM for this) and put a byte 18 hex, the opcode for JR, at FFFF. The first byte of the ROM is a DI, F3 hex, so the JR will jump to FFF4, where a long JP to the actual interrupt routine is put. In interrupt mode 0, the processor executes the instruction that the interrupting device places on the data bus. On a standard Spectrum this will be the byte FF, coincidentally (...) the opcode for RST #38. But for the same reasons as above, this is not really reliable. The 50 Hz interrupt is synchronized with the video signal generation by the ULA; both the interrupt and the video signal are generated by it. Many programs use the interrupt to synchronize with the frame cycle. Some use it to generate fantastic effects, such as full-screen characters, full-screen horizon (Aquaplane) or pixel colour (Uridium for instance). Very many modern programs use the fact that the screen is "written" (or "fired") to the CRT in a finite time to do as much time-consuming screen calculations as possible without causing character flickering: although the ULA has started displaying the screen for this frame already, the electron beam will for a moment not "pass" this or that part of the screen so it's safe to change something there. So the exact time in the 1/50 second time-slice at which the screen is updated is very important. Each line takes exactly 224 T states. After an interrupt occurs, 64 line times pass before the byte 16384 is displayed. At least the last 48 of these are actual border-lines. I could not determine whether my monitor didn't display the others or whether it was in vertical retrace, but luckily that's not really important. Then the 192 screen+border lines are displayed, followed by about 56 border lines again. 56.5 border lines would make up exactly 70000 T states, 1/50th of 3500000. However, I noticed that the frequency of the 50 Hz interrupt (measured in 1/T states!) changes very slightly when my Spectrum gets hot (I think it has something to do with the relative change of the frequencies of the two crystals in the Spectrum), so the time between interrupts will probably not be exactly 70000 T states. Anyway, whether the final border block is of fixed or variable length doesn't concern us either, the timings of the start and end of the screen, which are the timings of real interest, are fixed. Now for the timings of each line itself. I define a screen line to start with 256 screen pixels, then border, then horizontal retrace, and then border again. All this takes 224 T states. Every half T state a pixel is written to the CRT, so if the ULA is reading bytes it does so each 4 T states (and then it reads two: a screen and an ATTR byte). The border is 48 pixels wide at each side. A video screen line is therefore timed as follows: 128 T states of screen, 24 T states of right border, 48 T states of horizontal retrace and 24 T states of left border. When an interrupt occurs, the running instruction has to be completed first. So the start of the interrupt is fixed relatively to the start of the frame up to the length of the last instruction in T states. If the processor was executing a HALT (which, according to the Z80 books I read, is effectively many NOPs), the interrupt routine starts at most 3 T states away from the start of the frame. Of course the processor also needs some T states to store the program counter on the stack, read the interrupt vector and jump to the routine, but since I cannot determine that by only using the Spectrum, it is useless information by that very reason alone! Now when to OUT to the border to change it at the place you want? First of all, you cannot change the border within a "byte", an 8-pixel chunk. If we forget about the screen for a moment, if you OUT to port FE after 14326 to 14329 T states (including the OUT) from the start of the IM 2 interrupt routine, the border will change at exactly the position of byte 16384 of the screen. The other positions can be computed by remembering that 8 pixels take 4 T states, and a line takes 224 T states. You would think that OUTing after 14322 to 14325 T states, the border would change at 8 pixels left of the upper left corner of the screen. This is right for 14322, 14323 and 14324 T states, but if you wait 14325 T states the ULA happens to be reading byte 16384 (or 22528, or both) and will halt the processor for a while, thereby making you miss the 8 pixels. This exception happens again after 224 T states, and again after 448, an so forth. These 192 exceptions left of the actual screen rectangle are the only ones; similar things don't happen at the right edge because the ULA don't need to read things there - it has just finished! As noted above, reading or writing in low ram (or OUTing to the ULA) causes the ULA to halt the processor. When and how much? The processor is halted each time you want to access the ULA or low memory and the ULA is busy reading. Of the 312.5 'lines' the ULA generates, only 192 contain actual screen pixels, and the ULA will only read bytes during 128 of the 224 T states of each screen line. But if it does, the processor is halted for exactly 4 T states. 3. Interface I: The Interface I is quite complicated. It uses three different I/O ports, and contains logic to page and unpage an 8K ROM if new commands are used. The ROM is paged if the processor executes the instruction at ROM address 0008 or 1708 hexadecimal, the error and close# routines. It is inactivated when the Z80 executes the RET at address 0700. a) Port E7: I/O port E7 is used to send or receive data to and from the microdrive. Accessing this port will halt the Z80 until the Interface I has collected 8 bits from the microdrive head; therefore, it the microdrive motor isn't running, or there is no formatted cartridge in the microdrive, the Spectrum hangs. This is the famous 'IN 0 crash'. b) Port EF: Bit 7 6 5 4 3 2 1 0 +---------------------------------------+ READ| | | |busy| dtr |gap| sync|write| | | | | | | | |prot.| |---+---+----+----+-----+---+-----+-----| WRITE| | |wait| cts|erase|r/w|comms|comms| | | | | | | | clk | data| +---------------------------------------+ Bits DTR and CTS are used by the RS232 interface. The WAIT bit is used by the Network to synchronise, GAP, SYNC, WR_PROT, ERASE, R/_W, COMMS CLK and COMMS DATA are used by the microdrive system. If the microdrive is not being used, the COMMS DATA output selects the function of bit 0 of out-port F7: Bit 7 6 5 4 3 2 1 0 +------------------------------------------+ READ|txdata| | | | | | | net | | | | | | | | | input | |------+---+---+---+---+---+---+-----------| WRITE| | | | | | | |net output/| | | | | | | | | rxdata | +------------------------------------------+ TXDATA and RXDATA are the input and output of the RS232 port. COMMS DATA determines whether bit 0 of F7 is output for the RS232 or the network.