5.1 The Spectrum The Spectrum is at the hardware level a very simple machine. There's the 16K ROM which occupies the lowest part of the address space, and 48K of RAM which fills up the rest. An ULA which reads the lowest 6912 bytes of RAM to display the screen, and contains the logic for just one I/O port completes the machine, from a software point of view at least. Every even I/O address will address the ULA, but to avoid problems with other I/O devices only port FE should be used. If this port is written to, bits have the following meaning: Bit 7 6 5 4 3 2 1 0 +---+---+---+---+---+---+---+---+ | | | | E | M | Border | +---+---+---+---+---+---+---+---+ The lowest three bits specify the border colour; a zero in bit 3 activates the MIC output, and a one in bit 4 activates the EAR output (which sounds the internal speaker). The real Spectrum also activates the MIC when the ear is written to; the emulator doesn't. This is no problem; MIC is only used for saving, and when saving the Spectrum never sounds the internal speaker. The upper three bits are unused. If port FE is read from, the highest eight address lines are important too. A zero on one of these lines selects a particular half-row of five keys: IN: Reads keys (bit 0 to bit 4 inclusive, in that order) #FEFE SHIFT, Z, X, C, V #EFFE 0, 9, 8, 7, 6 #FDFE A, S, D, F, G #DFFE P, O, I, U, Y #FBFE Q, W, E, R, T #BFFE ENTER, L, K, J, H #F7FE 1, 2, 3, 4, 5 #7FFE SPACE, SYM SHFT, M, N, A zero in one of the five lowest bits means that the corresponding key is being pressed. If more than one address line is made low, the result is the logical AND of all single inputs, so a zero in a bit means that at least one of the corresponding keys are pressed. For example, only if each of the five lowest bits of the result from reading from port 00FE (for instance by XOR A/IN A,(FE)) is one, no key is pressed. A final remark about the keyboard. It is connected in a matrix-like fashion, with 8 rows of 5 columns, as is obvious from the above remarks. Any two keys pressed simultaneously can be uniquely decoded by reading from the IN ports; however, if more than two keys are pressed decoding may not be uniquely possible. For instance, if you press Caps shift, B and V, the Spectrum will think also the Space key is pressed, and react by giving the 'Break into Program' report. This matrix behaviour is also emulated - without it, Zynaps for instance won't pause when you press 5,6,7,8 and 0 simultaneously. Bit 5 (value 64) of IN-port FE is the ear input bit. When the line is silent, its value is zero, except in the early Model 2 of the Spectrum, where it was one. When there is a signal, this bit toggles. The Spectrum loading software is not sensitive to the polarity of this bit (which it definitely should not be, not only because of this model difference, but also because you cannot be sure the tape recorder doesn't change the polarity of the signal recorded!) Some old programs rely on the fact that bit 5 is always one (for instance Spinads); for these programs the emulator can mimic a Model 2 Spectrum. Bits 6 and 7 are always one. The ULA with the lower 16K of RAM, and the processor with the upper 32K RAM and 16K ROM are working independently of each other. The data and address buses of the Z80 and the ULA are connected by small resistors; normally, these do effectively decouple the buses. However, if the Z80 wants to read of write the lower 16K, the ULA halts the processor if it is busy reading, and after it's finished it lets the processor access lower memory through the resistors. A very fast, cheap and neat design indeed! If you run a program in the lower 16K of RAM, or read or write in that memory, the processor is halted sometimes. This part of memory is therefore somewhat slower than the upper 32K block. This is also the reason that you cannot write a sound- or save-routine in lower memory; the timing won't be exact, and the music will sound harsh. Also, INning from port FE will halt the processor, because the ULA has to supply the result. Therefore, INning from port FE is a tiny bit slower on average than INning from other ports; whilst normally an IN A,(nn) instruction would take 11 T states, it takes 12.15 T states on average if nn=FE. See below for more exact information. If the processor reads from a non-existing IN port, for instance FF, the ULA won't stop, but nothing will put anything on the data bus. Therefore, you'll read a mixture of FF's (idle bus), and screen and ATTR data bytes (the latter being very scarce, by the way). This will only happen when the ULA is reading the screen memory, 61.5% (192/312) of the 1/50th second time slice in which a frame is generated. The other 38.5% of the time the ULA is building the border or generating a vertical retrace. This behaviour is actually used in some programs, for instance by Arkanoid, and Z80 also emulates this. Finally, there is an interesting bug in the ULA which also has to do with this split bus. After each instruction fetch cycle of the processor, the processor puts the I-R register 'pair' (not the 8 bit internal Instruction Register, but the Interrupt and R registers) on the address bus. The lowest 7 bits, the R register, are used for memory refresh. However, the ULA gets confused if I is in the range 64-127, because it thinks the processor wants to read from lower 16K ram very, very often. The ULA can't cope with this read-frequency, and regularly misses a screen byte. Instead of the actual byte, the byte previously read is used to build up the video signal. The screen seems to be filled with 'snow'; however, the Spectrum won't crash, and program will continue to run normally. There's one program I know of that uses this to generate a nice effect: Vectron. (which has very nice music too by the way). This effect has not been implemented however - it's a bit useless (but maybe I'll include it in the future). The processor has three interrupt modes, selected by the instructions IM 0, IM 1 and IM 2. In mode 1, the processor simply executes a RST #38 instruction if an interrupt is requested. This is the mode the Spectrum is normally in. The other mode that is commonly used is IM 2. If an interrupt is requested, the processor first builds a 16 bit address by combining the I register (as the high byte) with whatever the interrupting device places on the data bus. The processor then fetches the 16-bit address at this interrupt table entry, and finally CALLs the subroutine at that address. Rodnay Zaks in his book 'Programming the Z80' states that only even bytes are allowed as low index byte, but that isn't true. The normal Spectrum contains no hardware to place a byte on the bus, and the bus will therefore always read FF (because the ULA also doesn't read the screen if it generates an interrupt), so the resulting index address is 256*I+0FF. However, some not-so-neat hardware devices put things on the data bus when they shouldn't, so later programs didn't assume the low index byte was 0FF. These programs contain a 257 byte table of equal bytes starting at 256*I, and the interrupt routine is placed at an address that is a multiple of 257. A useful but not so much used trick is to make the table contain FF's (or use the ROM for this) and put a byte 18 hex, the opcode for JR, at FFFF. The first byte of the ROM is a DI, F3 hex, so the JR will jump to FFF4, where a long JP to the actual interrupt routine is put. In interrupt mode 0, the processor executes the instruction that the interrupting device places on the data bus. On a standard Spectrum this will be the byte FF, coincidentally (...) the opcode for RST #38. But for the same reasons as above, this is not really reliable. The 50 Hz interrupt is synchronized with the video signal generation by the ULA; both the interrupt and the video signal are generated by it. Many programs use the interrupt to synchronize with the frame cycle. Some use it to generate fantastic effects, such as full-screen characters, full-screen horizon (Aquaplane) or pixel colour (Uridium for instance). Many modern programs use the fact that the screen is 'written' (or 'fired') to the CRT in a finite time to do as much time-consuming screen calculations as possible without causing character flickering: although the ULA has started displaying the screen for this frame already, the electron beam will for a moment not 'pass' this-or-that part of the screen so it's safe to change something there. So the exact time in the 1/50 second time-slice at which the screen is updated is very important. Normally the emulator updates the entire screen at once (50 times a second), and no best solution can be given as to when exactly the screen should be updated. The user can select one of three possibilities (low, normal and high video synchronisation, corresponding to a screen update after 1/200, 2/200 or 3/200 of a (relative) second after a Z80 interrupt) to try to get the best results. Try for instance Zynaps; with normal video synchronisation the top four or five lines of the background move out-of-phase with the rest, and your space-ship flickers in that region. With low video synchronisation the background moves smoothly but the sprites flicker in all parts of the screen. Only with high video sync everything moves smoothly and doesn't flicker. In Hi-resolution color emulation mode, however, the emulator makes a copy of every screen- and attribute-line in a buffer at the exact time the ULA would display it. Also, the exact times the border colour is changed is stored. Using this information the emulator builds the screen; in this way, what you see on your PC monitor is exactly what a real Spectrum would display on a television. Remember Aquaplane, with its full-width horizon? Each line takes exactly 224 T states. After an interrupt occurs, 64 line times pass before the byte 16384 is displayed. At least the last 48 of these are actual border-lines. I could not determine whether my monitor didn't display the others or whether it was in vertical retrace, but luckily that's not really important. Then the 192 screen+border lines are displayed, followed by 56 border lines again. This makes a total of 312 lines of 224 T states, or 69888 T states, which is, at 3.5 MHz, very nearly 1/50th of a second. Now for the timings of each line itself. I define a screen line to start with 256 screen pixels, then border, then horizontal retrace, and then border again. All this takes 224 T states. Every half T state a pixel is written to the CRT, so if the ULA is reading bytes it does so each 4 T states (and then it reads two: a screen and an ATTR byte). The border is 48 pixels wide at each side. A video screen line is therefore timed as follows: 128 T states of screen, 24 T states of right border, 48 T states of horizontal retrace and 24 T states of left border. When an interrupt occurs, the running instruction has to be completed first. So the start of the interrupt is fixed relative to the start of the frame up to the length of the last instruction in T states. If the processor was executing a HALT (which, according to the Z80 books I read, is effectively many NOPs), the interrupt routine starts at most 3 T states away from the start of the frame. The slowest instructions (INC/DEC (IX+d), RL etc. (IX+d), EX (SP),IX) take 23 T states. Of course the processor also needs some T states to store the program counter on the stack, read the interrupt vector and jump to the routine. In interrupt mode 1, this takes 13 T states; in interrupt mode 0, and assuming a RST #38 opcode is supplied, it takes 12 T states; a mode 2 interrupt takes 19 T states. Finally, a Non Maskable Interrupt is the fastest: it takes 11 T states. The ZX81 hardware generates a WAIT only 16 T states before it generates an NMI, which, by some combined hardware and software wizardry, generates one scanline on the television screen. It seems therefore that by executing a whole lot of slow instructions in a block, it is possible to jam the horizontal synchonisation of the ZX81 video signal. Has this ever been tried? Now when to OUT to the border to change it at the place you want? First of all, you cannot change the border within a 'byte', an 8-pixel chunk. If we forget about the screen for a moment, if you OUT to port FE after 14326 to 14329 T states (including the OUT) from the start of the IM 2 interrupt routine, the border will change at exactly the position of byte 16384 of the screen. The other positions can be computed by remembering that 8 pixels take 4 T states, and a line takes 224 T states. You would think that OUTing after 14322 to 14325 T states, the border would change at 8 pixels left of the upper left corner of the screen. This is right for 14322, 14323 and 14324 T states, but if you wait 14325 T states the ULA happens to be reading byte 16384 (or 22528, or both) and will halt the processor for a while, thereby making you miss the 8 pixels. This exception happens again after 224 T states, and again after 448, an so forth. These 192 exceptions left of the actual screen rectangle are the only ones; similar things don't happen at the right edge because the ULA don't need to read things there - it has just finished! As noted above, reading or writing in low ram (or OUTing to the ULA!) causes the ULA to halt the processor. When and how much? The processor is halted each time you want to access the ULA or low memory and the ULA is busy reading. Of the 312 'lines' the ULA generates, only 192 contain actual screen pixels, and the ULA will only read bytes during 128 of the 224 T states of each screen line. But if it does, the processor seems to be halted for 64 T states. It is not clear to me when, and for how long exactly, the ULA halts the processor. Sometimes the ULA even stops the processor when it is not interfering with it (when it is busy making the border left or right of the screen rectangle). Also, the timings on the 128K spectrum are different. The 128 ULA seems to be more relaxed as to giving the processor access to screen memory. I do not have any hard information on this at the moment.