Proposals for the new ULA kio 11.jun.00 ------------------------- (1) Addressing the ULA Any I/O with A0 = 0 selects the ULA. Bits A7-A5 = %111 select the classic functions. Bits A7-A5 = %011 select the new functions. Bits A8-A15 select key row for classic keyboard reading. (2) Classic functions Classic functions are addressed with Port %xxxx.xxxx.111x.xxx0 When the ULA is locked to classic mode, we may consider to ignore address bits A7-A5 too. OUT to the ULA at the classic address: OUT (%1111.1110),%xxxm.ebbb o set ear and mic output bit o set border color D7 = nc D6 = nc D5 = nc D4 = mic D3 = ear D2-0 = border color index there a no new functions on this port to maintain max. compatibility in the new enhanced mode and to ease maintaining 100% compatibility in locked classic mode. (nothing to change is most easy.) Bits D2-D0 are interpreted as index in the new color lookup table. This is fine in the locked classic mode too, because there the first 8 colors are set to the 8 classic colors. IN from ULA at the classic address: IN %1e1k.kkkk,(%rrrr.rrrr.1111.1110) o read key row from keyboard matrix o read ear input bit D7 = 1 D6 = ear D5 = 1 *) D4-0 = key states from selected key row A15-8 = key row selection mask there a no new functions on this port to maintain max. compatibility in the new enhanced mode and to ease maintaining 100% compatibility in locked classic mode. (nothing to change is most easy.) the minor address bits A8-A15 select the keyboard rows to read. multiple rows are possible. the pressed keys are returned in D0..D4. (*) We could consider to use bit D5 too. this would give us 8 additional keys to detect in classic and enhanced mode. if we attach a AT keyboard to the ULA, it would be the ULA's responsibility to read this keyboard and translate it's state to the classic keyboard state; at least in locked classic mode. in enhanced mode we could use a register in the ULA for this purpose. but even in enhanced mode it would be most easy to supply key states on the classic port too. (nothing to change is most easy.) (3) Addressing enhanced functions Enhanced functions are addressed with Port %xxxx.xxxx.011x.xxx0. All enhanced functions work via 128 registers in the ULA, which are readable and writable. Though some may have no effect when written to; and other may be inaccurate when read. To select a register, do an OUT to this port with D7 set and D6-D0 containing the number of the desired register. To write to the selected register, do an OUT to this port with D7 cleared and D6-D0 containing the data for the desired register. Writing to the selected register automatically increments the address register, so a subsequent OUT will write to the next register. Since the minor address byte A15-A8 is not used, you can use OTIR to write to a range of registers. To read from the selected register, do an IN to this port. Most registers will contain only 7 bit data with D7 = 0. Only very rare and read-only registers may contain full 8 bits; if any. Reading from the selected register does not automatically increment the address register. this makes read-modify-write possible, e.g. for setting individual bits in a register. (4) Enhanced functions & the ULA registers All enhanced functions are performed via 128 registers accessible as described in (3). All registers are readable and writable. Their functions are as described below: Color palettes: reg[0] = ink 0, red reg[1] = ink 0, green reg[2] = ink 0, blue ... reg[31*3+0] = ink 31, red reg[31*3+1] = ink 31, green reg[31*3+2] = ink 31, blue nickname: clut[i] where i is the ink and clut[i] addresses all 3 bytes clut[i][c] where i is the ink and c the color component All colors for screen output are derived from a color table, whether screen modes use attributes or not. the border color is also derived from this table. Any screen mode can display max. 32 distinct colors. each color entry requires 3 bytes total for it's red, green and blue component. so the color table has 96 bytes (32*3) in total. each color component is 7 bit wide, with D7 = 0. whether the ULA can really provide 7 bit color resolution is not shure. for instance the ULA may be capable to display 4 bit colors only. this can be detected by reading back the color written to a color register: all neglected low bits will be 0. in case that we really run out of registers or that it's easier for the ULA that way we could compress the r, g, and b components into 2 bytes (14 bits). then the color table is only 64 bytes in size. General setting: reg[96] = haupt D6 = 1 lock ula until reset D5 = 1 set ula to classic mode (ignore all other register contents) D4 = 1 z80 clock = 3.5 MHz (else 21 MHz) D3 = 1 classic screen line interleaving (swap A5-7 <-> A8-10 for pixels) D2 = 1 classic flashing (else forground pixels only) all control bits are defined the way that they must be set to switch to locked classic mode. if bit D6 is set, the ULA will disable it's new ports. this is done by ORing this bit and A7 in the new_mode_addressing stage. when D6 is set, all IN and OUT with A0 = 0 will work as in in classic mode. this does not automatically imply that all ULA registers are set up to behave like a classic ULA! bit D6 can only be cleared by a reset to the ULA. if bit D5 is set, the ULA will setup all registers to behave as a in classic mode. a hard wired ULA may obey this bit at once, a micro controller version will probably examine it only once per frame. whether this bit is reset after the ULA initialized it's registers or not is not specified. whether writing to other registers is completely ignoed or whether modifications are reset to classic mode setting once per frame as long as this bit is set is not specified too. we may specify it's behaviour as soon a real ULA is implemented. D4 enables an additional divider stage for the Z80 clock output. the main system clock is input to the ULA which divides it as required and feeds the various clock sinks, e.g. it's own micro controller, the Z80 clock and the pixel shift register. in a classic ZX Spectrum the input clock was 14 MHz and divided by 4 for the Z80. the enhanced ULA requires a higher input clock to provide higher pixel clocks and a higher Z80 clock. the exactly required input clock will be determined later. but this is clear: there will be an additional divider stage for the Z80 clock which divides the clock by 6. this stage can be enabled by setting D4 = 1 resulting in a Z80 clock of 3.5 MHz. if D6 = 0 then the additional divider stage is bypassed and the resulting clock for the Z80 CPU is 21 MHz. D3 controls address bit swapping required to achieve the strange scanline interleaving we know from our Specci. if D3 is set, then the video_pixel_addressing stage swaps A5-A7 with A8-A10. this is done by inserting selecion gates in the A5-A10 output lines. this only applies to the video pixel addressing, no other addressing! example for swaping of A5 and A8 in C syntax: A5 = haupt.D3 ? A8 : A5 D2 = 1 enables classic flashing, which swaps fore- and background colors. if D2 is cleared, only the foreground pixels toggle between foreground color and background color; the background pixels stay steady at background color. this seems to be more useful for blinking graphic items e.g. in games. this effect is achieved in the attribute_evaluation stage by inserting selection gates into D5-D0. example for swaping of background color attribute bit in C syntax: D3 = flashphase && flashing && !haupt.D2 ? D0 : D3 example for swaping of foreground color attribute bit in C syntax: D0 = flashphase && flashing ? D3 : D0 Programmable video output: reg[97] = blocks_per_frame = total blocks per frame reg[98] = block_screen_start = block no. for screen start reg[99] = block_screen_end = block no. for screen end reg[100] = block_interrupt = block no. for interrupt (0 = classic) reg[101] = block_current = current block reg[102] = lines_per_block = total scan lines per block reg[103] = line_current = current scan line reg[104] = bytes_per_line = total bytes per scan line reg[105] = byte_screen_start = byte no. of screen start reg[106] = byte_screen_end = byte no. of screen end = byte_current = current byte (real register in µC) reg[107] = pixel_bank_1 = pixels bank 1 reg[108] = pixel_bank_2 = pixels bank 2 *) reg[109] = pixel_base_lo reg[110] = pixel_base_hi = pixels base address reg[111] = pixel_current_lo reg[112] = pixel_current_hi = current pixel address **) reg[113] = attr_bank_1 = attributes bank 1 reg[114] = attr_bank_2 = attributes bank 2 *) reg[115] = attr_base_lo reg[116] = attr_base_hi = attributes base address reg[117] = attr_block_lo reg[118] = attr_block_hi = current attribute block base address reg[119] = attr_current_lo reg[120] = attr_current_hi = current attribute address **) reg[121] = video_ctrl = misc. video control bits D6 = block_attr = within block attr start at same addr per lines D5 = has_attr = screen mode uses attributes D4 = dual_bank = screen mode reads pixels from 2 pages video output is performed in multiple nested loops. each loop's control variables are stored in i/o registers and can be read and written from the Z80 program. byte_current can't be read because in a µC version of the ULA we'll probably need to store it into a real register for performance reasons. also (**) pixel_current and attr_current may need to be stored in real registers of the µC and therefore be not accessible via ULA register. then we could still read/write this cell once per line. the (*) second pixel_bank_2 and attr_bank_2 are not required if we make the following limitations on the overall video capabilities: - pixels always only from one bank throughout the entire screen and - same for attributes, though they may come from a different bank. - this will simplify the overall desing. it will not limit the size of screens (16 kB pixels should be enough) but a TS2068 compatible highres mode will not be possible, because in this mode pixels come interleaved from bank 1 and 2. also the transparent layers mode won't be possible which i consider is a must to include. but the mode where 1 attribute byte from bank 2 is associated with 1 pixel byte from bank 1 will be possible. - note: the current PCB of the +2 will probably allow only 2 bytes to be fetched for one classic character cell width. this means either 1 byte pixels plus one byte attributes (which may come from the same or a different bank) or 2 bytes pixels with no attributes. what are the benefits from a freely programmable video output? - all registers presented to the outside world are required anyway, even for the implementation of one sole video mode only. the only difference is that we can read and write the registers with all benefits we can take from this feature. on the other side some control parameters, which we would like to store in the µC's real registers can't be stored there, because we'd like them to be stored in i/o accessible registers. - we are not limited to TV compatible modes. if someone connects the ULA to a monitor he can program other modes, e.g. with less flicker, which are compatible in regard to the video ram layout. - furthermore we need not to discuss which modes we like to implement. though we'll get this discussion again, when we start on a GUI... - for game and demo writers this will be great stuff to play with. they can work with weird screen sizes as they like, as long they can be displayed on a TV set. they can switch between modes at a defined position in the screen, so they can have a lowres graphics and a highres text area. and they can use the programmable screen start address to implement vertical hardware scrolling and, with some limitations to the resolution, horizontal hardware scrolling too. how does it work? there is an infinite loop over all frames. each time this loop is entered for another frame, it performs all once_per_frame tests, e.g. checking haupt.D5 for switch to classic mode; and emits the vertical retrace signal. then there are three sections which correspond to the upper border, the screen and the lower border section. this approach is especially feasable for software, that is: a µC. it is not so good for a hard wired ULA. each section consists of a loop over blocks of scan lines. this seems to be a good idea, because the number of scan lines per section cannot be expressed in 7 bit, which is the limit for the ULA registers, and because this way the blocks of lines with same attributes can be implemented. each block starts with some initial actions, e.g. a test, whether this is the block when the interrupt should be generated. also it is neccessary to check for screen mode changes within a frame and this is best done once per block. so screen mode changes are limited to whole blocks. within a block is a loop over the individual scan lines. each scan line starts with the horizontal sync signal. then border, screen and again border pixels are sent to the video shifter. now an example for the implementation in C syntax: main() { // ---- loop over frames ---- for(;;) { // start frame actions if (haupt.D5) Reset_Registers_To_Classic_Mode(); block_current = 0; pixel_current = pixel_base; attr_current = attr_base; attr_block = attr_base; Do_Vertical_Retrace_Signal(); // border above screen while ( block_current < block_screen_start ) { Do_New_Block_Actions(); for ( line_current=0; line_current> 7 attr_bank_1 = pixel_bank_1 attr_base_lo = ($4000+24*32) & $7F attr_base_hi = ($4000+24*32) >> 7 video_ctrl = block_attr | has_attr (no dual_bank) ---- more to come ---- handling of the attribute modes will come later... reg[] = %0abc.defg a = 1 text mode (some registers have new meaning) e = 1 transparent layer mode reg[] = pixel size (bits), ((more attr. specs)) reg[] = pixel clock divider