Sunday 24 December 2023

Making the Compact Flash Coreboard more accessible (and reliable!)

I really enjoyed the design exercise for the FreeBee main board, especially doing the whole thing without using any PLDs. It had me drawing Karnaugh maps and trying different methods of simplifying TTL logic, which I find really satisfying.

So over the Christmas break I figured I'd revisit an older project, the Compact Flash Coreboard. Last time I worked on this was back in 2016. It basically featured a compact flash socket, some RAM, ROM, Floppy controller, and three Atmel 44 pin CPLDs. The design of these CPLDs was an exercise in frustration, tracking down glitches everywhere due to the big difference in their speed compared to that of the Microbee that was running them. One of the key issues was that I convinced myself that in order to talk IDE, I needed to do 16 bit transfers, where this simply isn't the case. This greatly complicated the design.Recently however I've been reading of RCBUS designs, including a compact flash interface in just three TTL chips (see this work by Tadeusz Pycio) that corrects for timing differences between the Z80 (rd, wr, iorq) and 8086 (iord, iowr) bus, meaning it will work reliably with any smallish Flash card, not just a select few.

The board essentially replicates what my old rev 0.4 coreboard did, with up to 512K of RAM, up to 512K of EEPROM, a floppy disk controller, and the Compact Flash interface. It's the "I want to have the CP/M bee experience" equivalent of the SuperPAK coreboard.

By allowing for up to 512K of EEPROM, my hope is that the machine can run a ported version of ROMWBW, which is an unencumbered version of CP/M developed by the retrocomputing community.

IC13, a 74HCT138, performs port decode into 8 port chunks, from P40h-47h, through to port 78h-7Fh.

Bank switching is performed to allow up to 512K of RAM and 512K of EPROM on the Microbee or FreeBee mainboards, in a way that works with the standard Microbee BIOS. This involves a write-only port at 50h, with the following bit assignments:

  • Bit 0: RAM bank bit 0
  • Bit 1: RAM bank bit 1
  • Bit 2: Video RAM disable. When reset video memory is given precedence in the map over all other memory.
  • Bit 3: ROM disable. When set RAM bank 0 appears in upper memory (8000h - FFFFh) and bit 1 is negated. When reset ROM bank zero appears in upper memory.
  • Bit 4: Video RAM location. When reset video RAM is at F000h to FFFFh. When set it is from 8000h to 8FFFh.
  • Bit 5: ROM bank select. When set we can use the four bank bits to select a ROM bank to appear in lower memory (0000h to 7FFFh) when ROM is enabled. When set the RAM exists here. Note this leaves no RAM in the system except for Video RAM. This bit breaks compatibility with the bee bank scheme when set, but shouldn't matter as I've never seen it used.
  • Bit 6: RAM bank bit 2.
  • Bit 7: RAM bank bit 3.

    There's a bit of weirdness with the Microbee bank select that we have to account for as well. Essentially bank select bit 1 is exclusive ORed with ROM disable. I'm not completely sure why they did this, and it's really hard to figure out from the contradictory documentation. When the computer boots and the register is cleared, RAM bank 0 appears from 0000h to 7FFFh, ROM bank 0 from 8000h to EFFFh, and video memory from F000h to FFFFh. Setting bit 2 then sees RAM bank 2 from 0000h to 7FFFh, RAM bank 0 from 8000h to EFFFh, and video memory from F000h to FFFFh.

    The compact flash interface is composed of just two dedicated chips, IC26 and IC28. IC26B, a 74HCT32, creates a shortened IORD* pulse by delaying the start of RD* by one CPU clock. IC26A lengthens CFSEL* by a CPU clock. The two gates on IOWR* simply delay this signal by two gate delays.

    The compact flash and IDE interface exists from P60h to P68h. This maintains compatibility with the Microbee CF8 BIOS.

    The reset circuit is straight from the standard SRAM coreboard. Note thet we are not asserting NMI instead of reset with jump latch, as they do for the Microbee DRAM coreboards. I believe this is done to ensure refresh is continued to the RAM. As we're not using DRAM, we don't need to do this. As with the SuperPAK board, some adjustment of D4 might be necessary depending on what supply voltage you run your bee on.

    The rest of the board is the Floppy Disk Interface. This is really only interesting if you have old floppy disks to read. There are a pile of changes from the "standard" microbee floppy interface. Firstly, The board should work (with some code changes) for either WD2793 or WD2797 FDC chips. This is done by not using the ENMF* input (WD2793) to do a divide by two on the clock. This is instead done separately using a flipflop.

    IC15 is a four bit latch at port 48h. Bit 0 selects the floppy drive (A or B). Bit 2 selects the side. Bit 3 is used to select double density (MFM encoding) on the FDC. Bit 4 (unused on the microbee normally) selects high density (8", 5.25" 1.2MB, or 3.5" 1.44MB) disks. When set it doubles the FDC clock, and selects a different set of precompensation trimpots, as well as doubling the pump frequency (by halving the capacitance on the pump pin. This is based on the FloppyIO board from MSPP.

    There's some jumpers for selecting Head Load Timeout delays. I've labelled them 3 (fast), 5 (medium) and 8 (glacial) to correspond to the varying delays that must be incorporated for various hardware.

    The last tidbit from the FDC sheet is the NMI logic. The microbee is not fast enough normally to keep up with the data rate from a high density drive. Tony Ellis did a lot of work to develop a faster FDC interface, and worked out that if you use the INTRQ output from the FDC to trigger an NMI and heavily optimise your code, you can _just_ keep up at 3.375 MHz.

    So IC20D, IC18A, and IC23B does that. When HD floppies are enabled and halt is active (ie the CPU is waiting for an interrupt), the INTRQ or DRQ output of the FDC is gated to NMI.

    The PCB design is a simple 269 x 107mm, two layer board. The Compact flash socket dictates 8 thou (0.2mm) clearance, but otherwise it's much the same as other FreeBee boards. I've taken a lot of care to get grounds low impedance, and added a 40mm IDE socket, so if you're not brave enough to solder on the 0.635mm pitch SMD Compact Flash socket, you can just buy a cheap eBay adapter and use thet.

    It's part of the FreeBee family, so has been treated to the same attention to detail as other boards from the family. Nice rounded tracks and a clean hand-done layout, with generous elliptical pads for all ICs making for ease of construction.

    It's a completely open source design. Design files are on my Google drive. Having gotten the prototype working, I'm in the process of updating the design files for the production version.

    Here's the prototype ready for smoke test...

    There are two small stuff-ups with the prototype. Firstly I forgot to connect the video enable signal back to the coreboard connector. A bodge wire is needed from IC16 pin 4 to X4 pin 13. Secondly the drive A and Drive B select signals are reversed.

    There is a process documented on the MSPP site for generating the CF images to work with this, as well as the BIOS ROM, in the "tech" repository, under Microbee/Software/Compact_Flash/IDE_CF_Adapter.

    Disk controller setup is as follows:

    Boot into monitor with ctrl-M (no CF card installed).

    Install a jumper across the TEST header (This has to be done after boot).

    In monitor type O 48 8. This will enable DDEN mode, but keep HDEN mode disabled.

    • Check the clock frequency (pin 24 IC19), it should be 1 MHz.
    • Adjust RV4 to make the pulse on the TG43 test point 500ns.
    • Adjust C23 to make the pulse on the DIRC test point 2µs.
    • Adjust RV2 to make the pulse on the WD test point 250ns.

    Now type O 48 18 in monitor. This enables HDEN mode, as well as DDEN mode.

    • Check the clock frequency (pin 24 IC19), it should be 2 MHz.
    • Adjust RV3 to make the pulse on the TG43 test point 250ns.
    • Check that the pulse length on DIRC is now 1µs.
    • Adjust RV1 to make the pulse on the WD test point 125ns.
  • Monday 30 October 2023

    FreeBee - A Microbee Compatible Single Board Computer

    This one is perhaps a little more ambitious than most of my vintage comupting shenanigans. It's been rattling around my head for a good number of years, as a vague idea about how I could simplify the video hardware on a Microbee. It's had a few false starts - mainly because as soon as the design process starts I start adding things - Z180 processors, per-pixel colour, blah blah. Once I start designing PLDs in, I know it's not going to see the light of day, as I simply don't enjoy the PLD design process.

    So this time around I set myself some really strict rules. I wanted to design the least possible computer I could that's capable of running Emu Joust. Rules are:

    • All through-hole DIP.
    • Must be able to play Emu Joust.
    • Absolutely no PLDs. Really. Designing a PLD in is just a shorthand to saying "I couldn't bother doing that bit, I'll leave it for later". And the bloody things just raise the bar for everyone. So many products use a PLD simply to make them hard to copy.
    • Which brings me to the next bit. Open source.
    • A machine for validating my video memory scheme.
    • as much as possible built from easy to obtain, current production chips.

    The last point is a bit difficult, as the key component for running Microbee software is the R6545 CRT controller. This is the granddaddy of modern graphics processors. It doesn't actually touch the graphics information, but it contains a pile of highly configurable counters that generate all the addresses for working theough video memory and pushing the data out to a CRT.

    It's the "highly configurable" bit that's problematic. Other machines of a similar era, even those that used ASICs for video (like the Sinclair Spectrum) had counters to clock through video memory, but were's nearly as configurable as the 6545. This is why the Harlequin is able to do Spectrum video in TTL. No configurability. Anyway, yes, 6545's aren't in current production, but we'll just have to deal with that.

    So let's get this video memory scheme bedded down first. It's the core part of the desigh. The majority of the computer is the video display circuitry.

    The R6545, like it's very close relative the Motorola 6845, comes from the bad old days of computing when people were impressed by seeing characters on a screen. Nowadays computers have pixel-addressable screens, but back in the late seventies that was super- high-end. When you think of the amount of data that even the modest Microbee 512 x 256 screen resolution entails (16kB in glorious monochrome), and then imagine moving the whole screen up one line (like you would when you scroll) with an LDIR block copy command. This takes 21 clocks per byte, or a whopping 6.2µs at 3.375 MHz. Our screen takes 101ms to draw (much longer if we have to wait for retrace times to do our moving), and the CPU isn't doing _anything_ else during that time.

    So the whole character thing makes our lives simpler. If we base our display on ASCII, and split the display up into (say) 16 rows of 64 columns (like the Bee does) then we've only got 1kB of data to store (and move!) for the whole screen.

    The key to character based screens is a double memory access. The CRTC outputs a counter that points to the character position on the screen. Each character is made up of a number of rows, and the CRTC outputs a row address that indexes into a character ROM. The same people who made the CRT Controllers also sold ROMs with ASCII character data.

    Here's a diagram from the SY6545 application note showing the scheme:

    And here's what's in the character ROM. This little guy is the Motorola MCM66740, from the late seventies. If you've spent as long looking at Microbee screens as I have, this will be instantly recognisable. It's the 'bee font!

    When generating the display, the CRTC scans across the screen, outputting the relevant screen address for the characters being displayed. At the end of the line a horizontal sync pulse is output, and then the same thing happens for the next row in the same characters. Once all the rows for the characters in the first row are done, the screen address is updated for the second row, and the next lot of characters are clocked out.

    Now how do we get graphics capability without the massive increase in CPU load that comes from pixel addressable graphics? The Exidy sorcerer (and Microbee) made use of programmable characters. Essentially a RAM was added alongside the character ROM. The first 128 characters (standard ASCII) go to the ROM, and the upper 128 go to the RAM. Loading data in to the RAM is a bit laborious - you need another set of address multiplexers, and another data transceiver. In the classic Microbee there are six multiplexer chips, and two transceivers, plus the shift register for the video output.

    It gets a lot more complicated in later bees. They have an extra RAM for colour for each character, plus yet another for "attribute" which really just increases the number of characters we can have (from 256 to conceivably 64K), meaning enough memory to have an individual bit per pixel, at the expense of more memory and more data bus transceivers.

    So this is where we enter the scene. We're not interested in colour or attributes (yet), but we are interested in running Emu Joust, which required monochrome with 128 "Programmable Character Graphics" characters.

    It’s possible to simplify this a bit, if you don’t mind all characters being in RAM (which means you have to pre-load them at power up).The key to understanding how to do this is to examine the timing. In order to display 80 characters in the standard PAL horizontal rate of 15.625 kHz, we need to use a dot clock of 13.5 MHz, which gives us a character rate (each character being 8 pixels wide) of 1.6875 MHz (592ns). This is a long time. Even when the Bee was very first built in ’82, you could buy 6116 static RAMs with an access time of 200ns or better. So it’s entirely reasonable to look up the screen data, then use the result a second time to look the character data using the same physical RAM.

    Our budget ends up looking something like:

    • 9ns input mux + 200ns RAM + 22ns screen latch + 200ns RAM = 431ns.

    Whereas a normal Microbee does:

    • 9ns screen mux + 200ns screen RAM = 209ns, at the same time as
    • 9ns character mux + 200ns PCG RAM = 209ns, or
    • 9ns character mux + 450ns Character ROM = 459ns.

    It’s pretty clear the incredibly slow 2532 is letting the show down for everyone. If we can just get rid of it, we can make some real changes.

    Let's use just one 8K RAM for everything. At the start of a character cycle the screen address is presented to the RAM (gated using tri-state buffers). The RAM outputs the screen data. At the half-way point through the character, this data is latched, and wrapped around back to the RAM address lines (using a second tri-state buffer) along with the row address. The data from the RAM is now our video which may be latched into the shift register. CPU accesses to either screen or character use a third set of tri-state buffers for the address, and a transceiver for the data.

    Note that 8K is more than we ostensibly need for screen RAM (2K), Character RAM (2K), and PCG RAM (2K). Let's use the last 2K for a second "small" font, which will be useful for an 80 x 24 screen. Turns out the Motorola MCM6674 character ROM has just the font we need, which (surprising nobody) is exactly the same as the Microbee 80 column font. To keep our Microbee compatibility, we'll enable this little guy using the MA13 output from the 6545, so by changing screen start address from 0000h to 2000h, we select the small font.

    So here's our schematics for CRT controller, video memory, and video memory access control. There's some complexity, sure, but we can break it down a bit. The first sheet is the CRT controller, along with the video output circuitry and keyboard.

    Here IC9 is our CRT Controller. It generates Video Memory addresses (MA0-10 and RA0-3) for the video memory array. The keyboard makes use of a "light pen strobe" input on the CRTC. Some of the video memory addresses are decoded and passed through a keyboard array. If you press a key, as the addresses scan to the key column and row, this match is passed through to the CRTC, which dutifully latches the address and raises a status bit to tell the CPU a key has been pressed.

    The DE (Display Enable), HS (Horizontal Sync), VS (Vertical Sync) and Cursor outputs from the CRTC, along with the video bitstream from the momory page are are used to build the actual output to the CRT. HS and VS tell the CRT when to start a new row (HS) or screen (VS). Display enable is used to gate the video output, so we don't put rubbish in the margins, and cursor is a neat signal that can be used to invert the video for a programmable number of rows in a programmable character position. IC16 delays the cursor and display enable signals to line up nicely with the video stream, and IC26 and 37 do the gating and cursor inversion.

    Moving on to the video memory page, we have a few things to do. IC8 and 10 gate the CPU address to the video RAM (IC14) and IC21 gates the CPU data bus. This allows us to read and write to the video memory. IC13 and 13 gates the CRTC address to the memory for a screen access, and IC11 does similarly for the row addresses that are used during character access. IC17 allows us to feed the results of the screen lookup back to the RAM for characters, and IC15 serialises the final output to send to the screen. Gating for the CRTC is incredibly easy - it's just done on alternate phases of the character clock. When CCLK is low we do screen address, and when CCLK is high we do character.

    Finally we have the glue (no PLDs!) that works out what video address to generate for the various possibilities of CRTC screen, CRTC character RAM big and little fonts), CRTC PCG, and CPU access for each of these. A couple of gates control read and write (essentially it's all read unless the CPU wants to do a write), and the last bit delays CPU access when the CPU tries to barge in while the CRTC is actively writing the screen, to ensure the CPU doesn't put garbage on the screen. This is reasonably straightforward - CPU accesses are latched in IC18, causing the CPU to be put into a wait state. Once we're in retrace, the wait is cleared and the CPU finishes doing it's thing.

    As I said at the beginning, the video circuitry is most of this computer, so you'll be glad to know there's not a lot more to describe.

    Here's the CPU and memory. By using a CMOS CPU there's no need for bus transceivers and buffers everywhere (saving a bunch of chips). The EPROM here started life as simply something to load up the fonts into video RAM on boot, but I ended up making it big enough to include Microbee Basic as well, plus added 32K of RAM, so as one board it does everything.

    The flip flop is there to enable the ROM at address 0 on reset. Once the boot sequnce is complete the ROM is able to page itself out, presenting RAM from 0000h to 7fffh, with the upper ROM from 8000h to efffh.

    Next we have some ports. A Z-80 PIO gives us some GPIO, plus bit-banged serial, a speaker, and cassette I/O. Port decoding is simply done with a 74HCT138 and 74HCT139, giving the following port map:

    • 00 to 03: PIO
    • 0B: Boot ROM and Character RAM enable port
    • 0C to 0F: CRTC
    • Plus a few others that aren't used on the board but may be linked to a coreboard, if that's plugged in.

    Lastly we have clock and reset. The clock is derived from a 13.5MHz crystal. This provides the dot clock for the video shift register. It's divided by 4 (3.375 MHz) for the CPU clock, and by 8 (1.6875MHz) for character clock. The last eighth of each character clock is used to load the shift register, so this is derived by just anding CLK/2, CLK/4 and CLK/8.

    Reset can come from two places - power up or reset switch. In each case it's just done by charging a capacitor and using a 74HC14 as a comparator.

    Layout is done in KiCad. 12 thou track and space with beautiful curved traces. It really looks the part.

    And after a bit of debugging, we finally have:

    As always, here's the design files:

    Friday 6 October 2023

    A SuperPAK Coreboard

    A bit of a discussion on the Microbee forum got me thinking it'd be nice to do a simple ROM coreboard. No flash, no CF, no SD, just lots of straightforward biggish EPROMs. Then load games in, write a shell, and play.

    I had a look at what was available around the time the Bee was still selling, and you could get 27512 EPROMs (64K x 8 in a 28 pin package) and skinny 6264s (which the Microbee premium baseboard uses). As it happens I have a whole pile of both.

    So a few days work in KiCAD yeilds plunder:

    One has to be considered in using round the tracks in KiCAD. Each time you run it it seems to take about twice as long as the last time, so if you do it too much the board becomes impossible to edit.

    Edit: Yay, boards have arrived!

    And assembled using parts I had to hand. Note no battery backup as yet, plus I didn't have a 74C or 74HC14, so have substituted a 74HC04 (not schmidt trigger). As a result reset is really hit and miss.

    In amy case, it boots, both on non-premium and premium baseboards, and I can run Emu Joust and access PAK. I shall order the remaining parts through the week.

    An adaptor board to simplify software development - this allows us to plug a single 32 pin EEPROM in in place of four 28 pin EPROMS, and can be assembled with a zif socket.