Monday 30 October 2023

FreeBee - A Microbee Compatible Single Board Computer

This one is perhaps a little more ambitious than most of my vintage comupting shenanigans. It's been rattling around my head for a good number of years, as a vague idea about how I could simplify the video hardware on a Microbee. It's had a few false starts - mainly because as soon as the design process starts I start adding things - Z180 processors, per-pixel colour, blah blah. Once I start designing PLDs in, I know it's not going to see the light of day, as I simply don't enjoy the PLD design process.

So this time around I set myself some really strict rules. I wanted to design the least possible computer I could that's capable of running Emu Joust. Rules are:

  • All through-hole DIP.
  • Must be able to play Emu Joust.
  • Absolutely no PLDs. Really. Designing a PLD in is just a shorthand to saying "I couldn't bother doing that bit, I'll leave it for later". And the bloody things just raise the bar for everyone. So many products use a PLD simply to make them hard to copy.
  • Which brings me to the next bit. Open source.
  • A machine for validating my video memory scheme.
  • as much as possible built from easy to obtain, current production chips.

The last point is a bit difficult, as the key component for running Microbee software is the R6545 CRT controller. This is the granddaddy of modern graphics processors. It doesn't actually touch the graphics information, but it contains a pile of highly configurable counters that generate all the addresses for working theough video memory and pushing the data out to a CRT.

It's the "highly configurable" bit that's problematic. Other machines of a similar era, even those that used ASICs for video (like the Sinclair Spectrum) had counters to clock through video memory, but were's nearly as configurable as the 6545. This is why the Harlequin is able to do Spectrum video in TTL. No configurability. Anyway, yes, 6545's aren't in current production, but we'll just have to deal with that.

So let's get this video memory scheme bedded down first. It's the core part of the desigh. The majority of the computer is the video display circuitry.

The R6545, like it's very close relative the Motorola 6845, comes from the bad old days of computing when people were impressed by seeing characters on a screen. Nowadays computers have pixel-addressable screens, but back in the late seventies that was super- high-end. When you think of the amount of data that even the modest Microbee 512 x 256 screen resolution entails (16kB in glorious monochrome), and then imagine moving the whole screen up one line (like you would when you scroll) with an LDIR block copy command. This takes 21 clocks per byte, or a whopping 6.2µs at 3.375 MHz. Our screen takes 101ms to draw (much longer if we have to wait for retrace times to do our moving), and the CPU isn't doing _anything_ else during that time.

So the whole character thing makes our lives simpler. If we base our display on ASCII, and split the display up into (say) 16 rows of 64 columns (like the Bee does) then we've only got 1kB of data to store (and move!) for the whole screen.

The key to character based screens is a double memory access. The CRTC outputs a counter that points to the character position on the screen. Each character is made up of a number of rows, and the CRTC outputs a row address that indexes into a character ROM. The same people who made the CRT Controllers also sold ROMs with ASCII character data.

Here's a diagram from the SY6545 application note showing the scheme:

And here's what's in the character ROM. This little guy is the Motorola MCM66740, from the late seventies. If you've spent as long looking at Microbee screens as I have, this will be instantly recognisable. It's the 'bee font!

When generating the display, the CRTC scans across the screen, outputting the relevant screen address for the characters being displayed. At the end of the line a horizontal sync pulse is output, and then the same thing happens for the next row in the same characters. Once all the rows for the characters in the first row are done, the screen address is updated for the second row, and the next lot of characters are clocked out.

Now how do we get graphics capability without the massive increase in CPU load that comes from pixel addressable graphics? The Exidy sorcerer (and Microbee) made use of programmable characters. Essentially a RAM was added alongside the character ROM. The first 128 characters (standard ASCII) go to the ROM, and the upper 128 go to the RAM. Loading data in to the RAM is a bit laborious - you need another set of address multiplexers, and another data transceiver. In the classic Microbee there are six multiplexer chips, and two transceivers, plus the shift register for the video output.

It gets a lot more complicated in later bees. They have an extra RAM for colour for each character, plus yet another for "attribute" which really just increases the number of characters we can have (from 256 to conceivably 64K), meaning enough memory to have an individual bit per pixel, at the expense of more memory and more data bus transceivers.

So this is where we enter the scene. We're not interested in colour or attributes (yet), but we are interested in running Emu Joust, which required monochrome with 128 "Programmable Character Graphics" characters.

It’s possible to simplify this a bit, if you don’t mind all characters being in RAM (which means you have to pre-load them at power up).The key to understanding how to do this is to examine the timing. In order to display 80 characters in the standard PAL horizontal rate of 15.625 kHz, we need to use a dot clock of 13.5 MHz, which gives us a character rate (each character being 8 pixels wide) of 1.6875 MHz (592ns). This is a long time. Even when the Bee was very first built in ’82, you could buy 6116 static RAMs with an access time of 200ns or better. So it’s entirely reasonable to look up the screen data, then use the result a second time to look the character data using the same physical RAM.

Our budget ends up looking something like:

  • 9ns input mux + 200ns RAM + 22ns screen latch + 200ns RAM = 431ns.

Whereas a normal Microbee does:

  • 9ns screen mux + 200ns screen RAM = 209ns, at the same time as
  • 9ns character mux + 200ns PCG RAM = 209ns, or
  • 9ns character mux + 450ns Character ROM = 459ns.

It’s pretty clear the incredibly slow 2532 is letting the show down for everyone. If we can just get rid of it, we can make some real changes.

Let's use just one 8K RAM for everything. At the start of a character cycle the screen address is presented to the RAM (gated using tri-state buffers). The RAM outputs the screen data. At the half-way point through the character, this data is latched, and wrapped around back to the RAM address lines (using a second tri-state buffer) along with the row address. The data from the RAM is now our video which may be latched into the shift register. CPU accesses to either screen or character use a third set of tri-state buffers for the address, and a transceiver for the data.

Note that 8K is more than we ostensibly need for screen RAM (2K), Character RAM (2K), and PCG RAM (2K). Let's use the last 2K for a second "small" font, which will be useful for an 80 x 24 screen. Turns out the Motorola MCM6674 character ROM has just the font we need, which (surprising nobody) is exactly the same as the Microbee 80 column font. To keep our Microbee compatibility, we'll enable this little guy using the MA13 output from the 6545, so by changing screen start address from 0000h to 2000h, we select the small font.

So here's our schematics for CRT controller, video memory, and video memory access control. There's some complexity, sure, but we can break it down a bit. The first sheet is the CRT controller, along with the video output circuitry and keyboard.

Here IC9 is our CRT Controller. It generates Video Memory addresses (MA0-10 and RA0-3) for the video memory array. The keyboard makes use of a "light pen strobe" input on the CRTC. Some of the video memory addresses are decoded and passed through a keyboard array. If you press a key, as the addresses scan to the key column and row, this match is passed through to the CRTC, which dutifully latches the address and raises a status bit to tell the CPU a key has been pressed.

The DE (Display Enable), HS (Horizontal Sync), VS (Vertical Sync) and Cursor outputs from the CRTC, along with the video bitstream from the momory page are are used to build the actual output to the CRT. HS and VS tell the CRT when to start a new row (HS) or screen (VS). Display enable is used to gate the video output, so we don't put rubbish in the margins, and cursor is a neat signal that can be used to invert the video for a programmable number of rows in a programmable character position. IC16 delays the cursor and display enable signals to line up nicely with the video stream, and IC26 and 37 do the gating and cursor inversion.

Moving on to the video memory page, we have a few things to do. IC8 and 10 gate the CPU address to the video RAM (IC14) and IC21 gates the CPU data bus. This allows us to read and write to the video memory. IC13 and 13 gates the CRTC address to the memory for a screen access, and IC11 does similarly for the row addresses that are used during character access. IC17 allows us to feed the results of the screen lookup back to the RAM for characters, and IC15 serialises the final output to send to the screen. Gating for the CRTC is incredibly easy - it's just done on alternate phases of the character clock. When CCLK is low we do screen address, and when CCLK is high we do character.

Finally we have the glue (no PLDs!) that works out what video address to generate for the various possibilities of CRTC screen, CRTC character RAM big and little fonts), CRTC PCG, and CPU access for each of these. A couple of gates control read and write (essentially it's all read unless the CPU wants to do a write), and the last bit delays CPU access when the CPU tries to barge in while the CRTC is actively writing the screen, to ensure the CPU doesn't put garbage on the screen. This is reasonably straightforward - CPU accesses are latched in IC18, causing the CPU to be put into a wait state. Once we're in retrace, the wait is cleared and the CPU finishes doing it's thing.

As I said at the beginning, the video circuitry is most of this computer, so you'll be glad to know there's not a lot more to describe.

Here's the CPU and memory. By using a CMOS CPU there's no need for bus transceivers and buffers everywhere (saving a bunch of chips). The EPROM here started life as simply something to load up the fonts into video RAM on boot, but I ended up making it big enough to include Microbee Basic as well, plus added 32K of RAM, so as one board it does everything.

The flip flop is there to enable the ROM at address 0 on reset. Once the boot sequnce is complete the ROM is able to page itself out, presenting RAM from 0000h to 7fffh, with the upper ROM from 8000h to efffh.

Next we have some ports. A Z-80 PIO gives us some GPIO, plus bit-banged serial, a speaker, and cassette I/O. Port decoding is simply done with a 74HCT138 and 74HCT139, giving the following port map:

  • 00 to 03: PIO
  • 0B: Boot ROM and Character RAM enable port
  • 0C to 0F: CRTC
  • Plus a few others that aren't used on the board but may be linked to a coreboard, if that's plugged in.

Lastly we have clock and reset. The clock is derived from a 13.5MHz crystal. This provides the dot clock for the video shift register. It's divided by 4 (3.375 MHz) for the CPU clock, and by 8 (1.6875MHz) for character clock. The last eighth of each character clock is used to load the shift register, so this is derived by just anding CLK/2, CLK/4 and CLK/8.

Reset can come from two places - power up or reset switch. In each case it's just done by charging a capacitor and using a 74HC14 as a comparator.

Layout is done in KiCad. 12 thou track and space with beautiful curved traces. It really looks the part.

And after a bit of debugging, we finally have:

As always, here's the design files:

No comments: