TLDR version: I am reverse engineering Microbee Basic. My disassembled code is on my google drive. It's getting pretty complete, to the point where I can move code around, reassemble, and things still work.
Over the years, I've occasionally delved into the inner workings of the various Microbee Basics, generally for a specific purpose, like wanting to figure out how the cassette data is structured for restoring tapes, or more recently figuring out how the bee reads it's keyboard. With the creation of the original SuperPAK coreboard, I had a brief look to see how the PAK command worked, with the aim of seeing if we could know which PAK ROM we were from the PAK call (spoiler, it's in HL).
during a recent holiday, I made the decision to upgrade basic for the Freebee 4MB Pak Cart. I had a few goals, in increasing order of complexity:
- Extend PAK beyond 256 * 8K = 2MB - Pak Cart has 4MB of Flash available, so I wanted to be able to type "PAK 320" and have that work.
- Add commands to erase and copy PAKs. "DELETE D" should erase the whole 512K chip in the PAK D slot. "DELETE 63" should erase PAK 63. "COPYPAK 275 TO 63" should copy the contents of PAK 275 to PAK 63. COPYPAK D to A should copy the whole 512K PAK Cart from D to A.
- Create a CP/M like directory structure on PAKs, so that I could have a PAK block with the directory on that PAK Cart, and be able to list it with "DIR A", run a program (potentially spanning multiple PAKs) with "RUN BLAH.M" where BLAH.M is the name of a program that can be viewed on the currently selected PAK Cart, and of course "LOAD BLAH.B" would check if BLAH.B exists on the currently selected PAK before trying to load from tape.
So as you can see it's getting increasingly ambitious, and to get things working we need to both find space in the basic ROMs to store our code, plus figure out enough of how the basic works that we can add our routines.
Finding the space is a doddle. Basic 5.29e, the premium BASIC, uses a bank select scheme to add a whole 8K ROM to the usual 16K basic. Examining this extended ROM shows that only 2K is used for the premium graphics routines, leaving us with 6K to play with. In fact the way that the ROM is implemented on the FreeBee Pak Cart coreboard, I can have an arbitrary dividing line between code that stays put (ROM B) and code that banks in and out (ROM A and C), as I just use a 32K EPROM with a lot of duplication. So I could squeeze all the staying put code into a 2K chunk, for example, and have 2 * 14K banks, for a total of 30K (2 * 14K + 2K) of available code space.
So now onto figuring out how it works so that I can make modifications. This is done through disassembly, using every possible hint to first figure out what's code and what's data, and then what routines do what in the code, and then working our way through all the routines to nut them out. Along the way we correct the disassembly stuff-ups where it's interpreted data as code, and to give meaningful names to routines and meaningful labels for jumps, with comments. Along the way we make sure our increasingly commented disassembly is correct by assembling it and performing a binary diff with the original ROM.
So let's start with the binary code for Basic 5.29e. It's a 24K file, stored as BASIC A, in memory from 8000h to 9FFFh when LV5 is 0 (ie at boot), followed by BASIC B, always in memory from A000h to BFFFh, followed by BASIC C, in memory from 8000h to 9FFF when LV5 is 1. So with our disassembler (I use Z80DASM, which is part of Z80ASM), we type:
z80dasm --origin=0x8000 --labels --output=basic.asm basic.bin
This gives us a very long file that looks like this (very first section):
; z80dasm 1.2.0
; command line: z80dasm --origin=0x08000 --labels --output=basic.asm basic.bin
org 08000h
l8000h:
jp l84c6h
jp l84c6h
jp la3e3h
jp la3cbh
jp la626h
jp lacafh
l8012h:
jp lab6dh
sub_8015h:
jp laae6h
sub_8018h:
jp lab26h
jp lab17h
sub_801eh:
jp l83d7h
l8021h:
jp l8517h
sub_8024h:
jp lad98h
l8027h:
jp laf9eh
jp la801h
jp la7ceh
jp lb035h
jp lb040h
jp lb04ch
jp lb057h
jp lb0a8h
jp l80ebh
jp l80bch
jp l809bh
jp l845fh
jp l8433h
jp l83c1h
The first section above is a "jump table". This is common in code of the era. You want to be able to advertise functions within your code, but you know that when you edit and reassemble the code things will move, so you start with a list of jumps to important functions, that way people can call the jump, which in turn goes to the function, which can then return to theirt code. While the function itself might move around, the jump never does.
So now we become detectives. There are some memory maps available that give us a bit of a hint about what's where in the basic. There's a reasonably good one in "Wildcards" by Ash, Burt, and Nallawalla. Let's use their insights to name the stuff in the jump table and add comments to everything:
;#############################################################################################
; Start of BASIC ROM A
;#############################################################################################
org 08000h
jp RESETTOHERE ; Start of BASIC
jp RESETTOHERE ; BASIC warm start
jp WAITMBEEKEY ; DGOS Wait for keyboard input - A register
jp GETKEYIFANY ; DGOS Scan keyboard
jp MBEEVDUFROMB ; DGOS Display character in B register
jp GIVEPIOARM ; DGOS Give PIO an arm
l8012h: jp CASSBYTEIN ; DGOS Get byte from cassette in A
sub_8015h: jp CASSBLOCKIN ; DGOS Get block from cassette
sub_8018h: jp CASSBYTEOUT ; DGOS Cassette byte out A
jp CASSBLOCKOUT ; DGOS Cassette block out
sub_801eh: jp RUNPROG ; Auto-execute address for saving BASIC program
l8021h: jp BASWARMSTART ; Warm start for restoring Reset jump
sub_8024h: jp HIRESINIT ; HIRES initialisation
l8027h: jp LORESINIT ; LORES initialisation
jp SETINVERSE ; INVERSE initialisation
jp SETUNDERLIN ; UNDERLINE initialisation
jp SETDOT ; SET dot: X = HL, Y = DE
jp RESETDOT ; RESET dot returns Z if OK
jp INVERTDOT ; INVERT dot
jp TESTDOT ; Test for dot - NZ if set/error
jp PLOTLINE ; PLOT a line
jp GETCHAR ; Redirected input A
jp PUTCHAR ; Redirected output A
jp LPUTCHAR ; Redirected print output A
jp WRMSTRCLRVAR ; Jump to BASIC with CLEAR
jp READYMODE ; Jump to BASIC command level
jp JMPBASFRPAK ; Jump to BASIC after NET or PAK
Every time we change a label, we do a global find and replace on the label, that way the routine gets labelled, as does every single call to the routine in the code. We use a method for labels that makes them obvious. I like to use all caps (shouty much), and I'm not afraid to use labels up to 16 characters. Yes, it means I have to hit TAB a lot, but so be it. Readability is _everything_.
Talking about readability, let's put lots of super-obvious breaks in the code. I like the ;############ sequence going all the way across the line. It makes a very obvious divider.
So the obvious next place to look is the first jump, where BASIC goes on power-up:
RESETTOHERE:
di
l84c7h:
ld sp,00080h
call sub_a3c9h
ld hl,lba6ah
l84d0h:
ld a,(hl)
or a
jr z,l84dch
ld b,a
inc hl
ld c,(hl)
inc hl
otir
jr l84d0h
Cool, our first learning is already there from the find and replace. First thing the routine does is disable interrupts, then initialise the stack pointer to a very low address in memory. This kinda makes sense - we don't know how much memory the machine has on power up, and we know (from the memory map) that memory from 0000-0080 is a scratch pad, so we don't mind trashing that with our stack. Let's follow the first call to see what that does:
sub_a3c9h:
reti
It's just a return from interrupt. This ensures that if there's a device that's triggered an interrupt prior to the reset, it's cleared so it's in a known state. As before, every time we learn something, we comment and label:
;#############################################################################################
; RESETTOHERE
; input: None.
; output: Performs a complete initialisation of the system.
; affects: Everything.
;#############################################################################################
RESETTOHERE: di ; Disable interrupts
ld sp,00080h ; We don't know how much RAM we have yet, so
; Initialise stack pointer to low memory
call RETURNINT ; Perform a reti
ld hl,lba6ah
l84d0h: ld a,(hl)
or a
jr z,l84dch
ld b,a
inc hl
ld c,(hl)
inc hl
otir
jr l84d0h
Now the next bit is a loop, with an exit in the jr z,l84dch bit. Looks like we get a byte from a table at lba6ah, if that's not zero, we use it as a counter, then we get the next byte from the table into C. The OTIR outputs the data pointed to by HL to the port pointed to by C, so this routine is clearly used for initialising devices from a table. These insights go into the code. First the table at lba6ah:
lba6ah:
dec b
ld bc,00f80h
rla
add a,e
add a,e
dec b
inc bc
adc a,d
rst 38h
sbc a,c
or a
ld a,a
nop
nop
As a disassembly, this looks really nonsensical, as it's not instructions, it's data. So let's go back to the binary ROM and open this bit up in a HEX editor to see what it is, noting that the HEX editor sees our code as starting at 0000h not 8000h, so a bit of address math is needed to find the section:
3A60: 08 FF 7F 00 00 00 00 00 00 C9 05 01 80 0F 17 83
3A70: 83 05 03 8A FF 99 B7 7F 00 00 08 B6 C8 A3 C8 A3
Our first byte, which is used as a counter, is at BA6Ah. This is just 5. Next byte is a port address, Port 01. So we're sending five bytes (80 0F 17 83 83) to port 01. Going to the bee port map, Port 01 is the control register for PIO port A. So this code looks like it's initialising PIO port A. If we download the instruction manual for the PIO, we can follow it through. The next bit does much the same for Port 03, which is PIO port B. So let's remove the chunk of meaningless code at BA6Ah and substitute our data:
;#############################################################################################
; PORTINITDATA - data used to initialise PIO
;#############################################################################################
PORTINITDATA: db 5, PIOACONTROL ; Initialise PIO A with five bytes
db 080h ; Set interrupt vector 080h
db 00Fh ; Set port to output mode
db 017h, 083h, 083h ; Configure interrupt mask
db 5, PIOBCONTROL ; Initialise BIO B with five bytes
db 08Ah ; Set interrupt vector 08Ah
db 0FFh ; Set port to control mode
db 10011001b ; Set port b bit 0 (TAPEIN) to input
; Set port b bit 1 (TAPEOUT) to output
; Set port b bit 2 (RS232 CLK) to output
; Set port b bit 3 (RS232 CTS) to input
; Set port b bit 4 (RS232 RXD) to input
; Set port b bit 5 (RS232 TXD) to output
; Set port b bit 6 (SPEAKER) to output
; Set port b bit 7 (VSYNC) to input
db 0B7h, 01111111b ; Set interrupt mask & enable interrupts
; for bit transitions on bit 7 (VSYNC)
db 0, 0 ; Signify end of PORTINITDATA
Note I've started labelling the ports as something meaningful - PIOACONTROL rather than 001h. So somewhere up the top of our code we need to equate our label for PIOACONTROL to 001h. Let's put everything we know about the bee hardware ports into a file HARDWARE.asm, and include that:
;#############################################################################################
; Hardware Constants - memory organisation
;#############################################################################################
SCRATCH: equ 00000h ; Basic Scratch Area
BASIC: equ 08000h ; Start of Basic ROM
;#############################################################################################
; Hardware Constants - ports
;#############################################################################################
PIOADATA: equ 000h ; PIO Port A data
PIOACONTROL: equ 001h ; PIO Port A control
PIOBDATA: equ 002h ; PIO Port B data
PIOBCONTROL: equ 003h ; PIO Port B control
We'll add to this file as we learn more. Now that we've nutted out our table, we comment the code that uses it:
;#############################################################################################
; RESETTOHERE
; input: None.
; output: Performs a complete initialisation of the system.
; affects: Everything.
;#############################################################################################
RESETTOHERE: di ; Disable interrupts
ld sp,CTCV1 ; We don't know how much RAM we have yet, so
; Initialise stack pointer to low memory
call RETURNINT ; Perform a reti
ld hl,PORTINITDATA ; Point to port initialisation data
PORTINITLOOP: ld a,(hl) ; Get counter for otir
or a ; Set flags
jr z,PORTINITFIN ; Zero - finished initialising ports
ld b,a ; Load byte counter
inc hl
ld c,(hl) ; Get address of IO port from table
inc hl ; point to data that gets sent to IO port
otir ; send it
jr PORTINITLOOP ; go get data for next device
Yay! We've worked our first bit out. This is essentially the process we follow for the whole ROM. Following calls and jumps, figuring out what's data and what's code, and then commenting and labelling so that it makes sense to us, not just to the CPU.
I've been doing this for maybe two months now pretty intensively. I've used a lot of great resources to figure stuff out: Wildcards, the Microbee Technical manual, and the Microbee Basic Software Hacker's Handbook, by Nigel Cottrill. I have commented maybe 70% of the code, and made some amazing insights along the way. For example, I've learned that most of the code, the core routines for Basic, are common to the Super-80; another machine that was developed at around the same time. That's because they both use "BASIC ETC" as a base, and they both just added their own IO routines to BASIC ETC and went from there.
It's also pretty clear that later writers of code did not have source code from earlier versions, as they became increasingly afraid to move code. The messiest bit was when they went from version 5.00 to 5.11, adding code to do colour. This code is an absolute shambles. The author didn't understand at all how the basic was structured. Rather than adding a keyword for colour alongside all the other keywords, they instead mangled the routines that tokenise the input line, searching for "COLO" in the line and substituting POKE commands inline. So then they needed to also mangle the code that detokenises for list so it does the opposite. Again, never moving code, instead having sudden jumps or calls to patches.
So as I mentioned at the top, I have a really good disassembly going, and I'm sharing it. You can assemble this code and it will give an exact byte-for-byte duplicate of Basic 5.29e. I've also got a highly modified version that I'm adding lots of stuff to for my Pak Cart Freebee. 9600 baud serial, 2400 baud cassette, more PAK, cleaned up colour commands, and much of the spaghetti untangled to free up masses of space. Here it is.


















































