Wednesday, May 27, 2015

Using SDCC with the CPUville system

As much fun as assembly coding is, sometimes (most times) I want to be lazy and use C.

SDCC knows how to emit Z80 code so that looked like a good place to start. It is easy to download and install and until you start trying to tell the linker how to lay memory out it is pretty easy to use too.
There are two main things you need to configure to make C code work on a particular computer assuming the compiler can generate code for the processor. You need to have some start up code to do any necessary set up for the C runtime and to call your main function, and you need to get the linker to put things in the correct places in memory.

Because I'm still using the CPUville monitor, there is a simple runtime environment is already up and running - the stack is initialised and the UART is configured and working. So all I want the C runtime start up code to do is initialise global variables, call main, then go back to the monitor when main returns.

SDCC comes with the source code for the default Z80 start up code in $SDCC_HOME/lib/src/z80/crt0.s. The good news is that there isn't much of it and the code is mostly obvious. What is not obvious is what all the .area directives do and how the global variable initialisation works.

Here is a rundown on the default crt0.s code.


;--------------------------------------------------------------------------
; crt0.s - Generic crt0.s for a Z80
;
; Copyright (C) 2000, Michael Hope
;
; This library is free software; you can redistribute it and/or modify it
; under the terms of the GNU General Public License as published by the
; Free Software Foundation; either version 2, or (at your option) any
; later version.
;
; This library is distributed in the hope that it will be useful,
; but WITHOUT ANY WARRANTY; without even the implied warranty of
; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
; GNU General Public License for more details.
;
; You should have received a copy of the GNU General Public License
; along with this library; see the file COPYING. If not, write to the
; Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston,
; MA 02110-1301, USA.
;
; As a special exception, if you link this library with other files,
; some of which are compiled with SDCC, to produce an executable,
; this library does not by itself cause the resulting executable to
; be covered by the GNU General Public License. This exception does
; not however invalidate any other reasons why the executable file
; might be covered by the GNU General Public License.
;--------------------------------------------------------------------------

.module crt0
.globl _main

.area _HEADER (ABS)
;; Reset vector
.org 0
jp init

.org 0x08
reti
.org 0x10
reti
.org 0x18
reti
.org 0x20
reti
.org 0x28
reti
.org 0x30
reti
.org 0x38
reti

.org 0x100
init:
;; Set stack pointer directly above top of memory.
ld sp,#0x0000

;; Initialise global variables
call gsinit
call _main
jp _exit

;; Ordering of segments for the linker.
.area _HOME
.area _CODE
.area _INITIALIZER
.area _GSINIT
.area _GSFINAL

.area _DATA
.area _INITIALIZED
.area _BSEG
.area _BSS
.area _HEAP

.area _CODE
__clock::
ld a,#2
rst 0x08
ret

_exit::
;; Exit - special code to the emulator
ld a,#0
rst 0x08
1$:
halt
jr 1$

.area _GSINIT
gsinit::
ld bc, #l__INITIALIZER
ld a, b
or a, c
jr Z, gsinit_next
ld de, #s__INITIALIZED
ld hl, #s__INITIALIZER
ldir
gsinit_next:

.area _GSFINAL
ret



I can't explain much about the .area directives because I don't understand it very well. But I've learned the _HEADER area can be fixed in a specific memory location by virtue of the ABS flag. _CODE areas, of which there will be multiple in a set of object code files, are by default relocatable and are all appended into one chunk. Importantly, the _INITIALIZER and _INITIALIZED areas behave like _CODE areas.

The first thing of interes is the _HEADER (ABS) area. It starts at 0x0000, the power on value for the program counter. A jump instruction to location 0x0100 takes execution beyond the RST/IRQ handlers to the real start up code. The RST/IRQ handlers are next. The Z80 defines where RST instructions will jump and where IRQ handlers for some interrupt modes will be and stub handlers are defined in these locations. Moving ahead to 0x0100 there is code to intiialise the stack and global variables, call main, then jump to an exit routine when main returns. This is the end of the fixed memory location start up code.

My best guess for the next batch of .area directives is to show the linker what order to put the named areas in memory. The comment is a clue, obviously, but disassembling the final binary file was more illuminating. It still isn't entirely clear to me though because I don't understand what effect _CODE defined multiple times has.

Moving beyond these placeholder area definitions there is an actual _CODE area which has three functions - __clock, _exit, and gsinit. These three are global labels by virtue of having two colons after them. __clock is obviously system dependent and needs no further consideration. _exit is also system dependent but it is called once the C program stops so is important in concept, even if the current implementation is not useful.

gsinit initialises the global variables and was a real puzzle. l__INITIALIZER, s__INITIALIZER, and s__INITIALIZED were obviously related to the _INITIALIZED and _INITIALIZER areas. But I could not see where they were defined or what they represented. The answer is in the assembler and linker documentation (which is not installed with SDCC), and some messages at http://permalink.gmane.org/gmane.comp.compilers.sdcc.user/4441 showing there these symbols need to be declared before assembly will work.

It turns out that areas get l_ and s_ symbols generated for them. l_<AREA NAME> is the length of the area, and s_<AREA NAME> is the location in memory. These must be finalised by the linker bacuse that info isn't known during assembly.

The gsinit code becomes quite simple once these symbols are understood. It uses a Z80 block copy to copy the contents of the _INITIALIZER area to the _INITIALIZED area.

The remaining puzzles were how this works when many object files with their own _INITIALIZER and _INITIALIZED areas are linked together, and why you need to copy a contiguous chunk of memory holding the inital values to another chunk where they are actually used.

The answer to the first question is that areas with the same name are concatenated in memory if they are not marked as absolute. So when the linking is finished the _INITIALIZER and _INITIALIZED areas from all the object files are merged together into single _INITIALIZER and _INITIALIZED areas, and the l_ and s_ symbols are defined based upon these merged areas.

I don't know the answer to why the values have to be copied from one place to another.


When it comes to the CPUville system and monitor, a few of the features of the default start up code are either unnecesary, not possible, or should be changed:
  •  The stack is already working.

  • There is 2K of ROM starting at 0x0000 so any loaded code must run from at least 0x0800, the start of RAM.

  • Due to the ROM, RST/IRQ handlers can't be defined.

  • _exit should return to the CPUville monitor.


I won't go through all the experiments, but instead show and explain the final code.

;--------------------------------------------------------------------------
; crt0.s - crt0.s for a CPUville Z80 kit
;
; Copyright (C) 2015, David Taylor
;
; Based upon the sdcc distribution version (C)2000 Michael Hope.
;
; This library is free software; you can redistribute it and/or modify it
; under the terms of the GNU General Public License as published by the
; Free Software Foundation; either version 2, or (at your option) any
; later version.
;
; This library is distributed in the hope that it will be useful,
; but WITHOUT ANY WARRANTY; without even the implied warranty of
; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
; GNU General Public License for more details.
;
; You should have received a copy of the GNU General Public License
; along with this library; see the file COPYING. If not, write to the
; Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston,
; MA 02110-1301, USA.
;
; As a special exception, if you link this library with other files,
; some of which are compiled with SDCC, to produce an executable,
; this library does not by itself cause the resulting executable to
; be covered by the GNU General Public License. This exception does
; not however invalidate any other reasons why the executable file
; might be covered by the GNU General Public License.
;--------------------------------------------------------------------------

monitor_warm_start = 0x04c9

.module crt0
.globl _main

.globl l__INITIALIZER
.globl s__INITIALIZED
.globl s__INITIALIZER

.area _HEADER (ABS)

; The CPUVille kit has 2K ROM, with RAM starting at 0x0800. That is
; a good place for the start-up code to go. The _CODE area can go
; right after it at 0x0809.
.org 0x800
init:

; Initialise global variables
call gsinit
; Call main
call _main
; Jump to the CPUville monitor warm start.
jp monitor_warm_start

;; Ordering of segments for the linker.
.area _HOME
.area _CODE
.area _INITIALIZER
.area _GSINIT
.area _GSFINAL

.area _DATA
.area _INITIALIZED
.area _BSEG
.area _BSS
.area _HEAP

.area _CODE

.area _GSINIT
gsinit::
ld bc, #l__INITIALIZER
ld a, b
or a, c
jr Z, gsinit_next
ld de, #s__INITIALIZED
ld hl, #s__INITIALIZER
ldir
gsinit_next:

.area _GSFINAL
ret

First up those l_ and s_ symbols are declared. It won't assemble without these declarations.

Now the initial location of the absolute _HEADER area is given as 0x0800, the start of RAM in the CPUville system. This is where the initial instructions should go.

The next few lines simply call the global variable initialisation code, call main, and then jump to the monitor warm start routine which allows you to enter monitor commands again. It is  important to jump back to the monitor rather than return, because the monitor jumps to user code rather than calling it. The only real difference to the standard start up code is that the CPUville start up code exits back to the monitor from code in the _HEADER area rather than jumping out to the _CODE area and exiting from there.

The area directives are all the same because I didn't dare change them. It is working so far, but I've not tried any memory allocation etc so I'm not sure what will happen then.

Finally the global variable initialisation code is the same as in the default start up code.

So that is the start up code. Assemble it with the command sdasz80 -o crt0.s and you end up with crt0.rel which is ready to link onto C programs.

To build a program I have to compile the .c files into .rel files, then link them together with crt0.rel using some arguments so the linker knows where the code will live in memory. Like this:

sdcc -mz80 -c test.c
sdcc -mz80 --code-loc 0x0809 --data-loc 0 --no-std-crt0 crt0.rel test.rel
objcopy -Iihex -Obinary crt0.ihx test.bin


The start up code is 9 bytes long, and the --code-loc 0x0809 argument tells the linker the _CODE area should start there, immediately after the 9 byte _HEADER area. As shown above the _HEADER area is absolute and located at 0x0800 by the .org directive in the start up source code.

It also seems to be important that crt0.rel is listed first on linking command line as it defines the order of the areas in memory with all those .area directives. An unfortunate side effect of this is that the output hex file is named crt0.ihx, but that is only annoying and does not overwrite anything.

Finally, use objcopy or bin2hex to create a binary file from the hex file. This binary file can be loaded onto the CPUville system via the monitor bload command and the program is ready to run!

Next time, wrapping some of the monitor routines so they can be called from C and the stdio functions.

1 comment:

bill rowe said...

David: Thanks for this. It was materially useful in my getting SDCC started for another Z80 computer. I have a problem though with the data segment. My system is all RAM starting at location 0x8000.

I specify code location as 8007 to allow for a 7 byte init segment. I didn't specify data location hoping it would follow my code but it defaults to 8000 so initialized data overwrites my code! Ideally I'd like to end up with
init
global data
code

but I don't know what address the code would end up at. Any thoughts?