By Pierre Sarrazin (sarrazip@sarrazip.com)
Date of this manual: 2025-01-03
Copyright © 2003-2025
http://sarrazip.com/dev/cmoc.html
Distributed under the GNU General Public License, version 3 or later (see the License section).
Version of CMOC covered by this manual: 0.1.90
CMOC is a Unix-based program that compiles a C-like language, generates Motorola 6809 assembly language programs and assembles them into executables for the Color Computer's Disk Basic environment. It can also target the Dragon computer, the Vectrex video game console, the Thomson MO and TO computers, and the OS-9 and FLEX operating systems.
The efficiency of the generated machine code is modest, but that code will be faster than the equivalent interpreted Color Basic program. This was the initial goal of the CMOC project.
CMOC itself is written in C++ for a Unix-like system. The source code is somewhat complicated because of the anarchic development process that gave rise to CMOC (one of the classes is named Contraption). Years ago, development was able to take off because having a working product had higher priority than having a clean but theoretical design.
float
and double
arithmetic for a target other than the CoCo Disk Basic
and DragonDOS environments.
(CMOC generates calls to the floating point routines of those Basic interpreters.)
Double-precision floating-point
arithmetic (the double
keyword is accepted but is an alias for float
).
64-bit arithmetic (no long long
type).
The volatile
keyword. (It is accepted but ignored,
and using it causes a warning to be issued.)
Bit fields. The TYPE NAME : BITS notation is actually accepted by the compiler, since version 0.1.52, but the field is allocated with the given TYPE, regardless of the number of bits specified after the colon.
Typedefs local to a function (global typedefs are supported).
Redefining a typedef name, even with an identical definition (e.g.,
typedef int INT; typedef int INT;
).
Structs local to a function (global structs are supported).
Indirection of a pointer to a struct used as an r-value (e.g., *ptr
alone).
The l-value case is supported, e.g., (*ptr).field
.
The register
keyword is accepted but ignored.
A continue
statement in a switch() body.
An expression of type float or double as an argument of a switch().
Implementing Duff's device in a switch().
Single-precision floating-point
arithmetic under the CoCo Disk Basic and (since version 0.1.79) DragonDOS
environments (the double
keyword is accepted but is an alias for float
).
8-, 16- and 32-bit arithmetic. Type char
is 8 bits, short
and int
are 16 bits and long
is 32 bits. Each of these types can be signed
or unsigned
.
Pointers, including pointers to pointers, etc.
Structs, including a struct in a struct. Struct initializers. A struct must be declared at the global level. A reference to an undefined struct is accepted if the declaration or statement only refers to a pointer to such a struct.
Anonymous structs.
Assigning a struct value to another (e.g., struct S a, b; a = b;
).
Initializing a struct from another one is also supported (e.g.,
struct S a; struct S b = a;
).
Passing a struct or union by value to a function (since version 0.1.40). (If the size of the struct or union is 1 byte, then it is passed as a 16-bit word whose least significant byte is the struct or union.)
Returning a struct or union by value from a function (since version 0.1.40).
Declaring a variable after the function's code has started, as in C99.
while, do, for, switch, continue, break. The switch statement supports long integers in the switch expression and in case values, as of CMOC 0.1.89.
Declaring a for() loop's control variable in the for() itself as in C99, e.g., for (int i = 0; ...) {}.
Declaring more than one variable on the same line, e.g., int a = 0, b = 1;
Variadic functions, e.g., void foo(char *format, ...). There must be at least one named argument before the ellipsis (...), as in Standard C.
K&R function definitions, e.g., f() int a; { ... }
Ending an initializer list with a comma.
Use of the C preprocessor (the system's cpp is invoked): #include, #define, #ifdef, etc.
Unions.
Enumerations.
The static
and extern
keywords. However, an extern declaration must be
at the global level, not inside a function. Local static variables are
supported since version 0.1.76. As in C, the initializer, if any, must be constant.
The const
keyword. (Const-correctness issues are diagnosed as warnings,
not as errors, to avoid breaking code written before version 0.1.50.)
Comma expressions (e.g., x = 1, y = 2, z = 3;).
Binary literals
(e.g., 0b101010
for 42). Note that this feature is not part of
Standard C.
Function-local function prototypes, since version 0.1.85.
Arrays of function pointers (and typedefs thereof).
Zero-element arrays.
Goto and non-case labeled statements.
Modular compilation and linking (see the section on this).
The following instructions on compiling the source code assume a Unix-like system.
Generic installation instructions are available in the INSTALL file. The short version is to give the following commands, where X.Y.Z is the CMOC version number (e.g., 0.1.72):
tar -xzf cmoc-X.Y.Z.tar.gz cd cmoc-X.Y.Z ./configure --prefix=/some/installation/dir make make install
On FreeBSD (and possibly other BSD systems), the gmake
command should be
given instead of make
. This ensures that GNU Make is used.
The argument to --prefix is often something like /usr/local or /usr/local/cmoc-X.Y.Z.
The sudo prefix is typically needed to do a make install
to a system directory like /usr/local.
The compiler executable is called "cmoc". After installation, the directory where it was installed must be in the shell's search path to allow the shell to run it when typing cmoc.
The make check
command runs several self tests.
To generate the HTML documentation (this document), do make html
, which
will create the file doc/cmoc-manual.html
.
The following must be in the shell's search path:
The cmoc executable.
A C preprocessor callable by the name "cpp".
The LWTOOLS lwasm assembler, and lwlink linker. If libraries are to be created, then the lwar archiver must also be in the search path.
To compile a program written in the file foo.c, run this command:
cmoc foo.c
By default, the resulting .bin file will load at address $2800 on a CoCo
(see the #pragma org
directive elsewhere in this manual).
The generated machine language is position-independent, so the .bin file
can be loaded with an offset, by passing a second argument to Disk Basic's
LOADM command.
Pass --help
to cmoc to see the available options.
CMOC comes with a writecocofile
command that can copy a .bin file
to a 35-track Disk Basic diskette image file. For example:
writecocofile testing.dsk foo.bin
Pass --help
to writecocofile for the available options.
The -r
option reads a file from the image and the -d
option lists
the files in the image.
For more information on running a CoCo emulator on a GNU/Linux system, see http://sarrazip.com/emul-coco.html.
By default, all peephole optimizations are enabled.
The peephole optimizer
can be disabled by passing the -O0
option to the compiler.
This can be useful when one suspects that the compiler may have generated
invalid code.
Several peephole optimizations were contributed by Jamie Cho in early 2016
for version 0.1.20. They can be disabled by passing -O1
.
Option -O2
is equivalent to using the default (full) optimization level.
By default, compiling a C file gives a .o object file when option -c is passed, or a .bin executable if -c is not passed and the C file is a complete program.
When the --intermediate (or -i) option is passed, the following intermediate files are also generated:
.s: The assembler file written by CMOC.
.lst: The listing generated by the 6809 assembler (lwasm) from a .s file.
When linking, these intermediate files are also generated:
.map: The linking map generated by lwlink. It lists the sections and symbols that appear in the final executable.
.link: The linking script for lwlink. It lists the sections that must be gathered by the linker to produce the final executable.
Sine CMOC 0.1.62, the --intdir=
D option can be used to have these
intermediate files generated in directory D.
The executable in Disk Basic .bin format can be
transfered to a CoCo or emulator and loaded with the LOADM command.
(If compiling for OS-9—with the --os9
command-line switch—the
executable has the name of the C file without the extension.)
When distributing the .bin file to CoCo users, it is not necessary or useful to accompany it with the other generated files.
Pass the --srec
option to cmoc to make
it generate an executable in
Motorola S-record format.
The executable file will have the .srec extension.
Since version 0.1.43 of the compiler, this format is the default one
when targetting the USim 6809 simulator (with the --usim
option).
The doc subdirectory of the source distribution contains
cmoc-vectrex.markdown, which gives Vectrex-specific instructions.
Giving the make html
command will generate cmoc-vectrex.html,
which can be viewed in a browser.
Note that questions regarding Vectrex-specific issues should be
addressed to Johan Van den Brande at johan@vandenbrande.com
.
The Dragon was a clone of the Color Computer. It ran a Basic interpreter and a disk operating system that were simitar to the CoCo's. The .bin file format was different however.
CMOC will generate a Dragon .bin file when it is given the
--dragon
command-line switch.
If a Dragon program is separated in several C files, each of these
files must be compiled with the --dragon
switch, and the linking
invocation must also be given --dragon
.
As of CMOC 0.1.79, floating point types are supported when targeting the DragonDOS environment.
The <disk.h>
library is also not usable on the Dragon.
Since version 0.1.43, CMOC has supported modular (separate) compilation as well as linking several modules (object files) together.
In other words, the user can now separate a large program into several .c files, compile each of these files to a .o object file, then have CMOC link them together into a final executable.
LWTOOLS' lwasm and lwlink commands must be in the search path, otherwise,
the user can specify the --lwasm=
and --lwlink=
command-line options
to specify the full path of these commands.
When a program is separated in multiple C files, each of these files
must be compiled with the same target command-line switch (e.g.,
--coco
, --dragon
, etc.).
The linking invocation
— the one that generates the final executable — must also be given the
same switch.
This ensures that all parts of the program are compiled consistently,
i.e., they all assume the same underlying environment.
Object files (.o) can be grouped into a library by using LWTOOLS' lwar command:
lwar -c libstuff.a foo.o bar.o baz.o
Library filenames should have the lib
prefix and be given the .a
extension, so that they will be compatible with the -l
option that
is used to specify a library to link with, e.g.:
cmoc -o prog.bin prog.c -lstuff
In this case, only the .o files of libstuff.a that are actually needed will be copied to the executable.
It is possible to specify libstuff.a
on the CMOC command line, but
then all of the library's object files get copied to the executable.
The -l
option must be specified after the source or object file(s).
The -L
option can be used to specify a directory in which to search
for libraries specified by -l
. The -L
option can be specified
before or after the source or object file(s).
An object file is indivisible as far as the linker is concerned, and therefore, so is a C file. This means that either all of the functions and globals in a C file will end up in the executable, or none of them will. When designing a library, it may be desirable to put each function in its own C file, so that only the functions used by the parent program will be copied into the executable.
A library that does not come with CMOC might need to run some code before and/or after the main() function. Some of the uses for this include initializing some global variables or releasing some resources when the program quits.
To have code executed before main(), the library author must create
a .asm file that defines a constructors
section. For example:
IMPORT _toolkit_constructor
SECTION constructors
lbsr _toolkit_constructor
ENDSECTION
(Each line in the previous file must be indented.)
The code above calls a C function called toolkit_constructor() (without an initial underscore). Some .c file in the library must define that function, which must execute whatever initialization is needed by the library.
In the example above, the LBSR instruction is used instead of JSR to maintain the relocatability of the library.
To have code executed after main(), the same .asm file can be used
to create a destructors
section.
IMPORT _toolkit_destructor
SECTION destructors
lbsr _toolkit_destructor
ENDSECTION
Here, a C function called toolkit_destructor() is called.
The .asm file may define only one of these two sections. It does not have to define both.
If the .asm file is called prepostmain.asm for example, then it must be assembled into an object file with a command like the following:
lwasm -fobj -o prepostmain.o prepostmain.asm
This object file must not be included in the .a file that comprises
the other object files of the library.
(A .a file can be created with a lwar -c
command.
See the "Creating libraries" section elsewhere in this manual.)
When the library is installed somewhere, the .o file that contains the constructor and destructor (e.g., prepostmain.o) must be installed along with the library's .a file.
Once the library has been installed in a directory, a program that uses this library must add some arguments to the CMOC invocation that links the program. If for example the library is named "toolkit" and it was installed in /usr/local/tools, then this directory contains both prepostmain.o and libtoolkit.a, and the following arguments must be passed to CMOC when linking the program:
/usr/local/tools/prepostmain.o -L /usr/local/tools -ltoolkit
The prepostmain.o file must be specified explicitly to force the linker to include all of its contents in the executable. If this .o file were only included in the .a file, then it would not be used by the linker because no code refers explicitly to the contents of prepostmain.asm.
In a modular program, the address at which code and data must be
positioned must not be specify with the #pragma org
and #pragma data
directives. Those addresses must be specified
with the --org
and --data
command-line options when invoking
CMOC to perform linking:
cmoc -c mod1.c cmoc -c mod2.asm cmoc -o prog.bin --org=C000 --data=3000 mod1.o mod2.o
The same goes for #pragma limit
: the --limit
option must be
used instead.
Since version 0.1.43, these pragmas can only be used when
compiling directly from a C file to an executable,
i.e., the -c
option is not used.
An object file typically comes from a C file, but an assembly language
file can also be passed to CMOC, which will invoke the assembler on it.
That file's extension must be .s
or .asm
.
There are conventions to be observed however:
code
:SECTION code _func LDD #42 RTS ENDSECTION
SECTION initgl LDD #1000 STD _someGlobal,PCR ENDSECTION
Functions and global variables that are to be made available to
other modules must be exported with an EXPORT directive,
e.g., _func EXPORT
.
Functions and global variables that are expected to be provided
by other modules must be imported with an IMPORT directive,
e.g., _printf IMPORT
.
The code should preferrably be position-independent, but that is not a requirement if the executable will always be loaded at the address it was compiled for.
See the Calling convention section elsewhere in this manual for the rules to observe.
Read-only global variables and values must be in a section named
rodata
. This includes string literals.
String literals must end with a null byte.
If the \n
sequence is used in the literal, it must be encoded
as byte 10 ($0A).
Writable global variables that have static initializers (typically
arrays) must be in a section named rwdata
.
Writable global variables that do not have initializers,
or that are initialized by executing code in an initgl
section,
must be in a section named bss
. This section must only give RMB
directives, and no FCC
, FDB
or FCB
directives. The bss
section follows that rule so that it does not take any space in
the executable file (at least with the Disk Basic BIN format).
Function names and global variable names must start with
an underscore. In other words, if the C name is foo
, then the
assembly language name must be _foo
.
To include an arbitrary file in the final executable, LWASM's
INCLUDEBIN
directive can be used.
As an example, assume the binary file is called blob.dat
on the
filesystem of the computer the program is being developed on.
Create a blob.asm
file that contains this:
SECTION rodata
EXPORT _blob
EXPORT _blob_end
_blob
INCLUDEBIN blob.dat
_blob_end
ENDSECTION
Specifying rodata
will put the data in the read-only section
of the program. To put it in the read-write section, specify
rwdata
instead.
Prefixing the symbols with an underscore is necessary to comply with CMOC's naming convention. This avoids conflicts with symbols generated by CMOC.
Invoke LWASM to have it generate an object file:
lwasm -fobj --output=blob.o blob.asm
In the C program, access the data this way:
unsigned char *start, *end;
asm
{
_blob IMPORT
_blob_end IMPORT
leax _blob,pcr
stx :start
leax _blob_end,pcr
stx :end
}
After this code, the start
and end
pointers delimit the data
as loaded in memory on the target machine. (Note that there is no
way to force the data to be loaded at a specific address. The position
of the data is decided by the linker, the origin address set by --org
and the offset passed to LOADM
on a CoCo.)
When invoking CMOC to perform the linking phase, add blob.o
to
the command line.
See the LWTOOLS site for details on assembler directives.
If an inline assembly block uses a global variable that is provided by another module, an import directive must be included:
void f() { asm { _foo IMPORT ldd #42 std _foo } }
A multi-module project typically uses a makefile to manage the build process.
CMOC can automatically generate a dependencies file, with a .d
extension, that lists the source files that were encountered during
the compilation of a C file. The dependency file will be compatible
with GNU Make.
This is done by passing either the --deps-only
or --deps
option
to the compiler. A .d
file will be generated with a makefile rule
like this:
main.o main.d : main.c foo.h bar.h baz.h
Option --deps-only
stops right after generating this dependency file.
Option --deps
can be used with -c
to have the compiler both compile
the C file and generate the dependency file.
The Generating Prerequisites Automatically
section of the GNU Make
manual should be consulted for details on how to write a makefile that
uses this mechanism. GNU Make's -M
is similar to --deps-only
, while
-MMD
is similar to --deps
. Note that the sed
command mentioned in
that manual is not necessary with CMOC, which automatically includes the
.d
filename in the generated rule.
The name of the .d
file is formed from the name of the .o
file to
be generated.
This option has no effect
when also specifying the -E
option (which prints the preprocessor
output and stops).
Binary operations on bytes give a byte under CMOC, whereas they give
an integer under Standard C. To get a warning for such operations,
pass the -Wgives-byte
command-line option. This can be useful when
porting an existing C program to CMOC.
This warning is not given if either side of the operation has a cast,
e.g., (char) x + (char) y
, or x + (char) y
.
The compiler generally produces more efficient code for unsigned
arithmetic. Therefore, types unsigned
and unsigned char
should
be preferred to int
and char
where possible.
CMOC considers that the signedness of an additive or multiplicative operation is the signedness of the left operand.
CMOC can issue a warning when comparison operators <
, <=
, >
or >=
are used with operands of different signedness. This is obtained by
passing the -Wsign-compare
command-line option.
The pre-increment (++i) typically generates one instruction less than the post-increment (i++). This is because the post-increment must yield the initial value of the variable. When this initial value is not needed, it is advisable to use the pre-increment.
The same situation holds for the decrement operators.
To specify the origin address of the program, use the --org option when invoking CMOC to link your executable. For example:
cmoc --org=0x5C00 prog.bin prog.o foo.o bar.o
By default, the origin address is 0x2800. (Under Disk Basic, the Basic program normally starts at 0x2601, because four PMODE pages are reserved by default.)
To determine the effective start and end address of the executable
(assuming no relocation by LOADM), one can look up the program_start
and program_end
symbols in the .lst listing file generated by the
assembler.
program_end
is useful to check that the executable fits in the
available RAM. On a 32k CoCo, the RAM ends at $8000. By defaut,
Color Basic reserves 200 bytes for Basic strings, and before that,
the system stack should be given 512 bytes for a typical C program.
These assumptions mean that an executable should not go beyond $7D38.
If a program crashes the CoCo just by getting loaded, it could
be because it is too long and overwrites the system stack. The
parse-coco-bin
Perl script, available on the CMOC home page,
can be useful to confirm that.
The following example program prints the start and end of the memory it uses initially:
#include <cmoc.h>
int main()
{
char *s, *e;
asm
{
leax program_start,pcr
stx :s
leax program_end,pcr
stx :e
}
printf("START AT %p, END AT %p.\n", s, e);
return 0;
}
Note that the label addresses are taken relatively to the
program counter (,pcr
), so that this program will report
the correct addresses even if the program is loaded at an
address other than the one specified in the .bin file.
(This can be done by passing a second argument to Disk Basic's
LOADM command.)
This section assumes that the code and data sections form a single
block. When using the #pragma data
directive (see elsewhere
in this manual), the writable globals will not be between
program_start
and program_end
.
CMOC accepts the --limit=X
option. When it
is passed, CMOC will fail if the end of the program, as indicated
by the program_end
listing symbol, exceeds address X (specified
in hexadecimal).
For example, --limit=7800
will keep the program from finishing
too close to the system stack under Disk Basic, which is not far
below $8000. A limit at $7800 leaves two kilobytes for the stack
and for Basic's string space.
All 6809 code generated by CMOC is position-independent, i.e., it can be loaded at any address and will still work.
Since version 0.1.48, CMOC automatically defines _CMOC_VERSION_
to be a 32-bit number of the form XXYYZZZ when the version
is X.Y.Z. For example, version 0.1.48 defines _CMOC_VERSION_
as 1048. This can be useful when some C code needs to do something
under CMOC but something else under another compiler.
By default, the compiler defines the _COCO_BASIC_
preprocessor
identifier. This identifier can be used to adapt a program to
make it use alternative code depending on whether it will run under
Disk Extended Color Basic or not.
To target OS-9, pass the --os9
option. The compiler will define OS9
.
To target the Vectrex, pass --vectrex
. The compiler will define VECTREX
.
To target the Dragon, pass --dragon
. The compiler will define DRAGON
.
To target the Thomson MO, pass --thommo
. The compiler will define THOMMO
.
To target the Thomson TO, pass --thomto
. The compiler will define THOMTO
.
To target the FLEX operating system, pass --flex
. The compiler will define FLEX
.
In the Thomson cases, the executable will have the .bin
extension and be
in the CoCo Disk Basic BIN format.
This format is documented in the Disk Basic Unravelled book.
CMOC does not provide the tools needed to convert this format to the Thomson
computers' native format.
When passing --usim
, the compiler targets the USim 6809 simulator,
which CMOC is shipped with.
The USIM
identifier will be defined.
No .bin file is produced in this case.
The .srec file can be executed by passing its path to
src/usim-0.91-cmoc/usim
.
CMOC's standard library is small.
The program must #include <cmoc.h>
to use functions like printf().
See that file for a list of implemented C functions.
Many are C functions while others are CMOC extensions.
("Standard" here means that those functions come with CMOC, not that
CMOC aims to provide a complete C standard library.)
The following header files are provided with CMOC.
They should be used by a directive of the form #include <_____.h>
.
<assert.h>
: Defines assert(), a macro that aborts the program if the assertion it is given is false.<cmoc.h>
: Declares printf() and several other functions from C.<coco.h>
: Declares several CoCo-specific functions, including initCoCoSupport(),
which should be called first by a program that targets the CoCo.<dskcon-standalone.h>
: Declares dskcon_init(), dskcon_processSector(), etc. to read and write
Disk Extended Color Basic floppy sectors without the presence of DECB.<setjmp.h>
: Declares C functions setjmp() and longjmp(), which can be used to perform
a nonlocal goto.<stdarg.h>
: Declares C macros va_start(), va_arg() and va_end(), which facilitate
the implementation of functions that take a variable argument list, like printf().<vectrex.h>
, <vectrex/bios.h>
, <vectrex/stdlib.h>
: Various definitions for the Vectrex console.CMOC's printf()
function supports %u, %d, %x, %X, %p, %s, %c, %f and %%.
Specifying a field width is allowed, except for %f.
A left justification is only supported for strings, e.g.,
%-15s
will work, but %-6u
will not.
Zero padding for an integer is supported (e.g., %04x
).
The l
modifier is supported for 32-bit longs (e.g., "%012ld");
%p always precedes the hexadecimal digits with $
, as per
the CoCo assembly language notation. %x and %X do not generate
such a prefix. %p, %x and %X always print letter digits as
capital letters (A
to F
, not a
to f
).
printf(), like putchar() and putstr(), sends its output one character at a time to Color Basic's PUTCHR routine, whose address is taken from $A002.
The standard library's printf() writes the characters of the formatted string by calling the routine whose address is stored in the library's CHROUT global variable (not to be confused with Color Basic's CHROUT vector at $A002). The same applies to functions sprintf(), putchar() and putstr().
By default, under Color Basic, that routine is the one found in that $A002 vector. To designate a C function as the new character output routine, first define the new routine:
void newOutputRoutine(void)
{
char ch;
asm
{
pshs x,b // preserve registers used by this routine, except A
sta :ch
}
// Insert C statements that print or otherwise process 'ch'.
asm
{
puls b,x
}
}
This routine will receive the character to be printed in register A. It must preserve registers B, X and U. It does not have to preserve A. A normal CMOC function will automatically preserve U, but an assembly-only function will not, so it must preserve U itself.
When targeting for the Color Basic environment, the routine will receive character code 13 to represent a new line, rather than code 10.
Install the routine at the appropriate time with this call:
ConsoleOutHook oldCHROOT;
oldCHROOT = setConsoleOutHook(newOutputRoutine).
To restore the original output routine, do this:
setConsoleOutHook(oldCHROOT);
This function is like printf(), but it writes into a memory buffer whose address is passed as the first argument, before the format string. For example:
char greeting[32];
void f(char *name)
{
sprintf(greeting, "Hello, %s.", name);
}
Calling f("Lonnie") will write "Hello, Lonnie." in the greeting[] array, including a terminating '\0' character. A total of 15 bytes get written to the start of that array.
The caller is responsible for providing a buffer long enough to receive all the text resulting from the format string and its arguments, including the terminating '\0' character.
In this example, the longest "name" that can be safely passed to f() would be a 23-character name.
The standard C language offers snprintf(), which is safer because it requires passing the length of the destination buffer. But checking for this limit would have a performance hit that is not necessarily acceptable on a CoCo. If such a function is needed, it can be implemented using the technique described in the previous section.
readline(), declared by #include <cmoc.h>
, acts like Basic's LINE INPUT
command and returns the address of the (NUL-terminated) string entered.
This address is a global buffer.
The next call to readline() will overwrite that buffer.
Defining a function using the name of a standard library function
is allowed. For example, a program could redefine rand()
and
srand()
to implement an alternative random number generator.
In the final assembly program, such functions replace the ones
provided by the standard library.
sbrk() is a function that dynamically allocates a region of memory
of the size (in bytes) passed to it as an argument. It returns a
void pointer. If the quantity of memory requested is not available,
sbrk() returns (void *) -1
. For example:
void *p = sbrk(100);
if (p != (void *) -1)
memset(p, 'X', 100);
In the case of the CoCo and the Dragon, the assumption is that the program is
loaded after the Basic program and variables. This means the space that
sbrk()
can allocate from goes from there to the top of the stack,
which is around $3F00 on a 16K CoCo and $7F00 on a CoCo with 32K or more.
Do not use sbrk()
if these assumptions do not apply, e.g., when
using --data
to position the writable globals elsewhere than
right after the code and read-only data.
On the CoCo and the Dragon, use the --stack-space
option or the
#pragma stack_space
directive (documented elsewhere in this manual)
when the program needs more than 1024 bytes of stack space.
To determine how much of that memory is available for sbrk(), call sbrkmax(), which returns the number of bytes as a size_t (unsigned).
sbrk() and sbrkmax() are declared by cmoc.h.
Inline assembly text can be specified by surrouding it with asm {
and }
.
In the text, one can refer to C variables (global, local and parameters) as well as functions.
Labels can be used for branch instructions, but a label must either be unique to the whole program or comply with what lwasm considers a "local" label. A label must appear at the very beginning of the line, without spaces or tabs in front of it.
One way of using lwasm local labels is to prefix the label name with
the @ character. Such a label will be local to the current block, which
will begin at the previous blank line (or start of the asm
block) and
end at the next blank line (or end of the asm
block). Refer to the
LWASM manual about its symbols
for details on using local labels.
The assembler may also support $
as a local label marker, but it is not
recommended to use it that way in inline assembly because it may hinder
portability to OS-9, where $
is used in labels but not as a local label marker.
The following example fills array out
with n
copies of character ch
,
then returns the address that follows the region written to.
(References to C variables should be preceded by a colon to avoid ambiguities,
especially when a C variable has the same name as a processor register.)
#include <cmoc.h>
char *f(char *out, char ch, unsigned char n)
{
char *end;
asm
{
ldx :out /* comments must be C style */
lda :ch // or C++ style
ldb :n ; or semi-colon comment (passed to assembler)
@loop:
sta ,x+
decb
bne @loop
stx :end
}
return end;
}
int main()
{
char a[10];
a[9] = '\0';
char *p = f(a, 'X', (unsigned char) sizeof(a) - 1);
printf("a='%s', %p, %p\n", a, a, p);
return 0;
}
Since version 0.1.21, when referring to a C function, the function
name is replaced with its assembly label, possibly followed by the
,pcr
suffix. This suffix is omitted if the instruction is BSR,
LBSR or JSR, because these instructions do not support the ,pcr
suffix.
The following example calls the same C function three different ways:
asm
{
jsr someCFunction
lbsr someCFunction
leax someCFunction
jsr ,x
}
Note that CMOC always converts C code to position independent code by default. Inline assembly statements should be written the same way if the resulting program is to be relocatable.
The BSR instruction is not recommended because it is a short branch and if the called function is too far from the call, the assembly step will fail.
Since 0.1.39, semi-colon comments are supported. They are passed verbatim to the assembler.
Note that using inline assembly is likely to make the program non portable to other C compilers.
See the Calling convention section elsewhere in this manual for the rules to observe. Namely, inline assembly must not modify U or Y. It is allowed to modify D, X and CC.
Since 0.1.79, referring to a struct field is supported:
asm
{
inc :someObject.someField
}
When using an index into a C array, the index is in array elements, not in bytes:
int v[5];
asm
{
ldd :v[4] ; refers to last int in v[],
; i.e., bytes 8 and 9 of the 10-byte array
}
When using an index into a non-array, the index is in bytes:
long n;
asm
{
clr n[3] ; clears the least significant byte of n
}
The GNU C preprocessor can add spaces in surprising ways, which makes its use slightly problematic in inline assembly. For example:
#define PIA0 0xFF00
asm
{
stb PIA0+2
}
One would expect this code to generate an stb 0xFF02
instruction,
but cpp will actually expand this to stb 0xFF00 +2
, because it
apparently adds a space after the expansion of the PIA0 identifier.
The assembler takes this space as the start of the comment, so it ignores
the +2 and assembles stb $FF00
.
A workaround appears to be to reverse the addition and write stb 2+PIA0
.
No space gets added before the identifier.
Therefore, preprocessor identifiers should be used with caution in inline assembly.
Before version 0.1.31, bad assembly language text could be emitted if inline assembly referred to a C variable that has the name of a register. To help resolve this ambiguity, version 0.1.31 introduced a C variable escape character, which is a colon that is put in front of the C variable name.
For example:
char b;
asm
{
inc :b
ldb :b
leax b,x
}
Here, :b
refers to variable b of the C program, while the b
in b,x refers to the register.
This change may break inline assembly code in programs that were written for versions of CMOC preceding 0.1.31. Adding a colon at the right places will resolve the issue.
Note that the escape character is not necessary on variable names that are not also register names.
When a function is written entirely in assembly language for
performance reasons, the stack frame may not be necessary.
CMOC will not generate a stack frame for a function defined
with the asm
modifier, as in this example:
asm int f(int m, int n)
{
// U not pushed, so 1st argument is at 2,s
asm
{
ldd 2,s // load m
addd 4,s // add n, leave return value in D
}
}
Only asm { ... }
statements are allowed in such a function.
Typically, only one is needed. Local variables cannot be defined
in that function and the function's parameters cannot be accessed
by name. The assembly code is allowed to refer to global
variables however.
By default, the compiler ends the function with the RTS instruction
(or RTI if the function is declared with the interrupt
modifier).
To keep the compiler from emitting that instruction (RTI as well as RTS),
add the __norts__
keyword:
asm __norts__ int f(int m, int n) { ... }
See the Calling convention section elsewhere in this manual for the rules to observe. In particular, note that byte arguments are promoted to words, which are pushed onto the stack in the big endian byte ordering.
Inline assembly can use Hitachi 6309 instructions like TFM, LDQ, etc. The underlying assembler (lwasm) accepts them by default.
The code generated by CMOC does not use the 6309-specific registers or instructions.
Since CMOC 0.1.84, verbatim assembly block of the form asm { ... }
are allowed at global scope. For example:
asm
{
foo:
}
void bar() {}
CMOC guarantees that this assembly block will be emitted
just before bar()
, with no other intervening instructions or
assembly labels or directives, except for the EXPORT directive
that exports bar()
, since it has external linkage.
The above example defines a foo
label whose value will be equal
to the address of bar()
.
If a sequence of more than one assembly blocks precedes a certain C function, those blocks are all emitted, in order, just before that function's code.
Inline assembly blocks that are not followed by a C function are emitted, in order, after the functions have been emitted.
CMOC supports the interrupt
function modifier, which tells the
compiler to end the function with an RTI instruction instead of
an RTS. For example, the following function handles the VSYNC
60 hertz interrupt:
interrupt void newCoCoIRQRoutine(void)
{
asm
{
lda $FF03 // check for 60 Hz interrupt
lbpl irqISR_end // return if 63.5 us interrupt
lda $FF02 // reset PIA0, port B interrupt flag
}
// Do something in C.
asm
{
irqISR_end:
}
// Nothing here, so that next instruction is RTI.
}
This routine could be hooked to the IRQ vector this way on a CoCo:
disableInterrupts();
unsigned char *irqVector = * (unsigned char **) 0xFFF8;
*irqVector = 0x7E; // extended JMP extension
* (void **) (irqVector + 1) = (void *) newCoCoIRQRoutine;
enableInterrupts();
Header <coco.h>
defines macros disableInterrupts() and enableInterrupts().
They set and reset the F and I condition codes.
The FIRQ vector is at $FFF6.
Note that using the interrupt
keyword is likely to make
the program non portable to other C compilers.
The address of a function can be taken and stored in order to be called through that pointer later.
The following example shows that the two syntaxes used in C to call a function through a pointer are supported by CMOC:
unsigned char f(int a, char b) { ... }
int main()
{
unsigned char (*pf)(int a, char b) = f;
unsigned char c0 = (*pf)(1000, 'x');
unsigned char c1 = pf(1001, 'y');
return 0;
}
A member of a struct can point to a function. For example:
struct S
{
void (*fp)();
};
void g() { ... }
int main()
{
struct S s = { g }; // member 'fp' points to g()
s.fp(); // call g()
(*s.fp)(); // call g()
return 0;
}
When only a single function needs to use a read-only array of integers, this array could be defined locally to that function, but it is more efficient, as of CMOC 0.1.10 to define the array as global. This is because the global array will be emitted as a sequence of FCB or FDB directives, while a local array will be initialized with a series of load and store instructions.
In the following example, array g
will be initialized more
efficiently than array l
:
int g[] = { 55, 66, 77 };
void f()
{
int l[] = { 22, 33, 44 };
/* ... */
}
A global array initializer containing only integer values (not string literals) is treated specially. It is initialized at compile-time, not at run-time. This means that if the program modifies the array, quits, and then is reexecuted, the modified contents will still appear as is. The array will not be reinitialized to the values in the initializer.
For example:
#include <cmoc.h>
int a[2] = { 13, 17 };
int main() { a[0]++; printf("%d\n", a[0]); return 0; }
The first time this program is executed, a[0] starts at 13, then is incremented to 14, which is the number that is printed.
The second time this program is executed, a[0] starts at 14
because array a
is not reinitialized upon entry.
This is meant to save memory by not including a second copy of the initializer just for run-time initialization purposes.
CMOC considers a global variable's initializer to be constant if it is made of:
An initializer used to initialize a union variable must match the type of the first member of the union:
union Word
{
unsigned u;
char b[2];
};
union Word w0 = { 1000 }; // ok: 1000 matches unsigned u
union Word w1 = { { 'x','y' } }; // error: does not match unsigned u
One must be careful when specifying an array size using an arithmetic expression, like this:
char a[256 * 192 / 8];
This will generate an "invalid dimensions" error message. This is
because the three numerical constants are of type int
, which means
they are signed integers. In 16-bit signed arithmetic, 256 * 192 is
-16384, which gets divided by 8, which is -2048. This size is rejected
by the compiler, which only accepts array sizes between 1 and 32767
bytes inclusively.
The fix is to force the expression to be evaluated with unsigned integers:
char a[256U * 192U / 8U];
This will be 6144 bytes, as intended.
To help support the ROM cartridge format, CMOC supports directives that allow telling it to assemble the code, the string, long and real literals, and the read-only global variables at the typical CoCo cartridge ROM address of $C000, while mapping the writable global variables at a RAM address like $3800.
This is achieved by using the --org
and --data
command-line options.
If the following program is in prog.c:
int f() { return 42; }
const int g = 100;
const unsigned char byteArray[3] = { 11, 22, 33 };
const char text[] = "hello";
int anotherWritableGlobal;
int main()
{
anotherWritableGlobal = 99;
return 0;
}
Then it can be compiled for a cartridge with this command, which specifies hexadecimal addresses:
cmoc --raw --org=C000 --data=3800 prog.c
The --raw
option tells CMOC to generate only the machine code,
without any of the headers that appears in the .bin format.
The execute will be named prog.raw
.
When developing a program made of multiple C files, --org
and
--data
should only be passed to the compiler invocation that links
the executable from the object files.
In the code above, variable g
is read-only because it is of a
constant type. byteArray
and text
are read-only because they are
arrays whose elements are of a constant type.
These three variables are thus automatically put in the read-only section, next to the code. This means they will be part of the cartridge ROM.
In the case of text
, the use of the empty brackets is necessary for
that variable to be seen as read-only.
Declaring this variable as const char *text
would lead the compiler
to see it as writable: the text
pointer itself can be modified,
although the characters it points to cannot.
In the case of a cartridge program, no writable global variable can have an initializer. That is because the ROM file format has no support for mapping a value into RAM, like the .bin format has.
After compiling the program, it can be tested with the XRoar emulator, but first it must be padded at the end with enough bytes so that the total file length is a multiple of 256. The following Perl script does this:
#!/usr/bin/perl
sysread(STDIN, $rom, 0xFFFF) > 0 or die;
my $romLen = length($rom);
binmode STDOUT or die;
print $rom;
my $extra = $romLen % 256;
print chr(0) x (256 - $extra) if $extra;
This script, in a file called bin2cart.pl, can be used this way:
perl bin2cart.pl < foo.raw > foo.rom
This ROM image can be tested in the XRoar emulator this way:
xroar -machine cocous -cart-autorun -cart foo.rom
Note that XRoar requires the image file to have the .rom extension.
In a cartridge-based program written as above, the CoCo's 60 Hz IRQ
interrupt is not enabled. To enable the IRQ in such a program, put
this at the beginning of the main()
function:
asm
{
// We come here from a JMP $C000 in Color Basic (normally at
// $A10A in v1.2). At this point, the 60 Hz interrupt has
// not been enabled yet, so enable it.
lda $FF03 // get control register of PIA0, port B
ora #1
sta $FF03 // enable 60 Hz interrupt
// Unmask interrupts to allow the timer IRQ to be processed.
andcc #$AF
}
Function sbrk()
, which allocates memory dynamically, should be avoided
because it expects to run in the default Disk Basic environment.
Enumerations are supported, with the restriction that an enum must
be declared at the global scope. In particular, an enum with an
enumerator list (e.g., enum { A, B }
) cannot be used in the formal
parameter of a function, nor as its return type.
An enum can be anonymous (e.g., enum { A, B }
) or be given a name
(e.g., enum E { A, B }
).
Each enumerated name can have a specified value, e.g.,
enum { SCREEN = 0x0400, BIT5 = 1 << 5 }
. Such a value must be
a constant expression.
Floating-point arithmetic is supported in some situations:
With --mc6839:
main()
function must first call enable_printf_float()
if the program wants to use the %f
placeholder of CMOC's printf()
function.
(If this is not done, printf()
will print an exclamation point (!
)
instead of the value of the number.)setMC6839Trap()
(declared by cmoc.h
).
This function will be called when an MC6839 operation fails.In all cases,
a warning is issued when the double
keyword is encountered, so the
user knows not to expect double-precision. There is also a warning
when a numeric literal does not use the f
suffix, which specifies
that the literal is single-precision. There too, double-precision
must not be expected. It is recommended to code programs using the
float
type and the f
suffix.
The compiler will fail to compile a program that uses floating-point
arithmetic when a platform other than the ones supported is targeted.
Using --mc6839
may then be an option.
CMOC's printf()
function supports the %f
placeholder, but
it does not support the width or precision parameteres of %f
(e.g., "%7.3f"
).
The <cmoc.h>
header file provides functions strtof()
and atoff()
to convert an ASCII decimal representation of a floating-pointer number
into a float value, as well as ftoa()
, to convert a float value into
an ASCII decimal representation.
Note that CMOC's float support must not be assumed to be thread-safe. This means that if a multitasking system is in place, other than OS-9, then only one thread at a time can make use of float operations. If an interrupt service routine is getting executed, it too must not do float operations if the main program is making use of them. (The reason for this restriction is that some floating point libraries use software floating point accumulators, e.g., FPA0 and FPA1 in the Extended Color Basic case.)
Under OS-9, multitasking is achieved by running more than one process, and each process will have its own set of software floating point accumulators.
The constants __FUNCTION__
and __func__
can be used to refer to
the current function's name:
printf("Now executing %s().\n", __func__);
In the global namespace, these identifiers give the empty string.
Accessing a value through a null pointer is a common bug. To help detect such accesses, CMOC has the --check-null command-line option, which adds run-time checks before every use of a pointer and every array element access.
By default, the handler that is invoked when a null pointer access
is detected is a no-op. The program should start by calling the
set_null_ptr_handler()
function to set up a handler that will
receive the address of the failed check as an argument. For example:
#include <cmoc.h>
struct S { int n; };
void nullPointerHandler(void *addressOfFailedCheck)
{
printf("[FAIL: %p]\n", addressOfFailedCheck);
exit(1);
}
int main()
{
set_null_ptr_handler(nullPointerHandler);
struct S *p = 0;
p->n = 42;
return 0;
}
This program will fail and display an address. One can then look up this address in the .lst listing file generated by CMOC to determine in which function that null pointer was detected.
Using this option incurs a performance cost and significantly adds
to the code size (about 9%), so it is only recommended
during debugging. An alternative is to define an assert()
macro,
as in Standard C, that expands to nothing when NDEBUG
is defined.
Runaway recursion or excessively long local arrays can cause corruption that is difficult to investigate. To help detect stack overflows, CMOC has the --check-stack command-line option, which adds run-time checks at the beginning of each function body.
When passing --check-stack to CMOC, the program should start with
a call to set_stack_overflow_handler()
that installs a handler.
This handler receives two arguments: the address of the failed check
and the out-of-range stack pointer. The handler must not return.
For example:
#ifdef _CMOC_CHECK_STACK_OVERFLOW_
void stackOverflowHandler(void *addressOfFailedCheck, void *stackRegister)
{
printf("[FAIL: %p, %p]\n", addressOfFailedCheck, stackRegister);
exit(1);
}
#endif
void recurse() { recurse(); }
int main()
{
#ifdef _CMOC_CHECK_STACK_OVERFLOW_
set_stack_overflow_handler(stackOverflowHandler);
#endif
recurse();
return 0;
}
This program will fail and display two addresses. One can look up the first one in the .lst listing file generated by CMOC to determine in which function that stack overflow was detected.
Using this option incurs a performance cost, so it is only recommended during debugging.
The preprocessor identifier _CMOC_CHECK_STACK_OVERFLOW_
is defined
by CMOC when --check-stack is used. This identifier can be used to
exclude stack check code when not needed.
By default, CMOC allocates 1024 bytes for the stack on a CoCo (256 bytes on a Vectrex). In a program that is compiled with the default layout, the stack is preceded by the memory managed by sbrk() (see elsewhere in this manual).
Note that this feature is not usable under OS-9, where stack checking is automatic and uses a different mechanism.
Option --stack-space=
N (with N > 0) can be used to specify
the number of bytes (in decimal) to be reserved to the stack. This
affects the quantity of memory available to sbrk() and the stack
checking done by --check-stack
. Use this option when the program
uses lots of stack space. Note that this option is not permitted when
targeting the Vectrex.
The stack space can also be specified with #pragma stack_space
N.
The command-line option has precedence over this pragma.
Specifying the stack space must be done when compiling the C file that defines main(). It has no effect when compiling a file that does not define main(), nor when calling CMOC to perform linking on a list of object files.
Note that this feature is not usable under OS-9, where stack checking is automatic and uses a different mechanism.
When targeting OS-9, by default, a stack check is performed upon entering each function, unless that function is assembly-only.
The stack check verifies that there will be at least 64 bytes of free stack space once the function will have allocated stack space for its local variables. If this check fails, a "stack overflow" error message is printed in the standard output and the program exits.
To change this 64 to another (non-zero) value, use the command-line
--function-stack
option. Pass 0 to this option to disable the
stack check.
CMOC follows the C convention of passing the parameters in the stack in the reverse order.
The caller pops them off the stack after the call.
An argument of type char
, which is signed, is promoted to int
.
An argument of type unsigned char
is promoted to unsigned int
.
When a struct passed by value, it is pushed exactly as it is, without
any padding, except if its size is 1: then an additional dummy byte
is pushed afterwards. (The same thing is done when a byte-sized argument
is promoted to 16 bits. This is for the benefit of va_arg(), defined
in <stdarg.h>
).
The return value must be left in B if it is byte-sized or in D if it is 16 bits. If the return value is a struct, a long, a float or a double, then the return value must be stored at a location whose address is received by the function as its first (hidden) parameter.
The body of a CMOC-generated function preserves registers U, Y, S and DP. It is allowed to modify A, B, X and CC.
Under OS-9, CMOC uses Y to refer to the data section of the current process. Any code that needs to use Y must preserve its value and restore it when finished. For portability, this rule should also be observed on platforms other than OS-9.
The compiler's low-level optimizer may emit code that uses Y when targeting a platform other than OS-9.
The called function does not have to return any particular condition codes.
Register DP is not used or modified by CMOC-generated code.
A CMOC-generated function uses a stack frame
if the function receives
one or more parameters or has local variables, and is not defined
with the asm
keyword (see the Assembly-only functions section
elsewhere in this manual).
Register U is used as the stack frame
pointer. Such a function starts with these instructions:
PSHS U
LEAU ,S
an ends with these:
LEAS ,U
PULS U,PC
(An interrupt
function will end with PULS U
and RTI
.)
Typically, such a program gets loaded at the end of Basic's RAM area,
e.g., &H7A00
.
Pass --org to the compiler to position the program at such an address.
In the Basic program, reserve some high RAM to your CMOC program,
e.g., CLEAR 200,&H79FF
.
Then use LOADM
, then DEF USR
to define a user routine that points
to your CMOC program, e.g., DEF USR5=&H7A00
.
(The 5 in this example can be a number from 0 to 9.)
Finally, call the routine using the routine number and an argument,
e.g., R=USR5(1000)
.
The argument passed to USR*() is stored by Color Basic in its FPA0 floating-point accumulator. If this argument is a signed 16-bit integer, this argument can be obtained in the CMOC program with this inline assembly code:
int arg;
asm
{
pshs u,y ; protect registers against Basic
jsr $B3ED ; convert FPA0 to integer in D
puls y,u
std :arg
}
Then the C variable arg
will contain the argument value.
To return a signed 16-bit integer value to the caller of USR*(), use code like this:
int ret = -1000; /* value to return */
asm
{
ldd :ret
pshs u,y ; protect registers against Basic
jsr $B4F4 ; GIVABF routine stores D in FPA0
puls y,u
}
Since CMOC 0.1.58, the main() return value, and the exit() argument, are guaranteed to be returned in D upon exiting the program, when compiled for the CoCo Basic environment or for the Dragon.
This can be useful to return a 16-bit address from one program for use by another program. Note that registers U and Y must be preserved (typically with PSHS and PULS) when calling another program.
Disk Basic 1.1 has a DOS command that loads track 34 at $2600 and if the contents start with the bytes 'O' and 'S', the interpreter jumps at $2602.
CMOC can accommodate this with its --dos option. It implies --coco, it uses $2600 as the code origin, and it generates the "OS" marker at that address.
The output is still a .bin file. CMOC comes with a command-line tool
called install-coco-boot-loader
that takes this file and copies
it to track 34 of a DECB disk image. For example:
cmoc --dos prog.c
install-coco-boot-loader foo.dsk prog.bin
This tool will fail if track 34 is already occupied. It has a --force
option that will install the boot leader despite that.
To uninstall the boot loader, use the --uninstall
option:
install-coco-boot-loader --uninstall foo.dsk
For maximum portability, a CMOC program should not use the C Preprocessor's "stringification" operator (#), e.g.:
#define f(x) #x
When this feature is supported by cpp, the expression f(foobar)
yields
"foobar"
, including the double quotes.
When this feature is not supported, the pound sign is left as is, so CMOC
will see #foobar
, which will typically give a syntax error. This has been
observed on the Macintosh (Darwin) as of May 2017.
Also to be avoided is the use of C++-style comments on a line to be processed by cpp, e.g.:
#define foo 42 // blargh
Some preprocessors may leave the comment instead of stripping it away, so the following should be used instead:
#define foo 42 /* blargh */
Versions 0.1.66 and later have a -Wlocal-var-hiding
command-line option
that makes the compiler issue a warning if a local variable hides another one
declared in the same function. In the following example, the second declaration
for v
hides the first one, so the option will cause a warning to be issued
for the second declaration:
void f(int c)
{
int v = 0;
g(&v);
if (c)
{
int v = 1;
g(&v);
}
}
By default, the code generated by CMOC is relocatable. The --no-relocate
command-line option can be passed to the compiler to tell it that
relocatability does not have to be supported. This allows the compiler
to do some optimizations.
When --no-relocate
is passed, the C preprocessor identifier _CMOC_NO_RELOCATE_
is automatically defined.
The -nodefaultlibs
option tells CMOC to avoid linking the program with libcmoc-std-*.a
,
which contains the small C library that comes with the compiler.
This option can be useful when targeting a platform that is not supported
by CMOC's library.
Note that assigning a struct variable to another will cause the compiler to generate
code that calls memcpy() to copy the struct.
If -nodefaultlibs is given, the program will not link,
because memcpy() is provided by libcmoc-std-*.a
.
The program must thus provide its own implementation of memcpy().
As of CMOC 0.1.83, the way to make systems calls to OS-9 is to use
inline assembly. This section shows an example program, called os9load.c
,
that loads the first few bytes of a file into a memory buffer.
#include <cmoc.h>
// filePath: Must end with '\r'.
// Returns the number of bytes read from the file (may be lower than bufferLength).
//
size_t loadFileToBuffer(const char *filePath,
void *buffer,
size_t bufferLength)
{
// Open the file in read mode.
unsigned char pathNumber, errorCode;
asm
{
lda #$01 ; read mode
ldx :filePath
os9 $84 ; I$Open
bcs @error
clrb
sta :pathNumber
@error
stb :errorCode
}
if (errorCode != 0)
{
printf("I$Open failed: error #%u\n", errorCode);
return 0;
}
// Load bytes from the file to the buffer.
size_t numBytesRead;
asm
{
lda :pathNumber
ldx :buffer ; address where to store the bytes
pshs y ; save global data ptr; cannot refer to C globals for now
ldy :bufferLength ; number of bytes to read
os9 $89 ; I$Read
bcs @error
clrb
sty :numBytesRead
@error
stb :errorCode
puls y ; restore global data ptr; can refer to C globals again
}
if (errorCode != 0)
{
printf("I$Read failed: error #%u\n", errorCode);
// Do not return because we need to close the file.
}
// Close the file.
unsigned char closeErrorCode;
asm
{
lda :pathNumber
os9 $8F ; I$Close
bcs @error
clrb
@error
stb :closeErrorCode
}
if (errorCode != 0) // if read error
return 0; // do not report close error if any; read error already reported
if (closeErrorCode != 0)
{
printf("I$Close failed: error #%u\n", errorCode);
return 0;
}
return numBytesRead; // success
}
int main()
{
char contents[10];
size_t numBytesRead = loadFileToBuffer("foo.txt\r", contents, sizeof(contents));
printf("First %u bytes: [", numBytesRead);
putstr(contents, numBytesRead);
printf("]\n");
return 0;
}
It can be compiled by giving this command: cmoc --os9 os9load.c
Once the os9load
executable has been transferred to NitrOS-9, it can
be used by first creating a foo.txt
file (e.g., with build foo.txt
).
If for example this file contains "This is the foo.txt file.
",
then when os9load
is run, it will print this:
First 10 bytes: [This is th]
Details on the systems calls can be found in the OS-9 System Programmer's Manual.
Whenever the Y register is used in an OS-9 system call, it must be preserved, typically with PSHS and PULS, as done above, because a CMOC program uses that register to point to the process's data segment.
That segment contains the program's global variables. As long as the Y register is used for something else, the CMOC program must refrain from referring to global variables. (Local variables are accessed through the U or S registers.)
Under OS-9, the arguments passed to the command can be accessed by reading
the argv
array passed to main() as in Standard C. No #include
directive
is needed to access that array. CMOC does not come with a command-line
processing library like Unix's getopt.
Under the other targets supported by CMOC, there is no way to pass arguments to the program.
Compilation failures
It is not recommended to run cmoc
from the directory tree where
it was build. It is recommended to run make install
and to put
cmoc
in the executable PATH.
Option --verbose
can be passed to cmoc
to make it show exactly
which options and arguments it passes to lwasm
and lwlink
.
Option -E
can be passed to cmoc
to make it output the result
of the C preprocessor pass (to standard output).
This can be used to see exactly how the #defines
and #includes
have been resolved.
Crash on startup
The program may be too long and loading it in memory may have
overwritten the memory used by the system stack. Under Color Basic,
the stack is typically at $7Fxx on a CoCo with at least 32k of RAM.
To detect this at build time, pass --limit=7800
(in hex)
to cmoc
to have issue an error message with the program end
exceeds $7800. No executable will be generated, which will avoid
running an excessively long one.
The program may have tuned out memory that is supposed to contain the code of interrupt service routines (ISRs). This can happen if the program switches to a CoCo's all-RAM mode, or it is changes the MMU blocks on a CoCo 3.
The program should disable the interrupts before making such changes.
Header <coco.h>
provides disableInterrupts()
and enableInterrupts()
.
Any redirection of interrupts should be made while interrupts are masked. In such a case, it is recommended to use self-sufficient ISRs, i.e., they do not rely on Color Basic's existing ISRs.
Frozen execution
Other bugs
Inconsistent #ifdef
directives may generate invalid programs.
If conditional compilation is used in a header file to determine
the contents of a function or macro, care must be taken to ensure that
the same condition is used for all C files involves in the program,
so that they all agree.
The stack space may have been exceeded by the depth of the
C function calls. Excessive use of sbrk()
may have over-allocated
memory, especially if --stack-space=N
was not called with the
correct value (to inform sbrk()
of the limit).
Option --check-stack
can be used to insert systematic checks.
See section Detecting stack overflows at run time elsewhere
in this manual.
Option -O0
can be passed to cmoc
to make it avoid the use
of its low-level optimizer, in order to rule out the possibility
that the observed malfunction may be due to a CMOC optimizer bug.
Note that this option is likely to increase program size to a
point where it might exceed the desired limits.
EXCEPT FOR THE FILES UNDER src/usim-0.91-cmoc in the CMOC source archive, CMOC is distributed under the following license:
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
For more details on USim, see https://github.com/raybellis/usim.