[ << ] [ >> ]           [] [] [] [ ? ]

6 Oldstyle Syntax Module

This chapter describes the oldstyle syntax module suitable for some 8-bit CPUs (6502, 65816, 680x, 68HC1x, Z80, etc.), which is available with the extension oldstyle.


6.1 Legal

This module is written in 2002-2024 by Frank Wille and is covered by the vasm copyright without modifications.


6.2 Additional options for this module

This syntax module provides the following additional options:

-autoexp

Automatically export all non-local symbols, making them visible to other modules during linking.

-ast

Allow the asterisk (*) for starting comments in the first column. This disables the possibility to set the code origin with *=addr in the first column.

-dotdir

Directives have to be preceded by a dot (.).

-i

Ignore everything after a blank in the operand field and treat it as a comment. This option is only available when the backend does not separate its operands with blanks as well.

-ldots

Allow dots (.) within all identifiers.

-noc

Disable C-style constant prefixes.

-noi

Disable intel-style constant suffixes.

-sect

Enables the additional section directives text, data and bss, which switch to their respective section type. The original text directive for creating string-constants and the data directive for creating byte-constants are no longer available. But there are still other directives for the same purpose.


6.3 General Syntax

Labels always start at the first column and may be terminated by a colon (:), but don’t need to. In the latter case the mnemonic or directive has to be separated from the label by whitespace (not required in any case, e.g. with =).

Local labels are introduced by ’.’ or terminated by ’$’. For the rest, any alphanumeric character, including ’_’, is allowed. Local labels are valid between two global label definitions.

It is allowed, but not recommended, to refer to any local symbol starting with ’.’ in the source, by prefixing it with the name of the last previously defined global symbol: global_name.local_name.

The option ‘-ldots’ allows dots (.) within labels and other identifiers, but disables the above mentioned feature.

Anonymous labels are supported by defining them with a single ’:’ at the beginning of a line. They may be referenced by ’:’ followed directly by one or more ’+’ or ’-’ signs. A + selects the first anonymous label following the point of reference. A ++ selects the second anonymous label in that direction, and so on. A - selects the first anonymous label before the point of reference. Example:

:       jmp     :-    ;infinite loop

The operands are separated from the mnemonic by whitespace. Multiple operands are separated by comma (,), or in some backends by whitespace.

Make sure that you don’t define a label on the same line as a directive for conditional assembly (if, else, endif)! This is not supported and leads to undefined behaviour.

Some CPU backends may support multiple statements (directives or mnemonics) per line, separated by a special character (e.g. : for Z80).

Comments are introduced by the comment character (;), or the first blank following the operand field when option ‘-i’ was given. The rest of the line will be ignored.

Example:

mylabel instr op1,op2  ;comment

In expressions, numbers starting with $ are hexadecimal (e.g. $fb2c). For Z80 also & may be used as a hexadecimal prefix, but make sure to avoid conflicts with the and-operator (either by using parentheses or blanks). % introduces binary numbers (e.g. %1100101). Numbers starting with @ are assumed to be octal numbers, e.g. @237 (except for Z80, where it means binary). A special case is a digit followed by a #, which can be used to define an arbitrary base between 2 and 9 (e.g. 4#3012). Intel-style constant suffixes are supported: h for hexadecimal, d for decimal, o or q for octal and b for binary. Hexadecimal intel-style constants must start with a digit (prepend 0, when required). Also C-style prefixes are supported for hexadecimal (0x) and binary (0b). All other numbers starting with a digit are decimal, e.g. 1239. The one character following a ' or " is converted into ASCII code. A closing quote behind that character is optional in expressions. Not optional for strings.

The current-PC symbols is *, unless redefined by a CPU backend (e.g. Z80 sets $).


6.4 Directives

Note that data directives, like byt, dfb, db, word, dfw, dw, etc. may optionally be written without any operand. In this case they are treated like space directives, which just increment the program counter by 1 or 2 bytes.

The following directives are supported by this syntax module (if the CPU- and output-module allow it):

<symbol> = <expression>

Equivalent to <symbol> equ <expression>.

abyte <modifier>,<exp1>[,<exp2>,"<string1>"...]

Write the integer or string constants into successive 8-bit memory cells in the current section while modifying each expression (and string-character) by the modifier expression. When the modifier contains the special ._ symbol, then it is a placeholder for any expression from the line. Otherwise the modifier will be just added to each element. Any combination of integer and character string constants is permitted.

addr <exp1>[,<exp2>...]

Assign the values of the operands into successive words of memory in the current section, using the target’s endianness and address pointer size. Note that addr is not available for 6809. You may use the alternative directive da instead.

align <bitcount>

Insert as many zero bytes as required to reach an address where <bit_count> low order bits are zero. For example align 2 would make an alignment to the next 32-bit boundary on a target with 8-bit addressable memory.

asc <exp1>[,<exp2>,"<string1>"...]

Equivalent to byte <exp1>[,<exp2>,"<string1>"...].

ascii <exp1>[,<exp2>,"<string1>"...]

See defm.

asciiz "<string1>"[,"<string2>"...]

See string.

assert <expression>[,<message>]

Display an error with the optional <message> when the expression is false.

binary <file>

Inserts the binary contents of <file> into the object code at this position.

blk <exp>[,<fill>]

Insert <exp> zero or <fill> bytes into the current section.

blkl <exp>[,<fill>]

Insert <exp> zero or <fill> 32-bit long words into the current section, using the endianness of the target CPU.

blkw <exp>[,<fill>]

Insert <exp> zero or <fill> 16-bit words into the current section, using the endianness of the target CPU.

bss <exp>

Equivalent to blk <exp>,0. (Not available with option ‘-sect’.)

bss

With option ‘-sect’: switches to a bss section with attributes "aurw".

bsz <exp>[,<fill>]

Equivalent to blk <exp>[,<fill>].

byt <exp1>[,<exp2>,"<string1>"...]

Assign the integer or string constant operands into successive 8-bit memory cells in the current section. Any combination of integer and character string constant operands is permitted. Without any operands the program counter is just incremented by one.

byte <exp1>[,<exp2>,"<string1>"...]

Equivalent to byt <exp1>[,<exp2>,"<string1>"...].

da <exp1>[,<exp2>...]

Equivalent to addr <exp1>[,<exp2>...].

data <exp1>[,<exp2>,"<string1>"...]

Equivalent to byt <exp1>[,<exp2>,"<string1>"...]. (Not available with option ‘-sect’.)

data

With option ‘-sect’: switches to a data section with attributes "adrw".

db <exp1>[,<exp2>,"<string1>"...]

Equivalent to byt <exp1>[,<exp2>,"<string1>"...].

dc <exp>[,<fill>]

Equivalent to blk <exp>[,<fill>].

defb <exp1>[,<exp2>,"<string1>"...]

Equivalent to byte <exp1>[,<exp2>,"<string1>"...].

defc <symbol> = <expression>

Define a new program symbol with the name <symbol> and assign to it the value of <expression>. Defining <symbol> twice will cause an error.

defl <exp1>[,<exp2>...]

Assign the values of the operands into successive 32-bit integers of memory in the current section, using the endianness of the target CPU.

defp <exp1>[,<exp2>...]

Assign the values of the operands into successive 24-bit integers of memory in the current section, using the endianness of the target CPU.

defm "string"

Equivalent to text "string".

defw <exp1>[,<exp2>...]

Equivalent to word <exp1>[,<exp2>...].

dfb <exp1>[,<exp2>,"<string1>"...]

Equivalent to byte <exp1>[,<exp2>,"<string1>"...].

dfw <exp1>[,<exp2>...]

Equivalent to word <exp1>[,<exp2>...].

defs <exp>[,<fill>]

Equivalent to blk <exp>[,<fill>].

dend

Ends an offset-section started by dsect and restores the previously active section.

dephase

Equivalent to rend.

ds <exp>[,<fill>]

Equivalent to blk <exp>[,<fill>].

dsb <exp>[,<fill>]

Equivalent to blk <exp>[,<fill>].

dsect

Starts an ’offset-section’ (the original directive in ADE was called ’dummy-section’) which does not generate any code in the output file. Its only purpose is to define absolute labels. Within a dsect block you may use org directives to set a new offset, which defaults to zero for the first dsect otherwise. Following dsect sections continue with the last offset from the former. Such an offsect-section block is closed by the dend directive, which restores the previous ’real’ section.

dsl <exp>[,<fill>]

Equivalent to blkl <exp>[,<fill>].

dsw <exp>[,<fill>]

Equivalent to blkw <exp>[,<fill>].

dw <exp1>[,<exp2>...]

Equivalent to word <exp1>[,<exp2>...].

end

Assembly will terminate behind this line.

endif

Ends a section of conditional assembly.

el

Equivalent to else.

else

Assemble the following lines when the previous if-condition was false.

ei

Equivalent to endif. (Not available for Z80 CPU)

einline

End a block of isolated local labels, started by inline.

endm

Ends a macro definition.

endmac

Ends a macro definition.

endmacro

Ends a macro definition.

endr

Ends a repetition block.

endrep

Ends a repetition block.

endrepeat

Ends a repetition block.

endstruct

Ends a structure definition.

endstructure

Ends a structure definition.

<symbol> eq <expression>

Equivalent to <symbol> equ <expression>.

<symbol> equ <expression>

Define a new program symbol with the name <symbol> and assign to it the value of <expression>. Defining <symbol> twice will cause an error.

exitmacro

Exit the current macro (proceed to endm) at this point and continue assembling the parent context. Note, that this directive also resets the level of conditional assembly to a state before the macro was invoked (which means that it works as a ’break’ command on all new if directives).

extern <symbol>[,<symbol>...]

See global.

even

Aligns to an even address. Equivalent to align 1.

fail <message>

Show an error message including the <message> string. Do not generate an output file.

fi

Equivalent to endif.

fill <exp>

Equivalent to blk <exp>,0.

fcb <exp1>[,<exp2>,"<string1>"...]

Equivalent to byte <exp1>[,<exp2>,"<string1>"...].

fcc "<string>"

Equivalent to text.

fcs "<string>"

Works like text and fcc, but additionally sets the most significant bit of the last byte. This can be used as a string terminator on some systems.

fdb <exp1>[,<exp2>,"<string1>"...]

Equivalent to word <exp1>[,<exp2>...].

global <symbol>[,<symbol>...]

Flag <symbol> as an external symbol, which means that <symbol> is visible to all modules in the linking process. It may be either defined or undefined.

if <expression>

Conditionally assemble the following lines if <expression> is non-zero.

ifblank <something>

Conditionally assemble the following lines if there are non-blank characters in the operand, which are not a comment.

ifnblank <something>

Conditionally assemble the following lines if there are any non-blank, non-comment characters in the operand.

ifdef <symbol>

Conditionally assemble the following lines if <symbol> is defined.

ifndef <symbol>

Conditionally assemble the following lines if <symbol> is undefined.

ifd <symbol>

Conditionally assemble the following lines if <symbol> is defined.

ifnd <symbol>

Conditionally assemble the following lines if <symbol> is undefined.

ifeq <expression>

Conditionally assemble the following lines if <expression> is zero.

ifne <expression>

Conditionally assemble the following lines if <expression> is non-zero.

ifgt <expression>

Conditionally assemble the following lines if <expression> is greater than zero.

ifge <expression>

Conditionally assemble the following lines if <expression> is greater than zero or equal.

iflt <expression>

Conditionally assemble the following lines if <expression> is less than zero.

ifle <expression>

Conditionally assemble the following lines if <expression> is less than zero or equal.

ifused <symbol>

Conditionally assemble the following lines if <symbol> has been previously referenced in an expression or in a parameter of an opcode. Issue a warning, when <symbol> is already defined. Note that ifused does not work, when the symbol has only been used in the following lines of the source.

incbin <file>[,<offset>[,<nbytes>]]

Inserts the binary contents of <file> into the object code at this position. When <offset> is specified, then the given number of 8-bit bytes will be skipped at the beginning of the file. The optional <nbytes> argument specifies the maximum number of 8-bit bytes to be read from that file. When the file size (in 8-bit bytes) is not aligned with the size of a target-byte the missing bits are automatically appended and assumed to be zero. As vasm’s internal target-byte endianness for more than 8 bits per byte is big-endian, included binary files are assumed to have the same endianness. Otherwise you have to specify ‘-ile’ to tell vasm that they use little-endian target-bytes (on your 8-bit bytes host file system).

incdir <path>

Add another path to search for include files to the list of known paths. Paths defined with ‘-I’ on the command line are searched first.

include <file>

Include source text of <file> at this position.

inline

Local labels in the following block are isolated from previous local labels and those after einline.

mac <name>

Equivalent to macro <name>. (Not available for unSP CPU)

list

The following lines will appear in the listing file, if it was requested.

local <symbol>[,<symbol>...]

Flag <symbol> as a local symbol, which means that <symbol> is local for the current file and invisible to other modules in the linking process.

macro <name>[,<argname>...]

Defines a macro which can be referenced by <name>. The <name> may also appear at the left side of the macro directive, starting on the first column. The macro definition is closed by an endm directive. When calling a macro you may pass up to 9 arguments, separated by comma. These arguments are referenced within the macro context as \1 to \9, or optionally by named arguments, which you have to specify in the operand. Argument \0 is set to the macro’s first qualifier (mnemonic extension), when given. The special argument \@ inserts an underscore followed by a six-digit unique id, useful for defining labels. \() may be used as a separator between the name of a macro argument and the subsequent text. \<symbolname> inserts the current decimal value of the absolute symbol symbolname.

mdat <file>

Equivalent to incbin <file>.

needs <expression>

Equivalent to symdepend <expression>.

nolist

This line and the following lines will not be visible in a listing file.

org [#]<expression>

Sets the base address for the subsequent code. This is equivalent to *=<expression>. An optional # is supported for compatibility reasons.

phase <expression>

Equivalent to rorg <expression>.

repeat <expression>

Equivalent to rept <expression>.

rept <expression>

Repeats the assembly of the block between rept and endr <expression> number of times. <expression> has to be positive.

reserve <exp>

Equivalent to blk <exp>,0.

rend

Ends a rorg block of label relocation. Following labels will be based on org again.

rmb <exp>[,<fill>]

Equivalent to blk <exp>[,<fill>]. (Not available for 6502 CPU.)

roffs <expression>

Sets the program counter <expression> bytes behind the start of the current section. The new program counter must not be smaller than the current one. The space will be padded with zeros.

rorg <expression>

Relocate all labels between rorg and rend based on the new origin from <expression>.

section <name>[,"<attributes>"]

Starts a new section named <name> or reactivate an old one. If attributes are given for an already existing section, they must match exactly. The section’s name will also be defined as a new symbol, which represents the section’s start address. The "<attributes>" string may consist of the following characters:

Section Contents:

c

section has code

d

section has initialized data

u

section has uninitialized data

i

section has directives (info section)

n

section can be discarded

R

remove section at link time

a

section is allocated in memory

Section Protection:

r

section is readable

w

section is writable

x

section is executable

s

section is shareable

Additional Attributes:

f

mark section for far-addressing

z

mark section for near-addressing (e.g. direct/zero-page for 6502/65816)

When attributes are missing they are automatically set for the section names text, data, rodata, bss, .text, .data, .rodata and .bss. Otherwise they default to "acrwx".

<symbol> set <expression>

Create a new symbol with the name <symbol> and assign the value of <expression>. If <symbol> was already assigned by set before, it will hold the new value from now on.

spc <exp>

Equivalent to blk <exp>,0.

str "<string1>"[,"<string2>"...]

Like text, but adds a terminating carriage return (ASCII code 13).

string "<string1>"[,"<string2>"...]

Like text, but adds a terminating zero-byte.

struct <name>

Defines a structure which can be referenced by <name>. Labels within a structure definitation can be used as field offsets. They will be defined as local labels of <name> and can be referenced through <name>.<label>. All directives are allowed, but instructions will be ignored when such a structure is used. Data definitions can be used as default values when the structure is used as initializer. The structure name, <name>, is defined as a global symbol with the structure’s size. A structure definition is ended by endstruct.

structure <name>

Equivalent to struct <name>.

symdepend <expression>

Declare the current section being dependent on an externally defined symbol from <expression>. In object file formats which support it, this will generate an external symbol reference without any actual relocation being performed (R_NONE in ELF).

text "<string>"

Puts a single string constant into successive 8-bit memory cells of the current section. The string delimiters may be any printable ASCII character. (Not available with option ‘-sect’.)

text

With option ‘-sect’: switches to a code section with attributes "acrx".

weak <symbol>[,<symbol>...]

Flag <symbol> as a weak symbol, which means that <symbol> is visible to all modules in the linking process and may be replaced by any global symbol with the same name. When a weak symbol remains undefined its value defaults to 0.

wor <exp1>[,<exp2>...]

Equivalent to word <exp1>[,<exp2>...].

wrd <exp1>[,<exp2>...]

Equivalent to word <exp1>[,<exp2>...].

word <exp1>[,<exp2>...]

Assign the values of the operands into successive 16-bit words of memory in the current section, using the endianness of the target CPU. Without any operand just the program counter is incremented by two.

xdef <symbol>[,<symbol>...]

See global.

xlib <symbol>[,<symbol>...]

See global.

xref <symbol>[,<symbol>...]

See global.

zmb <exp>[,<fill>]

Equivalent to blk <exp>[,<fill>].


6.5 Structures

The oldstyle syntax is able to manage structures. Structures can be defined in two ways:

mylabel struct[ure]
        <fields>
        endstruct[ure]

or:

        struct[ure] mylabel
        <fields>
        endstruct[ure]

Any directive is allowed to define the structure fields. Labels can be used to define offsets into the structure. The initialized data is used as default value, whenever no value is given for a field when the structure is referenced.

Some examples of structure declarations:

  struct point
x    db 4
y    db 5
z    db 6
  endstruct

This will create the following labels:

point.x  ; 0   offsets
point.y  ; 1
point.z  ; 2
point    ; 3   size of the structure

The structure can be used by optionally redefining the field values:

point1 point
point2 point 1, 2, 3
point3 point ,,4

is equivalent to

point1  
               db 4
               db 5
               db 6
point2
               db 1
               db 2
               db 3
point3
               db 4
               db 5
               db 4

6.6 Known Problems

Some known problems of this module at the moment:


6.7 Error Messages

This module has the following error messages:


[ << ] [ >> ]           [] [] [] [ ? ]