3 Standard Syntax Module

This chapter describes the standard syntax module which is available with the extension std.

3.1 Legal

This module is written in 2002-2024 by Volker Barthelmann and is covered by the vasm copyright without modifications.

3.2 Additional options for this module

This syntax module provides the following additional options:

‘-ac’: Immediately allocate common symbols in the .bss/.sbss section and define them as externally visible.
‘-align’: Enforces the backend’s natural alignment for all data directives (.word, .long, .float, etc.).
‘-gas’: Enable GNU-as compatibility mode. Currently this will only prevent labels prefixed by a dot to be recognized as local labels and turns .org into a pure section-offset.
‘-nodotneeded’: Recognize assembly directives without a leading dot (.).
‘-sdlimit=<n>’: Put data up to a maximum size of n bytes into the small-data sections. Default is n=0, which means the function is disabled.

3.3 General Syntax

Labels always have to be terminated by a colon (:), therefore they don’t necessarily have to start at the first column of a line.

Local labels may either be preceded by a ’.’ (unless option ‘-gas’ was given) or terminated by ’$’, and consist out of digits only. These labels exist and keep their value between two global label definitions.

A special form of reusable "local" labels, independent of global labels, may be defined by using a single digit from 0 to 9. You can reference the nearest previous digit-label with Nb and the nearest following digit-label with Nf, where N is such a digit.

Make sure that you don’t define a label on the same line as a directive for conditional assembly (if, else, endif)! This is not supported.

The operands are separated from the mnemonic by whitespace. Multiple operands are separated by comma (,).

The chracter used to introduce comments is usually the semicolon (;). Except for the following backends, which change it to a hash (#) character: ppc, vidcore, x86.

Example:

mylabel:  inst.q1.q2 op1,op2,op3  # comment

In expressions, numbers starting with 0x or 0X are hexadecimal (e.g. 0xfb2c). 0b or 0B introduce binary numbers (e.g. 0b1100101). Other numbers starting with 0 are assumed to be octal numbers, e.g. 0237. All numbers starting with a non-zero digit are decimal, e.g. 1239.

C-like escape characters in string constants are allowed by default, unless disabled by ‘-noesc’.

3.4 Directives

All directives are case-insensitive. The following directives are supported by this syntax module (if the CPU- and output-module allow it):

.2byte <exp1>[,<exp2>...]

See .uahalf.

.4byte <exp1>[,<exp2>...]

See .uaword.

.8byte <exp1>[,<exp2>...]

See .uaquad.

.ascii <exp1>[,<exp2>,"<string1>"...]

See .byte.

.abort <message>

Print an error and stop assembly immediately.

.asciiz "<string1>"[,"<string2>"...]

See .string.

.align <bitorbyte_count>[,<fill>][,<maxpad>]

Depending on the current CPU backend .align either behaves like .balign (x86) or like .p2align (PPC).

.balign <byte_count>[,<fill>][,<maxpad>]

Insert as many fill bytes as required to reach an address which is dividable by <byte_count>. For example .balign 2 would make an alignment to the next 16-bit boundary, on a target with 8-bit addressable memory. The padding bytes are initialized by <fill>, when given. The optional third argument defines a maximum number of padding bytes to use. When more are needed then the alignment is not done at all.

.balignl <bit_count>[,<fill>][,<maxpad>]

Works like .balign, with the only difference that the optional fill value can be specified as a 32-bit word. Padding locations which are not already 32-bit aligned, will cause a warning and padded by zero-bytes.

.balignw <bit_count>[,<fill>][,<maxpad>]

Works like .balign, with the only difference that the optional fill value can be specified as a 16-bit word. Padding locations which are not already 16-bit aligned, will cause a warning and padded by zero-bytes.

.byte <exp1>[,<exp2>,"<string1>"...]

Assign the integer or string constant operands into successive 8-bit memory cells in the current section. Any combination of integer and character string constant operands is permitted.

.comm <symbol>,<size>[,<align>]

Defines a common symbol which has a size of <size> bytes. The final size and alignment is assigned by the linker, which will use the highest size and alignment values of all common symbols with the same name found. A common symbol is usually allocated in the .bss section of the final executable. In case the optional <align> argument was not specified the backend’s default alignment for the given size will be used.

.double <exp1>[,<exp2>...]

Parse one of more IEEE double precision floating point expressions and write them into successive blocks of 64 bits into memory using the backend’s endianness.

.else

Assemble the following block only if the previous .if condition was false.

.elseif <exp>

Same as .else followed by .if, but without the need for an .endif. Avoids nesting.

.endif

Ends a block of conditional assembly.

.endm

Ends a macro definition.

.endr

Ends a repetition block.

.equ <symbol>,<expression>

See .set.

.equiv <symbol>,<expression>

Assign the <expression> to <symbol> similar to .equ and .set, but signals an error when <symbol> has already been defined.

.err <message>

Print a user error message. Do not create an output file.

.extern <symbol>[,<symbol>...]

See .global.

.fail <expression>

Cause a warning when <expression> is greater or equal 500. Otherwise cause an error.

.file "string"

Set the filename of the input source. This may be used by some output modules. By default, the input filename passed on the command line is used.

.float <exp1>[,<exp2>...]

Parse one of more IEEE single precision floating point expressions and write them into successive blocks of 32 bits into memory using the backend’s endianness.

.global <symbol>[,<symbol>...]

Flag <symbol> as an external symbol, which means that <symbol> is visible to all modules in the linking process. It may be either defined or undefined.

.globl <symbol>[,<symbol>...]

See .global.

.half <exp1>[,<exp2>...]

Assign the values of the operands into successive 16-bit words of memory in the current section using the backend’s endianness.

.if <expression>

Conditionally assemble the following lines if <expression> is non-zero.

.ifeq <expression>

Conditionally assemble the following lines if <expression> is zero.

.ifne <expression>

Conditionally assemble the following lines if <expression> is non-zero.

.ifgt <expression>

Conditionally assemble the following lines if <expression> is greater than zero.

.ifge <expression>

Conditionally assemble the following lines if <expression> is greater than zero or equal.

.iflt <expression>

Conditionally assemble the following lines if <expression> is less than zero.

.ifle <expression>

Conditionally assemble the following lines if <expression> is less than zero or equal.

.ifb <operand>

Conditionally assemble the following lines when <operand> is completely blank, except an optional comment.

.ifnb <operand>

Conditionally assemble the following lines when <operand> is non-blank.

.ifc <string1>,<string2>

Conditionally assemble the following lines when <string1> matches <string2>. Empty strings are allowed. Quotes are optional.

.ifnc <string1>,<string2>

Conditionally assembler the following lines when <string1> differs from <string2>. Empty strings are allowed. Quotes are optional.

.ifdef <symbol>

Conditionally assemble the following lines if <symbol> is defined.

.ifndef <symbol>

Conditionally assemble the following lines if <symbol> is undefined.

.incbin <file>

Inserts the binary contents of <file> into the object code at this position. When the file size (in 8-bit bytes) is not aligned with the size of a target-byte the missing bits are automatically appended and assumed to be zero. As vasm’s internal target-byte endianness for more than 8 bits per byte is big-endian, included binary files are assumed to have the same endianness. Otherwise you have to specify ‘-ile’ to tell vasm that they use little-endian target-bytes (on your 8-bit bytes host file system).

.incdir <path>

Add another path to search for include files to the list of known paths. Paths defined with ‘-I’ on the command line are searched first.

.include <file>

Include source text of <file> at this position.

.int <exp1>[,<exp2>...]

Assign the values of the operands into successive words of memory in the current section using the target’s endianness and address size.

.irp <symbol>[,<val>...]

Iterates the block between .irp and .endr for each <val>. The current <val>, which may be embedded in quotes, is assigned to \symbol. If no value is given, then the block is assembled once, with \symbol set to an empty string.

.irpc <symbol>[,<val>...]

Iterates the block between .irp and .endr for each character in each <val>, and assign it to \symbol. If no value is given, then the block is assembled once, with \symbol set to an empty string.

.lcomm <symbol>,<size>[,<alignment>]

Allocate <size> bytes of space in the .bss section and assign the value to that location to <symbol>. If <alignment> is given, then the space will be aligned to an address having <alignment> low zero bits or 2, whichever is greater. <symbol> may be made globally visible by the .globl directive.

.list

The following lines will appear in the listing file, when enabled.

.local <symbol>[,<symbol>...]

Flag <symbol> as a local symbol, which means that <symbol> is local for the current file and invisible to other modules in the linking process.

.long <exp1>[,<exp2>...]

Assign the values of the operands into successive 32-bit words of memory in the current section using the backend’s endianness.

.macro <name> [<argname1>[=<default>][,<argname2>...]]

Defines a macro, which can be referenced by <name>. The macro definition is closed by an .endm directive. The argument names, which may be passed to this macro, must be declared directly following the macro name, separated by white-space. You can define an optional default value in the case an argument is left out. Note that macro names are case-insensitive while the argument names are case-sensitive. Within the macro context arguments are referenced by \argname. The special argument \@ inserts a unique id, useful for defining labels. \() may be used as a separator between the name of a macro argument and the subsequent text.

.nolist

This line and the following lines will not be visible in a listing file.

.org <exp>[,<fill>]

Before any section directive, and in absence of the ‘-gas’ option, <exp> defines the absolute start address of the following code and <fill> has no meaning. Within a relocatable section <exp> defines the relative offset from the start of this section for the subsequent code. The optional <fill> value is only valid within a section and is used to fill the space to the new program counter (defaults to zero). When <exp> starts with a current-pc symbol followed by a plus (+) operator, then the directive just reserves space (filled with zero).

.p2align <bit_count>[,<fill>][,<maxpad>]

Insert as many fill bytes as required to reach an address where <bit_count> low order bits are zero. For example .p2align 2 would make an alignment to the next 32-bit boundary, when the target has 8-bit addressable memory. The padding bytes are initialized by <fill>, when given. The optional third argument defines a maximum number of padding bytes to use. When more are needed then the alignment is not done at all.

.p2alignl <bit_count>[,<fill>][,<maxpad>]

Works like .p2align, with the only difference that the optional fill value can be specified as a 32-bit word. Padding locations which are not already 32-bit aligned, will cause a warning and padded by zero-bytes.

.p2alignw <bit_count>[,<fill>][,<maxpad>]

Works like .p2align, with the only difference that the optional fill value can be specified as a 16-bit word. Padding locations which are not already 16-bit aligned, will cause a warning and padded by zero-bytes.

.popsection

Restore the top section from the internal section-stack. Also refer to .pushsection.

.pushsection <name>[,"<attributes>"][[,@<type>]|[,%<type>]|[,<mem_flags>]]

Works exactly like .section, but additionally pushes the previously active section onto an internal stack, where it may be restored from by the .popsection directive.

.quad <exp1>[,<exp2>...]

Assign the values of the operands into successive quadwords (64-bit) of memory in the current section using the backend’s endianness.

.rept <expression>

Repeats the assembly of the block between .rept and .endr <expression> number of times. <expression> has to be positive.

.section <name>[,"<attributes>"][[,@<type>]|[,%<type>]|[,<mem_flags>]]

Starts a new section named <name> or reactivates an old one. When attributes are given for an already existing section, they must match exactly. The "<attributes>" string may consist of the following characters:

Section Contents:

a: section is allocated in memory
c: section has code
d: section has initialized data
u: section has uninitialized data
i: section has directives (info or offsets section)
n: section can be discarded
R: remove section at link time

Section Protection:

r: section is readable
w: section is writable
x: section is executable
s: section is shareable

Section Alignment: A digit, which is ignored. The assembler will automatically align the section to the highest alignment restriction used within.

Memory attributes:

C: load section to Chip RAM (AmigaOS hunk format)
F: load section to Fast RAM (AmigaOS hunk format)
z: load section to zero/direct-page (6502, 65816, 680x, etc.)

Any other attribute will still be accepted by vasm and passed to the output driver (which might ignore it).

The optional <type> argument is mainly used for ELF output and may be introduced either by a ’%’ or a ’@’ character. Allowed are:

progbits: This is the default value, which means the section data occupies space in the file and may have initialized data.
nobits: These sections do not occupy any space in the file and will be allocated filled with zero bytes by the OS loader.

When the optional, non-standard, <mem_flags> argument is given it defines a 32-bit memory attribute, which defines where to load the section (platform specific). The memory attributes are currently only used in the hunk-format output module.

.set <symbol>,<expression>

Create a new program symbol with the name <symbol> and assign to it the value of <expression>. If <symbol> is already assigned, it will contain a new value from now on.

.size <symbol>,<size>

Defines the size in bytes associated with the given <symbol>. This information is only used by some object file formats (for example ELF) and typically sets the size of function symbols.

.short <exp1>[,<exp2>...]

Assign the values of the operands into successive 16-bit words of memory in the current section using the backend’s endianness.

.single <exp1>[,<exp2>...]

Parse one of more IEEE single precision floating point expressions and write them into successive blocks of 32 bits into memory using the backend’s endianness.

.skip <exp>[,<fill>]

Insert <exp> zero or <fill> bytes into the current section.

.space <exp>[,<fill>]

Insert <exp> zero or <fill> bytes into the current section.

.stabs "<name>",<type>,<other>,<desc>,<exp>

Add a stab-entry for debugging, including a symbol-string and an expression.

.stabn <type>,<other>,<desc>,<exp>

Add a stab-entry for debugging, without a symbol-string.

.stabd <type>,<other>,<desc>

Add a stab-entry for debugging, without symbol-string and value.

.string "<string1>"[,"<string2>"...]

Like .byte, but adds a terminating zero-byte.

.swbeg <op>

Just for compatibility. Do nothing.

.type <symbol>,<type>

Set type of symbol named <symbol> to <type>, which must be one of:

1: Object
2: Function
3: Section
4: File

The predefined symbols @object and @function are available for this purpose. Only used by some object file formats (for example ELF).

.uahalf <exp1>[,<exp2>...]

Assign the values of the operands into successive 16-bit areas of memory in the current section regardless of current alignment.

.ualong <exp1>[,<exp2>...]

Assign the values of the operands into successive 32-bit areas of memory in the current section regardless of current alignment.

.uaquad <exp1>[,<exp2>...]

Assign the values of the operands into successive 64-bit areas of memory in the current section regardless of current alignment.

.uashort <exp1>[,<exp2>...]

Assign the values of the operands into successive 16-bit areas of memory in the current section regardless of current alignment.

.uaword <exp1>[,<exp2>...]

Assign the values of the operands into successive 16-bit areas of memory in the current section regardless of current alignment.

.weak <symbol>[,<symbol>...]

Flag <symbol> as a weak symbol, which means that <symbol> is visible to all modules in the linking process and may be replaced by any global symbol with the same name. When a weak symbol remains undefined its value defaults to 0.

.word <exp1>[,<exp2>...]

Assign the values of the operands into successive 16-bit words of memory in the current section using the backend’s endianness.

.zero <exp>[,<fill>]

Insert <exp> zero or <fill> bytes into the current section.

Predefined section directives:

.bss: .section ".bss","aurw"
.data: .section ".data","adrw"
.dpage: .section ".dpage","adrwz"
.rodata: .section ".rodata","adr"
.sbss: .section ".sbss","aurw"
.sdata: .section ".sdata","adrw"
.sdata2: .section ".sdata2","adr"
.stab: .section ".stab","dr"
.stabstr: .section ".stabstr","dr"
.text: .section ".text","acrx"
.tocd: .section ".tocd","adrw"

3.5 Known Problems

Some known problems of this module at the moment:

- None.

3.6 Error Messages

This module has the following error messages:

- 1001: mnemonic expected
- 1002: invalid extension
- 1003: no space before operands
- 1004: too many closing parentheses
- 1005: missing closing parentheses
- 1006: missing operand
- 1007: scratch at end of line
- 1008: section flags expected
- 1009: invalid data operand
- 1010: memory flags expected
- 1011: identifier expected
- 1012: assembly aborted
- 1013: unexpected "%s" without "%s"
- 1014: pointless default value for required parameter <%s>
- 1015: invalid section type ignored, assuming progbits
- 1019: syntax error
- 1021: section name expected
- 1022: .fail %lld encountered
- 1023: .fail %lld encountered
- 1024: alignment too big