Assembler syntax

LLVM for DPU can assemble files to produce objects or immediate executables. By convention, assembly files shall end with ‘.S’ extension.

A concrete coding example is given in section Examples of an assembly program.

Overview of an assembler program structure

Source code in assembler consists of the following types of lines:

Comments: a line of comment starts with //
Pre-processor directives, such as #define
Assembly instructions
Linker directives, such as .global

Pre-processor directives

The assembler is fully compliant with the C Preprocessor standards.

In particular, it supports (not exhaustive):

#include, given that the path to included files can be specified with clang option -I
#define and #undef, to set and unset constants and macros
#if, #else, #ifdef, #endif: to check constant values

The preprocessor macros can be defined internally or be provided upon the invocation of clang, using the -D option.

Assembly instructions

An assembly instruction is defined on a single line of source code, following the standard format:

keyword arguments...

Where keyword identifies the instruction and arguments is the list of instruction arguments, separated by commas.

The list of keywords, along with a description of the arguments can be found:

Either by invoking dpuasmdoc from the command line
- dpuasmdoc key searches for all the instructions with a keyword containing key
- dpuasmdoc -match key searches for all the instructions with a keyword exactly equal to key
- dpuasmdoc -help gives a list of additional options
Or in the instruction set reference section of this documentation

Each assembly instruction argument can be:

A register

An immediate value

A label

A condition

An endianness specification

Registers

The next table gives the list of DPU registers, as understood by the assembler:

Name

Description

rX

32-bits read/write register, X being in the range [0..23]

sX

32-bits read/write safe register, X being in the range [0..23]

dX

64-bits read/write register, X being an even value in the range [0..22]

Register dx is the combination of r(X/2) (most significant bits) and r(X/2+1)

id

Special read-only register equal to the invoking thread’s number

idX

Special read-only register equal to X times the invoking thread’s number. X is equal to 2, 4 or 8.

zero

one

lneg

mneg

Special read-only registers returning system constants, respectively equal to 0, 1

0xffffffff and 0x80000000.

Immediate values

Immediate values are expressed as numerical values:

In hexadecimal unsigned (prefixed with 0x)

In binary unsigned (prefixed with 0b)

With ASCII value (enclosed by ', example: 'a' for the numerical value 97)

Or as signed decimal values

Allowed values depend on the type of expected immediate (signedness and number of encoding bits) and the expression.

Expected operand is unsigned

Allowed values must be unsigned and fit in the number of encoding bits. For example, the logical shift left instruction, defined as:

// Shift register 'ra' by 'shift' positions to the left
// Store the result into register 'rc'
// 'shift' is a 5 bits immediate value
lsl rc ra shift:u5

Allows the following expressions:

// Shift by 31 positions to the left
lsl r1, r0, 31
// Ditto
lsl r1, r0, 0x1f

But not:

// Shift by 32 positions to the left => not allowed
lsl r1, r0, 32
// Shift by a negative number of positions to the left => not allowed
lsl r1, r0, -3

Expected operand is signed

Given that an expected immediate operand is encoded on N bits, corresponding immediate values must (2^X meaning 2 power X):

Either be unsigned (i.e. hexadecimal expressions) in the range [0x0, (2^(N-1))-1]

Or be signed (i.e. decimal expressions) in the range [-(2^(N-1)), (2^(N-1))-1]

Let consider, for example, the subtract instruction, defined as:

// register 'rc' is equal to 'imm' minus 'ra'
// condition 'cc' must be equal to 'false'
sub rc imm:s24 ra false_cc:cc

Then, the following expressions are legal:

sub r1, 8388607,  r0, false   // r1 = (2^(24-1) - 1) - r0
sub r1, 0x7fffff, r0, false   // Same with hexa-decimal expression
sub r1, -8388608, r0, false   // r1 = -2^(24-1) - r0

But not:

sub r1, 8388608,  r0, false    // 8388608 = 2^23 > 2^(24-1) - 1
sub r1, 0x800000, r0, false    // 0x800000 = 2^23

Labels

Labels keep track of specific positions in the source code, which can then be referred by any instruction as immediate values. Typically, labels are used by the instructions jump and call to change the program counter value, moving to the labeled location.

A label is expressed on an individual line, followed by a colon.

For example, the following program iterates 10000000 times on the same loop (in C: for (i=10000000-1; i>=0; i--)):

#define ITERATIONS 9999999
move r0, ITERATIONS
loop:
        // DO SOMETHING...
        add r0, r0, -1, pl, loop     // Decrement r0 and iterate if positive.
stop

A label may also reference a forward position from the current PC value, using the syntax . + offset. The DPU assembler restricts the offset value to a positive integer lower than 8. Other values may lead to unpredictable behaviors.

Conditions

Conditions are used in specific DPU instructions:

Either to perform combo operations, i.e. compute, test, and jump in a row

Or to perform test operations, i.e. compute and check result

Combo operations

The general syntax of a combo operation is:

INSTRUCTION TARGET_REGISTER, OPERANDS, CONDITION, PC

Which is interpreted as:

Set TARGET_REGISTER value to the result of INSTRUCTION applied to OPERANDS

If the operation result verifies CONDITION then jump at the instruction specified by PC

We already saw an example of a combo operation in the previous paragraph, where the condition pl (ie. positive or null) was used to create a loop.

Each instruction has its own set of allowed conditions, identified by condition classes.

Test operations

The general syntax of a test operation is:

INSTRUCTION TARGET_REGISTER, OPERANDS, CONDITION

Which is interpreted as:

Set TARGET_REGISTER to 1 if the result of INSTRUCTION applied to OPERANDS verified CONDITION

Otherwise set TARGET_REGISTER to 0

For example

move r0, 255
move r1, 512
sub r2, r0, r1, ltu

Will set r2 to 1, as r0 (which contains 255) is lower than r1 (which contains 512)?

Available conditions and condition classes

The table below describes the available conditions:

`true`	Condition is always true, whatever the operation result is
`false`	Condition is always false, whatever the operation result is
`z`	The operation result equals zero
`nz`	The operation result does not equal zero
`e`	The operation result is even
`o`	The operation result is odd
`pl`	The operation result is greater or equal to zero
`mi`	The operation result is strictly lower than zero
`ov`	The operation result has overflow set
`nov`	The operation result does not have overflow set
`c`	The operation result has carry set
`nc`	The operation result does not have carry set
`sz`	The source register operand is equal to zero
`snz`	The source register operand is not equal to zero
`spl`	The source register operand is positive or null
`smi`	The source register operand is strictly negative
`so`	The source register operand is odd
`se`	The source register operand is even
`nc5` `nc6` `nc7` `nc8` `nc9` `nc10` `nc11` `nc12` `nc13` `nc14`	Operation result set the carry flag number 5,6,7,8,9,10,11,12,13 or 14 respectively. These conditions may come in handy to quickly detect buffer overflows.









`max`	Operation is a bit count and the result is the maximum count value
`nmax`	Operation is a bit count and the result is not the maximum count value
`sh32`	The second operand is a register with a value having bit number 5 equal to one
`nsh32`	The second operand is a register with a value having bit number 5 equal to zero
`eq`	The first operand is equal to the second operand
`neq`	The first operand is different from the second operand
`ltu` `leu` `gtu` `geu`	The first operand is respectively lower than, lower or equal to, greater than, greater or equal to the second operand when performing an unsigned comparison



`lts` `les` `gts` `ges`	First operand of is respectively lower than, lower or equal to, greater than, greater or equal to the second operand when performing a signed comparison



`xz`	Operation result is null and ZeroFlag is set
`xnz`	Operation result is not null or ZeroFlag is not set
`xleu`	Either operation result holds carry flag or ZeroFlag is set
`xgtu`	Operation result holds carry flag and ZeroFlag is not set
`xles`	ZeroFLag is set and either operation result is negative or overflows
`xgts`	ZeroFlag is not set and either operation result is positive or null or overflows
`small`	Operation is an 8x8 multiplication and the result is a less than 256
`large`	Operation is an 8x8 multiplication and the result is greater than 255

The list of condition classes can be found in DPU condition classes.

Endianness specification

Load and store operations can be achieved either with the DPU native endianness (little-endian) or in big-endian. This specification is given by an endian instruction argument either !little or !big.

Implicit operands

Implicit operands are used by instructions and conditions to behave according to a previous operation result. These are particularly useful to emulate 64-bits operations.

Previous operation results are maintained by flags, presented here-after.

Zero Flag

This flag is set when an operation result is equal to zero.

Carry flag

This flag is set when an operation result holds a carry.

Linker directives

Linker directives are forwarded to the linker process, so that it properly manages various aspects of the program, such as:

Fetching the label names when referenced from outer part of the assembly file
Initializing the static memory at boot time

Directives are represented by a dot, followed by the directive name, followed by specific arguments depending on the directive itself.

For example, to re-align a pointer to 32 bits boundary in memory, one may use the align directive as follows:

.align 4

LLVM understands most of the Gnu Assembler directives. Next is a non-exhaustive list of the most relevant directives.

text

The text directive declares a segment of text. All the subsequent information must be in assembly instructions, so that the linker puts those instructions into the program.

Syntax: .text

data

Declares segments of memory. All the subsequent information must define memory contents, using the directives described later in this section.

Syntax: .data

bss

Declares segments of memory. Notice that the DPU does not make any distinction between bss segment and common data: the linker puts all the data contents into an initial WRAM, enclosed in the program along with the code itself.

Syntax: .bss

align

The align directive inserts padding values into the current memory segment to ensure that the next declared variable will be aligned on the required boundary.

Syntax: .align x, fill, max

Where:

x is the alignment boundary, expressed in number of bytes (e.g. 2 for 16-bits boundary, 4 for 32-bits…)
fill is an optional argument specifying the padding value. By default, the padding value is 0
max is an optional argument specifying a maximum padding length… If ever the padding requires more bytes than max to reach the requested boundary, then the linker stops after having written max bytes

ascii

The ascii directive puts a string into the current memory segment.

Syntax: .ascii string...

Where:

string is one string, complying with the directive string format
there may be one or more strings

Notice that a synonym for this directive is .string.

asciz

The asciz directive declares a null-terminated string into the current memory segment. It is equivalent to .ascii, but the assembler automatically adds a 0 at the end of each declared string.

Syntax: .asciz string...

byte

Puts one byte into the current memory segment.

Syntax: .byte value

Where value is an immediate 8-bit value.

short

Puts a 16-bits word into the current memory segment.

Syntax: .half value, .hword value or .short value

Where value is an immediate 16-bit value.

long

Puts a 32-bits word into the current memory segment.

Syntax: .word value or .long value

Where value is a 32-bit value.

quad

Puts a 64-bits word into the current memory segment.

Syntax: .dword value or .quad value

Where value is a 64-bit value.

zero

Puts some bytes into the current memory segment.

Syntax: .zero n, value

Where:

n is the number of bytes set to the specified value
value is an optional byte value (by default, the directive creates a series of 0s).

global

The global (also called globl) directive “exports” the specified symbol, so that it can be referenced from another source.

From a C perspective, this means that non-global symbols are seen as static items.

Syntax: .global symbol

Where symbol is the exported symbol.

String format

Strings for directives such as ascii are written between double-quotes and may contain ASCII characters, interpreted as it is, or special characters represented with an escape sequence marked by the backslash character, followed by the escape special character definition.

Possible definitions for special characters are:

b: backspace (ASCII octal code 010)
f: form feed (ASCII octal code 014)
n: new line (ASCII octal code 012)
r: carriage return (ASCII octal code 015)
t: tabulation (ASCII octal code 011
a backslash: so that \\ stands for the backslash character
double quotes
A suite of digits representing the character value in octal (for example \010 stands for backspace)