Examples of an assembly program
This section gives concrete examples of how to develop, build, and test pure assembly programs for the DPU.
Saying hello to the world
This first illustration is similar to the “hello world” introductory program:
Declare a string equal to “hello world”
Compute the checksum of these characters and store the result into
r0
The source code
The string declaration is achieved with the help of .string directive. The code hereafter defines
a “global variable” hello equal to this string. This variable must reside in the data section of the
program (which automatically places it in WRAM):
.data
.global $hello
hello:
.string "hello world"
The main function loops on each character of the string until finding zero. This main routine is in the text section
of the program (which automatically places it in IRAM) and marked as global, so that the RTE can bootstrap it.
The hello world program (helloworld.S) is:
// Hello world, written in assembly: computes the checksum of "hello world"
.text
.globl __bootstrap
__bootstrap:
#define stringPointer r1
#define checksum r0
move checksum, 0
move stringPointer, hello
#define currentCharacter r2
checksum_loop:
// Load the current character
lbu currentCharacter , stringPointer , 0
// And exit if this character is 0
jz currentCharacter, end_of_loop
add checksum, checksum, currentCharacter
// Move to next character
add stringPointer, stringPointer, 1, true, checksum_loop
end_of_loop:
stop
.data
.globl hello
hello:
.string "hello world"
Building the program
Let’s assemble the program as usual using dpu-upmem-dpurte-clang:
dpu-upmem-dpurte-clang -nostartfiles -o helloworld helloworld.S
Notice that we compiled with -nostartfiles to define the entry point ourselves (__bootstrap).
Running the program
Let’s verify that the code above is correct, using dpu-lldb:
file helloworld
process launch --stop-at-entry
breakpoint set --source-pattern-regexp "stop"
process continue
register read r0
exit
When the program has terminated, verify that the return register is equal to “hello world“‘s checksum (i.e. 45c hexa-decimal):
r0 = 0x0000045c
Placing numerical values in memory
Many programs need some variables in memory (some static variables) to operate.
The basic directives to do so are .byte, .short, .long and quad. Such variables must be
declared in a .data section of the code.
The next program uses two variables ‘a’ and ‘b’, fetched from the WRAM, and stores the sum of ‘a’ and ‘b’ into memory:
.text
.globl __bootstrap
__bootstrap:
move r16, values
lw r0,r16,0
lw r1,r16,4
add r0, r0,r1
sw r16,8, r0
stop
.data
.globl values
values:
.long 0x12345678 //a
.long 0x9abcdef0 //b
.long 0 // s=a+b
Once the program is built:
dpu-upmem-dpurte-clang -nostartfiles -o trivial_add trivial_add.S
The debugger easily allows to verify the result:
file trivial_add
process launch --stop-at-entry
breakpoint set --source-pattern-regexp "stop"
process continue
parray 3 &values
exit
The stored result is the sum of ‘a’ and ‘b’.
(void *) [0] = 0x12345678
(void *) [1] = 0x9abcdef0
(void *) [2] = 0xacf13568
Useful common linker directives
The list of assembler directives, along with a comprehensive description can be found in Assembler syntax. The most commonly used are described hereafter.
Creating a static buffer of data
The “zero” directives allow creating a static buffer of data with an initial value. The DPU assembler repeats the specified value (zero by default) a certain number of times.
The DPU assembler defines:
.zeroto create a buffer of bytes
.fillto create a buffer of words
The example below creates a static buffer of 7 bytes, equal to 42 hexa-decimal and returns
puts the two words of this buffer into r0 and r1:
// Illustrates the creation and reference of a static
// buffer of memory in assembler.
.data
.globl static_buffer
.align 4
static_buffer:
.fill 7, 1, 0x42
.zero 1
.text
.globl __bootstrap
__bootstrap:
lw r0 , zero, static_buffer
move r1, 4
lw r1 , r1, static_buffer
stop
Build the program:
dpu-upmem-dpurte-clang -nostartfiles static_buffer.S -o static_buffer
Now, execute the program and verify that the registers and the memory match the
expectations. Notice that the host is a little-endian machine in this test, implying
that the 8th null byte in the buffer goes to the most significant bits of r1:
file static_buffer
process launch --stop-at-entry
breakpoint set --source-pattern-regexp "stop"
process continue
register read
parray 2 &static_buffer
exit
As expected, dpu-upmem-dpurte-clang places exactly 7 bytes in memory:
r0 = 0x42424242
r1 = 0x00424242
(void *) [0] = 0x42424242
(void *) [1] = 0x00424242
Notice that the buffer is aligned on 4 (bytes), which is necessary to perform a load word from it. If the buffer would have been used for DMA purposes, it would have needed to be aligned on 8 (bytes).
Useful tips and tricks
dpuasmdoc
This utility is the fastest way to remind the assembly syntax. By typing, for example:
dpuasmdoc ld
One can get the syntax of all the DPU instructions containing the keyword ld:
ld endian:e dc ra off:s24
let @a = (ra + off)
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
ld endian:e dc sa off:s24
cc = (ra & 0xffff) + off + 8 - (ra >> 16)
if (const_cc_ge0 cc) then
let @a = (ra & 0xffff) + off & 0xfff8
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
raise exception(_memory_fault)
else
let @a = (ra & 0xffff) + off
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
ld dc ra off:s24
let @a = (ra + off)
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
ldma ra rb immDma:u8
let @w = (ra & 0xfffff8)
let @m = (rb & 0xfffffff8)
let N = (1 + (immDma:U32 + (ra >> 24) & 0xff) & 0xff) << 3
Load N bytes from MRAM at address @m into WRAM at address @w
ldmai ra rb immDma:u8
let @i = (ra & 0xfffff8)
let @m = (rb & 0xfffffff8)
let N = (1 + (immDma:U32 + (ra >> 24) & 0xff) & 0xff) << 3
Load N bytes from MRAM at address @m into IRAM at address @w
lds dc ra off:s24
cc = (ra & 0xffff) + off + 8 - (ra >> 16)
if (const_cc_ge0 cc) then
let @a = (ra & 0xffff) + off & 0xfff8
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
raise exception(_memory_fault)
else
let @a = (ra & 0xffff) + off
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
To get a help on a specific keyword (e.g. the ld instruction specifically), use:
dpuasmdoc -match ld
ld endian:e dc ra off:s24
let @a = (ra + off)
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
ld endian:e dc sa off:s24
cc = (ra & 0xffff) + off + 8 - (ra >> 16)
if (const_cc_ge0 cc) then
let @a = (ra & 0xffff) + off & 0xfff8
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
raise exception(_memory_fault)
else
let @a = (ra & 0xffff) + off
dc = (Load 8 bytes from WRAM at address @a with endianness endian)
ld dc ra off:s24
let @a = (ra + off)
dc = (Load 8 bytes from WRAM at address @a with endianness endian)