Hello World! Example
Purpose
- This tutorial demonstrates how to write and test a “hello world” program for a DPU, including:
Building a program that executes the “hello world” function
Simulating this program and check the results
We assume that you have the UPMEM DPU toolchain properly installed on your computer (if not: Installing the UPMEM DPU toolchain).
Writing and building the program
The program prints “Hello World!”:
#include <stdio.h>
int main() {
printf("Hello World!\n");
return 0;
}
Let’s save this code into helloworld.c.
To compile and build the program executing this routine, invoke dpu-upmem-dpurte-clang as follows:
dpu-upmem-dpurte-clang -o helloworld helloworld.c
To ease the use of debugging tools, dpu-upmem-dpurte-clang enables debug symbols by default.
It can be disabled by adding -g0 as an argument in the compiler command line.
For more information about dpu-upmem-dpurte-clang arguments, please refer to CLANG COMPILER USER’S MANUAL.
Running and testing hello world
To execute the program, we will use dpu-lldb.
Once launched the help command gives a list of available commands.
In our example, we will simply load the “hello world” program and execute it with the following commands:
file helloworld
process launch
exit
You will see the “Hello World!”, and a message on the console indicating that the program ended successfully:
Hello World!
exited with status = 0 (0x00000000)
The exit status is the 8 least significant bits (LSB) of the value return by the thread 0 (first thread use to execute the DPU program).
In our case, we returned 0x0.
A more robust way for the DPU to notice that the execution was not successful is to trigger a fault or put some information in memory.
Note: dpu-lldb can be used to run a program, but it is first of all a debugger.
For more information on dpu-lldb, see the section on Debugging.
Creating a host application to drive the program
Running a DPU program with dpu-lldb is mainly here to facilitate the development of programs running on DPUs.
Your final product, however, will consist of a host application able to load and execute the “hello world” program onto a DPU.
The host APIs are available for C, C++, Java and Python languages. This tutorial focuses on the C language, but equivalent codes for C++, Java and Python are provided where applicable.
Let’s see how to write such a host application to get a fully operational environment.
First, you must write the host application itself (in helloworld_host.c, for example):
#include <assert.h>
#include <dpu.h>
#include <dpu_log.h>
#include <stdio.h>
#ifndef DPU_BINARY
#define DPU_BINARY "./helloworld"
#endif
int main(void) {
struct dpu_set_t set, dpu;
DPU_ASSERT(dpu_alloc(1, NULL, &set));
DPU_ASSERT(dpu_load(set, DPU_BINARY, NULL));
DPU_ASSERT(dpu_launch(set, DPU_SYNCHRONOUS));
DPU_FOREACH(set, dpu) {
DPU_ASSERT(dpu_log_read(dpu, stdout));
}
DPU_ASSERT(dpu_free(set));
return 0;
}
#include <dpu>
#include <iostream>
using namespace dpu;
int main(void) {
try {
auto dpu = DpuSet::allocate(1);
dpu.load("helloworld");
dpu.exec();
dpu.log(std::cout);
}
catch (const DpuError & e) {
std::cerr << e.what() << std::endl;
}
return 0;
}
import com.upmem.dpu.Dpu;
import com.upmem.dpu.DpuException;
import com.upmem.dpu.DpuSystem;
public class HelloWorldHost {
public static void main(String[] args) throws DpuException {
try(DpuSystem dpu = DpuSystem.allocate(1, "")) {
dpu.load("helloworld");
dpu.exec(System.out);
}
}
}
#!/bin/env python3
from dpu import DpuSet
from sys import stdout
with DpuSet(nr_dpus=1, binary="helloworld", log=stdout) as dpu:
dpu.exec()
Briefly:
DPU_ASSERThandles errors in the DPU API and exits in case of an error.
dpu_allocallocates a set of UPMEM DPU ranks. One set contains several DPU ranks and each rank contains several DPUs, the number depending on the target:
with the simulator, the rank contains 1 DPU.
with other targets it can vary, even between 2 ranks of the same target.
dpu_loadreads and loads the binary executable into the allocated DPU set
dpu_launchstarts the execution of the program. The host application remains suspended until the program is finished (DPU_SYNCHRONOUS)
DPU_FOREACHiterates over the individual DPUs from the allocated set
dpu_log_readfetches the DPU stdout buffer and display it on the host stdoutWhen the execution completes, the allocated DPU set must be free, using
dpu_free
Note: As seen in the corresponding codes, APIs are also available in C++, Java and Python to load the binary executable, allocate and launch the DPU (etc.). More information can be found in the documentation for the host APIs in C++ Host API, Java Library and Python Library.
This is a simple example using only 1 DPU. In most use cases, the host application will use far more than 1 DPU at a time, but the API functions stay generally the same: the DPU set parameter determines the scope of the action, and thus the overall performance (see section Controlling the execution of DPUs from host applications for more details).
This program does not check the execution result, but different methods exist to gather such results, including:
Sharing small data through the WRAM
Sharing buffers through the MRAM
These techniques will be described later in this documentation.
To compile and link this application, you can use any standard compiler install on your machine (gcc for example) and dpu-pkg-config:
gcc --std=c99 helloworld_host.c -o helloworld_host `dpu-pkg-config --cflags --libs dpu`
g++ --std=c++11 helloworld_host.cpp -o helloworld_host_cpp `dpu-pkg-config --cflags --libs dpu` -g
javac -cp $(dpu-pkg-config --variable=java dpu) HelloWorldHost.java
N/A
And then run the application:
./helloworld_host
./helloworld_host_cpp
java -cp .:$(dpu-pkg-config --variable=java dpu) HelloWorldHost
python3 helloworld.py
About dpu-pkg-config
dpu-pkg-config is a tool based on pkg-config that will add the path to the DPU include directory (-I<path_to_DPU_include_directory>) with --cflags and/or the path to the DPU libraries and the link directive (-L<path_to_DPU_libraries> -ldpu) with --libs.
While paths can change from one release to another, dpu-pkg-config will ensure that the needed compilation directives are always the good ones.
Conclusion
With this introduction, you should now be familiar with the main components of the UPMEM DPU toolchain. The rest of the documentation will introduce you to the details of each of them.