C++ Host API

The UPMEM DPU toolchain contains a C++ header file that wraps the C DPU library for a more idiomatic interface.

Contents

The detailed documentation of the API can be found in the Reference to the C++ API.

Compiler prerequisite

This API relies on C++11 features and thus needs this standard or higher to work.

Compiler options

As for the C API, using the command dpu-pkg-config --libs --cflags dpu will provide all the necessary options to compile and link an application with the C++ DPU library.

Overview

The following code is an example of a C++ application with a simple DPU program with no real use case. The goal here is to present some of the main features of the C++ API.

#include <dpu>
#include <iomanip>
#include <iostream>

using namespace dpu;

int main(int argc, char **argv) {
    auto system = DpuSet::allocate(1);
    auto dpu = system.dpus()[0];
    std::vector<long> data { 0x0706050403020100l };
    std::vector<std::vector<long>> results { std::vector<long>(1) };

    dpu->load("cpp_example.dpu");
    dpu->copy("my_var", data);
    dpu->exec();
    dpu->log(std::cout);
    dpu->copy(results, "my_var");

    long value = results[0][0];

    std::cout << "My_Var after = 0x" << std::setfill('0') << std::setw(16) << std::hex << value << std::endl;

    return 0;
}

Here is the DPU program, written in C:

#include <mram.h>
#include <stdint.h>
#include <stdio.h>

__mram uint64_t my_var; // Initialized by the host application

int main() {
    uint64_t data = my_var;
    printf("My_Var before = 0x%016lx\n", data);

    my_var = data + 1;

    return 0;
}

Calling the static method DpuSet::allocate will allocate a number of DPUs. In the example, we are allocating a single DPU with the default profile. The underlying DPU will be available until the DpuSet destructor is called. At that point, the DPU will be automatically freed.

The DPU program is loaded with the load method. It is then executed with the exec method. We are then printing the DPU logs on the standard output with the log method.

The copy methods can be used to read and write the DPU memory, using the symbols defined with the attributes __mram or __host in the DPU program.

For a real application, the load, copy and exec methods would be used on the whole system, or at least on each rank of the system (the rank list can be fetched with the ranks method).

The program can be run with:

./cpp_example.host

And will give the output:

=== DPU#0x0 ===
My_Var before = 0x0706050403020100
My_Var after = 0x0706050403020101