RISC-V: A Baremetal Introduction using C++. Startup.

Phil Mulholland
4 min readMay 29, 2021

In the last post, we set up the development environment. This post is about how the RISC-V core executes our program.

How do we go from reset to entering the main() function in C++ in RISC-V? Startup code is generally not something you need to worry about, however, it is of interest when bringing up a device from scratch.

Can we write our own startup code in pure C++?

Booting a RISC-V Core

From here https://commons.wikimedia.org/wiki/File:Linux_boot_screen_compact.png
Booting an OS is much more complex, but the entry point relies on the same principles.

Let’s start by looking at the RISC-V specifics needed to boot. This is where the software starts executing.

Where does the execution start at reset?

RISC-V does not define a standard entry point or reset vector. For SiFive’s core we can locate the entry code in the linker section .text.metal.init.enter, but this will change for other RISC-V cores.

How is the stack created?

The stack pointer sp needs to be initialized manually with a move to the register. The linker script defines a global symbol_sp for the top of the stack, and reserves __stack_size below it.

Do any other registers need to be initialized?

The linker relaxation optimization for RISC-V requires the global pointer gp to be configured with a location,_global_pointer$, from the
linker script. This is used to optimize access to global variables. (see this post for more details)

What about multi-core startup?

RISC-V has a concept of a hart to represent each execution context (For example hardware threads and cores.). This example will only work with one hart.

You can see these specifics handled the the small function below using inline assembly.

extern "C" void _enter(void) __attribute__((naked, 
section(".text.metal.init.enter")));
extern "C" void _start(void) __attribute__ ((noreturn));
void _enter(void) {
// Setup SP and GP
// The locations are defined in the linker script
__asm__ volatile (
".option push;"
".option norelax;"
"la gp, __global_pointer$;"
".option pop;"
"la sp, _sp;"
"jal zero, _start;"
: /* output: none %0 */
: /* input: none */
: /* clobbers: none */);
// This point will not be executed,
// _start() will be called with no return.
}

A few details:

  • The sp and gp are aliases to general-purpose registers and these are defined by the ABI and code generation conventions.
  • The zero register is an alias to r0, however, this is defined as constant 0 by the hardware specification, not just by convention.
  • The la instruction is a pseudo instruction for loading addresses. RISC-V defines many such pseudo instructions

Initializing the C++ World

The enter() function was pure assembly, when do we get to use C++ as promised?

Once we reach the start() function we can use SOME C++, we have a stack so we can use local variables. The heap, globals or anything that relies on static initialization will not be available. In this case we can use the std::fill, std::copy and std::for_each functions from the<algorithm> header.

What needs to be initialized? The SRAM at reset has no defined value, so any globals will be in an undefined state. To initialize them the value is either cleared to zero (.bss), or comes from a default value stored in the program image (.data), or from an initialization function ( constructors).

extern "C" std::uint8_t metal_segment_bss_target_start, metal_segment_bss_target_end;
extern "C" std::uint8_t metal_segment_data_source_start;
extern "C" std::uint8_t metal_segment_data_target_start, metal_segment_data_target_end;
extern "C" std::uint8_t metal_segment_itim_source_start;
extern "C" std::uint8_t metal_segment_itim_target_start, metal_segment_itim_target_end;

extern "C" function_t __init_array_start, __init_array_end;
extern "C" function_t __fini_array_start, __fini_array_end;

// Define the symbols with "C" naming as they are used by the assembler
extern "C" void _start(void) __attribute__ ((noreturn));

// At this point we have a stack and global poiner, but no access to global variables.
void _start(void) {
// Init memory regions
// Clear the .bss section (global variables with no initial values)
std::fill(&metal_segment_bss_target_start,
&metal_segment_bss_target_end,
0U);
// Initialize the .data section (global variables with initial values)
std::copy(&metal_segment_data_source_start,
&metal_segment_data_source_start + (&metal_segment_data_target_end-&metal_segment_data_target_start),
&metal_segment_data_target_start);
// Initialize the .itim section (code moved from flash to SRAM to improve performance)
std::copy(&metal_segment_itim_source_start,
&metal_segment_itim_source_start + (&metal_segment_itim_target_end - &metal_segment_itim_target_start),
&metal_segment_itim_target_start);
// Call constructors
std::for_each( &__init_array_start,
&__init_array_end,
[](function_t pf) {(pf)();});
// Jump to main
auto rc = main();
// Call destructors
std::for_each( &__fini_array_start,
&__fini_array_end,
[](function_t pf) {(pf)();});
// Don't expect to return, if so busy loop in the exit function.
_Exit(rc);
}

What is happening above? It’s initialing the regions of memory for the C++ program. The linker script has defined the locations, but we need to initialize the SRAM which is in an undefined state at startup.

  • The linker script used here is from the Freedom E-SDK. Any variable named metal_* or __init_*is defined in the linker script.
  • The bss region contains global variables with no initial value. The SRAM allocated to these variables is cleared to 0.
  • The data section contains global variables with initial values. These values are copied from the program image in read-only memory (FLASH/ROM) to SRAM.
  • The itim section is a code section that is to be moved to SRAM to improve performance.
  • The init array is a table of constructor function pointers to construct global variables.

Conclusion

This is the absolute bare minimum needed to run a program, but from here we can call main() and start the program.

Could we implement the startup routines in pure C++? Not at all, but we have benefited from the abstraction of C++.

The complete source code is here

The next post will look at RISC-V system registers and how to use C++ to abstract special instructions.

This C++ start-up code is based on examples in chapter 8 of Christopher Kormanyos’s Real Time C++. His code examples for AVR, ARM, Renesas etc are on github. He shows we can implement most of the code in C++.

--

--

Phil Mulholland

Experienced in Distributed Systems, Event-Driven Systems, Firmware for SoC/MCU, Systems Simulation, Network Monitoring and Analysis, Automated Testing and RTL.