TruePERSPECTIVES_logo.png

Volatile is better for embedded developers [updated]

Posted by Magnus Unemyr on Oct 18, 2016 11:17:00 AM

Note: This blog post was originally published on 26th of July 2016, but has now been republished with additional information on critical regions.

Usually stability is a good thing. Sometimes however, volatile is better. In fact, volatile equals reliability in some cases - in particular for embedded developers. One of the most classic embedded development bugs is NOT using the "volatile" keyword in some variable declarations. If you are thoroughly confused by now, and have no clue what I am writing about, you really need to read this blog post right now.

problem-98377_1280.jpg

The short story:

You sometimes introduce a very difficult-to-find bug if you don't declare a variable using the "volatile" keyword.

The long story:

We need to start with some compiler theory. Assume you have a variable declaration like this:

int Value;

On most modern microcontrollers, this becomes a 32-bit variable that is stored in RAM memory. The exact location of the variable is determined by the toolchain during the build process, but in this case we can assume it is stored in the memory location 0x00201000.

Writing to the variable might be done in C source code using a statement like this:

Value = 7;

The compiler generates corresponding assembler code, for example this (pseudo code only for clarity, not actual assembler code for any particular CPU architecture):

LOAD REG1, 7            ; Load the value 7 into a register (REG1)

MOV [0x00201000], REG1  ; Store the value in the register in the memory location of the variable

 

Reading a variable value is the opposite:

Value = Value + 1;

This could generate assembler code similar to this:

MOV REG1, [0x00201000]  ; Retrieve the value from memory

ADD REG1, 1             ; Add one to the value

MOV [0x00201000], REG1  ; Store the updated value in the register in the memory location of the variable

 

In a real-life scenario, the same variable is often accessed multiple times in the same code area, for example like this:

Value = Value + 4;

Value = Value - 2;

This would generate the following assembler code:

MOV REG1, [0x00201000]    ; Retrieve the value from memory

ADD REG1, 4               ; Add four to the value

MOV [0x00201000], REG1    ; Store the value in the register in the memory location of the variable

MOV REG1, [0x00201000]    ; Retrieve the value from memory

SUB REG1, 2               ; Subtract two from the value

MOV [0x00201000], REG1    ; Store the value in the register in the memory location of the variable

There is clearly unneccessary duplication going on here.

 

Therefore, the optimization phase of the compiler removes unneccessary assembler instructions, into something like this:

MOV REG1, [0x00201000]     ; Retrieve the value from memory

ADD REG1, 4                ; Add four to the value

SUB REG1, 2                ; Subtract two from the value

MOV [0x00201000], REG1     ; Store the value in the register in the memory location of the variable

 

Compilers optimize code like this all the time. In this case, the compiler removes the unneccessary reads and writes to and from memory, as the compiler knows the value is already in a CPU register (accessing CPU registers is much faster than accessing memory, and is thus the preferred method for any optimizing compiler). Some of the reads and writes to and from memory are thus redundant.

In fact, a good compiler will also detect the two mathematical statements can be reduced into one (that just adds 2 instead of first adding 4 and then subtracting 2 from it, but that is another story).

 

So what do all this have to do with volatility?

As it turns out, a lot.

This is because, sometimes a variable value gets updated without the compiler knowing it.

 

How could that happen? If we have this C source code....

Value = Value+ 4;

Value = Value -2;

...generating this machine code...

MOV REG1, [0x00201000]    ; Retrieve the value from memory

ADD REG1, 4               ; Add four to the value

SUB REG1, 2               ; Subtract two from the value

MOV [0x00201000], REG1    ; Store the value in the register in the memory location of the variable

Then nothing else can update the value of the variable in memory, right?

Untrue.

There are at least two situations where a variable can change its value without the compiler being aware of it:

  • Hardware events can change the value of Special Function Register (SFR) variables
  • Interrupt handlers can break into the execution flow at any time and overwrite the current state

Let's look into the problem with SFR register variables first. A special function register (SFR) is a memory location that acts as a programming interface between the hardware and the software. Software can write bits into bitfields of an SFR register to configure the hardware, and to trigger some event (setting the bauderate of a UART, and sending one byte are good examples). Vice versa, the hardware can set bits in the SFR register bitfelds to notify the software on its status, for example that a new byte has been received on a UART channel.

How does this relate to the volatile keyword? To accomodate for easy access to the SFR hardware register from software, a C or C++ "variable" is normally mapped on the same memory location using a symbolic #define. Something like this:

#define sfrVariable (*((unsigned char*) 0x000000F0))

This C line declares a symbol (sfrVariable), that refers to the contents of the memory location 0x000000F0 (where a particular 8-bit SFR hardware register is assumed to be located). With the above definition at hand, we can read and write to the SFR register from our C code. Like this:

if( sfrVariable == 0x04 )
  ;    // Do something

The only problem is that the above if-statement will not work. At least not always. The compiler will sometimes optimize the assembler code by caching the value in a CPU register, and it will make use of the fact it thinks the accurate "variable" value is stored in a CPU register.

Since the hardware may change the value of the SFR register memory location (with the "variable" mapped over the same memory location) at any point in time, the compiler will generate logic that operates on cached inaccurate values at runtime. What we need is a way to prevent the compiler from caching variable values in CPU registers, and use somewhat less efficient code that reads and writes to the actual memory location every time, That would prevent problems caused by caching,

This is what the volatile keyword do, and so our SFR "variable" definition can now be rewritten into:

#define sfrVariable (*((volatile unsigned char*) 0x000000F0))

With the above definition of an SFR register "variable" that is mapped over the same memory location as the SFR hardware register, the compiler are now prevented from caching its value in CPU registers, and all reads and writes are now done using a bit less efficient access to the actual memory location. This results in code like this now checking for the actual value of the SFR register memory location, rather than an inaccurate cached value:

if( sfrVariable == 0x04 )
  ;    // Do something

This works, and solves hard-to-find problems caused by the compiler optimization phase.

Another problem related to CPU caching and the volatile keyword can be caused by interrupt handlers and parallel RTOS tasks. An interrupt handler can update variable values asynchronously at any time, wrecking the compiler optimization (due to caching in CPU registers) and thus create buggy code. For example, an interrupt handler like this could be executed at any time, without the compiler being aware of it:

main()

{

  ...

  Value =  Value + 4;

  Value = Value - 2;

  ...

  Value = Value + 3;

  ...

}

 

void IntHandler_TimerInterrupt()

{

  ...

  Value = 15;

  ...

}

 

What happens if the variable value magically gets updated (by the interrupt handler) between the -2 and +3 operations in main()?  I mean, something like this:

MOV REG1, [0x00201000]  ; Retrieve the value from memory

ADD REG1, 4             ; Add four to the value

SUB REG1, 2             ; Subtract two from the value

ADD REG2, 15            ; INTERRUPT HANDLER CODE: Move the value 15 into register 2

MOV [0x00201000], REG2  ; INTERRUPT HANDLER CODE: Store the new value in the memory location of the variable

...

ADD REG1, 3             ; Add three to the value

MOV [0x00201000], REG1  ; Store the value in the register in the memory location of the variable

Now, the interrupt handler breaks into the variable update in main() and writes 15 into it, and later, the normal code overwrites this value by writing back (a now corrupt) value into the variable in memory. This may happen perhaps once every one million times the interrupt handler executes, or once every week. A very hard to find bug, indeed.

Declaring any variables that are accessed by interrupt handlers using the volatile keyword solves this problem, as any variable that is accessed from both the main() execution thread and the interrupt handler(s) are then not cached in CPU registers, thus avoiding the problem.  

However, declaring such variables using the volatile keyword do not solve all problems. In addition to declaring variables accessed from interrupt handlers as volatile, you also need to using critical sections (also known as critical regions).

Consider another execution history for the code example above, where the volatile keyword have been used, thus removing the problem of CPU register caching:

MOV REG1, [0x00201000]  ; Retrieve the value from memory

ADD REG1, 4             ; Add four to the value

MOV [0x00201000], REG1  ; Store the value in the register in the memory location of the variable

MOV REG1, [0x00201000]  ; Retrieve the value from memory

ADD REG2, 15            ; INTERRUPT HANDLER CODE: Move the value 15 into register 2

MOV [0x00201000], REG2  ; INTERRUPT HANDLER CODE: Store the new value in the memory location of the variable

SUB REG1, 2             ; Subtract two from the value

MOV [0x00201000], REG1  ; Store the value in the register in the memory location of the variable

Despite using the volatile keyword, thus switching off CPU register caching, the interrupt handler still breaks into the machine code at unsuitable times, creating buggy code with an inaccurately calculated variable value. So while using the volatile keyword solves part of the problem, it is not enough when you have parallel software execution threads. What you need to do is to prevent parallel software threads to access the same variable at the same time.

You do this using "critcal regions", or "critical sections" that prevent other software execution threads to interrupt a variable update. Something like this:

  ...

  _enter_critical_region();

  Value =  Value + 4;

  Value = Value - 2;

  _exit_critical_region();

  ...

  _enter_critical_region();

  Value = Value + 3;

  _exit_critical_region();

  ...

The _enter_critical_region disable any parallel execution threads, and the _exit_critical_region code enable them again. Now, this execution thread can be guaranteed to do the variable update as an "atomic" operation without risk of other parallel execution threads corrupting variable updates.

It is important to point out that this problem happens for any type of parallel software execution threads. I.e., it doesn't matter if it is an interrupt handler or an RTOS thread/task that is the parallel execution thread. In both cases, you need to prevent parallel access to the same variable using critical sections.

For interrupt handlers, a critical region is created by disabling and enabling the interrupt, and for RTOS tasks, the RTOS typically have API functions for this purpose. It is however important not to switch off the parallell execution thread longer than absolutely necessary, as responsiveness or even malfunction (for example, missed interrupts) could happen if you switch off software parallellism too long.   

Topics: Embedded Software Development