Introduction

This book contains a collection of Rust Exercises, written by Ferrous Systems. See ferrous-systems.com/training for more details or a custom quote. You can view this material on-line at https://rust-exercises.ferrous-systems.com.

We use these exercises as part of our Rust Training, but you are welcome to try them for yourself as well.

Source Code

The source code for this book can be found at https://github.com/ferrous-systems/rust-exercises. It is open sourced as a contribution to the growth of the Rust language.

If you wish to fund further development of the course, why not book a training with us!

Icons and Formatting we use

We use Icons to mark different kinds of information in the book:

✅ Call for action
❗️ Warnings, Details that require special attention
🔎 Knowledge, that gets you deeper into the subject, but you do not have to understand it completely to proceed.
💬 Descriptions for Accessibility

Note: Notes like this one contain helpful information

Course Material

We have attempted to make our material as inclusive as possible. This means, that some information is available in several forms, for example as a picture and as a text description. We also use icons so that different kinds of information are visually distinguishable on the first glance. If you are on a course and have accessibility needs that are not covered, please let us know.

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

We encourage the use of this material, under the terms of the above license, in the production and/or delivery of commercial or open-source Rust training programmes.

Fizzbuzz

In this exercise, you will implement your first tiny program in rust: FizzBuzz. FizzBuzz is easy to implement, but allows for application of Rust patterns in a very clean fashion. If you have never written Rust before, use the cheat sheet for help on syntax.

After completing this exercise you are able to

write a simple Rust program
create and return owned String s
use conditionals
format strings with and without printing them to the system console
write a function with a parameter and return type.

Prerequisites

For completing this exercise you need to have

basic programming skills in other languages
the Rust Syntax Cheat Sheet

Task

Create a new project called fizzbuzz
Define a function fn fizzbuzz that implements the following rules:
- If i is divisible by 3, return the String "Fizz"
- If i is divisible by 5, return the String "Buzz"
- If i is divisible by both 3 and 5, return the String "FizzBuzz"
- If neither of them is true return the number as a String
Write a main function that implements the following:
- Iterate from 1 to 100 inclusive.
- On each iteration the integer is tested with fn fizzbuzz
- print the returned value.

If you need it, we have provided a complete solution for this exercise.

Knowledge

Printing to console

The recommended way to print to the console in this exercise is println!. println! always needs a format string - it uses {} as a placeholder to mean print the next argument, like Python 3 or C#.

#![allow(unused)]
fn main() {
let s = "Fizz";
println!("The value is s is {}. That's nice.", s);
}

Creating Strings

The two recommended ways to get a String type for this exercise are:

#![allow(unused)]
fn main() {
// 1.
let s = "Fizz".to_string();

let i = 4;
let s = i.to_string();

// 2. 
let s = format!("Fizz");

let i = 4;
let s = format!("{}", i);
}

We'll cover these in more detail later, but either can be used to convert string literals ("hello") and integers (123) into values of type String, which is all you'll need here. We'll use format! in our examples. It's flexible and equally efficient as other methods.

Returning data

If you have issues returning data from multiple branches of your solution, liberally use return.

#![allow(unused)]
fn main() {
fn returner() -> String {
    let x = 10;
    if x % 5 == 0 {
        return format!("Buzz");
    }
    format!("Fizz")
}
}

Step-by-Step-Solution

In general, we also recommend to use the Rust documentation to figure out things you are missing to familiarize yourself with it. If you ever feel completely stuck or that you haven’t understood something, please hail the trainers quickly.

Step 1: New Project

Create a new binary Cargo project, check the build and see if it runs.

Solution

cargo new fizzbuzz 
cd fizzbuzz 
cargo run

Step 2: Counting from 1 to 100 in `fn main()`

Print the numbers from 1 to 100 (inclusive) to console. Use a for loop. Running this code should print the numbers from 1 to 100.

Solution

fn main() {
    for i in 1..=100 {
        println!("{}", i);
    }
}

Step 3: The function `fn fizzbuzz`

✅ Function Signature

Create the function with the name fizzbuzz. It takes an unsigned 32-bit integer as an argument and returns a String type.

Solution

#![allow(unused)]
fn main() {
fn fizzbuzz(i: u32) -> String {
    unimplemented!()
}
}

✅ Function Body

Use if statements with math operators to implement the following rules:

If i is divisible by 3, return the String "Fizz"
If i is divisible by 5, return the String "Buzz"
If i is divisible by both 3 and 5, return the String "FizzBuzz"
If neither of them is true return the number as a String

Running this code should still only print the numbers from 1 to 100.

Solution

#![allow(unused)]
fn main() {
fn fizzbuzz(i: u32) -> String {
    if i % 3 == 0 && i % 5 == 0 {
        format!("FizzBuzz")
    } else if i % 3 == 0 {
        format!("Fizz")
    } else if i % 5 == 0 {
        format!("Buzz")
    } else {
        format!("{}", i)
    }
}
}

Step 4: Call the function

Add the function call to fn fizzbuzz() to the formatted string in the println!() statement.

Running this code should print numbers, interlaced with Fizz, Buzz and FizzBuzz according to the rules mentioned above.

Solution

fn fizzbuzz(i: u32) -> String {
    if i % 3 == 0 && i % 5 == 0 {
        format!("FizzBuzz")
    } else if i % 3 == 0 {
        format!("Fizz")
    } else if i % 5 == 0 {
        format!("Buzz")
    } else {
        format!("{}", i)
    }
}

fn main() {
    for i in 1..=100 {
        println!("{}", fizzbuzz(i));
    }
}

Fizzbuzz Cheat Sheet

This is a syntax cheat sheet to be used with the Fizzbuzz exercise.

Variables

#![allow(unused)]
fn main() {
let thing = 42; // an immutable variable
let mut thing = 43; // a mutable variable
}

Functions

// a function with one argument, no return.
fn number_crunch(input: u32) {
    // function body
}

// a function with two arguments and a return type.
fn division_machine(dividend: f32, divisor: f32) -> f32 {
    // function body
    let quotient = dividend / divisor;

    // return line does not have a semi-colon!
    quotient
}

fn main() {
    
    let cookies = 1000.0_f32;
    let cookie_monsters = 1.0_f32;

    // calling a function 
    let number = division_machine(cookies, cookie_monsters);
}

`for` loops and ranges

#![allow(unused)]
fn main() {
// for loop with end-exclusive range
for i in 0..10 {
    // do this
}

// for loop with end-inclusive range
for j in 0..=10 {
    // do that 
}
}

if - statements

#![allow(unused)]
fn main() {
let number = 4;

if number == 4 {
    println!("This happens");
} else if number == 5 {
    println!("Something else happens");
} else {
    println!("Or this happens");
}

// condition can be anything that evaluates to a bool

}

Operators (Selection)

Operator	Example	Explanation
`!=`	`expr != expr`	Nonequality comparison
`==`	`expr == expr`	Equality comparison
`&&`	`expr && expr`	Short-circuiting logical AND
`\|\|`	`expr \|\| expr`	Short-circuiting logical OR
`%`	`expr % expr`	Arithmetic remainder
`/`	`expr / expr`	Arithmetic division

Fizzbuzz with `match`

In this exercise you will modify your previously written fizzbuzz to use match statements instead of if statements.

After completing this exercise you are able to

use match statements
define a tuple

Prerequisites

For completing this exercise you need to have

a working fizzbuzz

Task

Rewrite the body of fn fizzbuzz() so the different cases are not distinguished with if statements, but with pattern matching of a tuple containing the remainders.

If you need it, we have provided a complete solution for this exercise.

Knowledge

Tuple

A tuple is a collection of values of different types. Tuples are constructed using parentheses (), and each tuple itself is a value with type signature (T1, T2, ...), where T1, T2 are the types of its members. Functions can use tuples to return multiple values, as tuples can hold any number of values, including the _ placeholder

#![allow(unused)]
fn main() {
// A tuple with a bunch of different types.
let long_tuple = (1u8, 2u16, 3u32, 4u64,
                      -1i8, -2i16, -3i32, -4i64,
                      0.1f32, 0.2f64,
                      'a', true);
}

Step-by-Step-Solution

We assume you have deleted the entire function body of fn fizzbuzz() before you get started.

Step 1: The Tuple

Define a tuple that consists of the remainder of the integer i divided by 3 and the integer i divided by 5.

Solution

#![allow(unused)]
fn main() {
let i = 10;
let remainders = (i%3, i%5);
}

Step 2: Add the `match` statement with its arms

The the for us relevant patterns of the tuple that we match for are a combination of 0 and the placeholder _ (underscore). _ stands for any value. Think about what combinations of 0 and _ represent which rules. Add the match arms accordingly.

Solution

#![allow(unused)]
fn main() {
fn fizzbuzz(i: i32) -> String {
let remainders = (i%3, i%5);

    match remainders {
        (0, 0) => format!("FizzBuzz"),
        (0, _) => format!("Fizz"),
        (_, 0) => format!("Buzz"),
        (_, _) => format!("{}", i),
    }
}
}

Rustlatin

In this exercise we will implement a Rust-y, simpler variant of Pig Latin: Depending on if a word starts with a vowel or not, either a suffix or a prefix is added to the word

Learning Goals

You will learn how to

create a Rust library
split a &str at specified char
get single char out of a &str
iterate over a &str
define Globals
compare a value to the content of an array
use the Rust compiler’s type inference to your advantage
to concatenate &str
return the content of a Vec<String> as String.

Prerequisites

You must be able to

define variables as mutable
use for loop
use an if/else construction
read Rust documentation
define a function with signature and return type
define arrays and vectors
distinguish between String and &str

For this exercise we define

the Vowels of English alphabet → ['a', 'e', 'i', 'o', 'u']
a sentence is a collection of Unicode characters with words that are separated by a space character (U+0020)

Task

✅ Implement a function that splits a sentence into its words, and adds a suffix or prefix to them according to the following rules:

If the word begins with a vowel add prefix “sr” to the word.
If the word does not begin with a vowel add suffix “rs” to the word.

For example, the sentence Implement a function that splits a sentence into its words becomes srImplement sra functionrs thatrs splitsrs sra sentencers srinto srits wordsre.

The function returns a String containing the modified words.

Getting started

Find the exercise template in ../../exercise-templates/rustlatin

The folder contains each step as its own numbered project, containing a lib.rs file. Each lib.rs contains starter code and a test that needs to pass in order for the step to be considered complete.

Complete solutions are available ../../exercise-solutions/rustlatin

Knowledge

Rust Analyzer

A part of this exercise is seeing type inference in action and using it to help to determine the type the function is going to return. To make sure the file can be indexed by Rust Analyzer, make sure you open the relevant step by itself - e.g. exercise-templates/rustlatin/step1. You can close each step when complete and open the next one.

Step-by-step-Solution

Step 1: Splitting a sentence and pushing its words into a vector.

✅ Iterate over the sentence to split it into words. Use the white space as separator. This can be done with the .split() method, where the separator character ' ' goes into the parenthesis. This method returns an iterator over substrings of the string slice. In Rust, iterators are lazy, that means just calling .split() on a &str doesn’t do anything by itself. It needs to be in combination with something that advances the iteration, such as a for loop, or a manual advancement such as the .next() method. These will yield the actual object you want to use. Push each word into the vector collection_of_words. Add the correct return type to the function signature.

✅ Run the test to see if it passes.

Solution

#![allow(unused)]
fn main() {
fn rustlatin(sentence: &str) -> Vec<String> {
    let mut collection_of_words = Vec::new();

    for word in sentence.split(' ') {
        collection_of_words.push(word.to_string())
    }
    collection_of_words
}
}

Step 2: Concatenating String types.

✅ After iterating over the sentence to split it into words, add the suffix "rs" to each word before pushing it to the vector.

✅ To concatenate two &str the first needs to be turned into the owned type with .to_owned(). Then String and &str can be added using +.

✅ Add the correct return type to the function signature.

✅ Run the test to see if it passes.

Solution

#![allow(unused)]
fn main() {
fn rustlatin(sentence: &str) -> Vec<String> {
    let mut collection_of_words = Vec::new();

    for word in sentence.split(' ') {
            collection_of_words.push(word.to_owned() + "rs")

    };
    collection_of_words
}
}

Step 3: Iterating over a word to return the first character.

✅ After iterating over the sentence to split it into words, add the first character of each word to the vector.

✅ Check the Rust documentation on the primitive str Type for a method that returns an iterator over the chars of a &str. The char type holds a Unicode Scalar Value that represents a single character (although just be aware the definition of character is complex when talking about emojis and other non-English text).

Since iterators don’t do anything by themselves, it needs to be advanced first, with the .next() method. This method returns an Option(Self::Item), where Self::Item is the char in this case. You don’t need to handle it with pattern matching in this case, a simple unwrap() will do, as a None is not expected to happen.

✅ Add the correct return type to the function signature. Run the test to see if it passes.

Solution

#![allow(unused)]
fn main() {
fn rustlatin(sentence: &str) -> Vec<char> {
    let mut collection_of_chars = Vec::new();

    for word in sentence.split(' ') {
        let first_char = word.chars().next().unwrap();
        collection_of_chars.push(first_char);
    };
    collection_of_chars
}
}

Step 4: Putting everything together: Comparing values and returning the content of the vector as `String`.

✅ Add another function that checks if the first character of each word is a vowel. contains() is the method to help you with this. It adds the prefix or suffix to the word according to the rules above.

Call the function in each iteration.

In fn rustlatin return the content of the vector as String. Run the tests to see if they pass.

Solution

#![allow(unused)]
fn main() {
const VOWELS: [char; 5] = ['a', 'e', 'i', 'o', 'u'];

fn latinize(word: &str) -> String {
    let first_char_of_word = word.chars().next().unwrap();
    if VOWELS.contains(&first_char_of_word) {
        "sr".to_string() + word
    } else {
        word.to_string() + "rs"
    }
}
}

Step 5 (optional)

If not already done, use functional techniques (i.e. methods on iterators) to write the same function. Test this new function as well.

Solution

#![allow(unused)]
fn main() {
const VOWELS: [char; 5] = ['a', 'e', 'i', 'o', 'u'];

fn rustlatin_match(sentence: &str) -> String {
    // transform incoming words vector to rustlatined outgoing
    let new_words: Vec<_> = sentence
        .split(' ')
        .into_iter()
        .map(|word| {
            let first_char_of_word = word.chars().next().unwrap();
            if VOWELS.contains(&first_char_of_word) {
                "sr".to_string() + word
            } else {
                word.to_string() + "rs"
            }
        })
        .collect();

    new_words.join(" ")
}
}

URLs, Match and Result

In this exercise you will complete a number of mini exercises to learn about Error Handling. The final result will be a binary that reads lines from a text file and distinguishes between URLs and non-URLs by using the url parsing library.

In this exercise, you will learn how to

handle occurring Result-types with match for basic error handling.
when to use the .unwrap() method.
propagate an error with the ? operator
return the Option-type.
do some elementary file processing (opening, reading to buffer, counting, reading line by line).
navigate the Rust stdlib documentation
add external dependencies to your project

Task

Find the exercise template here ../../exercise-templates/urls-match-result

Find the solution to the exercise here ../../exercise-solutions/urls-match-result. You can run them with the following command: cargo run --example step_x, where x is the number of the step.

Fix the runtime error in the template code by correcting the file path. Then, handle the Result type that is returned from the std::fs::read_to_string() with a match block, instead of using .unwrap().
Take the code from Step 1 and instead of using a match, propagate the Error with ? out of fn main(). Note that your main function will now need to return something when it reaches the end.
Take the code from Step 2, and split the String into lines using the lines() method. Use this to count how many lines there are.
Change the code from Step 3 to filter out empty lines using is_empty and print the non-empty ones.
Take your code from Step 4 and write a function like fn parse_url(input: &str) -> Option<url::Url> which checks if the given input: &str is a Url, or not. The function should return Some(url) where url is of type Url, which is from the url crate. Use this function to convert each line and use the returned value to print either Is a URL: <url> or Not a URL.

The url crate has already been added as a dependency so you can just use url::Url::parse

Knowledge

Option and Result

Both Option and Result are similar in a way. Both have two variants, and depending on what those variants are, the program may continue in a different way.

The Option type can have the variant Some(T) or None. T is a type parameter that means some type should go here, we'll decide which one later. The Option type is used when you have to handle optional values. For example if you want to be able to leave a field of a struct empty, you use the Option type for that field. If the field has a value, it is Some(<value>), if it is empty, it is None.

The variants of the Result type are Ok(t) and Err(e). It is used to handle errors. If an operation was successful, Ok(t) is returned. In Ok(t), t can be the empty tuple or some other value. In Err(e), e contains an error message that can usually be printed with println!("Err: {:?}", e);.

Both types can be used with the match keyword. The received value is matched on patterns, each leads to the execution of one of a number of different expressions depending on which arm matches first.

How to use `match`

match is a way of controlling flow based on pattern matching. A pattern on the left results in the expression on the right.

#![allow(unused)]
fn main() {
let value = true;

match value {
   true => println!("This is true!"),
   false => println!("This is false!"),
}
}

Unlike with if/else, every case has to be handled explicitly, although you can use a catch-all pattern to cover 'everything else':

#![allow(unused)]
fn main() {
let value = 50_u32;

match value {
    1 => println!("This is one."),
    50 => println!("This is fifty!"),
    _ => println!("This is any other number from 0 to 4,294,967,295."),
}
}

There are different ways to use match:

The return values of the expression can be bound to a variable:

#![allow(unused)]
fn main() {
enum Season {
    Spring,
    Summer,
    Fall,
    Winter
}

fn which_season_is_now(season: Season) -> String {

    let return_value = match season {
        Season::Spring => String::from("It's spring!"),
        Season::Summer => String::from("It's summer!."),
        Season::Fall => String::from("It's Fall!"),
        Season::Winter => String::from("Brrr. It's Winter."),
    };

    return_value
}
}

In case of a Result<T, E>, match statements can be used to get to the inner value.

use std::fs::File;

fn main() {
    let file_result = File::open("hello.txt");

    let _file_itself = match file_result {
        Ok(file) => file,
        Err(error) => panic!("Error opening the file: {:?}", error),
    };
}

All arms of the match tree have to either result in the same type, or they have to diverge (that is, panic the program or return early from the function)!

Template

Start your VSCode in the proper root folder to have Rust-Analyzer working properly.

../../exercise-templates/urls-match-result/

The template builds, but has a runtime error, as the location of the file is wrong. This is intentional.

Your code will use the example data found in

../../exercise-templates/urls-match-result/src/data

Step-by-Step Solution

Step 1: Handle the `Result` instead of unwrapping it

std::fs::read_to_string returns a Result<T, E> kind of type, a quick way to get to inner type T is to use the .unwrap() method on the Result<T, E>. The cost is that the program panics if the Error variant occurs and the Error can not be propagated. It should only be used when the error does not need to be propagated and would result in a panic anyways. It’s often used as a quick fix before implementing proper error handling.

✅ Check the documentation for the exact type std::fs::read_to_string returns.

✅ Handle the Result using match to get to the inner type. Link the two possible patterns, Ok(some_string) and Err(e) to an an appropriate code block, for example: println!("File opened and read") and println!("Problem opening the file: {:?}", e).

✅ Fix the path of the file so that the program no longer prints an error.

Click to see the code using the fixed path

fn main() {
    let read_result = std::fs::read_to_string("src/data/content.txt");

    match read_result {
        Ok(_str) => println!("File opened and read"),
        Err(e) => panic!("Problem opening and reading the file: {:?}", e),
    };
}

TIP: IDEs often provide a "quick fix" to roll out all match arms quickly

Step 2: Returning a Result from main

✅ Add Result<(), Error> as return type to fn main() and Ok(()) as the last line of fn main().

✅ Delete the existing match block and add a ? after the call to std::fs::read_to_string(...).

✅ Print something after the std::fs::read_to_string but before the Ok(()) so you can see that your program did run. Try changing the file path back to the wrong value to see what happens if there is an error.

Click to see the solution

fn main() -> Result<(), std::io::Error> {
    let _file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");
    Ok(())
}

Step 3: Count the number of lines

✅ Take a look at the documentation of std::lines. It returns a struct Lines which is an iterator.

✅ Add a block like for line in my_contents.lines() { }

✅ Declare a mutable integer, initialized to zero. Increment that integer inside the for loop.

✅ Print the number of lines the file contains.

Click to see the solution

fn main() -> Result<(), std::io::Error> {
    let file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");

    let mut number = 0;

    for _line in file_contents.lines() {
        number += 1;
    }

    println!("{}", number);

    Ok(())
}

Step 4: Filter out empty lines

✅ Filter out the empty lines, and only print the the others. The is_empty method can help you here.

Click to see the solution

fn main() -> Result<(), std::io::Error> {
    let file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");

    for line in file_contents.lines() {
        if !line.is_empty() {
            println!("{}", line)
        }
    }

    Ok(())
}

Step 5: Check if a string is a URL, and return with `Option<T>`

✅ Write a function that takes (input: &str), parses each line and returns Option<url::Url> (using the url::Url). Search the docs for a method for this!

✅ If a line can be parsed successfully, return Some(url), and return None otherwise.

✅ In the main function, use your new function to only print value URLs.

✅ Test the fn parse_url().

Click me

fn parse_url(line: &str) -> Option<url::Url> {
    match url::Url::parse(&line) {
        Ok(u) => Some(u),
        Err(_e) => None,
    }
}

fn main() -> Result<(), std::io::Error> {
    let file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");

    for line in file_contents.lines() {
        match parse_url(line) {
            Some(url) => {
                println!("Is a URL: {}", url);
            }
            None => {
                println!("Not a URL");
            }
        }
    }

    Ok(())
}

#[test]
fn correct_url() {
    assert!(parse_url("https://example.com").is_some())
}

#[test]
fn no_url() {
    assert!(parse_url("abcdf").is_none())
}

Help

Typing variables

Variables can be typed by using : and a type.

#![allow(unused)]
fn main() {
let my_value: String = String::from("test");
}

Calculator

In this exercise we will implement a Reverse-Polish-Notation Calculator library.

You will learn:

How to write libraries in Rust
How to use tests to check library code
How to model tree-like data structures in Rust using enum
How to use enum types to model custom errors
How to use turbofish syntax to assist type resolution
How to handle simple text parsing in Rust

Task

Write a library that parses and evaluates math expressions that use Reverse-Polish (postfix) notation. For example, a string "3 1 + 2 /" in this notation is equivalent to (3 + 1) / 2 but unlike the later it does not force us to handle operator precedence using parentheses.

We will support the 4 basic math operations (+, -, *, /) that all will expect two operands, and a sqr operation to square a single number.

Here's a basic grammar of expressions that we can expect:

expr =
    number
    | expr expr '+'
    | expr expr '-'
    | expr expr '*'
    | expr expr '/'
    | expr 'sqr'

and here are some examples for you:

Postfix Notation	Traditional notation	Value
42	42	42
40 2 +	40 + 2	42
1 3 + 2 /	(1 + 3) / 2	2

Hint: you can use these examples as basis for unit tests.

Our library should expose a pair of functions:

// Takes a string and produces an expression tree
pub fn parse(input: &str) -> Result<Expr, ParseError>

// Evaluates the expression
pub fn eval(expr: &Expr) -> Result<i64, ParseError>

`Expr` type

The Expr type has to represent any math expression we can encounter, from a simple number to a complex nested expressions. We will use an enum type for different expression variants.

Note how we use Expr inside a Square variant. This way we can represent both simple expressions (like 5 sqr) and something very complex (like 5 sqr sqr sqr or 3 2 + sqr).

enum Expr {
    Number(i64),
    Square(Box<Expr>),
    // ...
}

// Here's how a `Square` variant can be created:
let four = Expr::Square(Box::new(Expr::Number(2)));

Here's a bit of code to get you started

pub enum Expr {}

pub enum ParseError {}

pub fn parse(input: &str) -> Result<Expr, ParseError> {
    let mut stack: Vec<Expr> = Vec::new();
    for word in input.split_ascii_whitespace() {

    };
    assert_eq!(stack.len(), 1);
    let res = stack.pop().unwrap();
    Ok(res)
}

pub enum EvalError {}

pub fn eval(expr: &Expr) -> Result<i64, EvalError> {
    todo!()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn numbers() {
        let input = "42";
        let expr = parse(input).unwrap();
        let value = eval(&expr).unwrap();
        assert_eq!(value, 42);
    }

    // #[test]
    // fn smoke_test() {
    //     let input = "3 sqr 4 sqr + 5 sqr -";
    //     let expr = parse(input).unwrap();
    //     let value = eval(&expr).unwrap();
    //     assert_eq!(value, 0);
    // }
}

Tip: When working on this exercise it's often easier to add support for one operation at a time.

Write a new test for an operation
Add support for parsing the new kind of expression
Add support for this kind of operation to eval function

This way your code will be in a working state regularly, and iterating on it will be easier.

Tip: the starting code is not untouchable!

You can rewrite bits of it as you see fit. For example, the assertion in the parse function can go in a way of your tests. Feel free to comment it out, or better yet refactor the bit at the end to get rid of it and the unwrap call.

Stretch goals

Handle overflow and underflow errors in eval function (checked_add and similar methods can be very useful here).
Add support for unary minus - operator.
Hard: change parse function to support infix notation, operator precedence, and parentheses.

Help

Recursive data structures in Rust

What does Box mean in the Expr type?

If you try to make a enum type that uses itself as a field in one of its variants, then the type will potentially have an infinite size. And if you would try to make a local variable of that type, the compiler wouldn't know how big the stack frame for that function would have to be. To avoid this we introduce an indirection via a Box wrapper. It forces the wrapped portion of the type to be heap-allocated, and the size of Box itself becomes predictable.

Box is not the only type you can use for this purpose. Rc and Arc are other examples of these smart pointer types. And in addition, Vec from standard library holds all its items on a heap, too.

Parsing postfix notation text

What's the idea behind stack variable?

We offer a rough scaffold for the parse function in the starter code. As we split the input around spaces we will get either a number or an operator at a time. We will use stack variable to store temporary sub-expressions.

When we encounter an operator we pop one or two expressions from the stack, wrap them into a new expression, and then push the it back to the stack. When we encounter a number we simply wrap it into Expr::Number(...) and push it.

At the end, if the original string was well-formed we should end up with just a single item in the stack representing the whole expression.

To parse numbers you can use the parse() method on string slices. This method can produce values of many different types, and you can use turbofish syntax to give the compiler a hint about what kind of value you expect:

let value = "12".parse::<i64>()?;

Dealing with `Option` and `Result` types

Should you unwrap()?

In the parse function there are many instances where you either get an Option or an Result with an error that is different from ParseError. While calling unwrap() in these cases can be tempting there are better options.

In general, unless there's some way you can recover from an error in your function you should prefer ? operator to bail out of it when things go wrong. Thus, when getting an Option or a Result your first though should be: "I want to use ?. How do I get there?"

Thankfully, Result has a convenient map_err method that can convert the error you get into an error type that you need:

file.read_to_string(&mut buffer).map_err(|io_error| MyError::IoVariant)?;

Likewise, Option has a helpful ok_or method to convert to Result type that you can then use ? on.

100_u8.checked_add(200_u8).ok_or(MyError::ByteOverflow)?;

Option and Result have many other useful methods, and given that these types show up in Rust code all the time learning their API will help you writing more terse and idiomatic code. You can read more about them in Item 3 of Effective Rust book.

In addition, you can always match / let else / if let on your Options and Results in tricky situations.

Testing error cases

Use unwrap_err() method

While the use of panicking functions like unwrap in production code is often frowned upon they are popular for tests. Similar to how you can use unwrap to get a value out of Ok variant of a Result you can use unwrap_err to get a value out of Err. Here's an example:

#[test]
fn no_a_number() {
    let input = "X";
    let error = parse(input).unwrap_err();
    assert_eq!(error, ParseError::NotANumber);
}

Make your types testable

Use derive for your types

Rust's testing macroassert_eq! compares the two arguments using == operator and if the values do not match it prints them in Debug mode to show you the difference between them. Thus, to use it with your types like Expr the types have to be comparable and printable, i.e. implement Debug and PartialEq traits. Thankfully, both traits can usually be derived automatically like so:

#[derive(Debug, PartialEq)]
pub enum Expr {
    // ...
}

Type conversions

&i64 to i64

The eval function takes &Expr type as an argument and when you use it in match you will get &i64 instead of i64. You can use a dereferencing operator * to convert a reference to a number to a number itself:

// expr: &Expr
match expr {
    // n: &i64
    Expr::Number(n) => {
        let x: i64 = *n;
    }
}

&Box<Expr> to &Expr

Rust will make this conversion automatically, you don't have to do anything! To learn more about how it works read about Deref coercion in The Rust Book.

Box<Expr> to Expr

Similar to &i64 to i64 you can use a dereferencing operator:

let boxed_expression: Box<Expr> = Box::new(Expr::Number(1));
let expr: Expr = *boxed_expression;

Step by step solution

Click to see the steps.

A full solution is available in our repository.

Step 1: Make a new library

cargo new --lib calc

Paste the starting code into lib.rs. Add #[derive(Debug, PartialEq)] for Expr and error types to fix compilation errors in the test.

Step 2: Add support for numbers

Let's start with parse function. Inside a for loop we'll be matching word variable against different operators. If none of the operators match we will assume that we encounter a number. So, number parsing will be in the default branch of our match expression:

// inside `parse`
for word in input.split_ascii_whitespace() {
    match word {
        "+" => todo!("add support for different operators"),
        _ => {
            let number = word
                .parse::<i64>()
                .map_err(|_| ParseError::NotANumber(word.to_string()))?;
            let expr = Expr::Number(number);
            stack.push(expr);
        }
    }
}

Tip: You can use Rust Analyzer to populate enum types. When you type ParseError::NotANumber(word.to_string()) put the cursor over NotANumber and use Generate Variant quick action. You can do it again after typing Expr::Number.

Now, let's work on eval function. After you type match expr in the body you once again can use a quick action to generate a missing match arm for Number variant.

// in `eval`
match expr {
    Expr::Number(n) => Ok(*n),
}

You can now run the number test that we provide.

You can also write an error text for ParseError::NotANumber:

// in `tests` module:
#[test]
fn not_a_number() {
    let input = "X";
    let error = parse(input).unwrap_err();
    assert_eq!(error, ParseError::NotANumber("X".to_string()));
}

Step 3: Add support for addition

You can start with any operator of your choosing. Some of them may be trickier than others:

sqr takes only one argument
- and / take the order of the operands in account
/ can produce an error during evaluation

So, depending on what operator we will implement first our work can be easier (sqr) or harder (/). Addition seems like a good starting place.

parse:

match word {
    "+" => {
        let a = stack.pop().ok_or(ParseError::MissingOperand)?;
        let b = stack.pop().ok_or(ParseError::MissingOperand)?;
        let a = Box::new(a);
        let b = Box::new(b);
        let expr = Expr::Add(a, b);
        stack.push(expr);
    }
    _ => {
        // number parsing code
    }
}

Once again: use Generate Variant for the new kind of expression.

eval:

match expr {
    Expr::Number(n) => Ok(*n),
    Expr::Add(a, b) => {
        let a = eval(a)?;
        let b = eval(b)?;
        Ok(a + b)
    }
}

and tests:

#[test]
fn add() {
    let input = "40 2 +";
    let expr = parse(input).unwrap();
    let value = eval(&expr).unwrap();
    assert_eq!(value, 42);
}

Step 4: Add support for squaring a number

You can probably notice that both parse and eval functions open opportunities for a refactoring. Before doing that we should probably explore the variant of expression with the largest potential of being different. sqr operator will produce code of a different shape due to it requiring only a single operand. In real world situations it is sometimes hard to predict how different requirements will shape the resulting code. However, making an attempt at a prediction like this can help you with avoiding refactoring prematurely and having to rollback massive changes later.

parse:

match word {
    "+" => {
        // ...
    }
    "sqr" => {
        let a = stack.pop().ok_or(ParseError::MissingOperand)?;
        let a = Box::new(a);
        let expr = Expr::Square(a);
        stack.push(expr);
    }
    _ => {
        // ...
    }
}

eval:

match expr {
    // ...
    Expr::Square(a) => {
        let a = eval(a)?;
        Ok(a.pow(2))
    }
}

Test:

#[test]
fn square() {
    let input = "5 sqr";
    let expr = parse(input).unwrap();
    let value = eval(&expr).unwrap();
    assert_eq!(value, 25);
}

Step 5: Refactoring

`parse` function

Some observations:

Every branch of our big match statement ends up producing an expression.
Every time we pop expressions from the stack we have to box them.

So, here's a plan:

Let's make match produce an expression: let expr = match word { ... };
Let's change our stack variable to store Boxed expressions

pub fn parse(input: &str) -> Result<Expr, ParseError> {
    let mut stack: Vec<Box<Expr>> = Vec::new();
    for word in input.split_ascii_whitespace() {
        let expr = match word {
            "+" => {
                let a = stack.pop().ok_or(ParseError::MissingOperand)?;
                let b = stack.pop().ok_or(ParseError::MissingOperand)?;
                Expr::Add(a, b)
            }
            "sqr" => {
                let a = stack.pop().ok_or(ParseError::MissingOperand)?;
                Expr::Square(a)
            }
            _ => {
                let number = word
                    .parse::<i64>()
                    .map_err(|_| ParseError::NotANumber(word.to_string()))?;
                Expr::Number(number)
            }
        };
        stack.push(Box::new(expr));
    }
    assert_eq!(stack.len(), 1);
    let res = stack.pop().unwrap();
    Ok(*res)
}

Note that we have to adjust the type from Box<Expr> to Expr at the end. While we are at it let's rewrite the end portion of the function to get rid of an assertion and an unwrap:

for word in input.split_ascii_whitespace() {
    // ...
}

match stack.pop() {
    Some(expr) if stack.is_empty() => Ok(*expr),
    Some(_) => Err(ParseError::TooManyOperands),
    None => Err(ParseError::EmptyInput),
}

Let's test our new error conditions:

#[test]
fn too_many_operands() {
    let input = "42 42 42 +";
    let error = parse(input).unwrap_err();
    assert_eq!(error, ParseError::TooManyOperands);
}

#[test]
fn empty_input() {
    let input = "      ";
    let error = parse(input).unwrap_err();
    assert_eq!(error, ParseError::EmptyInput);
}

`eval` function

So far every branch of the match produces an number that we later wrap into Ok.

pub fn eval(expr: &Expr) -> Result<i64, EvalError> {
    let value = match expr {
        Expr::Number(n) => *n,
        Expr::Add(a, b) => eval(a)? + eval(b)?,
        Expr::Square(a) => eval(a)?.pow(2),
    };
    Ok(value)
}

Step 6: Subtraction

In parse function all subsequent operators will require two stack.pop() calls. We may as well combine different operators together and use unreachable! macro for the second match:

match word {
    "+" | "-" => {
        let a = stack.pop().ok_or(ParseError::MissingOperand)?;
        let b = stack.pop().ok_or(ParseError::MissingOperand)?;
        match word {
            "+" => Expr::Add(a, b),
            "-" => Expr::Sub(a, b),
            _ => unreachable!(),
        }
    }
    // ...
}

// in `eval`
match expr {
    // ...
    Expr::Add(a, b) => eval(a)? + eval(b)?,
    Expr::Sub(a, b) => eval(a)? - eval(b)?,
    // ...
};

#[test]
fn sub() {
    let input = "42 2 -";
    let expr = parse(input).unwrap();
    let value = eval(&expr).unwrap();
    assert_eq!(value, 40);
}

Adding a test will reveal that we have a bug with the order of operands.

---- tests::sub stdout ----
thread 'tests::sub' panicked at calculator/src/lib.rs:96:9:
assertion `left == right` failed
  left: -40
  right: 40

It's up to you to decide where to mitigate the issue. You can do it in eval or in parse. We will do it in parse right away by popping b from the stack first:

match word {
    "+" | "-" => {
        let b = stack.pop().ok_or(ParseError::MissingOperand)?;
        let a = stack.pop().ok_or(ParseError::MissingOperand)?;
        // ...
    }
    // ...
}

Step 7: Multiplication

The changes will completely match code for addition and subtraction.

Here's a test for you to check your work:

#[test]
fn mul() {
    let input = "21 2 *";
    let expr = parse(input).unwrap();
    let value = eval(&expr).unwrap();
    assert_eq!(value, 42);
}

Step 8: Division

Division will require more code in the eval function to check if the divisor is Zero. You can perform the check inside the Expr::Div arm or move it to its own arm and use a guard expression like this:

match expr {
    // ...
    Expr::Div(_, divisor) if eval(&divisor)? == 0 => {
        return Err(EvalError::DivisionByZero)
    }
    Expr::Div(a, b) => eval(a)? / eval(b)?,
    // ...
}

Let's test our code:

#[test]
fn div() {
    let input = "84 2 /";
    let expr = parse(input).unwrap();
    let value = eval(&expr).unwrap();
    assert_eq!(value, 42);
}

#[test]
fn divide_by_zero() {
    let input = "42 0 /";
    let expr = parse(input).unwrap();
    let error = eval(&expr).unwrap_err();
    assert_eq!(error, EvalError::DivisionByZero);
}

Finally, you can uncomment the smoke test and see how our library works for complex expressions.

You can find a complete solution in our repository.

Iterators

In this exercise, you will learn to manipulate and chain iterators. Iterators are a functional way to write loops and control flow logic.

After completing this exercise you are able to

chain Rust iterator adapters
use closures in iterator chains
collect a result to different containers

Prerequisites

For completing this exercise you need to have

knowledge of control flow
how to write basic functions
know basic Rust types

Task

Calculate the sum of all odd numbers in the following string using an iterator chain:

//ignore everything that is not a number
1
2
3
4
five
6
7
∞
9
X
11

We have a template project for this exercise. You can replace the todo! item in the template with reader.lines() and continue "chaining" the iterators until you've calculated the desired result. Note that the template will only be able to find numbers.txt if you run cargo run from the exercise-templates/iterators directory. Running the binary from elsewhere in the workspace will give a File not found error.

If you need it, we have provided a complete solution for this exercise.

Knowledge

Iterators and iterator chains

Iterators are a way to chain function calls instead of writing elaborate for loops.

This lets us have a type safe way of composing control flow together by calling the right functions.

For example, to double every number given by a vector, you could write a for loop:

let v = [10, 20, 30];
let mut xs = [0, 0, 0];

for idx in 0..=v.len() {
  xs[idx] = 2 * v[idx];
}

In this case, the idea of running a procedure like 2 * v[idx] whilst indexing over the entire collection is called a map. Using iterator chains you could instead write something like:

let v = [10, 20, 30];
let xs: Vec<_> = v
  .iter()
  .map(|elem| elem * 2)
  .collect();

No win for brevity, but it has several benefits:

Changing the underlying logic is more robust
Less indexing operations means you will fight the borrow checker less in the long run
You can parallelize your code with minimal changes using rayon.

The first point is not in vain - the original snippet has a bug in the upper bound, since 0..=v.len() is inclusive! It should have been 0..v.len().

Finally, don't forget that iterators are lazy functions - they only carry out computation when a consuming adapter like .collect() is called, not when the .map() is added to the chain.

Iterator chains workflow advice

Start every iterator call on a new line, so that you can see closure arguments and type hints for the iterator at the end of the line clearly.

When in doubt, write .map(|x| x) first to see what item types you get and decide on what iterator methods to use and what to do inside a closure based on that.

Turbofish syntax `::<>`

Iterators sometimes struggle to figure out the types of all intermediate steps and need assistance.

let numbers: Vec<_> = ["1", "2", "3"]
    .iter()
    .map(|s| s.parse::<i32>().unwrap())
    // a turbofish in the `parse` call above
    // helps a compiler determine the type of `n` below
    .map(|n| n + 1)
    .collect();

This ::<SomeType> syntax is called the turbofish operator, and it disambiguates calling the same method but getting back different return types, like .parse::<i32>() and .parse::<f64>() (try it!)

Dealing with `.unwrap()`s in iterator chains

Intermediate steps in iterator chains often produce Result or Option.

You may be tempted to use unwrap / expect to get the inner values. However, there are usually better ways that don't require a potential panic.

Concretely, the following snippet:

    let numbers: Vec<_> = ["1", "2", "3"]
        .iter()
        .map(|s| s.parse::<i32>())
        .filter(|r| r.is_ok())
        .map(|r| r.expect("all `Result`s are Ok here"))
        .collect();

can be replaced with a judicious use of .filter_map():

    let numbers: Vec<_> = ["1", "2", "3"]
        .iter()
        .filter_map(|s| s.parse::<i32>().ok())
        .collect();

You will relive similar experiences when learning Rust without knowing the right tools from the standard library that let you convert Result into what you actually need.

We make a special emphasis on avoiding ".unwrap() now, refactor later" because later usually never comes.

Dereferences

Rust will often admonish you to add an extra dereference (*) by comparing the expected input and actual types, and you'll need to write something like .map(|elem| *elem * 2) to correct your code. A tell tale sign of this is that the expected types and the actual type differ by the number of &'s present.

Remember you can select and hover over each expression and rust-analyzer will display its type if you want a more detailed look inside.

Destructuring in closures

Some iterator chains involve Item being a tuple. If so, it may be useful to destructure the tuple when writing the closure:

let x = [10, 20, 30];
let y = [1, 2, 3];
let z = x
  .iter()
  .zip(y.iter())
  .map(|(a, b)| a * b)
  .sum::<i32>();

Here, the .map(|(a, b)| a + b) is iterating over [(10, 1), (20, 2), (30, 3)] and calling the left argument a and the right argument b, in each iteration.

Step-by-Step-Solution

In general, we also recommend using the Rust documentation to get unstuck. In particular, look for the examples in the Iterator page of the standard library for this exercise.

If you ever feel completely stuck or that you haven’t understood something, please hail the trainers quickly.

Step 1: New Project

Copy or recreate the exercise-templates/iterators template to get started.

Step 2: Read the string data

Read the contents of iterators/numbers.txt line by line, and collect it all into one big String. Note that the lines() iterator gives us Result<String, std::io::Error> so let's only keep the lines that we were able to succesfully read from disk.

Solution

use std::io::{BufRead, BufReader};
use std::fs::File;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    use crate::*;
    let f = File::open("numbers.txt")?;
    let reader = BufReader::new(f);

    let file_lines = reader.lines()
        .filter_map(|line| line.ok())
        .collect::<String>();

    println!("{:?}", file_lines);

    Ok(())
}

Step 3: Skip the non-numeric lines

Now let's check that each line is a a valid number, using .parse(). We'll be collecting everything into a Vec<i32>.

Note that you may or may not need type annotations on .parse() depending on if you add them on the binding or not - that is, let numeric_lines: Vec<i32> = ... will give Rust type information to deduce the iterator's type correctly.

Solution

If the use of filter_map here is unfamiliar, go back and reread the Dealing with .unwrap()s in iterator chains section.

use std::io::{BufRead, BufReader};
use std::fs::File;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    use crate::*;
    let f = File::open("numbers.txt")?;
    let reader = BufReader::new(f);

    let numeric_lines: Vec<i32> = reader.lines()
        .filter_map(|line| line.ok())
        .filter_map(|line| line.parse::<i32>().ok())
        .collect::<Vec<i32>>();
    println!("{:?}", numeric_lines);

    Ok(())
}

Step 4: Keep the odd numbers

Use a .filter() with an appropriate closure to keep only the odd numbers.

Solution

use std::io::{BufRead, BufReader};
use std::fs::File;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    use crate::*;
    let f = File::open("numbers.txt")?;
    let reader = BufReader::new(f);

    let odd_numbers = reader.lines()
        .filter_map(|line| line.ok())
        .filter_map(|line| line.parse::<i32>().ok())
        .filter(|num| num % 2 != 0)
        .collect::<Vec<i32>>();

    println!("{:?}", odd_numbers);

    Ok(())
}

Step 5: Add the odd numbers

Take the odd numbers and sum() them.

Solution

use std::io::{BufRead, BufReader};
use std::fs::File;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    use crate::*;
    let f = File::open("numbers.txt")?;
    let reader = BufReader::new(f);

    let result = reader.lines()
        .filter_map(|line| line.ok())
        .filter_map(|line| line.parse::<i32>().ok())
        .filter(|num| num % 2 != 0)
        .sum::<i32>();

    println!("{:?}", result);

    Ok(())
}

SimpleDB Exercise

In this exercise, we will implement a toy protocol parser for a simple protocol for databank queries. We call it simpleDB. The protocol has two commands, one of them can be sent with a payload of additional data. Your parser parses the incoming data strings, makes sure the commands are formatted correctly and returns errors for the different ways the formatting can go wrong.

After completing this exercise you will be able to

write a simple Rust library from scratch
interact with borrowed and owned memory, especially how to take ownership
handle complex cases using the match and if let syntax
create a safe protocol parser in Rust manually

Prerequisites

basic pattern matching with match
control flow with if/else
familiarity with Result<T, E>, Option<T>

Tasks

Create a library project called simple-db.
Implement appropriate data structures for Command and Error.
Read the documentation for str, especially strip_prefix() and strip_suffix(). Pay attention to their return type.
Implement the following function so that it implements the protocol specifications to parse the messages. Use the provided tests to help you with the case handling.

pub fn parse(input: &str) -> Result<Command, Error> {
    todo!()
}

The Step-by-Step-Solution contains steps 4a-c that explain a possible way to handle the cases in detail.

Optional Tasks:

Run clippy on your codebase.
Run rustfmt on your codebase.

If you need it, we have provided solutions for every step for this exercise.

Protocol Specification

The protocol has two commands that are sent as messages in the following form:

PUBLISH <payload>\n
RETRIEVE\n

With the additional properties:

The payload cannot contain newlines.
A missing newline at the end of the command is an error.
A newline other than at the end of the command is an error.
Sending a PUBLISH with an empty payload is allowed. In this case, the command PUBLISH \n will publish an empty payload.

Note: depending on the order in which the rules are implemented, you may obtain different behaviours.

Issues with the format (or other properties) of the messages are handled with the following error codes:

UnexpectedNewline (a newline not at the end of the line)
IncompleteMessage (no newline at the end)
EmptyMessage (empty string instead of a command)
UnknownCommand (string is not empty, but neither PUBLISH nor RECEIVE)
UnexpectedPayload (message contains a payload, when it should not)
MissingPayload (message is missing a payload)

Testing

Below are the tests your protocol parser needs to pass. You can copy them to the bottom of your lib.rs.

#[cfg(test)]
mod tests {
    use super::*;

    // Tests placement of \n
    #[test]
    fn test_missing_nl() {
        let line = "RETRIEVE";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }
    #[test]
    fn test_trailing_data() {
        let line = "PUBLISH The message\n is wrong \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedNewline);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_string() {
        let line = "";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }

    // Tests for empty messages and unknown commands

    #[test]
    fn test_only_nl() {
        let line = "\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::EmptyMessage);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_unknown_command() {
        let line = "SERVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnknownCommand);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of RETRIEVE command

    #[test]
    fn test_retrieve_w_whitespace() {
        let line = "RETRIEVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve_payload() {
        let line = "RETRIEVE this has a payload\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve() {
        let line = "RETRIEVE\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Retrieve);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of PUBLISH command

    #[test]
    fn test_publish() {
        let line = "PUBLISH TestMessage\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("TestMessage".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_publish() {
        let line = "PUBLISH \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_missing_payload() {
        let line = "PUBLISH\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::MissingPayload);
        assert_eq!(result, expected);
    }
}

Knowledge

This section explains concepts necessary to solve the simpleDB exercise.

We also recommend using the official Rust documentation to figure out unfamiliar concepts. If you ever feel completely stuck, or if you haven’t understood something specific, please hail the trainers quickly.

Derives

#[derive(PartialEq, Eq)]

This enables comparison between 2 instances of the type, by comparing every field/variant. This enables the assert_eq! macro, which relies on equality being defined. Eq for total equality isn’t strictly necessary for this example, but it is good practice to derive it if it applies.

#[derive(Debug)]

This enables the automatic generation of a debug formatting function for the type. The assert_eq! macro requires this for testing.

Control flow and pattern matching, returning values

This exercise involves handling a number of cases. You are already familiar with if/else and a basic form of match. Here, we’ll introduce you to if let, and let else:

if let Some(message) = message.strip_prefix("PREFIX:") {
    // Executes if the above pattern is a match.
}
// The variable `message` is NOT available here.

let Some(message) = message.strip_prefix("PREFIX:") else {
    // Executes if the above pattern is NOT a match.
    // Must have an early return in this block.
}
// The variable `message` is still available here.

When to use what?

if let is like a pattern-matching match block with only one arm. So, if your match only has one arm of interest, consider an if let or let else instead (depending on whether the pattern match means success, or the pattern match means there's an error).

match can be used to handle more fine grained and complex pattern matching, especially when there are several, equally ranked possibilities. The match arms may have to include a catch all _ => arm, for every possible case that is not explicitly spelled out. The order of the match arms matter: The catch all branch needs to be last, otherwise, it catches all…

Returning Values from branches and match arms

All match arms always need to produce a value the same type (or they diverge with a return statement).

Step-by-Step Solution

Step 1: Creating a library project with cargo

Create a new Cargo project, check the build and the test setup:

Solution

cargo new --lib simple-db
cd simple-db
cargo build
cargo test

Step 2: Define appropriate data structures

Define two enums, one is called Command and one is called Error. Command has 2 variants for the two possible commands. Publish carries data (the message), Retrieve does not. Error is just a list of error kinds. Use #[derive(Eq,PartialEq,Debug)] for both enums.

Solution

#[derive(Eq, PartialEq, Debug)]
pub enum Command {
    Publish(String),
    Retrieve,
}

#[derive(Eq, PartialEq, Debug)]
pub enum Error {
    UnexpectedNewline,
    IncompleteMessage,
    EmptyMessage,
    UnknownCommand,
    UnexpectedPayload,
    MissingPayload,
}

// Tests go here!

Step 3: Read the documentation for `str`, especially `strip_prefix()`, `strip_suffix()`

tl;dr:

message.strip_prefix("FOO ") returns Some(remainder) if the string slice message starts with "FOO ", otherwise you get None
message.strip_suffix('\n') returns Some(remainder) if the string slice message ends with '\n', otherwise you get None.

Note that both functions will take either a string slice, or a character, or will actually even take a function that returns a boolean to tell you whether a character matches or not (we won't use that though).

The proposed logic

Check if the string ends with the char '\n' - if so, keep the rest of it, otherwise return an error.
Check if the remainder still contains a '\n' - if so, return an error.
Check if the remainder is empty - if so, return an error.
Check if the remainder begins with "PUBLISH " - if so, return Ok(Command::Publish(...)) with the payload upconverted to a String
Check if the remainder is "PUBLISH" - if so, return an error because the mandatory payload is missing.
Check if the remainder begins with "RETRIEVE " - if so, return an error because that command should not have anything after it.
Check if the remainder is "RETRIEVE" - if so, return Ok(Command::Retrieve)
Otherwise, return an unknown command error.

Step 4: Implement `fn parse()`

Step 4a: Sorting out wrongly placed and absent newlines

Missing, wrongly placed and more than one \n are errors that occur independent of other errors so it makes sense to handle these cases first. Check the string has a newline at the end with strip_suffix. If not, that's an Error::IncompleteMessage. We can assume the pattern will match (that strip_suffix will return Some(...), which is our so-called sunny day scenario) so a let - else makes most sense here - although a match will also work.

Now look for newlines within the remainder using the contains() method and if you find any, that's an error.

Tip: Introduce a generic variant Command::Command that temporarily stands for a valid command.

Solution

pub fn parse(input: &str) -> Result<Command, Error> {
    let Some(message) = input.strip_suffix('\n') else {
        return Err(Error::IncompleteMessage);
    };

    if message.contains('\n') {
        return Err(Error::UnexpectedNewline);
    }

    Ok(Command::Command)

Step 4b: Looking for "RETRIEVE"

In 4a, we produce a Ok(Command::Command) if the newlines all check out. Now we want to look for a RETRIEVE command.

If the string is empty, that's an error. If the string is exactly "RETRIEVE", that's our command. Otherwise the string starts with "RETRIEVE ", then that's an UnexpectedPayload error.

Solution

    let Some(message) = input.strip_suffix('\n') else {
        return Err(Error::IncompleteMessage);
    };

    if message.contains('\n') {
        return Err(Error::UnexpectedNewline);
    }

    if message == "RETRIEVE" {
        Ok(Command::Retrieve)
    } else if let Some(_payload) = message.strip_prefix("RETRIEVE ") {
        Err(Error::UnexpectedPayload)
    } else if message == "" {
        Err(Error::EmptyMessage)
    } else {
        Err(Error::UnknownCommand)
    }

Step 4c: Looking for "PUBLISH"

Now we want to see if the message starts with "PUBLISH ", and if so, return a Command::Publish containing the payload, but converted to a heap-allocated String so that ownership is passed back to the caller. If not, and the message is equal to "PUBLISH", then that's a MissingPayload error.

Solution

    let Some(message) = input.strip_suffix('\n') else {
        return Err(Error::IncompleteMessage);
    };

    if message.contains('\n') {
        return Err(Error::UnexpectedNewline);
    }

    if let Some(payload) = message.strip_prefix("PUBLISH ") {
        Ok(Command::Publish(String::from(payload)))
    } else if message == "PUBLISH" {
        Err(Error::MissingPayload)
    } else if message == "RETRIEVE" {
        Ok(Command::Retrieve)
    } else if let Some(_payload) = message.strip_prefix("RETRIEVE ") {
        Err(Error::UnexpectedPayload)
    } else if message == "" {
        Err(Error::EmptyMessage)
    } else {
        Err(Error::UnknownCommand)
    }

Full source code

If all else fails, feel free to copy this solution to play around with it.

Solution

#![allow(unused)]
fn main() {
#[derive(Eq, PartialEq, Debug)]
pub enum Command {
    Publish(String),
    Retrieve,
}

#[derive(Eq, PartialEq, Debug)]
pub enum Error {
    UnexpectedNewline,
    IncompleteMessage,
    EmptyMessage,
    UnknownCommand,
    UnexpectedPayload,
    MissingPayload,
}

pub fn parse(input: &str) -> Result<Command, Error> {
    let Some(message) = input.strip_suffix('\n') else {
        return Err(Error::IncompleteMessage);
    };

    if message.contains('\n') {
        return Err(Error::UnexpectedNewline);
    }

    if let Some(payload) = message.strip_prefix("PUBLISH ") {
        Ok(Command::Publish(String::from(payload)))
    } else if message == "PUBLISH" {
        Err(Error::MissingPayload)
    } else if message == "RETRIEVE" {
        Ok(Command::Retrieve)
    } else if let Some(_payload) = message.strip_prefix("RETRIEVE ") {
        Err(Error::UnexpectedPayload)
    } else if message == "" {
        Err(Error::EmptyMessage)
    } else {
        Err(Error::UnknownCommand)
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    // Tests placement of \n
    #[test]
    fn test_missing_nl() {
        let line = "RETRIEVE";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }
    #[test]
    fn test_trailing_data() {
        let line = "PUBLISH The message\n is wrong \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedNewline);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_string() {
        let line = "";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }

    // Tests for empty messages and unknown commands

    #[test]
    fn test_only_nl() {
        let line = "\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::EmptyMessage);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_unknown_command() {
        let line = "SERVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnknownCommand);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of RETRIEVE command

    #[test]
    fn test_retrieve_w_whitespace() {
        let line = "RETRIEVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve_payload() {
        let line = "RETRIEVE this has a payload\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve() {
        let line = "RETRIEVE\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Retrieve);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of PUBLISH command

    #[test]
    fn test_publish() {
        let line = "PUBLISH TestMessage\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("TestMessage".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_publish() {
        let line = "PUBLISH \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_missing_payload() {
        let line = "PUBLISH\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::MissingPayload);
        assert_eq!(result, expected);
    }
}
}

Green and Yellow Game

In this assignment we will implement the game "Green and Yellow". It’s like Wordle, but with numerical digits instead of letters. But for legal reasons it’s also entirely unlike Wordle, and entirely unlike the 1970’s board-game "Mastermind".

After completing this exercise you will be able to

Work with rust slices and vectors
Accept input from stdin
Iterate through arrays and slices
Generate random numbers

Prerequisites

To complete this exercise you need to have:

basic Rust programming skills
the Rust Syntax Cheat Sheet

Task

Create a new binary crate called green-yellow
Copy all the test cases into into your main.rs
Define a function fn calc_green_and_yellow(guess: &[u8; 4], secret: &[u8; 4]) -> String that implements the following rules:
- Return a string containing four Unicode characters
- For every item in guess, if guess[i] == secret[i], then position i in the output String should be a green block (🟩)
- Then, for every item in guess, if guess[i] is in secret somewhere, and hasn't already been matched, then position i in the output String should be a yellow block (🟨)
- If any of the guesses do not appear in the secret, then that position in the output String should be a grey block (⬜)
Ensure all the test cases pass!
Write a main function that implements the following:
- Generate 4 random digits - our 'secret'
- Go into a loop
- Read a string from Standard In and trim the whitespace off it
- Parse that string into a guess, containing four digits (give an error if the user makes a mistake)
- Run the calculation routine above and print the coloured blocks
- Exit if all the blocks are green
Play the game

If you need it, we have provided a complete solution for this exercise.

Your test cases are:

#![allow(unused)]
fn main() {
#[test]
fn all_wrong() {
    assert_eq!(
        &calc_green_and_yellow(&[5, 6, 7, 8], &[1, 2, 3, 4]),
        "⬜⬜⬜⬜"
    );
}

#[test]
fn all_green() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 4], &[1, 2, 3, 4]),
        "🟩🟩🟩🟩"
    );
}

#[test]
fn one_wrong() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 5], &[1, 2, 3, 4]),
        "🟩🟩🟩⬜"
    );
}

#[test]
fn all_yellow() {
    assert_eq!(
        &calc_green_and_yellow(&[4, 3, 2, 1], &[1, 2, 3, 4]),
        "🟨🟨🟨🟨"
    );
}

#[test]
fn one_wrong_but_duplicate() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 1], &[1, 2, 3, 4]),
        "🟩🟩🟩⬜"
    );
}

#[test]
fn one_right_others_duplicate() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 1, 1, 1], &[1, 2, 3, 4]),
        "🟩⬜⬜⬜"
    );
}

#[test]
fn two_right_two_swapped() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 2, 2], &[2, 2, 2, 1]),
        "🟨🟩🟩🟨"
    );
}

#[test]
fn two_wrong_two_swapped() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 3, 3, 2], &[2, 2, 2, 1]),
        "🟨⬜⬜🟨"
    );
}

#[test]
fn a_bit_of_everything() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 9, 4, 3], &[1, 2, 3, 4]),
        "🟩⬜🟨🟨"
    );
}

#[test]
fn two_in_guess_one_in_secret() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 3], &[3, 9, 9, 9]),
        "⬜⬜🟨⬜"
    );
}

#[test]
fn two_in_secret_one_in_guess() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 4], &[3, 3, 9, 9]),
        "⬜⬜🟨⬜"
    );
}
}

Knowledge

Generating Random Numbers

There are no random number generators in the standard library - you have to use the rand crate.

You will need to change Cargo.toml to depend on the rand crate - we suggest version 0.8.

You need a random number generator (call rand::thread_rng()), and using that you can generate a number out of a given range with gen_range. See https://docs.rs/rand for more details.

Reading from the Console

You need to grab a standard input handle with std::io::stdin(). This implements the std::io::Write trait, so you can call read_to_string(&mut some_string) and get a line of text into your some_string: String variable.

Parsing Strings into Integers

Strings have a parse() method, which returns a Result, because of course the user may not have typed in a proper digit. The parse() function works out what you are trying to create based on context - so if you want a u8, try let x: u8 = my_str.parse().unwrap(). Or you can say let x = my_str.parse::<u8>().unwrap(). Of course, try to do something better than unwrap!

Step-by-Step-Solution

Step 1: New Project

Create a new binary Cargo project, check the build and see if it runs.

Solution

cargo new green-yellow
cd green-yellow
cargo run

Step 2: Generate some squares

Get calc_green_and_yellow to just generate grey blocks. We put them in an Vec first, as that's easier to index than a string.

Call the function from main() to avoid the warning about it being unused.

Solution

fn calc_green_and_yellow(_guess: &[u8; 4], _secret: &[u8; 4]) -> String {
    let result = ["⬜"; 4];

    result.join("")
}

Step 3: Check for green squares

You need to go through every pair of items in the input arrays and check if they are the same. If so, set the output square to be green.

Solution

fn calc_green_and_yellow(guess: &[u8; 4], secret: &[u8; 4]) -> String {
    let mut result = ["⬜"; 4];

    for i in 0..guess.len() {
        if guess[i] == secret[i] {
            result[i] = "🟩";
        }
    }

    result.join("")
}

Step 4: Check for yellow squares

This gets a little more tricky.

We need to loop through every item in the guess array and compare it to every item in the secret array. But! We must make sure we ignore any values we already 'used up' when we produced the green squares.

Let's do this by copying the input, so we can make it mutable, and mark off any values used in the green-square-loop by setting them to zero.

Solution

fn calc_green_and_yellow(guess: &[u8; 4], secret: &[u8; 4]) -> String {
    let mut result = ["⬜"; 4];
    let mut secret_handled = [false; 4];

    for i in 0..guess.len() {
        if guess[i] == secret[i] {
            // that's a match
            result[i] = "🟩";
            // don't match this secret digit again
            secret_handled[i] = true;
        }
    }

    'guess: for g_idx in 0..guess.len() {
        // only process guess digits we haven't already dealt with
        if result[g_idx] == "🟩" {
            continue;
        }
        for s_idx in 0..secret.len() {
            // only process secret digits we haven't already dealt with
            if secret_handled[s_idx] {
                continue;
            }

Step 5: Get some random numbers

Add rand = "0.8" to your Cargo.toml, and make a random number generator with rand::thread_rng() (Random Number Generator). You will also have to use rand::Rng; to bring the trait into scope.

(A built-in random number generator is proposed for the Standard Library but is still nightly only as of October 2024).

Call your_rng.gen_range() in a loop.

Solution

                // put a yellow block in for this guess
                result[g_idx] = "🟨";
                // never match this secret digit again
                secret_handled[s_idx] = true;
                // stop comparing this guessed digit to any other secret digits
                continue 'guess;
            }
        }
    }

Step 6: Make the game loop

We a loop to handle each guess the user makes.

For each guess we need to read from Standard Input (using std::io::stdin() and its read_line()) method.

You will need to trim and then split the input, then parse each piece into a digit.

If the digit doesn't parse, continue the loop.
If the digit parses but it out of range, continue the loop.
If you get the wrong number of digits, continue the loop.
If the guess matches the secret, then break out of the loop and congratulate the winner.
Otherwise run the guess through our calculation function and print the squares.

Solution

}
fn main() {
    let mut rng = rand::thread_rng();
    let stdin = std::io::stdin();

    let mut secret = [0u8; 4];
    for digit in secret.iter_mut() {
        *digit = rng.gen_range(1..=9);
    }

    println!("New game (secret is {:?}!", secret);

    loop {
        let mut line = String::new();
        println!("Enter guess:");
        stdin.read_line(&mut line).unwrap();
        let mut guess = [0u8; 4];
        let mut idx = 0;
        for piece in line.trim().split(' ') {
            let Ok(digit) = piece.parse::<u8>() else {
                println!("{:?} wasn't a number", piece);
                continue;
            };
            if digit < 1 || digit > 9 {
                println!("{} is out of range", digit);
                continue;
            }
            if idx >= guess.len() {
                println!("Too many numbers, I only want {}", guess.len());
                continue;
            }
            guess[idx] = digit;

Shapes

In this exercise we're going to define methods for a struct, define and implement a trait, and look into how to make structs and traits generic.

Learning Goals

You will learn how to:

implement methods for a struct
when to use Self, self, &self and &mut self in methods
define a trait with required methods
make a type generic over T
how to constrain T

Tasks

Part 1: Defining Methods for Types

You can find a complete solution

Make a new library project called shapes
Make two structs, Circle with field radius and Square with field side to use as types. Decide on appropriate types for radius and side.
Make an impl block and implement the following methods for each type. Consider when to use self, &self, &mut self and Self.
- fn new(...) -> ...
  - creates an instance of the shape with a certain size (radius or side length).
- fn area(...) -> ...
  - calculates the area of the shape.
- fn scale(...)
  - changes the size of an instance of the shape.
- fn destroy(...) -> ...
  - destroys the instance of a shape and returns the value of its field.

Part 2: Defining and Implementing a Trait

You can find a complete solution

Define a Trait HasArea with a mandatory method: fn area(&self) -> f32.
Implement HasArea for Square and Circle. You can defer to the existing method but may need to cast the return type.
Abstract over Circle and Square by defining an enum Shape that contains both as variants.
Implement HasArea for Shape.

Part 3: Making `Square` generic over `T`

You can find a complete solution

We want to make Square and Circle generic over T, so we can use other numeric types and not just u32 and f32.

Add the generic type parameter <T> to Square. You can temporarily remove enum Shape to make this easier.
Import the num crate, version 0.4.0, to use the num::Num trait as bound for the generic type <T>. This assures that T is a numeric type, and also makes some guarantees about operations that can be performed on <T>.
Add a where clause on the methods of Square, as required, e.g.:
```
where T: num::Num
```
Depending on the operations performed in that function, you may need to add further trait bounds, such as Copy and std::ops::MulAssign. You can add them to the where clause with a + sign, like T: num::Num + Copy.
Add the generic type parameter <T> to Circle and then appropriate where clauses.
Re-introduce Shape but with the generic type parameter <T>, and then add appropriate where clauses.

Help

This section gives partial solutions to look at or refer to.

Getting Started

Create a new library Cargo project, check the build and see if it runs:

$ cargo new --lib shapes
$ cd shapes
$ cargo run

Creating a Type

Each of your shape types (Square, Circle, etc.) will need some fields (or properties) to identify its geometry. Use /// to add documentation to each field.

/// Describes a human individual
struct Person {
    /// How old this person is
    age: u8
}

Functions that take arguments: self, &self, &mut self

Does your function need to take ownership of the shape in order to calculate its area? Or is it sufficient to merely take a read-only look at the shape for a short period of time?

You can pass arguments by reference in Rust by making your function take x: &MyShape, and passing them with &my_shape.

You can also associate your function with a specific type by placing it inside a block like impl MyShape { ... }

impl Pentagon {
    fn area(&self) -> u32 {
        // calculate the area of the pentagon here...
    }
}

A Shape of many geometries

You can use an enum to provide a single type that can be any of your supported shapes. If we were working with fruit, we might say:

struct Banana { ... }
struct Apple { ... }

enum Fruit {
    Banana(Banana),
    Apple(Apple),
}

If you wanted to count the pips in a piece of Fruit, you might just call to the num_pips() method on the appropriate constituent fruit. This might look like:

impl Fruit {
    fn num_pips(&self) -> u8 {
        match self {
            Fruit::Apple(apple) => apple.num_pips(),
            Fruit::Banana(banana) => banana.num_pips(),
        }
    }
}

I need a π

The f32 type also has its own module in the standard library called std::f32. If you look at the docs, you will find a defined constant for π: std::f32::consts::PI.

I need a π, of type `T`

If you want to convert a Pi constant to some type T, you need a where bound like:

where T: num::Num + From<f32>

This restricts T to values that can be converted from an f32 (or, types you can convert an f32 into). You can then call let my_pi: T = my_f32_pi.into(); to convert your f32 value into a T value.

Defining a `Trait`

A trait has a name, and lists function definitions that make guarantees about the name of a method, its arguments and return types.

#![allow(unused)]
fn main() {
pub trait Color {
    fn red() -> u8;
}
}

Adding generic Type parameters

#![allow(unused)]
fn main() {
pub struct Square<T> {
    /// The length of one side of the square
    side: T,
}

impl<T> Square<T> {
    // ...
}
}

Connected Mailbox Exercise

In this exercise, we will take our "SimpleDB" protocol parser and turn it into a network-connected data storage service. When a user sends a "PUBLISH" we will push the data into a queue, and when the user sends a "RETRIEVE" we will pop some data off the queue (if any is available). The user will connect via TCP to port 7878.

After completing this exercise you are able to

write a Rust binary that uses a Rust library
combine two Rust packages into a Cargo Workspace
open a TCP port and perform an action when each user connects
use I/O traits to read/write from a TCP socket

Prerequisites

creating and running binary crates with cargo
using match to pattern-match on an enum, capturing any inner values
using Rust's Read and Write I/O traits
familiarity with TCP socket listening and accepting

Tasks

Create an empty folder called connected-mailbox. Copy in the simple-db project from before and create a new binary crate called tcp-server, and put them both into a Cargo Workspace.
```
📂 connected-mailbox
┣ 📄 Cargo.toml 
┃
┣ 📂 simple-db 
┃  ┣ 📄 Cargo.toml 
┃  ┗ ...
┃
┗ 📂 tcp-server 
   ┣ 📄 Cargo.toml 
   ┗ ...
```
Write a basic TCP Server which can listen for TCP connections on 127.0.0.1:7878. For each incoming connection, read all of the input as a string, and send it back to the client.
Change the TCP Server to depend upon the simple-db crate, using a relative path.
Change your TCP Server to use your simple-db crate to parse the input, and provide an appropriate canned response.
Set up a VecDeque and either push or pop from that queue, depending on the command you have received.

At every step, try out your program using a command-line TCP Client: you can either use nc, or netcat, or our supplied tools/tcp-client program.

Optional Tasks:

Run cargo clippy on your codebase.
Run cargo fmt on your codebase.
Wrap your VecDeque into a struct Application with a method that takes a simple-db::Command and returns an Option<String>. Write some tests for it.

Help

Connecting over TCP/IP

Using `nc`, `netcat` or `ncat`

The nc, netcat, or ncat tools may be available on your macOS or Linux machine. They all work in a similar fashion.

$ echo "PUBLISH 1234" | nc 127.0.0.1 7878

The echo command adds a new-line character automatically. Use echo -n if you don't want it to add a new-line character.

Using our TCP Client

We have written a basic TCP Client which should work on any platform.

$ cd tools/tcp-client
$ cargo run -- "PUBLISH hello"
$ cargo run -- "RETRIEVE"

It automatically adds a newline character on to the end of every message you send. It is hard-coded to connect to a server at 127.0.0.1:7878.

Writing to a stream

If you want to write to an object that implements std::io::Write, you could use writeln!.

Solution

#![allow(unused)]
fn main() {
use std::io::prelude::*;
use std::net::{TcpStream};

fn handle_client(mut stream: TcpStream) -> Result<(), std::io::Error> {
    let mut buffer = String::new();
    stream.read_to_string(&mut buffer)?;
    println!("Received: {:?}", buffer);
    writeln!(stream, "Thank you for {buffer:?}!")?;
    Ok(())
}
}

Writing a TCP Server

If you need a working example of a basic TCP Echo server, you can start with our template.

Solution

use std::io::prelude::*;
use std::net::{TcpListener, TcpStream};
use std::time::Duration;

const DEFAULT_TIMEOUT: Option<Duration> = Some(Duration::from_millis(1000));

fn main() -> std::io::Result<()> {
    let listener = TcpListener::bind("127.0.0.1:7878")?;

    // accept connections and process them one at a time
    for stream in listener.incoming() {
        match stream {
            Ok(stream) => {
                println!("Got client {:?}", stream.peer_addr());
                if let Err(e) = handle_client(stream) {
                    println!("Error handling client: {:?}", e);
                }
            }
            Err(e) => {
                println!("Error connecting: {:?}", e);
            }
        }
    }
    Ok(())
}

/// Process a single connection from a single client.
///
/// Drops the stream when it has finished.
fn handle_client(mut stream: TcpStream) -> Result<(), std::io::Error> {
    stream.set_read_timeout(DEFAULT_TIMEOUT)?;
    stream.set_write_timeout(DEFAULT_TIMEOUT)?;

    let mut buffer = String::new();
    stream.read_to_string(&mut buffer)?;
    println!("Received: {:?}", buffer);
    writeln!(stream, "Thank you for {buffer:?}!")?;
    Ok(())
}

Making a Workspace

Solution

A workspace file looks like:

[workspace]
resolver= "2"
members = ["simple-db", "tcp-server"]

Each member is a folder containing a Cargo package (i.e. that contains a Cargo.toml file).

Handling Errors

Solution

In a binary program anyhow is a good way to handle top-level errors.

use std::io::Read;

fn handle_client(stream: &mut std::net::TcpStream) -> Result<(), anyhow::Error> {
    // This returns a `Result<(), std::io::Error>`, and the `std::io::Error` will auto-convert into an `anyhow::Error`.
    stream.read_to_string(&mut buffer)?;
    /// ... etc
    Ok(())    
}

You could also write an enum Error which has a variant for std::io::Error and a variant for simple_db::Error, and suitable impl From<...> for Error blocks.

When handling a client, you could .unwrap() the function which handles the client, but do you want to quit the server if the client sends a malformed message? Perhaps you should catch the result with a match, and print an error to the console before moving on to the next client.

Solution

If you need it, we have provided a complete solution for this exercise.

Multi-Threaded Mailbox Exercise

In this exercise, we will take our "Connected Mailbox" and make it multi-threaded. A new thread should be spawned for every incoming connection, and that thread should take ownership of the TcpStream and drive it to completion.

After completing this exercise you are able to

spawn threads
convert a non-thread-safe type into a thread-safe-type
lock a Mutex to access the data within

Prerequisites

A completed "Connected Mailbox" solution

Tasks

Use the std::thread::spawn API to start a new thead when your main loop produces a new connection to a client. The handle_client function should be executed within that spawned thread. Note how Rust doesn't let you pass &mut VecDeque<String> into the spawned thread, both because you have multiple &mut references (not allowed) and because the thread might live longer than the VecDeque (which only lives whilst the main() function is running, and main() might quit at any time with an early return or a break out of the connection loop).
Convert the VecDeque into a Arc<Mutex<VecDeque>> (use std::sync::Mutex). Change the handle_client function to take a &Mutex<VecDeque>. Clone the Arc handle with .clone() and move that cloned handle into the new thread. Change the handle_client function to call let mut queue = your_mutex.lock().unwrap(); whenever you want to access the queue inside the Mutex.
Convert the Arc<Mutex<VecDeque>> into a Mutex<VecDeque> and introduce scoped threads with std::thread::scope. The Mutex<VecDeque> should be created outside of the scope (ensure it lives longer than any of the scoped threads), but the connection loop should be inside the scope. Change std::thread::spawn to be s.spawn, where s is the name of the argument to the scope closure.

At every step (noting that Step 1 won't actually work...), try out your program using a command-line TCP Client: you can either use nc, or netcat, or our supplied tools/tcp-client program.

Optional Tasks:

Run cargo clippy on your codebase.
Run cargo fmt on your codebase.

Help

Making a Arc, containing a Mutex, containing a VecDeque

You can just nest the calls to SomeType::new()...

Solution

use std::collections::VecDeque;
use std::sync::{Arc, Mutex};

fn main() {
    // This type annotation isn't required if you actually push something into the queue...
    let queue_handle: Arc<Mutex<VecDeque<String>>> = Arc::new(Mutex::new(VecDeque::new()));
}

Spawning Threads

The std::thread::spawn function takes a closure. Rust will automatically try and borrow any local variables that the closure refers to but that were declared outside the closure. You can put move in front of the closure bars (e.g. move ||) to make Rust try and take ownership of variables instead of borrowing them.

You will want to clone the Arc and move the clone into the thread.

Solution

use std::collections::VecDeque;
use std::sync::{Arc, Mutex};

fn main() {
    let queue_handle = Arc::new(Mutex::new(VecDeque::new()));

    for _ in 0..10 {
        // Clone the handle and move it into a new thread
        let thread_queue_handle = queue_handle.clone();
        std::thread::spawn(move || {
            handle_client(&thread_queue_handle);
        });

        // This is the same, but fancier. It stops you passing the wrong Arc handle
        // into the thread.
        std::thread::spawn({ // this is a block expression
            // This is declared inside the block, so it shadows the one from the
            // outer scope.
            let queue_handle = queue_handle.clone();
            // this is the closure produced by the block expression
            move || {
                handle_client(&queue_handle);
            }
        });
    }

    // This doesn't need to know it's in an Arc, just that it's in a Mutex.
    fn handle_client(locked_queue: &Mutex<VecDeque<String>>) {
        todo!();
    }
}

Locking a Mutex

A value of type Mutex<T> has a lock() method, but this method can fail if the Mutex has been poisoned (i.e. a thread panicked whilst holding the lock). We generally don't worry about handling the poisoned case (because one of your threads has already panicked, so the program is in a fairly bad state already), so we just use unwrap() to make this thread panic as well.

Solution

use std::collections::VecDeque;
use std::sync::{Arc, Mutex};

fn main() {
    let queue_handle = Arc::new(Mutex::new(VecDeque::new()));

    let mut inner_q = queue_handle.lock().unwrap();
    inner_q.push_back("Hello".to_string());
    println!("{:?}", inner_q.pop_front());
    println!("{:?}", inner_q.pop_front());
}

Creating a thread scope

Recall, the purpose of a thread scope is to satisfy the compiler that it is safe for a thread to borrow an item that is on the current function's stack. It does this by ensuring that all threads created with the scope terminate before the thread scope ends (after which, the remainder of the function is executed including perhaps destruction or transfer of the variables that were borrowed).

Use std::thread::scope to create a scope, and pass it a closure containing the bulk of your main function. Any variables you want to borrow should be created before the thread scope is created, but you should wait for incoming connections inside the thread scope (think about what happens to any spawned threads that are still executing at the point you try and leave the thread scope).

Solution

use std::collections::VecDeque;
use std::sync::Mutex;

fn main() {
    let locked_queue = Mutex::new(VecDeque::new());

    std::thread::scope(|s| {
        for i in 0..10 {
            let locked_queue = &locked_queue;
            s.spawn(move || {
                let mut inner_q = locked_queue.lock().unwrap();
                inner_q.push_back(i.to_string());
                println!("Pop {:?}", inner_q.pop_front());
            });
        }
    });
}

Solution

If you need it, we have provided a complete solution for this exercise.

Self-Check Project

This exercise is intended for you to check your Rust knowledge. It is based on our other exercises, so you can follow those one by one instead of attempting to do everything in one go if you prefer.

In this exercise you will create a small in-memory message queue that is accessible over a TCP connection and uses a plain-text format for its protocol. The protocol has two commands: one to put a message into the back of the queue and one to read a message from the front of the queue. When a user sends a "PUBLISH" you will push the data into the queue, and when the user sends a "RETRIEVE" you will pop some data off the queue (if any is available). The user will connect via TCP to port 7878. You should handle multiple clients adding or removing messages from the queue at the same time.

Goals

After completing this exercise you will have demonstrated that you can:

write a Rust binary that uses a Rust library
combine two Rust packages into a Cargo workspace
open a TCP port and perform an action when each user connects
use I/O traits to read/write from a TCP socket
create a safe protocol parser in Rust manually
interact with borrowed and owned memory, especially how to take ownership
handle complex cases using the match and if let syntax
handle errors using Result and custom error types
spawn threads
convert a non-thread-safe type into a thread-safe-type
lock a Mutex to access the data within

Tasks

Create a Cargo workspace for your project.
Create a binary package inside your workspace for your TCP server
Implement a simple TCP Server that listens for connections on 127.0.0.1:7878. For each incoming connection, read all of the input as a string, and send it back to the client. Disconnect the client if they send input that is not valid UTF-8.
Create a package with a library crate inside your workspace for the message protocol parser. Make your TCP server depend on that library using a relative path.
Inside your library implement the following function so that it implements the protocol specifications to parse the messages. Use the provided tests to help you with the case handling.
```
pub fn parse(input: &str) -> Result<Command, Error> {
    todo!()
}
```
Change your TCP Server to use your parser crate to parse the input, and provide an appropriate canned response.
Set up a VecDeque queue and either push or pop from that queue, depending on the command you have received.
Add support for multiple simultaneous client connections using threads. Make sure all clients read and write to the same shared queue.

Optional Tasks

Allow each connection to read input line by line as a sequence of commands and execute them in the same order as they come in. This way you should be able to open several connections in terminal and type commands in them one by one.
Handle slow clients by disconnecting them if the input isn't received within some timeout.
Run cargo fmt on your codebase.
Run cargo clippy on your codebase.

Protocol Specification

The protocol has two commands that are sent as messages in the following form:

PUBLISH <payload>\n
RETRIEVE\n

With the additional properties:

The payload cannot contain newlines.
A missing newline at the end of the command is an error.
Data after the first newline is an error.
Empty payloads are allowed. In this case, the command is PUBLISH \n.

Violations against the form of the messages and the properties are handled with the following error codes:

UnexpectedNewline (a newline not at the end of the line)
IncompleteMessage (no newline at the end)
EmptyMessage (empty string instead of a command)
UnknownCommand (string is not empty, but neither PUBLISH nor RECEIVE)
UnexpectedPayload (message contains a payload, when it should not)
MissingPayload (message is missing a payload)

Testing

Below are the tests your protocol parser needs to pass. You can copy them to the bottom of your lib.rs.

#[cfg(test)]
mod tests {
    use super::*;

    // Tests placement of \n
    #[test]
    fn test_missing_nl() {
        let line = "RETRIEVE";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }
    #[test]
    fn test_trailing_data() {
        let line = "PUBLISH The message\n is wrong \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedNewline);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_string() {
        let line = "";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }

    // Tests for empty messages and unknown commands

    #[test]
    fn test_only_nl() {
        let line = "\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::EmptyMessage);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_unknown_command() {
        let line = "SERVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnknownCommand);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of RETRIEVE command

    #[test]
    fn test_retrieve_w_whitespace() {
        let line = "RETRIEVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve_payload() {
        let line = "RETRIEVE this has a payload\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve() {
        let line = "RETRIEVE\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Retrieve);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of PUBLISH command

    #[test]
    fn test_publish() {
        let line = "PUBLISH TestMessage\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("TestMessage".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_publish() {
        let line = "PUBLISH \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_missing_payload() {
        let line = "PUBLISH\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::MissingPayload);
        assert_eq!(result, expected);
    }
}

Help

Connecting over TCP/IP

Using `nc`, `netcat` or `ncat`

The nc, netcat, or ncat tools may be available on your macOS or Linux machine, or on WSL on windows. They all work in a similar fashion.

$ echo "PUBLISH 1234" | nc 127.0.0.1 7878

The echo command adds a new-line character automatically. Use echo -n if you don't want it to add a new-line character.

Using our TCP Client

We have written a basic TCP Client which should work on any platform.

$ cd tools/tcp-client
$ cargo run -- "PUBLISH hello"
$ cargo run -- "RETRIEVE"

It automatically adds a newline character on to the end of every message you send. It is hard-coded to connect to a server at 127.0.0.1:7878.

Solution

This exercise is based on three other exercises. Check their solutions below:

nRF52 Preparation

This chapter contains information about the nRF52-based exercises, the required hardware and an installation guide.

Required Hardware

nRF52840 Development Kit (DK)
nRF52840 Dongle
2 micro-USB cables
- ❗️ make sure you're using micro usb cables which can transmit data (some are charging-only; these are not suitable for these exercises)
2 corresponding available USB ports on your laptop / PC (you can use a USB hub if you don't have enough ports)

In our nRF52-focussed exercises we will use both the nRF52840 Development Kit (DK) and the nRF52840 Dongle. We'll mainly develop programs for the DK and use the Dongle to assist with some exercises.

For the span of these exercises keep the nRF52840 DK connected to your PC using a micro-USB cable. Connect the USB cable to the J2 port on the nRF52840 DK.

Labeled Diagram of the nRF52840 Development Kit (DK)

Starter code

Project templates and starter code for our trainings can be found at in this repo.

Required tools

Please install the required tools before the lesson starts.

nRF52 Code Organization

Training Materials

You will need a local copy of the training materials. We recommend the Github release as it contains pre-compiled HTML docs and pre-compiled dongle firmware, but you can clone the repo with git and check out the appropriate tag as well if you prefer.

Ask your trainer which release/tag you should be using.

Github Release

Download the latest release from the rust-exercises Github release area. Unpack the zip file somewhere you can work on the contents.

Git checkout

Clone and change into the rust-exercises git repository:

git clone https://github.com/ferrous-systems/rust-exercises.git
cd rust-exercises

The git repository contains all training materials, i.e. code snippets, custom tools and the source for this handbook, but not the pre-compiled dongle firmware.

Firmware

The target firmware for the nRF52 for this exercise lives in ./nrf52-code:

$ tree -L 2
.
├── boards
│   ├── dk
│   ├── dk-solution
│   ├── dongle
│   └── dongle-fw
├── consts
│   ├── Cargo.lock
│   ├── Cargo.toml
│   └── src
├── hal-app
│   ├── Cargo.lock
│   ├── Cargo.toml
│   └── src
├── loopback-fw
│   ├── Cargo.lock
│   ├── Cargo.toml
│   └── src
├── puzzle-fw
│   ├── Cargo.lock
│   ├── Cargo.toml
│   ├── build.rs
│   └── src
├── radio-app
│   ├── Cargo.lock
│   ├── Cargo.toml
│   └── src
├── usb-app
│   ├── Cargo.lock
│   ├── Cargo.toml
│   └── src
├── usb-app-solutions
│   ├── Cargo.lock
│   ├── Cargo.toml
│   ├── src
│   └── traces
├── usb-lib
│   ├── Cargo.lock
│   ├── Cargo.toml
│   └── src
└── usb-lib-solutions
    ├── get-descriptor-config
    ├── get-device
    └── set-config

27 directories, 17 files

boards/dk

Contains a Board Support Package for the nRF52840 Developer Kit.

boards/dk-solution

Contains a Board Support Package for the nRF52840 Developer Kit, with a solution to the BSP exercise.

boards/dongle

Contains a Board Support Package for the nRF52840 USB Dongle. You won't be using this.

boards/dongle-fw

In the release zip file, this contains pre-compiled firmware for the nRF52 USB Dongle, which you use in the nRF52 Radio Exercise. In the Git repository, it's empty.

consts

Contains constants (e.g. USB Vendor IDs) shared by multiple crates.

hal-app

Contains template and solution binary crates for the nRF BSP exercise.

loopback-fw

Source code for the USB Dongle firmware to implement loopback mode.

puzzle-fw

Source code for the USB Dongle firmware to implement puzzle mode. No, you won't find the solution to the puzzle in this source directory - nice try!

radio-app

Contains template and solution binary crates for the nRF Radio exercise.

usb-app

Contains template binary crates for the nRF USB exercise.

usb-app-solutions

Contains solution binary crates for the nRF USB exercise.

usb-lib

Contains a template library crate for the nRF USB exercise. This library can parse USB descriptor information.

usb-lib-solutions/get-descriptor-config

Contains a solution library crate for the nRF USB exercise.

usb-lib-solutions/get-device

Contains a solution library crate for the nRF USB exercise.

usb-lib-solutions/set-config

Contains a solution library crate for the nRF USB exercise.

nRF52 Hardware

In our nRF52-focussed exercises we will use both the nRF52840 Development Kit (DK) and the nRF52840 Dongle. We'll mainly develop programs for the DK and use the Dongle to assist with some exercises.

nRF52840 Development Kit (DK)

Connect one end of one of the supplied micro USB cable to the USB connector J2 of the board and the other end to your PC.

💬 These directions assume you are holding the board "horizontally" with components (switches, buttons and pins) facing up. In this position, rotate the board, so that its convex shaped short side faces right. You'll find one USB connector (J2) on the left edge, another USB connector (J3) on the bottom edge and 4 buttons on the bottom right corner.

Labeled Diagram of the nRF52840 Development Kit (DK)

The board has several switches to configure its behavior. The out of the box configuration is the one we want. If the above instructions didn't work for you, check the position of the following switches:

SW6 is set to the DEFAULT position (to the right - nRF = DEFAULT).
SW7 (protected by Kapton tape) is set to the Def. position (to the right - TRACE = Def.).
SW8 is set to the ON (to the left) position (Power = ON)
SW9 is set to the VDD position (center - nRF power source = VDD)
SW10 (protected by Kapton tape) is set to the OFF position (to the left - VEXT -> nRF = OFF).

Windows

When the nRF52-DK is connected to your PC it shows up as a removable USB Flash Drive (named JLINK) and also as a USB Serial Device (COM port) in the Device Manager under the Ports section.

Linux

When the nRF52-DK is connected to your PC it shows up as a USB device under lsusb. The device will have a VID of 1366 and a PID of 10xx or 01xx, where x can vary:

$ lsusb
(..)
Bus 001 Device 014: ID 1366:1051 SEGGER 4-Port USB 2.0 Hub

The device will also show up in the /dev directory as a ttyACM device:

$ ls /dev/ttyACM*
/dev/ttyACM0

macOS

When the nRF52-DK is connected to your Mac it shows up as a removable USB flash drive (named JLINK) on the Desktop, and also a USB device named "J-Link" when executing ioreg -p IOUSB -b -n "J-Link".

$ ioreg -p IOUSB -b -n "J-Link"
(...)
  | +-o J-Link@14300000  <class AppleUSBDevice, id 0x10000606a, registered, matched, active, busy 0 $
  |     {
  |       (...)
  |       "idProduct" = 4117
  |       (...)
  |       "USB Product Name" = "J-Link"
  |       (...)
  |       "USB Vendor Name" = "SEGGER"
  |       "idVendor" = 4966
  |       (...)
  |       "USB Serial Number" = "000683420803"
  |       (...)
  |     }
  |

The device will also show up in the /dev directory as tty.usbmodem<USB Serial Number>:

$ ls /dev/tty.usbmodem*
/dev/tty.usbmodem0006834208031

nRF52840 Dongle

Connect the Dongle to your PC/laptop. Its red LED should start oscillating in intensity.

Windows

The device shows up as a USB Serial Device (COM port) in the Device Manager under the Ports section

Linux

The dongle shows up as a USB device under lsusb. The device will have a VID of 0x1915 and a PID of 0x521f -- the 0x prefix will be omitted in the output of lsusb:

$ lsusb
(..)
Bus 001 Device 023: ID 1915:521f Nordic Semiconductor ASA 4-Port USB 2.0 Hub

The device will also show up in the /dev directory as a ttyACM device:

$ ls /dev/ttyACM*
/dev/ttyACM0

macOS

The device shows up as a usb device when executing ioreg -p IOUSB -b -n "Open DFU Bootloader". The device will have a vendor ID ("idVendor") of 6421 and a product ID ("idProduct") of 21023:

$ ioreg -p IOUSB -b -n "Open DFU Bootloader"
(...)
| +-o Open DFU Bootloader@14300000  <class AppleUSBDevice, id 0x100005d5b, registered, matched, ac$
  |     {
  |       (...)
  |       "idProduct" = 21023
  |       (...)
  |       "USB Product Name" = "Open DFU Bootloader"
  |       (...)
  |       "USB Vendor Name" = "Nordic Semiconductor"
  |       "idVendor" = 6421
  |       (...)
  |       USB Serial Number" = "CA1781C8A1EE"
  |       (...)
  |     }
  |

The device will also show up in the /dev directory as tty.usbmodem<USB Serial Number>:

$ ls /dev/tty.usbmodem*
/dev/tty.usbmodemCA1781C8A1EE1

nRF52 Tools

Follow the relevant section for the operating system that you're using:

Once complete, go to Setup check.

Linux

Install VS Code

Follow the instructions for your distribution on https://code.visualstudio.com/docs/setup/linux.

Install dependencies

Some of our tools require a C compiler.

Ensure you have the proper packages installed. On Debian based distributions you can use:

sudo apt-get install gcc

Configure USB Device access for non-root users

Connect the dongle and check its permissions with these commands:

$ lsusb -d 1915:521f
Bus 001 Device 016: ID 1915:521f Nordic Semiconductor ASA USB Billboard
$ #   ^         ^^

$ # take note of the bus and device numbers that appear for you when run the next command
$ ls -l /dev/bus/usb/001/016
crw-rw-r-- 1 root root 189, 15 May 20 12:00 /dev/bus/usb/001/016

The root root part in crw-rw-r-- 1 root root indicates the device can only be accessed by the root user.

To access the USB devices as a non-root user, follow these steps:

As root, create /etc/udev/rules.d/50-ferrous-training.rules with the following contents:

# udev rules to allow access to USB devices as a non-root user

# nRF52840 Dongle in bootloader mode
ATTRS{idVendor}=="1915", ATTRS{idProduct}=="521f", TAG+="uaccess"

# nRF52840 Dongle applications
ATTRS{idVendor}=="1209", TAG+="uaccess"

# nRF52840 Development Kit
ATTRS{idVendor}=="1366", ENV{ID_MM_DEVICE_IGNORE}="1", TAG+="uaccess"

Run the following command to put the new udev rules into effect
```
sudo udevadm control --reload-rules
```

To check the permissions again, first disconnect and reconnect the dongle. Then run lsusb.

$ lsusb
Bus 001 Device 017: ID 1915:521f Nordic Semiconductor ASA 4-Port USB 2.0 Hub

$ ls -l /dev/bus/usb/001/017
crw-rw-r--+ 1 root root 189, 16 May 20 12:11 /dev/bus/usb/001/017

The + part in crw-rw-r--+ indicates the device can be accessed without root permissions. If you have permission to access them dongle, then the nRF52-DK should also work because both were listed in the udev rules file.

Install base rust tooling

Go to https://rustup.rs and follow the instructions.

Install rust analyzer

Open VS Code, find Rust Analyzer in the marketplace (bottom icon in the left panel), then install it.

Configure Rust Cross compilation support

Run this command in a terminal:

rustup +stable target add thumbv7em-none-eabihf

Install ELF analysis tools

Run these commands in a terminal:

cargo install cargo-binutils
rustup +stable component add llvm-tools

Third-party tools written in Rust

Install the flip-link, nrf-dfu and cyme tools from source using the following Cargo commands:

cargo install flip-link
cargo install nrfdfu
cargo install cyme

Install probe-rs 0.27 pre-compiled binaries on Linux with:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/probe-rs/probe-rs/releases/download/v0.27.0/probe-rs-tools-installer.sh | sh

Windows

Install VS Code

Go to https://code.visualstudio.com and run the installer.

Associate the device with the WinUSB drivers

On Windows you'll need to associate the nRF52840 Development Kit's USB device to the WinUSB driver.

To do that connect the nRF52840 DK to your PC using micro-USB port J2, then download and run the Zadig tool.

In Zadig's graphical user interface,

Select the 'List all devices' option from the Options menu at the top.
From the device (top) drop down menu select "BULK interface (Interface nnn)"
Once that device is selected, 1366 1051 should be displayed in the USB ID field. That's the Vendor ID - Product ID pair.
Select 'WinUSB' as the target driver (right side)
Click "Install Driver". The process may take a few minutes to complete and might not appear to do anything right away. Click it once and wait.

You do not need to do anything for the nRF52840 Dongle device.

Install base rust tooling

Go to https://rustup.rs and follow the instructions.

You will need a C compiler to use Rust on Windows. The rustup installer will suggest you install either Visual Studio, or the Build Tools for Visual Studio - either is fine. When that is installing, be sure to select the optional "Desktop development with C++" part of the C++ build tools package. The installation may take up to 5.7 GB of disk space. Please also be aware of the license conditions attached to these products, especially in an enterprise environment.

Install rust analyzer

Open VS Code, find Rust Analyzer in the marketplace (bottom icon in the left panel), then install it.

If you get a message about git not being installed, ignore it!

Configure Rust Cross compilation support

Run this command in a terminal:

rustup +stable target add thumbv7em-none-eabihf

Install ELF analysis tools

Run these commands in a terminal:

cargo install cargo-binutils
rustup +stable component add llvm-tools

Third-party tools written in Rust

Install the flip-link, nrf-dfu and cyme tools from source using the following Cargo commands:

cargo install flip-link
cargo install nrfdfu
cargo install cyme

Install probe-rs 0.27 pre-compiled binaries on Windows with:

powershell -c "irm https://github.com/probe-rs/probe-rs/releases/download/v0.27.0/probe-rs-tools-installer.ps1 | iex"

macOS

Install VS Code

Go to https://code.visualstudio.com and click on "Download for Mac".

Install base rust tooling

Go to https://rustup.rs and follow the instructions.

Install rust analyzer

Open VS Code, find Rust Analyzer in the marketplace (bottom icon in the left panel), then install it.

Configure Rust Cross compilation support

Run this command in a terminal:

rustup +stable target add thumbv7em-none-eabihf

Install ELF analysis tools

Run these commands in a terminal:

cargo install cargo-binutils
rustup +stable component add llvm-tools

Third-party tools written in Rust

Install the flip-link, nrf-dfu and cyme tools from source using the following Cargo commands:

cargo install flip-link
cargo install nrfdfu
cargo install cyme

Install probe-rs 0.27 pre-compiled binaries on macOS with:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/probe-rs/probe-rs/releases/download/v0.27.0/probe-rs-tools-installer.sh | sh

Setup check

✅ Let's check that you have installed all the tools listed in the previous section.

$ cargo size --version
cargo-size 0.3.6

✅ Connect the nRF52840-DK with your computer by plugging the usb-cable into the J2 connector on the DK (the usb connector on the short side of the board).

✅ Use cyme to list the USB devices on your computer.

$ cyme
(..)
  2  15  0x1366 0x1051 J-Link                   001050255503      12.0 Mb/s
(..)

Your nRF52840-DK should appear as "J-Link" with USB Vendor ID (VID) of 0x1366 and a USB Product ID (PID) 0x1051.

🔎 If cyme doesn't work for any reason, you can use cargo xtask usb-list, which does the same thing but is much more basic. Run it from the root of the extracted tarball / git checkout:

$ cargo xtask usb-list
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.04s
     Running `xtask/target/debug/xtask usb-list`
Bus 002 Device 015: ID 1366:1051 <- J-Link on the nRF52840 Development Kit
(...) random other USB devices will be listed

✅ In the terminal run cargo run --bin hello from the nrf52-code/radio-app folder, to build and run a simple program on the DK to test the set-up.

❯ cargo run --bin hello
    Finished `dev` profile [optimized + debuginfo] target(s) in 0.09s
     Running `probe-rs run --chip=nRF52840_xxAA --allow-erase-all --log-format=oneline target/thumbv7em-none-eabihf/debug/hello`
      Erasing ✔ 100% [####################]  12.00 KiB @  18.51 KiB/s (took 1s)
  Programming ✔ 100% [####################]  12.00 KiB @  14.30 KiB/s (took 1s)
  Finished in 1.49s
Hello, world!
`dk::exit()` called; exiting ...

References and Resources

Radio Project

nRF52840 Product Specification
The Embedded Rust Book is a great learning resource, especially the Concurrency chapter.
If you are looking to write an interrupt handler, look at the #[interrupt] attribute. All interrupts implemented by the nrf52840 HAL are listed in nrf52840-pac/src/lib.rs. It is also recommended that you work through the USB exercise to learn about RTIC.

USB Project

Tooltips

Besides the ones covered in this training, there are many more tools that make embedded development easier. Here, we'd like to introduce you to some of these tools and encourage you to play around with them and adopt them if you find them helpful!

`cargo-bloat`

cargo-bloat is a useful tool to analyze the binary size of a program. You can install it through cargo:

$ cargo install cargo-bloat
(..)
Installed package `cargo-bloat v0.10.0` (..)

Let's inspect our radio exercises's hello program with it:

$ cd nrf52-code/radio-app
$ cargo bloat --bin hello
File  .text   Size      Crate Name
0.7%  13.5% 1.3KiB        std <char as core::fmt::Debug>::fmt
0.5%   9.6%   928B      hello hello::__cortex_m_rt_main
0.4%   8.4%   804B        std core::str::slice_error_fail
0.4%   8.0%   768B        std core::fmt::Formatter::pad
0.3%   6.4%   614B        std core::fmt::num::<impl core::fmt::Debug for usize>::fmt
(..)
5.1% 100.0% 9.4KiB            .text section size, the file size is 184.5KiB

This breaks down the size of the .text section by function. This breakdown can be used to identify the largest functions in the program; those could then be modified to make them smaller.

Using `probe-rs` VS Code plugin

The probe-rs team have produced a VS Code plugin. It uses the probe-rs library to talk directly to your supported Debug Probe (J-Link, ST-Link, CMSIS-DAP, or whatever) and supports both single-stepping and defmt logging.

Install the probe-rs.probe-rs-debugger extension in VS Studio, and when you open the nrf52-code/radio-app folder in VS Code, the .vscode/launch.json file we supply should give you a Run with probe-rs entry in the Run and Debug panel. Press the green triangle and it will build the code, flash device, set up defmt and then start the chip running. You can set breakpoints in the usual way (by clicking to the left of your source code to place a red dot).

Using `gdb` and `probe-rs`

The CLI probe-rs command has an option for opening a GDB server. We have found the command-line version of GDB to be a little buggy though, so the VS Code plugin above is preferred.

$ probe-rs gdb --chip nRF52840_xxAA
# In another window
$ arm-none-eabi-gdb ./target/thumbv7em-none-eabihf/debug/blinky
gdb> target extended-remote :1337
gdb> monitor reset halt
gdb> break main
gdb> continue
Breakpoint 1, blinky::__cortex_m_rt_main_trampoline () at src/bin/blinky.rs:10

Using `gdb` and `openocd`

You can also debug a Rust program using gdb and openocd. However, this isn't recommended because it requires significant extra set-up, especially to get the RTT data piped out of a socket and into defmt-print (this function is built into a probe-rs).

If you are familiar with OpenOCD and GDB, and want to try this anyway, then do pretty much what you would do with a C program.

The only change is that if you want defmt output, you need these OpenOCD commands to enable RTT:

rtt setup 0x20000000 0x40000 "SEGGER RTT"
rtt start
rtt server start 9090 0

You can then use nc to connect to localhost:9090, and pipe the output into defmt-print:

nc localhost:9090 | defmt-print ./target/thumbv7em-none-eabihf/debug/blinky

Troubleshooting

If you have issues with any of the tools used in this training check out the sections in this chapter.

`cargo-size` is not working

$ cargo size --bin hello
Failed to execute tool: size
No such file or directory (os error 2)

llvm-tools is not installed. Install it with rustup component add llvm-tools

▶ Run button, type annotations and syntax highlighting missing / Rust-Analyzer is not working

If you get no type annotations, no "Run" button and no syntax highlighting this means Rust-Analyzer isn't at work yet.

Try the following:

add something to the file you're currently looking at, delete it again and save. This triggers a re-run. (you can also touch the file in question)
check that you have a single folder open in VS code; this is different from a single-folder VS code workspace. First close all the currently open folders then open a single folder using the 'File > Open Folder' menu. The open folder should be the nrf52-code/radio-app folder for the Radio exercise, the nrf52-code/hal-app folder for the HAL exercise, or the nrf52-code/usb-app folder for the USB exercise.
use the latest version of the Rust-Analyzer plugin. If you get a prompt to update the Rust-Analyzer extension when you start VS code accept it. You may also get a prompt about updating the Rust-Analayzer binary; accept that one too. The extension should restart automatically after the update. If it doesn't then close and re-open VS code.
You may need to wait a little while Rust-Analyzer analyzes all the crates in the dependency graph. Then you may need to modify and save the currently open file to force Rust-Analyzer to analyze it.

`cargo build` fails to link

If you have configured Cargo to use sccache then you'll need to disable sccache support. Unset the RUSTC_WRAPPER variable in your environment before opening VS code. Run cargo clean from the Cargo workspace you are working from (nrf52-code/radio-app or nrf52-code/usb-app). Then open VS code.

If you are on Windows and get linking errors like LNK1201: error writing to program database, then something in your target folder has become corrupt. A cargo clean should fix it.

Dongle USB functionality is not working

NOTE: this section only applies to the Radio exercise

If you don't get any output from cargo xtask serial-term it could just have been that first line got lost when re-enumerating the device from bootloader mode to the loopback application.

Run cargo xtask serial-term in one console window. Leave this window open.

In another window, run these two commands:

$ cargo xtask change-channel 20
requested channel change to channel 20

$ cargo xtask change-channel 20
requested channel change to channel 20

If you get two lines of output in cargo xtask serial-term like this, you are good to go:

$ cargo xtask serial-term
now listening on channel 20
now listening on channel 20

Return to the "Interference" section.

🔎 cargo xtask serial-term shows you the log output that the Dongle is sending to your computer via the serial interface (not over the wireless network!). After you've ran cargo xtask change-channel, it tells you that it is now listening for network traffic on channel 20. This is helpful for debugging, but not mission-critical.

If you only get one line of output then your OS may be losing some serial data -- we have seen this behavior on some macOS machines. You will still be able to work through the exercises but will miss log data every now and then. Return to the "Interference" section.

If you don't get any output from cargo xtask serial-term and/or the cargo xtask change-channel command fails then the Dongle's USB functionality is not working correctly. In this case you should ask your trainer for a custom firmware which has a fixed radio channel allocated. This means that when you program the Development Kit to send data to the Dongle, you need to ensure they are communicating on the same channel by setting

/// make sure to pass the channel number of the loopback-nousb* program you
/// received from the trainer
radio.set_channel(Channel::_21);

Note that the loopback-nousb* programs do not send you any logs via cargo xtask serial-term for debugging but you will be able do the exercises nonetheless. For your debugging convenience, the Dongle will toggle the state of its green LED when it receives a packet. When you're done, return to the "Interference" section.

`cargo run` errors

You may get one of these errors:

"Access denied (insufficient permissions)" (seen on macOS)
"USB error while taking control over USB device: Resource busy" (seen on Linux)

$ cargo run --bin usb-4
Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/usb-4`
Error: An error specific to a probe type occured: USB error while taking control over USB device: Access denied (insufficient permissions)

Caused by:
    USB error while taking control over USB device: Access denied (insufficient permissions)

$ cargo run --bin usb-4
Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/usb-4`
Error: An error specific to a probe type occured: USB error while taking control over USB device: Resource busy

Caused by:
    USB error while taking control over USB device: Resource busy

All of them have the same root issue: You have another instance of the cargo run process running.

It is not possible to have two or more instances of cargo run running. Terminate the old instance before executing cargo run. If you are using VS Code click the garbage icon ("Kill Terminal") on the top right corner of the terminal output window (located on the bottom of the screen).

`no probe was found` error

You may encounter this error:

Running probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/hello
Error: no probe was found

It may be caused by the micro-USB cable plugged on the long side of the board, instead of the short side.
Check that the board is powered on.
Check that your cable is a data cable and not power-only.

`location info is incomplete` error

Problem: Using cargo run --bin hello from within the nrf52-code/radio-app folder finishes compiling and starts up probe-rs. But then the following error is returned:

Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/hello`
(HOST) WARN  (BUG) location info is incomplete; it will be omitted from the output
Error: AP ApAddress { dp: Default, ap: 0 } is not a memory AP
The LED5 next to the FTDI chip on the DK goes off for a split second but no program is flashed.

Solution: It seems like my nRF52840-DK was shipped with the MCU in some kind of protected state. Using nrfjprog from the nRF command line tools you can run nrfjprog --recover which makes the MCU exit this state and programming etc. using probe-rs works fine again.

Untested: using nrf-recover may also work.

nRF52 Radio Exercise

In this exercise you'll get familiar with:

the structure of embedded Rust programs,
the existing embedded Rust tooling, and
embedded application development using a Board Support Package (BSP).

To put these concepts in practice you'll write applications that use the radio functionality of the nRF52840 microcontroller.

You should have acquired two development boards for your training. We'll use both in the this radio exercise.

The nRF52840 Development Kit

This is the larger development board.

The board has two USB ports: J2 and J3 and an on-board J-Link programmer / debugger -- there are instructions to identify the ports in a previous section. USB port J2 is the J-Link's USB port. USB port J3 is the nRF52840's USB port. Connect the Development Kit to your computer using the J2 port.

The nRF52840 Dongle

This is the smaller development board.

The board has the form factor of a USB stick and can be directly connected to one of the USB ports of your PC / laptop. Do not connect it just yet.

The nRF52840

Both development boards have an nRF52840 microcontroller. Here are some details that are relevant to these exercises:

single core ARM Cortex-M4 processor clocked at 64 MHz
1 MB of Flash (at address 0x0000_0000)
256 KB of RAM (at address 0x2000_0000)
IEEE 802.15.4 and BLE (Bluetooth Low Energy) compatible radio
USB controller (device function)

Parts of an Embedded Program

We will look at the elements that distinguish an embedded Rust program from a desktop program.

✅ Open the nrf52-code/radio-app folder in VS Code.

# or use "File > Open Folder" in VS Code
code nrf52-code/radio-app

✅ Then open the nrf52-code/radio-app/src/bin/hello.rs file.

Attributes

In the file, you will find the following attributes:

`#![no_std]`

The #![no_std] language attribute indicates that the program will not make use of the standard library, the std crate. Instead it will use the core library, a subset of the standard library that does not depend on an underlying operating system (OS).

`#![no_main]`

The #![no_main] language attribute indicates that the program will use a custom entry point instead of the default fn main() { .. } one.

`#[entry]`

The #[entry] macro attribute marks the custom entry point of the program. The entry point must be a divergent function whose return type is the never type !. The function is not allowed to return; therefore the program is not allowed to terminate. The macro comes from the cortex-m-rt crate and is not part of the Rust language.

Building an Embedded Program

The default in a Cargo project is to compile for the host (native compilation). The nrf52-code/radio-app project has been configured for cross compilation to the ARM Cortex-M4 architecture. This configuration can be seen in the Cargo configuration file (.cargo/config):

# .cargo/config
[build]
target = "thumbv7em-none-eabihf" # = ARM Cortex-M4

The target thumbv7em-none-eabihf can be broken down as:

thumbv7em - we generate instructions for the Armv7E-M architecture running in Thumb-2 mode (actually the only supported mode on this architecture)
none - there is no Operating System
eabihf - use the ARM Embedded Application Binary Interface, with Hard Float support
- f32 and f64 can be passed to functions in FPU registers (like S0), instead of in integer registers (like R0)

✅ Inside the folder nrf52-code/radio-app, use the following command to cross compile the program to the ARM Cortex-M4 architecture.

cargo build --bin hello

The output of the compilation process will be an ELF (Executable and Linkable Format) file. The file will be placed in the target/thumbv7em-none-eabihf directory.

✅ Run $ file target/thumbv7em-none-eabihf/debug/hello and compare if your output is as expected.

Expected output:

$ file target/thumbv7em-none-eabihf/debug/hello
hello: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped

Binary Size

ELF files contain metadata like debug information so their size on disk is not a good indication of the amount of Flash the program will use once it's loaded on the target device's memory.

To display the amount of Flash the program will occupy on the target device use the cargo-size tool, which is part of the cargo-binutils package.

✅ Use the following command to print the binary's size in system V format.

cargo size --bin hello -- -A

Expected output: The breakdown of the program's static memory usage per linker section.

$ cargo size --bin hello -- -A
   Compiling radio v0.0.0 (/Users/jonathan/Documents/rust-exercises/nrf52-code/radio-app)
    Finished dev [optimized + debuginfo] target(s) in 0.92s
hello  :
section               size        addr
.vector_table          256         0x0
.text                 4992       0x100
.rodata               1108      0x1480
.data                   48  0x2003fbc0
.gnu.sgstubs             0      0x1920
.bss                    12  0x2003fbf0
.uninit               1024  0x2003fbfc
.defmt                   6         0x0
.debug_loc            3822         0x0
.debug_abbrev         3184         0x0
.debug_info         109677         0x0
.debug_aranges        2896         0x0
.debug_ranges         4480         0x0
.debug_str          108868         0x0
.debug_pubnames      40295         0x0
.debug_pubtypes      33582         0x0
.ARM.attributes         56         0x0
.debug_frame          2688         0x0
.debug_line          18098         0x0
.comment                19         0x0
Total               335111

🔎 More details about each linker section:

The first three sections are contiguously located in Flash memory -- on the nRF52840, flash memory spans from address 0x0000_0000 to 0x0010_0000 (i.e. 1 MiB of flash).

The .vector_table section contains the vector table, a data structure required by the Armv7E-M specification
The .text section contains the instructions the program will execute
The .rodata section contains constants like strings literals

Skipping .gnu.sgstubs (which is empty), the next few sections - .data, .bss and .uninit - are located in RAM. Our RAM spans the address range 0x2000_0000 - 0x2004_0000 (256 KB). These sections contain statically allocated variables (static variables), which are either initialised with a value kept in flash, with zero, or with nothing at all.

The remaining sections are debug information, which we ignore for now. But your debugger might refer to them when debugging!

Running the Program

Setting the log level

Enter the appropriate command into the terminal you're using. This will set the log level for this session.

MacOS & Linux

export DEFMT_LOG=warn

PowerShell

$Env:DEFMT_LOG = "warn"

Windows Command Prompt

set DEFMT_LOG=warn

Inside VS Code

To get VS Code to pick up the environment variable, you can either:

set it as above and then open VS Code from inside the terminal (ensuring it wasn't already open and hence just getting you a new window on the existing process), or
add it to your rust-analyzer configuration, by placing this in your settings.json file:
```
"rust-analyzer.runnables.extraEnv": {
    "DEFMT_LOG": "warn"
}
```
This will ensure the variable is set whenever rust-analyzer executes cargo run for you.

Running from VS Code

✅ Open the nrf52-code/radio-app/src/bin/hello.rs file, go to the "Run and Debug" button on the left, and then click the "Run" triangle next to Debug Microcontroller.

Note: you will get the "Run" button if the Rust analyzer's workspace is set to the nrf52-code/radio-app folder. This will be the case if the current folder in VS code (left side panel) is set to nrf52-code/radio-app.

Running from the console

If you are not using VS code, you can run the program out of your console. Enter the command cargo run --bin hello from within the nrf52-code/radio-app folder. Rust Analyzer's "Run" button is a short-cut for that command.

Expected output

NOTE: Recent version of the nRF52840-DK have flash-read-out protection to stop people dumping the contents of flash on an nRF52 they received pre-programmed, so if you have problems immediately after first plugging your board in, see this page.

If you run into an error along the lines of "Debug power request failed" retry the operation and the error should disappear.

$ cargo run --bin hello
   Compiling radio_app v0.0.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app)
    Finished `dev` profile [optimized + debuginfo] target(s) in 0.36s
     Running `probe-rs run --chip=nRF52840_xxAA --allow-erase-all --log-format=oneline target/thumbv7em-none-eabihf/debug/hello`
      Erasing ✔ 100% [####################]  12.00 KiB @  18.70 KiB/s (took 1s)
  Programming ✔ 100% [####################]  12.00 KiB @  14.34 KiB/s (took 1s)
  Finished in 1.48s
Hello, world!
`dk::exit()` called; exiting ...

What just happened?

cargo run will compile the application and then invoke the probe-rs tool with its final argument set to the path of the output ELF file.

The probe-rs tool will

flash (load) the program on the microcontroller
reset the microcontroller to make it execute the new program
collect logs from the microcontroller and print them to the console
print a backtrace of the program if the halt was due to an error.

Should you need to configure the probe-rs invocation to e.g. flash a different microcontroller you can do that in the .cargo/config.toml file.

[target.thumbv7em-none-eabihf]
runner = [
  "probe-rs",
  "run",
  "--chip",
  "nRF52840_xxAA"
]
# ..

🔎 How does flashing work?

The flashing process consists of the PC communicating with a second microcontroller on the nRF52840 DK over USB (J2 port). This second microcontroller, which is a J-Link Arm Debug Probe, is connected to the nRF52840 through a electrical interface known as SWD (Serial Wire Debug). The SWD protocol specifies procedures for reading memory, writing to memory, halting the target processor, reading the target processor registers, etc.

🔎 How does logging work?

Logging is implemented using the Real Time Transfer (RTT) protocol. Under this protocol the target device writes log messages to a ring buffer stored in RAM; the PC communicates with the J-Link to read out log messages from this ring buffer. This logging approach is non-blocking in the sense that the target device does not have to wait for physical IO (USB comm, serial interface, etc.) to complete while logging messages since they are written to memory. It is possible, however, for the target device to run out of space in its logging ring buffer; this causes old log messages to be overwritten or the microcontroller to pause whilst waiting for the PC to catch up with reading messages (depending on configuration).

Panicking

✅ Open the nrf52-code/radio-app/src/bin/panic.rs file and click the "Run" button (or run with cargo run --bin panic).

This program attempts to index an array beyond its length and this results in a panic.

$ cargo run --bin panic
   Compiling radio_app v0.0.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app)
    Finished `dev` profile [optimized + debuginfo] target(s) in 0.03s
     Running `probe-rs run --chip=nRF52840_xxAA --allow-erase-all --log-format=oneline target/thumbv7em-none-eabihf/debug/panic`
      Erasing ✔ 100% [####################]  12.00 KiB @  18.71 KiB/s (took 1s)
  Programming ✔ 100% [####################]  12.00 KiB @  14.22 KiB/s (took 1s)
  Finished in 1.49s
00:00:00.000000 [ERROR] panicked at src/bin/panic.rs:30:13:
index out of bounds: the len is 3 but the index is 3 (radio_app src/lib.rs:8)
`dk::fail()` called; exiting ...
Frame 0: syscall1 @ 0x00000cac inline
       /Users/jonathan/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cortex-m-semihosting-0.5.0/src/lib.rs:201:13
Frame 1: report_exception @ 0x0000000000000caa inline
       /Users/jonathan/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cortex-m-semihosting-0.5.0/src/macros.rs:28:9
Frame 2: exit @ 0x0000000000000caa
       /Users/jonathan/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cortex-m-semihosting-0.5.0/src/debug.rs:74:25
Frame 3: fail @ 0x0000043e
       /Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/boards/dk/src/lib.rs:456:9
Frame 4: <unknown function @ 0x000c0a36> @ 0x000c0a36

In no_std programs the behavior of panic is defined using the #[panic_handler] attribute. In the example, the panic handler is defined in the radio-app/lib.rs file, but we can change it:

✅ Change radio-app/lib.rs and change the panic panic handler, like:

#[panic_handler]
fn panic(info: &core::panic::PanicInfo) -> ! {
    defmt::error!("Oops!! {}", defmt::Debug2Format(info));
    dk::fail();
}

Now run the program again. Try again, but without printing the info variable. Can you print info without defmt::Debug2Format(..) wrapped around it? Why not?

Using a Hardware Abstraction Layer

✅ Open the nrf52-code/radio-app/src/bin/led.rs file.

You'll see that it initializes your board using the dk crate:

let board = dk::init().unwrap();

This grants you access to the board's peripherals, like its LEDs.

The dk crate / library is a Board Support Package (BSP) tailored to this training to make accessing the peripherals used in this exercise as seamless as possible. You can find its source code at nrf52-code/boards/dk/src/.

dk is based on the nrf52840-hal crate, which is a Hardware Abstraction Layer (HAL) over the nRF52840 System on Chip. The purpose of a HAL is to abstract away the device-specific details of the hardware, for example registers, and instead expose a higher level API more suitable for application development.

The dk::init function we have been calling in all programs initializes a few of the nRF52840 peripherals and returns a Board structure that provides access to those peripherals. We'll first look at the Leds API.

✅ Run the led program. Two of the green LEDs on the board should turn on; the other two should stay off.

NOTE this program will not terminate itself. Within VS code you need to click "Kill terminal" (garbage bin icon) in the bottom panel to terminate it.

✅ Open the documentation for the dk crate by running the following command from the nrf52-code/radio-app folder:

cargo doc -p dk --open

✅ Check the API docs of the Led abstraction. Change the led program, so that the bottom two LEDs are turned on, and the top two are turned off.

🔎 If you want to see logs from Led API of the dk Board Support Package, flash the dk with the following environment variable:

DEFMT_LOG=trace cargo run --bin led

The logs will appear on your console, as the output of cargo run. Among the logs you'll find the line "I/O pins have been configured for digital output". At this point the electrical pins of the nRF52840 microcontroller have been configured to drive the 4 LEDs on the board.

After the dk::init logs you'll find logs about the Led API. As the logs indicate, an LED becomes active when the output of the pin is a logical zero, which is also referred as the "low" state. This "active low" configuration does not apply to all boards: it depends on how the pins have been wired to the LEDs. You should refer to the board documentation to find out which pins are connected to LEDs and whether "active low" or "active high" applies to it.

🔎 When writing your own embedded project, you can implement your own BSP similar to dk, or use the matching HAL crate for your chip directly. Check out awesome-embedded-rust if there's a BSP for the board you want to use, or a HAL crate for the chip you'd like to use.

Timers and Time

Next we'll look into the time related APIs exposed by the dk HAL.

✅ Open the nrf52-code/radio-app/src/bin/blinky.rs file.

This program will blink (turn on and off) one of the LEDs on the board. The time interval between each toggle operation is one second. This wait time between consecutive operations is generated by the blocking timer.wait operation. This function call will block the program execution for the specified Duration argument.

The other time related API exposed by the dk HAL is the dk::uptime function. This function returns the time that has elapsed since the call to the dk::init function. This function is used in the program to log the time of each LED toggle operation.

✅ Try changing the Duration value passed to Timer.wait. Try values larger than one second and smaller than one second. What values of Duration make the blinking imperceptible?

❗ If you set the duration to below 2ms, try removing the defmt::println! command in the loop. Too much logging will fill the logging buffer and cause the loop to slow down, resulting in the blink frequency to reduce after a while.

nRF52840 Dongle

Next, we'll look into the radio API exposed by the dk HAL. But before that we'll need to set up the nRF52840 Dongle.

From this section on, we'll use the nRF52840 Dongle in addition to the nRF52840 DK. We'll run some pre-compiled programs on the Dongle and write programs for the DK that will interact with the Dongle over a radio link.

💬 How to find the buttons on the Dongle: Put the Dongle in front of you, so that the side with the parts mounted on faces up. Rotate it, so that the narrower part of the board, the surface USB connector, faces away from you. The Dongle has two buttons. They are next to each other in the lower left corner of the Dongle. The reset button (RESET) is mounted sideways, it's square shaped button faces you. Further away from you is the round-ish user button (SW1), which faces up.

The Dongle does not contain an on-board debugger, like the DK, so we cannot use probe-rs tools to write programs into it. Instead, the Dongle's stock firmware comes with a bootloader.

When put in bootloader mode the Dongle will run a bootloader program instead of the last application that was flashed into it. This bootloader program will make the Dongle show up as a USB CDC ACM device (AKA Serial over USB device) that accepts new application images over this interface. We'll use the nrfdfu tool to communicate with the bootloader-mode Dongle and flash new images into it.

✅ Connect the Dongle to your computer. Put the Dongle in bootloader mode by pressing its reset button.

When the Dongle is in bootloader mode its red LED will pulsate. The Dongle will also appear as a USB CDC ACM device with vendor ID 0x1915 and product ID 0x521f.

You can also use cyme, a cross-platform version of the lsusb tool, to check out the status of the Dongle.

✅ Run cyme to list all USB devices.

Output should look like this:

$ cyme
(..)
  2  16  0x1915 0x521f Open DFU Bootloader      E1550CA275E7      12.0 Mb/s
(..)

The first two values depend on your host computer and which USB port you used, so they will be different for you. The hex-string is the device's unique ID and that will also be different.

Now that the device is in bootloader mode, you need to get the Dongle Firmware.

❗️ This firmware will not be found in the git checkout - you need to get it from https://github.com/ferrous-systems/rust-exercises/releases.

If you have downloaded and unpacked the complete rust-exercises release zip file, the firmware will be in the nrf52-code/boards/dongle-fw directory.
If not, you can download the individual firmware files from the releases page. You need puzzle-fw and loopback-fw.

For the next section you'll need to flash the loopback-fw file onto the Dongle.

✅ Change to the directory where the loopback-fw file is located and run:

nrfdfu ./loopback-fw

Expected output:

[INFO  nrfdfu] Sending init packet...
[INFO  nrfdfu] Sending firmware image of size 37328...
[INFO  nrfdfu] Done.

After the device has been programmed it will automatically reset and start running the new application.

🔎 Alternatively, you can also use nordic's own nrfutil tool to convert a .hex file and flash it for you, among many other things nrfutil is a very powerful tool, but also unstable at times, which is why we replaced the parts we needed from it with nrfdfu.

🔎 The loopback application will make the Dongle enumerate itself as a CDC ACM device.

✅ Run cyme to see the newly enumerated Dongle in the output:

$ cyme
(..)
  2  16  0x1209 0x0002 Dongle Loopback          -                 12.0 Mb/s

The loopback app will log messages over the USB interface. To display these messages on the host we have provided a cross-platform tool: cargo xtask serial-term.

❗ Do not use serial terminal emulators like minicom or screen. They use the USB TTY ACM interface in a slightly different manner and may result in data loss.

✅ Run cargo xtask serial-term from the root of the extracted tarball / git checkout. It shows you the logging output the Dongle is sending on its serial interface to your computer. This helps you monitor what's going on at the Dongle and debug connection issues. Start with the Dongle unplugged and you should see the following output:

$ cargo xtask serial-term
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `xtask/target/debug/xtask serial-term`
(waiting for the Dongle to be connected)
deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw

This line is printed by the loopback app on boot. It contains the device ID of the dongle, a 64-bit unique identifier (so everyone will see a different number); the radio channel that the device will use to communicate; and the transmission power of the radio in dBm.

If you don't get any output from cargo xtask serial-term check the USB dongle troubleshooting section.

Interference

At this point you should not get more output from cargo xtask serial-term.

❗If you get "received N bytes" lines in output like this:

$ cargo xtask serial-term
deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw
received 7 bytes (CRC=Ok(0x2459), LQI=0)
received 5 bytes (CRC=Ok(0xdad9), LQI=0)
received 6 bytes (CRC=Ok(0x72bb), LQI=0)

That means the device is observing interference traffic, likely from 2.4 GHz Zigbee, WiFi or Bluetooth. In this scenario you should switch the listening channel to one where you don't observe interference. Use the cargo xtask change-channel tool to do this in a second window. The tool takes a single argument: the new listening channel which must be in the range 11-26.

$ cargo xtask change-channel 11
requested channel change to channel 11

Then you should see new output from cargo xtask serial-term:

deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw
(..)
now listening on channel 11

Leave the Dongle connected and cargo xtask serial-term running. Now we'll switch back to the Development Kit. Note that if you remove and re-insert the dongle, it goes back to its default channel of 20.

Radio Out

In this section you'll send radio packets from the DK to the Dongle and get familiar with the different settings of the radio API.

Radio Setup

✅ Open the nrf52-code/radio-app/src/bin/radio-send.rs file.

✅ First run the program radio-send.rs as it is. You should see new output in the output of cargo xtask serial-term, if you left your Dongle on channel 20. If you change your Dongle's channel to avoid interference, change to the channel to match in radio-send.rs before you run it.

$ cargo xtask serial-term
deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw
received 5 bytes (CRC=Ok(0xdad9), LQI=53)

The program broadcasts a radio packet that contains the 5-byte string Hello over channel 20 (which has a center frequency of 2450 MHz). The loopback program running on the Dongle is listening to all packets sent over channel 20; every time it receives a new packet it reports its length and the Link Quality Indicator (LQI) metric of the transmission over the USB/serial interface. As the name implies the LQI metric indicates how good the connection between the sender and the receiver is (a higher number means better quality).

Because of how our firmware generates a semihosting exception to tell our flashing tool (probe-run) when the firmware has finished running, if you load the radio-send firmware and then power-cycle the nRF52840-DK, the firmware will enter a reboot loop and repeatedly send a packet. This is because nothing catches the semihosting exception and so the CPU reboots, sends a packet, and then tries another semihosting exception.

Messages

In radio-send.rs we propose three different ways to define the bytes we want to send to the radio:

#![allow(unused)]
fn main() {
let msg: &[u8; 5] = &[72, 101, 108, 108, 111];
let msg: &[u8; 5] = &[b'H', b'e', b'l', b'l', b'o'];
let msg: &[u8; 5] = b"Hello";
}

Here, we explain the different types.

Slices

The send method takes a reference -- in Rust, a reference (&) is a non-null pointer that's compile-time known to point into valid (e.g. non-freed) memory -- to a Packet as argument. A Packet is a stack-allocated, fixed-size buffer. You can fill the Packet (buffer) with data using the copy_from_slice method -- this will overwrite previously stored data.

This copy_from_slice method takes a slice of bytes (&[u8]). A slice is a reference into a list of elements stored in contiguous memory. One way to create a slice is to take a reference to an array, a fixed-size list of elements stored in contiguous memory.

#![allow(unused)]
fn main() {
// stack allocated array
let array: [u8; 3] = [0, 1, 2];

let ref_to_array: &[u8; 3] = &array;
let slice: &[u8] = &array;
}

slice and ref_to_array are constructed in the same way but have different types. ref_to_array is represented in memory as a single pointer (1 word / 4 bytes); slice is represented as a pointer + length (2 words, or 8 bytes).

Because slices track length at runtime rather than in their type they can point to chunks of memory of any length.

let array1: [u8; 3] = [0, 1, 2];
let array2: [u8; 4] = [0, 1, 2, 3];

let mut slice: &[u8] = &array1;
defmt::println!("{:?}", slice); // length = 3

// now point to the other array
slice = &array2;
defmt::println!("{:?}", slice); // length = 4

Byte literals

In the example we sent the list of bytes: [72, 101, 108, 108, 111], which can be interpreted as the string "Hello". To see why this is the case check this list of printable ASCII characters. You'll see that letter H is represented by the (single-byte) value 72, e by 101, etc.

Rust provides a more convenient way to write ASCII characters: byte literals. b'H' is syntactic sugar for the literal 72u8, b'e' is equivalent to 101u8, etc.. So we can rewrite [72, 101, 108, 108, 111] as [b'H', b'e', b'l', b'l', b'o']. Note that byte literals can also represent u8 values that are not printable ASCII characters: those values are written using escaped sequences like b'\x7F', which is equivalent to 0x7F.

Byte string literals

[b'H', b'e', b'l', b'l', b'o'] can be further rewritten as b"Hello". This is called a byte string literal (note that unlike a string literal like "Hello" this one has a b before the opening double quote). A byte string literal is a series of byte literals (u8 values); these literals have type &[u8; N] where N is the number of byte literals in the string.

Because byte string literals are references you need to dereference them to get an array type.

#![allow(unused)]
fn main() {
let reftoarray: &[u8; 2] = b"Hi";

// these two are equivalent
let array1:  [u8; 2] = [b'H', b'i'];
let array2:  [u8; 2] = *b"Hi";
//          ^          ^ dereference
}

Or if you want to go the other way around: you need to take a reference to an array to get the same type as a byte string literal.

#![allow(unused)]
fn main() {
// these two are equivalent
let reftoarray1: &[u8; 2] = b"Hi";
let reftoarray2: &[u8; 2] = &[b'H', b'i'];
//               ^          ^
}

Character constraints in byte string vs. string literals

You can encode text as b"Hello" or as "Hello".

b"Hello" is by definition a string (series) of byte literals so each character has to be a byte literal like b'A' or b'\x7f'. You cannot use "Unicode characters" (char type) like emoji or CJK (Chinese Japanese Korean) in byte string literals.

On the other hand, "Hello" is a string literal with type &str. str strings in Rust contain UTF-8 data so these string literals can contain CJK characters, emoji, Greek letters, Cyrillic script, etc.

Printing strings and characters

In this exercise we'll work with ASCII strings so byte string literals that contain no escaped characters are OK to use as packet payloads.

You'll note that defmt::println!("{:?}", b"Hello") will print [72, 101, 108, 108, 111] rather than "Hello" and that the {} format specifier (Display) does not work. This is because the type of the literal is &[u8; N] and in Rust this type means "bytes"; those bytes could be ASCII data, UTF-8 data or something else.

To print this you'll need to convert the slice &[u8] into a string (&str) using the core::str::from_utf8 function. This function will verify that the slice contains well formed UTF-8 data and interpret it as a UTF-8 string (&str). We were careful to ensure that our three example messages were the same, and were all valid UTF-8, so we expect the conversion to always succeed. Why not try and see which bytes cause this conversion to fail?

Something similar will happen with byte literals: defmt::println!("{}", b'A') will print 65 rather than A. To get the A output you can cast the byte literal (u8 value) to the char type: defmt::println!("{}", b'A' as char).

Link Quality Indicator (LQI)

received 7 bytes (CRC=Ok(0x2459), LQI=60)

✅ Now run the radio-send program several times with different variations to explore how LQI can be influenced

change the distance between the Dongle and the DK -- move the DK closer to or further away from the Dongle.
change the transmit power
change the channel
change the length of the packet
different combinations of all of the above

Take note of how LQI changes with these changes. Does packet loss occur in any of these configurations?

NOTE if you decide to send many packets in a single program then you should use the Timer API to insert a delay of at least five milliseconds between the transmissions. This is required because the Dongle will use the radio medium right after it receives a packet. Not including the delay will result in the Dongle missing packets

802.15.4 radios are often used in mesh networks like Wireless Sensors Networks (WSN). The devices, or nodes, in these networks can be mobile so the distance between nodes can change in time. To prevent a link between two nodes getting broken due to mobility the LQI metric is used to decide the transmission power -- if the metric degrades power should be increased, etc. At the same time, the nodes in these networks often need to be power efficient (e.g. are battery powered) so the transmission power is often set as low as possible -- again the LQI metric is used to pick an adequate transmission power.

🔎 802.15.4 compatibility

The radio API we are using follows the PHY layer of the IEEE 802.15.4 specification, but it's missing MAC level features like addressing (each device gets its own address), opt-in acknowledgment (a transmitted packet must be acknowledged with a response acknowledgment packet; the packet is re-transmitted if the packet is not acknowledged in time). These MAC level features are not implemented in hardware (in the nRF52840 Radio peripheral) so they would need to be implemented in software to be fully IEEE 802.15.4 compliant.

This is not an issue for these exercises but it's something to consider if you would like to continue from here and build a 802.15.4 compliant network API.

Radio In

In this section we'll explore the recv_timeout method of the Radio API. As the name implies, this is used to listen for packets. The method will block the program execution until a packet is received or the specified timeout has expired. We'll continue to use the Dongle in this section; it should be running the loopback application; and cargo xtask serial-term should also be running in the background.

The loopback application running on the Dongle will broadcast a radio packet after receiving one over channel 20. The contents of this outgoing packet will be the contents of the received one but reversed.

✅ Open the nrf52-code/radio-app/src/bin/radio-recv.rs file. Make sure that the Dongle and the Radio are set to the same channel. Click the "Run" button.

The Dongle does not inspect the contents of your packet and does not require them to be ASCII, or UTF-8. It will simply send a packet back containing the same bytes it received, except the bytes will be in reverse order to how you sent it.

That is, if you send b"olleh", it will send back b"hello".

The Dongle will respond as soon as it receives a packet. If you insert a delay between the send operation and the recv operation in the radio-recv program this will result in the DK not seeing the Dongle's response. So try this:

✅ Add a timer.wait(x) call before the recv_timeout call, where x is core::time::Duration; try different lengths of time for x and observe what happens.

Having log statements between send and recv_timeout can also cause packets to be missed so try to keep those two calls as close to each other as possible and with as little code in between as possible.

NOTE Packet loss can always occur in wireless networks, even if the radios are close to each other. The Radio API we are using will not detect lost packets because it does not implement IEEE 802.15.4 Acknowledgement Requests. For the next step in the exercise, we will use a new function to handle this for us. For the sake of other radio users, please do ensure you never call send() in a tight loop!

Radio Puzzle

illustration showing that you send plaintext and the dongle responds with ciphertext

Your task in this section is to decrypt the substitution cipher encrypted ASCII string stored in the Dongle using one of the stack-allocated maps in the heapless crate. The string has been encrypted using simple substitution.

Preparing the Dongle

✅ Flash the puzzle-fw program on the Dongle. Follow the instructions from the nRF52840 Dongle section but flash the puzzle-fw program instead of the loopback-fw one -- don't forget to put the Dongle in bootloader mode (pressing the reset button) before invoking nrfdfu.

Like in the previous sections the Dongle will listen for radio packets -- this time over channel 25 -- while also logging messages over a USB/serial interface. It also prints a . periodically so you know it's still alive.

Sending Messages and Receiving the Dongle's Responses

✅ Open the nrf52-code/radio-app folder in VS Code; then open the src/bin/radio-puzzle.rs file. Run the program.

This will send a zero sized packet let msg = b"" to the dongle. It does this using a special function called dk::send_recv. This function will:

Determine a unique address for your nRF52840 (Nordic helpfully bake a different random address into every nRF52 chip they make)
Construct a packet where the first six bytes are the unique address, and the remainder are the ones you passed to the send_recv() function
Use the Radio::send() method to wait for the channel to be clear (using a Clear Channel Assessment) before actually sending the packet
Use the Radio::recv_timeout() method to wait for a reply, up to the given number of microseconds specified
Check that the first six bytes in the reply match our six byte address a. If so, the remainder of the reply is returned as the Ok variant b. Otherwise, increment a retry counter and, if we have run out of retry attempts, we return the Err variant c. Otherwise, we go back to step 2 and try again.

This function allows communication with the USB dongle to be relatively robust, even in the presence of other devices on the same channel. However, it's not perfect and sometimes you will run out of retry attempts and your program will need to be restarted.

❗ The Dongle responds to the DK's requests wirelessly (i.e. by sending back radio packets) as well. You'll see the dongle responses printed by the DK. This means you don't have to worry if serial-term doesn't work on your machine.

✅ Try sending one-byte sized packets. ✅ Try sending longer packets.

What happens?

Answer

The Dongle will respond differently depending on the length of the payload in the incoming packet:

On zero-sized payloads (i.e. packets that only contain the device address and nothing else) it will respond with the encrypted string.
On one-byte sized payloads it will respond with the direct mapping from the given plaintext letter (single u8 value) to the corresponding ciphertext letter (another u8 value).
On payloads of any other length the Dongle will respond with the string correct if it received the correct secret string, otherwise it will respond with the string incorrect.

The Dongle will always respond with payloads that are valid UTF-8 so you can use str::from_utf8 on the response packets. However, do not attempt to look inside the raw packet, as it will contain six random address bytes at the start, and they will not be valid UTF-8. Only look at the &[u8] that the send_recv() function returns, and treat the Packet as just a storage area that you don't look inside.

This step is illustrated in src/bin/radio-puzzle-1.rs

From here on, the exercise can be solved in multiple ways. If you have an idea on how to go from here and what tools to use, you can work on your own. If you don't have an idea what to do next or what tools to use, we'll provide a guide on the next page.

Help

Use a dictionary

Our suggestion is to use a dictionary / map. std::collections::HashMap is not available in no_std code (it requires a secure random number generator to prevent collision attacks) but you can use one of the stack-allocated maps in the heapless crate. It supplies a stack-allocated, fixed-capacity version of the std::Vec type which will come in handy to store byte arrays. To store character mappings we recommend using a heapless::LinearMap.

heapless is already declared as a dependency in the Cargo.toml of the project so you can directly import it into the application code using a use statement.

use heapless::Vec;       // like `std::Vec` but stack-allocated
use heapless::LinearMap; // a dictionary / map

fn main() {
    // A hash map with a capacity of 16 `(u8, u8)` key-value pairs allocated on the stack
    let mut my_map = LinearMap::<u8, u8, 16>::new();
    my_map.insert(b'A', b'~').unwrap();

    // A vector with a fixed capacity of 8 `u8` elements allocated on the stack
    let mut my_vec = Vec::<u8, 8>::new();
    my_vec.push(b'A').unwrap();
}

If you haven't used a stack-allocated collection before note that you'll need to specify the capacity of the collection as a const-generic parameter. The larger the value, the more memory the collection takes up on the stack. The heapless::LinearMap documentation of the heapless crate has some usage examples, as does the heapless::Vec documentation.

Note the difference between character literals and byte literals!

Something you will likely run into while solving this exercise are character literals ('c') and byte literals (b'c'). The former has type char and represent a single Unicode "scalar value". The latter has type u8 (1-byte integer) and it's mainly a convenience for getting the value of ASCII characters, for instance b'A' is the same as the 65u8 literal.

IMPORTANT you do not need to use the str or char API to solve this problem, other than for printing purposes. Work directly with slices of bytes ([u8]) and bytes (u8); and only convert those to str or char when you are about to print them.

Note: The plaintext secret string is not stored in puzzle-fw so running strings on it will not give you the answer. Nice try.

Make sure not to flood the log buffer

When you log more messages than can be moved from the probe to the target, the log buffer will get overwritten, resulting in data loss. This can easily happen when you repeatedly poll the dongle and log the result. The quickest solution to this is to wait a short while until you send the next packet so that the logs can be processed in the meantime.

use core::time::Duration;

#[entry]
fn main() -> ! {

    let mut timer = board.timer;

    for plainletter in 0..=127 {
        /* ... send letter to dongle ... */
        defmt::println!("got response");
        /* ... store output ... */

        timer.wait(Duration::from_millis(20));
    }
}

Recommended Steps

Each step is demonstrated in a separate example so if for example you only need a quick reference of how to use the map API you can look at step / example number 2 and ignore the others.

Send a one letter packet (e.g. A) to the radio to get a feel for how the mapping works. Then do a few more letters. See src/bin/radio-puzzle-1.rs.
Get familiar with the dictionary API. Do some insertions and look ups. What happens if the dictionary gets full? See src/bin/radio-puzzle-2.rs.
Next, get mappings from the radio and insert them into the dictionary. See src/bin/radio-puzzle-3.rs.
You'll probably want a buffer to place the plaintext in. We suggest using heapless::Vec for this. See src/bin/radio-puzzle-4.rs for a starting-point (NB It is also possible to decrypt the packet in place).
Simulate decryption: fetch the encrypted string and "process" each of its bytes. See src/bin/radio-puzzle-5.rs.
Now merge steps 3 and 5: build a dictionary, retrieve the secret string and do the reverse mapping to decrypt the message. See src/bin/radio-puzzle-6.rs.
As a final step, send the decrypted string to the Dongle and check if it was correct or not. See src/bin/radio-puzzle-7.rs.

For your reference, we have provided a complete solution in the src/bin/radio-puzzle-solution.rs file. That solution is based on the seven steps outlined above. Did you solve the puzzle in a different way?

All finished? See the next steps.

Next Steps

If you've already completed the main exercise tasks or would like to explore more on your own this section has some suggestions.

Alternative containers

Modify-in-place

If you solved the puzzle using a Vec buffer you can try solving it without the buffer as a stretch goal. You may find the slice methods that let you mutate a Packet's data useful, but remember that the first six bytes of your Packet will be the random device address - you can't decrypt those! A solution that does not use a heapless:Vec buffer can be found in the src/bin/radio-puzzle-solution-2.rs file.

Using `liballoc::BTreeMap`

If you solved the puzzle using a heapless::Vec buffer and a heapless::LinearMap and you still need something else to try, you could look at the Vec and BTreeMap types contained within liballoc. This will require you to set up a global memory allocator, like embedded-alloc.

Collision avoidance

In this section you'll test the collision avoidance feature of the IEEE 802.15.4 radio used by the Dongle and DK.

If you check the API documentation of the Radio abstraction we have been using you'll notice that we haven't used these methods: energy_detection_scan(), set_cca() and try_send().

The first method scans the currently selected channel (see set_channel()), measures the energy level of ongoing radio communication in this channel and returns the maximum energy observed over a span of time. This method can be used to determine what the idle energy level of a channel is. If there's non-IEEE 802.15.4 traffic on this channel the method will return a high value.

Under the 802.15.4 specification, before sending a data packet devices must first check if there's communication going on in the channel. This process is known as Clear Channel Assessment (CCA). The send method we have been used performs CCA in a loop and sends the packet only when the channel appears to be idle. The try_send method performs CCA once and returns the Err variant if the channel appears to be busy. In this failure scenario the device does not send any packet.

The Radio abstraction supports 2 CCA modes: CarrierSense and EnergyDetection. CarrierSense is the default CCA mode and what we have been using in this exercise. CarrierSense will only look for ongoing 802.15.4 traffic in the channel but ignore other traffic like 2.4 GHz WiFi and Bluetooth. The EnergyDetection method is able to detect ongoing non-802.15.4 traffic.

Here are some things for you to try out:

First, read the section 6.20.12.4 of the nRF52840 Product Specification, which covers the nRF52840's implementation of CCA.
Disconnect the Dongle. Write a program for the DK that scans and reports the energy levels of all valid 802.15.4 channels. In your location which channels have high energy levels when there's no ongoing 802.15.4 traffic? If you can, use an application like WiFi Analyzer to see which WiFi channels are in use in your location. Compare the output of WiFiAnalyzer to the values you got from energy_detection_scan. Is there a correspondence? Note that WiFi channels don't match in frequency with 802.15.4 channels; some mapping is required to convert between them -- check this illustration for more details about co-existence of 802.15.4 and WiFi.
Choose the channel with the highest idle energy. Now write a program on the DK that sets the CCA mode to EnergyDetection and then send a packet over this channel using try_send. The EnergyDetection CCA mode requires a Energy Detection (ED) "threshold" value. Try different threshold values. What threshold value makes the try_send succeed?
Repeat the previous experiment but use the channel with the lowest idle energy.
Pick the channel with the lowest idle energy. Run the loopback app on the Dongle and set its listening channel to the chosen channel. Modify the DK program to perform a send operation immediately followed by a try_send operation. The try_send operation will collide with the response of the Dongle (remember: the Dongle responds to all incoming packets after a 5ms delay - see the loopback-fw program for details). Find a ED threshold that detects this collision and makes try_send return the Err variant.

Interrupt handling

If we haven't covered interrupt handling in your training, the cortex-m-rt crate provides attributes to declare exception and interrupt handlers: #[exception] and #[interrupt]. You can find documentation about these attributes and how to safely share data with interrupt handlers using Mutexes in the "Concurrency" chapter of the Embedded Rust book.

Another way to deal with interrupts is to use a framework like Real-Time Interrupt-driven Concurrency (RTIC); this framework has a book that explains how you can build reactive applications using interrupts. We use this framework in the "USB" exercise.

Starting a Project from Scratch

So far we have been using a pre-made Cargo project to work with the nRF52840 DK. In this section we'll see how to create a new embedded project for any microcontroller.

Identify the microcontroller

The first step is to identify the microcontroller you'll be working with. The information about the microcontroller you'll need is:

1. Its processor architecture and sub-architecture

This information should be in the device's data sheet or manual. In the case of the nRF52840, the processor is an ARM Cortex-M4 core. With this information you'll need to select a compatible compilation target. rustup target list will show all the supported compilation targets.

$ rustup target list
(..)
thumbv6m-none-eabi
thumbv7em-none-eabi
thumbv7em-none-eabihf
thumbv7m-none-eabi
thumbv8m.base-none-eabi
thumbv8m.main-none-eabi
thumbv8m.main-none-eabihf

The compilation targets will usually be named using the following format: $ARCHITECTURE-$VENDOR-$OS-$ABI, where the $VENDOR field is sometimes omitted. Bare metal and no_std targets, like microcontrollers, will often use none for the $OS field. When the $ABI field ends in hf it indicates that the output ELF uses the hardfloat Application Binary Interface (ABI).

The thumb targets listed above are all the currently supported ARM Cortex-M targets. The table below shows the mapping between compilation targets and ARM Cortex-M processors.

Compilation target	Processor
`thumbv6m-none-eabi`	ARM Cortex-M0, ARM Cortex-M0+
`thumbv7m-none-eabi`	ARM Cortex-M3
`thumbv7em-none-eabi`	ARM Cortex-M4, ARM Cortex-M7
`thumbv7em-none-eabihf`	ARM Cortex-M4F, ARM Cortex-M7F
`thumbv8m.base-none-eabi`	ARM Cortex-M23
`thumbv8m.main-none-eabi`	ARM Cortex-M33, ARM Cortex-M35P
`thumbv8m.main-none-eabihf`	ARM Cortex-M33F, ARM Cortex-M35PF

The ARM Cortex-M ISA is backwards compatible so for example you could compile a program using the thumbv6m-none-eabi target and run it on an ARM Cortex-M4 microcontroller. This will work but using the thumbv7em-none-eabi results in better performance (ARMv7-M instructions will be emitted by the compiler) so it should be preferred. The older ISAs may also be limited in terms of the maximum number of interrupts you can define, which maybe be fewer than your newer microcontroller actually has.

2. Its memory layout

In particular, you need to identify how much Flash and RAM memory the device has and at which address the memory is exposed. You'll find this information in the device's data sheet or reference manual.

In the case of the nRF52840, this information is in section 4.2 (Figure 2) of its Product Specification. It has:

1 MB of Flash that spans the address range: 0x0000_0000 - 0x0010_0000.
256 KB of RAM that spans the address range: 0x2000_0000 - 0x2004_0000.

The `cortex-m-quickstart` project template

With all this information you'll be able to build programs for the target device. The cortex-m-quickstart project template provides the most frictionless way to start a new project for the ARM Cortex-M architecture -- for other architectures check out other project templates by the rust-embedded organization.

The recommended way to use the quickstart template is through the cargo-generate tool:

cargo generate --git https://github.com/rust-embedded/cortex-m-quickstart

But it may be difficult to install the cargo-generate tool on Windows due to its libgit2 (C library) dependency. Another option is to download a snapshot of the quickstart template from GitHub and then fill in the placeholders in Cargo.toml of the snapshot.

Once you have instantiated a project using the template you'll need to fill in the device-specific information you collected in the two previous steps:

1. Change the default compilation target in `.cargo/config`

[build]
target = "thumbv7em-none-eabi"

For the nRF52840 you can choose either thumbv7em-none-eabi or thumbv7em-none-eabihf. If you are going to use the FPU then select the hf variant.

2. Enter the memory layout of the chip in `memory.x`

MEMORY
{
  /* NOTE 1 K = 1 KiBi = 1024 bytes */
  FLASH : ORIGIN = 0x00000000, LENGTH = 1M
  RAM : ORIGIN = 0x20000000, LENGTH = 256K
}

3. `cargo build` now will cross compile programs for your target device

If there's no template or signs of support for a particular architecture under the rust-embedded organization then you can follow the embedonomicon to bootstrap support for the new architecture by yourself.

Flashing the program

To flash the program on the target device you'll need to identify the on-board debugger, if the development board has one. Or choose an external debugger, if the development board exposes a JTAG or SWD interface via some connector.

If the hardware debugger is supported by the probe-rs project -- for example J-Link, ST-Link or CMSIS-DAP -- then you'll be able to use probe-rs-based tools like probe-rs and cargo-embed. This is the case of the nRF52840 DK: it has an on-board J-Link probe.

If the debugger is not supported by probe-rs then you'll need to use OpenOCD or vendor provided software to flash programs on the board.

If the board does not expose a JTAG, SWD or similar interface then the microcontroller probably comes with a bootloader as part of its stock firmware. In that case you'll need to use dfu-util or a vendor specific tool like nrfdfu or nrfutil to flash programs onto the chip. This is the case of the nRF52840 Dongle.

Getting output

If you are using one of the probes supported by probe-rs then you can use the rtt-target library to get text output on cargo-embed. The logging functionality we used in the examples is implemented using the rtt-target crate.

If that's not the case or there's no debugger on board then you'll need to add a HAL before you can get text output from the board.

Adding a Hardware Abstraction Layer (HAL)

Now you can hopefully run programs and get output from them. To use the hardware features of the device you'll need to add a HAL to your list of dependencies. crates.io, lib.rs and awesome embedded Rust are good places to search for HALs.

After you find a HAL you'll want to get familiar with its API through its API docs and examples. HAL do not always expose the exact same API, specially when it comes to initialization and configuration of peripherals. However, most HAL will implement the embedded-hal traits. These traits allow inter-operation between the HAL and driver crates. These driver crates provide functionality to interface external devices like sensors, actuators and radios over interfaces like I2C and SPI.

If no HAL is available for your device then you'll need to build one yourself. This is usually done by first generating a Peripheral Access Crate (PAC) from a System View Description (SVD) file using the svd2rust tool. The PAC exposes a low level, but type safe, API to modify the registers on the device. Once you have a PAC you can use of the many HALs on crates.io as a reference; most of them are implemented on top of svd2rust-generated PACs.

Hello, 💡

Now that you've set up your own project from scratch, you could start playing around with it by turning on one of the DK's on-board LEDs using only the HAL. Some hints that might be helpful there:

The Nordic Infocenter tells you which LED is connected to which pin.

nRF52 HAL Exercise

In this exercise you'll learn to:

use a HAL to provide features in a BSP
configure GPIO pins using the nRF52 HAL

To test your BSP changes, you will modify a small example: hal-app/src/bin/blinky.rs

You will need an nRF52840 Development Kit for this exercise, but not the nRF USB dongle.

If you haven't completed the Radio Exercise, you should start there, and go at least as far as completing the "Timers and Time" section.

The nRF52840 Development Kit

This is the larger development board.

Adding Buttons

To practice using a HAL to provide functionality through a Board Support Package, you will now modify the dk crate to add support for Buttons.

Change the demo app

✅ Change the hal-app/src/bin/buttons.rs file as described within, so it looks for button presses.

It should now fail to compile, because the dk crate doesn't have support for buttons. You will now fix that!

Define a Button

✅ Open up the dk crate in VS Code (nrf52-code/boards/dk) and open src/lib.rs.

✅ Add a struct Button which represents a single button.

It should be similar to struct Led, except the inner type must be Pin<Input<PullUp>>. You will need to import those types - look where Output and PushPull types were imported from for clues! Think about where it makes sense to add this new type. At the top? At the bottom? Maybe just after to the LED related types?

🔎 The pins must be set as pull-ups is because each button connects a GPIO pin to ground, but the pins float when the button is not pressed. Enabling the pull-ups inside the SoC ensure that the GPIO pin is weakly connected to 3.3V through a resistor, giving it a 'default' value of 'high'. Pressing the button then makes the pin go 'low.

Define all the Buttons

✅ Add a struct Buttons which contains four buttons.

Use struct Leds for guidance. Add a buttons field to struct Board which is of type Buttons. Again, think about where it makes sense to insert this new field.

Set up the buttons

Now the Board struct initialiser is complaining you didn't initialise the new buttons field.

✅ Take pins from the HAL, configure them as inputs with pull-ups, and install them into the Buttons structure.

The mapping is:

Button 1: P0.11
Button 2: P0.12
Button 3: P0.24
Button 4: P0.25

You can verify this in the User Guide.

Run your program

✅ Run the buttons demo:

cd nrf52-code/hal-app
cargo run --bin buttons

Now when you press the button, the LED should illuminate. If it does the opposite, check your working!

Write a more interesting demo program for the BSP

✅ You've got four buttons and four LEDs. Make up a demo!

If you're stuck for ideas, you could have the LEDs do some kind of animation. The buttons might then stop or start the animation, or make it go faster or slower. Try setting up a loop with a 20ms delay inside it, to give yourself a basic 50 Hz "game tick". You can look at the blinky demo for help with the timer.

Troubleshooting

🔎 If you get totally stuck, ask for help! If all else fails, you could peek in nrf52-code/boards/dk-solution, which has a complete set of the required BSP changes.

nRF52 USB Exercise

In this exercise you'll learn to:

work with registers and peripherals from Rust
handle external events in embedded Rust applications using RTIC
debug event driven applications
test no_std code

To put these concepts and techniques in practice you'll write a toy USB device application that gets enumerated and configured by the host. This embedded application will run in a fully event driven fashion: only doing work when the host asks for it.

You will need an nRF52840 Development Kit for this exercise, but not the nRF USB dongle.

The nRF52840 Development Kit

Exercise Steps

You will need to complete the exercise steps in order. It's OK if you don't get them all finished, but you must complete one before starting the next one. You can look at the solution for each step if you get stuck.

If you are reading the book view, the steps are listed on the left in the sidebar (use the hamburger if that is hidden). If you are reading the source on Github, go back to the SUMMARY.md file to see the steps.

Listing USB Devices

As we showed in Preparation/Software Tools, we can use cyme to list USB devices on our system.

✅ To list all USB devices, run cyme from the top-level checkout.

$ cyme
(...) random other USB devices will be listed
  2  15  0x1366 0x1051 J-Link                   001050255503      12.0 Mb/s

The goal of this exercise is to get the nRF52840 SoC to show in this list. The embedded application will use the USB Vendor ID (VID) 0x1209 and USB Product ID (PID) 0x0001, as defined in nrf52-code/consts:

$ cyme
(...) random other USB devices will be listed
  2  15  0x1366 0x1051 J-Link                   001050255503      12.0 Mb/s
  2  16  0x1209 0x0001 composite_device         -                 12.0 Mb/s

Hello, world!

In this section, we'll set up the integration in VS Code and run the first program.

✅ Open the nrf52-code/usb-app folder in VS Code and open the src/bin/hello.rs file.

Note: To ensure full rust-analyzer support, do not open the whole rust-exercises folder.

Give rust-analyzer some time to analyze the file and its dependency graph. When it's done, a "Run" button will appear over the main function. If it doesn't appear on its own, type something in the file, delete and save. This should trigger a re-load.

✅ Click the "Run" button to run the application on the microcontroller.

If you are not using VS code run the cargo run --bin hello command from the nrf52-code/usb-app folder.

NOTE: Recent version of the nRF52840-DK have flash-read-out protection to stop people dumping the contents of flash on an nRF52 they received pre-programmed, so if you have problems immediately after first plugging your board in, see this page.

If you run into an error along the lines of "Debug power request failed" retry the operation and the error should disappear.

The usb-app package has been configured to cross-compile applications to the ARM Cortex-M architecture and then run them using the probe-rs custom Cargo runner. The probe-rs tool will load and run the embedded application on the microcontroller and collect logs from the microcontroller.

The probe-rs process will terminate when the microcontroller enters the "halted" state. From the embedded application, one can enter the "halted" state using by performing a CPU breakpoint with a special argument that indicates 'success'. For convenience, an exit function is provided in the dk Board Support Package (BSP). This function is divergent like std::process::exit (fn() -> !) and can be used to halt the device and terminate the probe-rs process.

Checking the API documentation

We'll be using the dk Board Support Package. It's good to have its API documentation handy. You can generate the documentation for that crate from the command line:

✅ Run the following command from within the nrf52-code/usb-app folder. It will open the generated documentation in your default web browser. Note that if you run it from inside the nrf52-code/boards/dk folder, you will find a bunch of USB-related documentation missing, because we disable that particular feature by default.

cargo doc --open

NOTE: If you are using Safari and the documentation is hard to read due to missing CSS, try opening it in a different browser.

✅ Browse to the documentation for the dk crate, and look at what is available within the usbd module. Some of these functions will be useful later.

RTIC hello

RTIC, Real-Time Interrupt-driven Concurrency, is a framework for building event-driven, time-sensitive applications.

✅ Open the nrf52-code/usb-app/src/bin/rtic-hello.rs file.

RTIC applications are written in RTIC's Domain Specific Language (DSL). The DSL extends Rust syntax with custom attributes like #[init] and #[idle].

RTIC makes a clearer distinction between the application's initialization phase, the #[init] function, and the application's main loop or main logic, the #[idle] function. The initialization phase runs with interrupts disabled and interrupts are re-enabled before the idle function is executed.

rtic::app is a procedural macro that generates extra Rust code, in addition to the user's functions. The fully expanded version of the macro can be found in the file target/rtic-expansion.rs. This file will contain the expansion of the procedural macro for the last compiled RTIC application.

✅ Build the rtic-hello example and look at the generated rtic-expansion.rs file.

You can use rustfmt on target/rtic-expansion.rs to make the generated code easier to read. Among other things, the file should contain the following lines. Note that interrupts are disabled during the execution of the init function:

#[doc(hidden)]
#[no_mangle]
unsafe extern "C" fn main() -> ! {
    rtic::export::interrupt::disable();
    let mut core: rtic::export::Peripherals = rtic::export::Peripherals::steal().into();
    #[inline(never)]
    fn __rtic_init_resources<F>(f: F)
    where
        F: FnOnce(),
    {
        f();
    }
    let mut executors_size = 0;
    extern "C" {
        pub static _stack_start: u32;
        pub static __ebss: u32;
    }
    let stack_start = &_stack_start as *const _ as u32;
    let ebss = &__ebss as *const _ as u32;
    if stack_start > ebss {
        if rtic::export::msp::read() <= ebss {
            panic!("Stack overflow after allocating executors");
        }
    }
    __rtic_init_resources(|| {
        let (shared_resources, local_resources) =
            init(init::Context::new(core.into(), executors_size));
        rtic::export::interrupt::enable();
    });
    idle(idle::Context::new())
}

Dealing with Registers

In this and the next section we'll look into RTIC's event handling features. To explore these features we'll use the action of connecting a USB cable to the DK's port J2 as the event we'd like to handle.

✅ Open the nrf52-code/usb-app/src/bin/events.rs file.

We'll read the code and explain what it does.

The example application enables the signaling of this "USB power" event in the init function. This is done using the low level register API generated by the svd2rust tool. The register API was generated from a SVD (System View Description) file, a file that describes all the peripherals and registers, and their memory layout, on a device. In our case the device was the nRF52840; a sample SVD file for this microcontroller can be found here.

In the svd2rust API, peripherals are represented as structs. The fields of each peripheral struct are the registers associated to that peripheral. Each register field exposes methods to read and write to the register in a single memory operation.

The read and write methods take closure arguments. These closures in turn grant access to a "constructor" value, usually named r or w, which provides methods to modify the bitfields of a register. At the same time the API of these "constructors" prevent you from modifying the reserved parts of the register: you cannot write arbitrary values into registers; you can only write valid values into registers.

Apart from the read and write methods there's a modify method that performs a read-modify-write operation on the register; this API is also closure-based. The svd2rust-generated API is documented in detail in the svd2rust crate starting at the Peripheral API section.

In Cortex-M devices interrupt handling needs to be enabled on two sides: on the peripheral side and on the core side. The register operations done in init take care of the peripheral side. The core side of the operation involves writing to the registers of the Nested Vector Interrupt Controller (NVIC) peripheral. This second part doesn't need to be done by the user in RTIC applications because the framework takes care of it.

Event Handling

Below the idle function you'll see a #[task] handler, a function. This task is bound to the POWER_CLOCK interrupt signal and will be executed, function-call style, every time the interrupt signal is raised by the hardware.

✅ Run the events application. Then connect a micro-USB cable to your PC/laptop then connect the other end to the DK (port J3). You'll see the "POWER event occurred" message after the cable is connected.

Note that all tasks will be prioritized over the idle function so the execution of idle will be interrupted (paused) by the on_power_event task. When the on_power_event task finishes (returns) the execution of the idle will be resumed. This will become more obvious in the next section.

Try this: add an infinite loop to the end of init so that it never returns. Now run the program and connect the USB cable. What behavior do you observe? How would you explain this behavior? (hint: look at the rtic-expansion.rs file: under what conditions is the init function executed?)

Task State

Now let's say we want to change the previous program to count how many times the USB cable (port J3) has been connected and disconnected.

✅ Open the nrf52-code/usb-app/src/bin/task-state.rs file.

Tasks run from start to finish, like functions, in response to events. To preserve some state between the different executions of a task we can add a resource to the task. In RTIC, resources are the mechanism used to share data between different tasks in a memory safe manner but they can also be used to hold task state.

To get the desired behavior we'll want to store some counter in the state of the on_power_event task.

The starter code shows the syntax to declare a resource, the Resources struct, and the syntax to associate a resource to a task, the resources list in the #[task] attribute.

In the starter code a resource is used to move (by value) the POWER peripheral from init to the on_power_event task. The POWER peripheral then becomes part of the state of the on_power_event task and can be persistently accessed throughout calls to on_power_event() through a reference. The resources of a task are available via the Context argument of the task.

To elaborate more on this move action: in the svd2rust API, peripheral types like POWER are singletons (only a single value of that type can ever exist). The consequence of this design is that holding a peripheral instance, like POWER, by value means that the function (or task) has exclusive access, or ownership, over the peripheral. This is the case of the init function: it owns the POWER peripheral but then transfers ownership over it to a task using the resource initialization mechanism.

We have moved the POWER peripheral into the task because we want to clear the USBDETECTED interrupt flag after it has been set by the hardware. If we miss this step the on_power_event task (function) will be called again once it returns and then again and again and again (ad infinitum).

Also note that in the starter code the idle function has been modified. Pay attention to the logs when you run the starter code.

✅ Modify the program so that it prints the number of times the USB cable has been connected to the DK every time the cable is connected, as shown below.

USBDETECTED interrupt enabled
idle: going to sleep
on_power_event: cable connected 1 time
idle: woke up
idle: going to sleep
on_power_event: cable connected 2 times
idle: woke up
idle: going to sleep
on_power_event: cable connected 3 times

You can find a solution to this exercise in the nrf52-code/usb-app-solutions/src/bin/task-state.rs file.

USB Enumeration

A USB device, like the nRF52840, can be one of these three states:

Default
Address
Configured

After being powered the device will start in the Default state. The enumeration process will take the device from the Default state to the Address state. As a result of the enumeration process the device will be assigned an address, in the range 1..=127, by the host.

The USB protocol is complex so we'll leave out many details and focus only on the concepts required to get enumeration and configuration working. There are also several USB specific terms so we recommend checking chapter 2, "Terms and Abbreviations", of the USB specification (linked at the bottom of this document) every now and then.

Each OS may perform the enumeration process slightly differently but the process will always involve these host actions:

A USB reset, to put the device in the Default state, regardless of what state it was in.
A GET_DESCRIPTOR request, to get the device descriptor.
A SET_ADDRESS request, to assign an address to the device.

These host actions will be perceived as events by the nRF52840 and these events will cause some bits to be set in the relevant register, and then an interrupt to be fired. During this exercise, we will gradually parse and handle these events and learn more about Embedded Rust along the way.

There are more USB concepts involved that we'll need to cover, like descriptors, configurations, interfaces and endpoints but for now let's see how to handle USB events.

For each step of the course, we've prepared a usb-<n>.rs file that gives you a base structure and hints on how to proceed. The matching usb-<n>.rs in usb-app-solutions contains a sample solution should you need it. Switch from usb-<n>.rs to usb-<n+1>.rs when instructed and continue working from there. Please keep the USB cable plugged into J3 through all these exercises.

USB-1: Dealing with USB Events

The USBD peripheral on the nRF52840 contains a series of registers, called EVENTS registers, that indicate the reason for entering the USBD interrupt handler. These events must be handled by the application to complete the enumeration process.

✅ Open the nrf52-code/usb-app/src/bin/usb-1.rs file.

In this starter code the USBD peripheral is initialized in init and a task, named handle_usb_interrupt, is bound to the interrupt signal called USBD. This task will be called every time a new USBD event needs to be handled. The handle_usb_interrupt task uses usbd::next_event() to check all the event registers; if any event is set (i.e. that event just occurred) then the function returns the event, represented by the Event enum, wrapped in the Some variant. This Event is then passed to the on_event function for further processing.

✅ Connect the USB cable to the port J3 then run the starter code.

❗️ Keep the cable connected to the J3 port for the rest of the exercise.

This code will panic because Event::UsbReset is not handled yet - it has a todo!() on the relevant match arm.

✅ Go to fn on_event(...). You'll need to handle the Event::UsbReset case - for now, just print the log message returning to the Default state.

✅ Now handle the Event::UsbEp0Setup case - for now, just print the log message usb-1 exercise complete and then execute dk::exit() to shut down the microcontroller.

Your logs should look like:

USBD initialized
USB: UsbReset
returning to the Default state
USB: UsbEp0Setup
usb-1 exercise complete

You can ignore the Event::UsbEp0DataDone event for now because we don't yet get far enough when talking to the host computer for this event to come up.

USB Knowledge

`USBRESET` (indicated by `Events::UsbReset`)

This event indicates that the host issued a USB reset signal - the first step in the enumeration process. According to the USB specification this will move the device from any state to the Default state. Since we are currently not dealing with any other state, for now we just log that we received this event and move on.

`EP0SETUP` (indicated by `Events::UsbEp0Setup`)

The USBD peripheral has detected the SETUP stage of a control transfer. For now, we just print a log message and exit the application.

`EP0DATADONE` (indicated by `Events::UsbEp0DataDone`)

The USBD peripheral is signaling the end of the DATA stage of a control transfer. Since you won't encounter this event just yet, you can leave it as it is.

Help

You can find the solution in the nrf52-code/usb-app-solutions/src/bin/usb-1.rs file.

USB Endpoints

Under the USB protocol data transfers occur over endpoints.

Endpoints are similar to UDP or TCP ports in that they allow logical multiplexing of data over a single physical USB bus. USB endpoints, however, have directions: an endpoint can either be an IN endpoint or an OUT endpoint. The direction is always from the perspective of the host so at an IN endpoint data travels from the device to the host and at an OUT endpoint data travels from the host to the device.

Endpoints are identified by their address, a zero-based index, and direction. There are four types of endpoints: control endpoints, bulk endpoints, interrupt endpoints and isochronous endpoints. Each endpoint type has different properties: reliability, latency, etc. In this exercise we'll only need to deal with control endpoints.

All USB devices must use "endpoint 0" as the default control endpoint. "Endpoint 0" actually refers to two endpoints: endpoint 0 IN and endpoint 0 OUT. This endpoint pair is used to establish a control pipe, a bidirectional communication channel between the host and device where data is exchanged using a predefined format. The default control pipe over endpoint 0 is mandatory: it must always be present and must always be active.

Going back to our enumeration steps, we are expecting the host to request our Device Descriptor using a GET_DESCRIPTOR request sent over the control pipe. Later, we will expect the device to send us a SET_ADDRESS request, giving us our new USB address - again, over the control pipe.

For detailed information about endpoints check Section 5.3.1 Device Endpoints, in the USB 2.0 specification. Or you can look at Chapter 3 of USB In a Nutshell.

USB Control Transfers

Before we continue we need to discuss how data transfers work under the USB protocol.

The control pipe handles control transfers, a special kind of data transfer used by the host to issue requests. A control transfer is a data transfer that occurs in three stages: a SETUP stage, an optional DATA stage and a STATUS stage. The device must handle these requests by either supplying the requested data, or performing the requested action.

During the SETUP stage the host sends 8 bytes of data that identify the control request. Depending on the issued request there may be a DATA stage or not; during the DATA stage data is transferred either from the device to the host or the other way around. During the STATUS stage the device acknowledges, or not, the whole control request.

For detailed information about control transfers see Chapter 4 of USB In a Nutshell.

In this exercise, we expect the host to perform a control transfer to find out what kind of device we are.

USB-2: SETUP Stage

At the end of program usb-1 we received a EP0SETUP event. This event signals the end of the SETUP stage of a control transfer. The nRF52840 USBD peripheral will automatically receive the SETUP data and store it in the registers BMREQUESTTYPE, BREQUEST, WVALUE{L,H}, WINDEX{L,H} and WLENGTH{L,H}.

In nrf52-code/usb-app/src/bin/usb-2.rs, you will find a short description of each register above the variable into which it should be read. But before we read those registers, we need to write some parsing code and get it unit tested.

For in-depth register documentation, refer to Sections 6.35.13.31 to 6.35.13.38 of the nRF52840 Product Specification.

Writing a parser for the data of this SETUP stage

We could parse the SETUP data inside our application, but it makes more sense to put the code in a library where we can test it, and where we can share it with other applications.

We have provided just such a library in nrf52-code/usb-lib. But it's missing some important parts that you need to complete. The definition of Descriptor::Configuration as well as the associated test has been "commented out" using an #[cfg(TODO)] attribute because it is not handled by the firmware yet - leave those disabled for the time being.

✅ Run cargo test in the nrf52-code/usb-lib directory.

When you need to write some no_std code that does not involve device-specific I/O you should consider writing it as a separate crate. This way, you can test it on your development machine (e.g. x86_64) using the standard cargo test functionality.

So that's what we'll do here. In nrf52-code/usb-lib/src/lib.rs you'll find starter code for writing a no_std SETUP data parser. The starter code contains some unit tests; you can run them with cargo test (from within the usb-lib folder) or you can use Rust Analyzer's "Test" button in VS code.

You should see:

running 2 tests
test tests::set_address ... ok
test tests::get_descriptor_device ... FAILED

failures:

---- tests::get_descriptor_device stdout ----
thread 'tests::get_descriptor_device' panicked at src/lib.rs:119:9:
assertion `left == right` failed
  left: Err(UnknownRequest)
 right: Ok(GetDescriptor { descriptor: Device, length: 18 })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    tests::get_descriptor_device

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass `--lib`

✅ Fix the tests by parsing GET_DESCRIPTOR requests for DEVICE descriptors.

Modify Request::parse() in nrf52-code/usb-lib/src/lib.rs to recognize a GET_DESCRIPTOR request of type DEVICE so that the get_descriptor_device test passes. Note that the parser already handles SET_ADDRESS requests.

Description of GET_DESCRIPTOR request

We can recognize a GET_DESCRIPTOR request by the following properties:

bmRequestType is 0b10000000
bRequest is 6 (i.e. the GET_DESCRIPTOR Request Code, defined in table 9-4 in the USB spec)

Description of GET_DESCRIPTOR requests for DEVICE descriptors

In this step of the exercise, we only need to parse DEVICE descriptor requests. They have the following properties:

the descriptor type is 1 (i.e. DEVICE, defined in table 9-5 of the USB spec)
the descriptor index is 0
the wIndex is 0 for our purposes
❗️you need to fetch the descriptor type from the high byte of wValue, and the descriptor index from the the low byte of wValue

Check Section 9.4.3 of the USB specification for a very detailed description of the requests. All the constants we'll be using are also described in Tables 9-3, 9-4 and 9-5 of the same document. Or, you can refer to Chapter 6 of USB In a Nutshell.

You should return Err(Error::xxx) if the properties aren't met.

🔎 Remember that you can:

define binary literals by prefixing them with 0b
use bit shifts (>>) and casts (as u8) to get the high/low bytes of wValue

You will also find this information in the // TODO implement ... comment in the Request::parse() function of lib.rs file.

See nrf52-code/usb-lib-solutions/get-device/src/lib.rs for a solution.

Using our new parser

✅ Read incoming request information and pass it to the parser:

Modify nrf52-code/usb-app/src/bin/usb-2.rs to read the appropriate USBD registers and parse them when an EP0SETUP event is received.

Getting Started:

for a mapping of register names to the USBD API, check the entry for nrf52840_hal::target::usbd in the documentation you created using cargo doc
Try let value = usbd.register_name.read().bits() as u8; if you just want the bottom eight bits of a register.
remember that we've learned how to read registers in events.rs.
you will need to put together the higher and lower bits of wlength, windex and wvalue to get the whole field, or use a library function to do it for you. Can the dk crate help?
Note: If you're using a Mac, you need to catch SET_ADDRESS requests returned by the parser as these are sent before the first GET_DESCRIPTOR request. We added an empty handler for you already so there's nothing further to do (we're just explaining why it's there).

Expected Result:

When you have successfully received a GET_DESCRIPTOR request for a Device descriptor you are done. You should see an output like this:

USB: UsbReset @ Duration { secs: 0, nanos: 361145018 }
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 402465820 }
SETUP: bmrequesttype: 0, brequest: 5, wlength: 0, windex: 0, wvalue: 10
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 404754637 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 8, windex: 0, wvalue: 256
GET_DESCRIPTOR Device [length=8]
Goal reached; move to the next section
`dk::exit()` called; exiting ...

Note: wlength / length can vary depending on the OS, USB port (USB 2.0 vs USB 3.0) or the presence of a USB hub so you may see a different value.

You can find a solution to this step in nrf52-code/usb-app-solutions/src/bin/usb-2.rs.

USB Device Descriptors

After receiving a GET_DESCRIPTOR request during the SETUP stage, the device needs to respond with the actual descriptor data during the DATA stage. In our Rust application, this descriptor will be generated using some library code and serialised into an array of bytes which we can give to the USBD peripheral.

A descriptor is a binary encoded data structure sent by the device to the host. The device descriptor, in particular, contains information about the device, like its product and vendor identifiers and how many configurations it has. The format of the device descriptor is specified in Section 9.6.1 of the USB specification.

As far as the enumeration process goes, the most relevant fields of the device descriptor are the number of configurations and bcdUSB, the version of the USB specification the devices adheres to. In bcdUSB you should report compatibility with USB 2.0.

What about (the number of) configurations?

A configuration is akin to an operation mode. USB devices usually have a single configuration that will be the only mode in which they'll operate, for example a USB mouse will always act as a USB mouse. Some devices, though, may provide a second configuration for the purpose of firmware upgrades. For example a printer may enter DFU (Device Firmware Upgrade) mode, a second configuration, so that a user can update its firmware; while in DFU mode the printer will not provide printing functionality.

The specification mandates that a device must have at least one available configuration so we can report a single configuration in the device descriptor.

You can read more about Device Descriptors in Chapter 5 of USB In a Nutshell.

USB-3: DATA Stage

The next step is to respond to the GET_DESCRIPTOR request for our device descriptor, with an actual device descriptor that describes our USB Device.

Handle the request

✅ Open the nrf52-code/usb-app/src/bin/usb-3.rs file

Part of this response is already implemented. We'll go through this.

We'll use the dk::usb::Ep0In abstraction. An instance of it is available in the board value (inside the #[init] function). The first step is to make this Ep0In instance available to the on_event function.

The Ep0In API has two methods: start and end. start is used to start a DATA stage; this method takes a slice of bytes ([u8]) as argument; this argument is the response data. The end method needs to be called after start, when the EP0DATADONE event is raised, to complete the control transfer. Ep0In will automatically issue the STATUS stage that must follow the DATA stage.

✅ Handle the EP0DATADONE event

Do this by calling the end method on the EP0In instance.

✅ Implement the response to the GET_DESCRIPTOR request for device descriptors.

Extend nrf52-code/usb-app/src/bin/usb-3.rs so that it uses Ep0In to respond to the GET_DESCRIPTOR request (but only for device descriptors - no other kind of descriptor).

Values of the device descriptor

The raw values you need to pack into the descriptor are as follows. Note, we won't be doing this by hand, so read on before you start typing!

bLength = 18, the size of the descriptor (must always be this value)
bDescriptorType = 1, device descriptor type (must always be this value)
bDeviceClass = bDeviceSubClass = bDeviceProtocol = 0, these are unimportant for enumeration
bMaxPacketSize0 = 64, this is the most performant option (minimizes exchanges between the device and the host) and it's assumed by the Ep0In abstraction
idVendor = consts::VID, our example's USB Vendor ID (*)
idProduct = consts::PID, our example's USB Product ID (*)
bcdDevice = 0x0100, this means version 1.0 but any value should do
iManufacturer = iProduct = iSerialNumber = None, string descriptors not supported
bNumConfigurations = 1, must be at least 1 so this is the minimum value

(*) the consts crate refers to the crate in the nrf52-code/consts folder. It is already part of the usb-app crate dependencies.

Use the `usb2::device::Descriptor` abstraction

Although you can create the device descriptor by hand as an array filled with magic values we strongly recommend you use the usb2::device::Descriptor abstraction. The crate is already in the dependency list of the project; browse to the usb2 crate in the cargo doc output you opened earlier.

The length of the device descriptor

The usb2::device::Descriptor struct does not have bLength and bDescriptorType fields. Those fields have fixed values according to the USB spec so you cannot modify or set them. When bytes() is called on the Descriptor value the returned array, the binary representation of the descriptor, will contain those fields set to their correct value.

The device descriptor is 18 bytes long but the host may ask for fewer bytes (see wlength field in the SETUP data). In that case you must respond with the amount of bytes the host asked for. The opposite may also happen: wlength may be larger than the size of the device descriptor; in this case your answer must be 18 bytes long (do not pad the response with zeroes).

Expected log output

Once you have successfully responded to the GET_DESCRIPTOR Device request you should get logs like these (if you are logging like our solution does):

USB: UsbReset @ Duration { secs: 0, nanos: 211334227 }
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 252380370 }
SETUP: bmrequesttype: 0, brequest: 5, wlength: 0, windex: 0, wvalue: 52
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 254577635 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 8, windex: 0, wvalue: 256
GET_DESCRIPTOR Device [length=8]
EP0IN: start 8B transfer
USB: UsbEp0DataDone @ Duration { secs: 0, nanos: 254852293 }
EP0IN: transfer done
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 257568358 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 18, windex: 0, wvalue: 256
GET_DESCRIPTOR Device [length=18]
EP0IN: start 18B transfer
USB: UsbEp0DataDone @ Duration { secs: 0, nanos: 257843016 }
EP0IN: transfer done
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 259674071 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 9, windex: 0, wvalue: 512
ERROR unknown request (goal achieved if GET_DESCRIPTOR Device was handled before)
`dk::exit()` called; exiting ...

A solution to this exercise can be found in nrf52-code/usb-app-solutions/src/bin/usb-3.rs.

Configuration descriptor

The configuration descriptor describes one of the device configurations to the host. The descriptor contains the following information about a particular configuration:

the total length of the configuration: this is the number of bytes required to transfer this configuration descriptor and the interface and endpoint descriptors associated to it
its number of interfaces -- must be >= 1
its configuration value -- this is not an index and can be any non-zero value
whether the configuration is self-powered
whether the configuration supports remote wakeup
its maximum power consumption

The full format of the configuration descriptor is specified in section 9.6.3, Configuration, of the USB specification.

USB-4: Supporting more Standard Requests

After responding to the GET_DESCRIPTOR Device request the host will start sending different requests. Let's identify those, and then handle them.

Update the parser

The starter nrf52-code/usb-lib package contains unit tests for everything we need. Some of them have been commented out using a #[cfg(TODO)] attribute.

✅ Remove all #[cfg(TODO)] attributes so that everything is enabled.

✅ Update the parser in nrf52-code/usb-lib to handle GET_DESCRIPTOR requests for Configuration Descriptors.

When the host issues a GET_DESCRIPTOR Configuration request the device needs to respond with the requested configuration descriptor plus all the interface and endpoint descriptors associated to that configuration descriptor during the DATA stage.

As a reminder, all GET_DESCRIPTOR request types share the following properties:

bmRequestType is 0b10000000
bRequest is 6 (i.e. the GET_DESCRIPTOR Request Code, defined in Table 9-4 of the USB specification)

A GET_DESCRIPTOR Configuration request is determined by the high byte of its wValue field:

The high byte of wValue is 2 (i.e. the CONFIGURATION descriptor type, defined in Table 9-5 of the USB specification)

✅ Update the parser in nrf52-code/usb-lib to handle SET_CONFIGURATION requests.

See the section on SET_CONFIGURATION for details on how to do this.

Once you've completed this, all your test cases should pass. If not, fix the code until they do!

Help

If you need a reference, you can find solutions to parsing GET_DESCRIPTOR Configuration and SET_CONFIGURATION requests in the following files:

Each file contains just enough code to parse the request in its name and the GET_DESCRIPTOR Device and SET_ADDRESS requests. So you can refer to nrf52-code/usb-lib-solutions/get-descriptor-config without getting "spoiled" about how to parse the SET_CONFIGURATION request.

Update the application

We're now going to be using nrf52-code/usb-app/src/bin/usb-4.rs.

Since the logic of the EP0SETUP event handling is getting more complex with each added event, you can see that usb-4.rs was refactored to add error handling: the event handling now happens in a separate function that returns a Result. When it encounters an invalid host request, it returns the Err variant which can be handled by stalling the endpoint:

fn on_event(/* parameters */) {
    match event {
        /* ... */
        Event::UsbEp0Setup => {
            if ep0setup(/* arguments */).is_err() {
                // unsupported or invalid request:
                // TODO add code to stall the endpoint
                defmt::warn!("EP0IN: unexpected request; stalling the endpoint");
            }
        }
    }
}

fn ep0setup(/* parameters */) -> Result<(), ()> {
    let req = Request::parse(/* arguments_*/)?;
    //                                       ^ early returns an `Err` if it occurs

    // TODO respond to the `req`; return `Err` if the request was invalid in this state

    Ok(())
}

Note that there's a difference between the error handling done here and the error handling commonly done in std programs. std programs usually bubble up errors to the top main function (using the ? operator), report the error (or chain of errors) and then exit the application with a non-zero exit code. This approach is usually not appropriate for embedded programs as

main cannot return,
there may not be a console to print the error to and/or
stopping the program, and e.g. requiring the user to reset it to make it work again, may not be desirable behavior.

For these reasons in embedded software errors tend to be handled as early as possible rather than propagated all the way up.

This does not preclude error reporting. The above snippet includes error reporting in the form of a defmt::warn! statement. This log statement may not be included in the final release of the program as it may not be useful, or even visible, to an end user but it is useful during development.

✅ For each green test, extend usb-4.rs to handle the new requests your parser is now able to recognize.

If that's all the information you need - go ahead! If you'd like some more detail, read on.

Dealing with unknown requests: Stalling the endpoint

You may come across host requests other than the ones listed in previous sections.

For this situation, the USB specification defines a device-side procedure for "stalling an endpoint", which amounts to the device telling the host that it doesn't support some request.

This procedure should be used to deal with invalid requests, requests whose SETUP stage doesn't match any USB 2.0 standard request, and requests not supported by the device – for instance the SET_DESCRIPTOR request is not mandatory.

✅ Use the dk::usbd::ep0stall() helper function to stall endpoint 0 in nrf52-code/usb-app/src/bin/usb-4.rs if an invalid request is received.

Updating Device State

At some point during the initialization you'll receive a SET_ADDRESS request that will move the device from the Default state to the Address state. If you are working on Linux, you'll also receive a SET_CONFIGURATION request that will move the device from the Address state to the Configured state. Additionally, some requests are only valid in certain states– for example SET_CONFIGURATION is only valid if the device is in the Address state. For this reason usb-4.rs will need to keep track of the device's current state.

The device state should be tracked using a resource so that it's preserved across multiple executions of the USBD event handler. The usb2 crate has a State enum with the 3 possible USB states: Default, Address and Configured. You can use that enum or roll your own.

✅ Start tracking and updating the device state to move your request handling forward.

Update the handling of the `USBRESET` event

Instead of ignoring it, we now want it to change the state of the USB device. See section 9.1 USB Device States of the USB specification for details on what to do. Note that fn on_event() was given state: &mut State.

Update the handling of `SET_ADDRESS` requests

This request should come right after the GET_DESCRIPTOR Device request if you're using Linux, or be the first request sent to the device by macOS.

A SET_ADDRESS request has the following fields as defined by Section 9.4.6 Set Address of the USB spec:

bmrequesttype is 0b00000000
brequest is 5 (i.e. the SET_ADDRESS Request Code, see table 9-4 in the USB spec)
wValue contains the address to be used for all subsequent accesses
wIndex and wLength are 0, there is no wData

It should be handled as follows:

If the device is in the Default state, then
- if the requested address stored in wValue was 0 (None in the usb API) then the device should stay in the Default state
- otherwise the device should move to the Address state
If the device is in the Address state, then
- if the requested address stored in wValue was 0 (None in the usb API) then the device should return to the Default state
- otherwise the device should remain in the Address state but start using the new address
If the device is in the Configured state this request results in "unspecified" behavior according to the USB specification. You should stall the endpoint in this case.

Note: According to the USB specification the device needs to respond to this request with a STATUS stage -- the DATA stage is omitted. The nRF52840 USBD peripheral will automatically issue the STATUS stage and switch to listening to the requested address (see the USBADDR register) so no interaction with the USBD peripheral is required for this request.

For more details, read the introduction of section 6.35.9 of the nRF52840 Product Specification 1.0.

Implement the handling of `GET_DESCRIPTOR Configuration` requests

So how should we respond to the host when it wants our Configuration Descriptor? As our only goal is to be enumerated we'll respond with the minimum amount of information possible.

✅ First, check the request

Configuration descriptors are requested by index, not by their configuration value. Since we reported a single configuration in our device descriptor the index in the request must be zero. Any other value should be rejected by stalling the endpoint (see section Dealing with unknown requests: Stalling the endpoint for more information).

✅ Next, create and send a response

The response should consist of the configuration descriptor, followed by interface descriptors and then by (optional) endpoint descriptors. We'll include a minimal single interface descriptor in the response. Since endpoints are optional we will include none.

The configuration descriptor and one interface descriptor will be concatenated in a single packet so this response should be completed in a single DATA stage.

The configuration descriptor in the response should contain these fields:

bLength = 9, the size of this descriptor (must always be this value)
bDescriptorType = 2, configuration descriptor type (must always be this value)
wTotalLength = 18 = one configuration descriptor (9 bytes) and one interface descriptor (9 bytes)
bNumInterfaces = 1, a single interface (the minimum value)
bConfigurationValue = 42, any non-zero value will do
iConfiguration = 0, string descriptors are not supported
bmAttributes { self_powered: true, remote_wakeup: false }, self-powered due to the debugger connection
bMaxPower = 250 (500 mA), this is the maximum allowed value but any (non-zero?) value should do

The interface descriptor in the response should contain these fields:

bLength = 9, the size of this descriptor (must always be this value)
bDescriptorType = 4, interface descriptor type (must always be this value)
bInterfaceNumber = 0, this is the first, and only, interface
bAlternateSetting = 0, alternate settings are not supported
bNumEndpoints = 0, no endpoint associated to this interface (other than the control endpoint)
bInterfaceClass = bInterfaceSubClass = bInterfaceProtocol = 0, does not adhere to any specified USB interface
iInterface = 0, string descriptors are not supported

Again, we strongly recommend that you use the usb2::configuration::Descriptor and usb2::interface::Descriptor abstractions here. Each descriptor instance can be transformed into its byte representation using the bytes method -- the method returns an array. To concatenate both arrays you can use an stack-allocated heapless::Vec buffer. If you haven't used the heapless crate before you can find example usage in the the src/bin/vec.rs file.

NOTE: the usb2::configuration::Descriptor and usb2::interface::Descriptor structs do not have bLength and bDescriptorType fields. Those fields have fixed values according to the USB spec so you cannot modify or set them. When bytes() is called on the Descriptor value, the returned array (which contains a binary representation of the descriptor, packed according to the USB 2.0 standard) will contain those fields set to their correct value.

Getting it Configured

At this stage the device will be in the Address stage. It has been identified and enumerated by the host but cannot yet be used by host applications. The device must first move to the Configured state before the host can start, for example, HID communication or send non-standard requests over the control endpoint.

There is no template for this step - start with your solution to USB-4.

Windows will enumerate the device but not automatically configure it after enumeration. Here's what you should do to force the host to configure the device.

Linux and macOS

Nothing extra needs to be done if you're working on a Linux or macOS host. The host will automatically send a SET_CONFIGURATION request so proceed to the SET_CONFIGURATION section to see how to handle the request.

Windows

After getting the device enumerated and into the idle state, open the Zadig tool (covered in the setup instructions; see the top README) and use it to associate the nRF52840 USB device to the WinUSB driver. The nRF52840 will appear as a "unknown device" with a VID and PID that matches the ones defined in the consts crate.

Now modify the usb-descriptors command within the xtask package to "open" the device -- this operation is commented out in the source code. With this modification usb-descriptors will cause Windows to send a SET_CONFIGURATION request to configure the device. You'll need to run cargo xtask usb-descriptors to test out the correct handling of the SET_CONFIGURATION request.

SET_CONFIGURATION

The SET_CONFIGURATION request is sent by the host to configure the device. Its configuration according to Section 9.4.7 of the USB specification is:

bmrequesttype is 0b00000000
brequest is 9 (i.e. the SET_CONFIGURATION Request Code, see table 9-4 in the USB spec)
wValue contains the requested configuration value
wIndex and wLength are 0, there is no wData

✅ To handle a SET_CONFIGURATION, do the following:

If the device is in the Default state, you should stall the endpoint because the operation is not permitted in that state.
If the device is in the Address state, then
- if wValue is 0 (None in the usb API) then stay in the Address state
- if wValue is non-zero and valid (was previously reported in a configuration descriptor) then move to the Configured state
- if wValue is not valid then stall the endpoint
If the device is in the Configured state, then read the requested configuration value from wValue
- if wValue is 0 (None in the usb API) then return to the Address state
- if wValue is non-zero and valid (was previously reported in a configuration descriptor) then move to the Configured state with the new configuration value
- if wValue is not valid then stall the endpoint

In all the cases where you did not stall the endpoint (by returning Err) you'll need to acknowledge the request by starting a STATUS stage.

✅ This is done by writing 1 to the TASKS_EP0STATUS register.

NOTE: On Windows, you may get a GET_STATUS request before the SET_CONFIGURATION request and although you should respond to it, stalling the GET_STATUS request seems sufficient to get the device to the Configured state.

Expected output

✅ Run the program and check the log output.

Once you are correctly handling the SET_CONFIGURATION request you should get logs like these:

[DEBUG] Initializing the board (dk dk/src/lib.rs:312)
[DEBUG] Clocks configured (dk dk/src/lib.rs:330)
[DEBUG] RTC started (dk dk/src/lib.rs:349)
[DEBUG] I/O pins have been configured for digital output (dk dk/src/lib.rs:359)
[DEBUG] USB: UsbReset @ 00:00:00.324523 (usb_5 src/bin/usb-5.rs:56)
[WARN ] USB reset condition detected (usb_5 src/bin/usb-5.rs:60)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.367462 (usb_5 src/bin/usb-5.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b00000000, brequest: 5, wlength: 0, windex: 0x0000, wvalue: 0x000b (usb_5 src/bin/usb-5.rs:88)
[INFO ] EP0: SetAddress { address: Some(11) } (usb_5 src/bin/usb-5.rs:99)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.370758 (usb_5 src/bin/usb-5.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 8, windex: 0x0000, wvalue: 0x0100 (usb_5 src/bin/usb-5.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Device, length: 8 } (usb_5 src/bin/usb-5.rs:99)
[DEBUG] EP0IN: start 8B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.371337 (usb_5 src/bin/usb-5.rs:56)
[INFO ] EP0IN: transfer complete (usb_5 src/bin/usb-5.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.371917 (usb_5 src/bin/usb-5.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 18, windex: 0x0000, wvalue: 0x0100 (usb_5 src/bin/usb-5.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Device, length: 18 } (usb_5 src/bin/usb-5.rs:99)
[DEBUG] EP0IN: start 18B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.372497 (usb_5 src/bin/usb-5.rs:56)
[INFO ] EP0IN: transfer complete (usb_5 src/bin/usb-5.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.373046 (usb_5 src/bin/usb-5.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 9, windex: 0x0000, wvalue: 0x0200 (usb_5 src/bin/usb-5.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 9 } (usb_5 src/bin/usb-5.rs:99)
[DEBUG] EP0IN: start 9B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.373748 (usb_5 src/bin/usb-5.rs:56)
[INFO ] EP0IN: transfer complete (usb_5 src/bin/usb-5.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.373901 (usb_5 src/bin/usb-5.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 18, windex: 0x0000, wvalue: 0x0200 (usb_5 src/bin/usb-5.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 18 } (usb_5 src/bin/usb-5.rs:99)
[DEBUG] EP0IN: start 18B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.374603 (usb_5 src/bin/usb-5.rs:56)
[INFO ] EP0IN: transfer complete (usb_5 src/bin/usb-5.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.379211 (usb_5 src/bin/usb-5.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b00000000, brequest: 9, wlength: 0, windex: 0x0000, wvalue: 0x002a (usb_5 src/bin/usb-5.rs:88)
[INFO ] EP0: SetConfiguration { value: Some(42) } (usb_5 src/bin/usb-5.rs:99)
[INFO ] entering the configured state (usb_5 src/bin/usb-5.rs:198)

These logs are from a Linux host. You can find traces for other OSes in these files (they are in the nrf52-code/usb-app-solutions/traces folder):

linux-configured.txt
win-configured.txt, this file only contains the logs produced by running cargo xtask usb-descriptors
macos-configured.txt (same logs as the ones shown above)

You can find a solution to this part of the exercise in nrf52-code/usb-app-solutions/src/bin/usb-5.rs.

Idle State

Once you have handled all the previously covered requests the device should be enumerated and remain idle awaiting for a new host request. Your logs may look like this:

[DEBUG] USB: UsbReset @ 00:00:00.347259 (usb_4 src/bin/usb-4.rs:56)
[WARN ] USB reset condition detected (usb_4 src/bin/usb-4.rs:60)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.389770 (usb_4 src/bin/usb-4.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b00000000, brequest: 5, wlength: 0, windex: 0x0000, wvalue: 0x000a (usb_4 src/bin/usb-4.rs:88)
[INFO ] EP0: SetAddress { address: Some(10) } (usb_4 src/bin/usb-4.rs:99)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.393066 (usb_4 src/bin/usb-4.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 8, windex: 0x0000, wvalue: 0x0100 (usb_4 src/bin/usb-4.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Device, length: 8 } (usb_4 src/bin/usb-4.rs:99)
[DEBUG] EP0IN: start 8B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.393585 (usb_4 src/bin/usb-4.rs:56)
[INFO ] EP0IN: transfer complete (usb_4 src/bin/usb-4.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.394409 (usb_4 src/bin/usb-4.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 18, windex: 0x0000, wvalue: 0x0100 (usb_4 src/bin/usb-4.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Device, length: 18 } (usb_4 src/bin/usb-4.rs:99)
[DEBUG] EP0IN: start 18B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.394958 (usb_4 src/bin/usb-4.rs:56)
[INFO ] EP0IN: transfer complete (usb_4 src/bin/usb-4.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.395385 (usb_4 src/bin/usb-4.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 9, windex: 0x0000, wvalue: 0x0200 (usb_4 src/bin/usb-4.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 9 } (usb_4 src/bin/usb-4.rs:99)
[DEBUG] EP0IN: start 9B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.396057 (usb_4 src/bin/usb-4.rs:56)
[INFO ] EP0IN: transfer complete (usb_4 src/bin/usb-4.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.396270 (usb_4 src/bin/usb-4.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b10000000, brequest: 6, wlength: 18, windex: 0x0000, wvalue: 0x0200 (usb_4 src/bin/usb-4.rs:88)
[INFO ] EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 18 } (usb_4 src/bin/usb-4.rs:99)
[DEBUG] EP0IN: start 18B transfer (dk dk/src/usbd.rs:59)
[DEBUG] USB: UsbEp0DataDone @ 00:00:00.396942 (usb_4 src/bin/usb-4.rs:56)
[INFO ] EP0IN: transfer complete (usb_4 src/bin/usb-4.rs:65)
[INFO ] EP0IN: transfer done (dk dk/src/usbd.rs:83)
[DEBUG] USB: UsbEp0Setup @ 00:00:00.401824 (usb_4 src/bin/usb-4.rs:56)
[DEBUG] SETUP: bmrequesttype: 0b00000000, brequest: 9, wlength: 0, windex: 0x0000, wvalue: 0x002a (usb_4 src/bin/usb-4.rs:88)
[INFO ] EP0: SetConfiguration { value: Some(42) } (usb_4 src/bin/usb-4.rs:99)
[WARN ] EP0IN: unexpected request; stalling the endpoint (usb_4 src/bin/usb-4.rs:71)

Note that these logs are from a macOS host where a SET_ADDRESS request is sent first, and then a GET_DESCRIPTOR request. On other OSes the messages may be in a different order. Also note that there are some GET_DESCRIPTOR DeviceQualifier requests in this case; you do not need to parse them in the usb crate as they'll be rejected (stalled) anyways.

You can find traces for other OSes in these files (they are in the nrf52-code/usb-app-solutions/traces folder):

linux-enumeration.txt
macos-enumeration.txt (same logs as the ones shown above)
win-enumeration.txt

✅ Double check that the enumeration works by running cargo xtask usb-list](./nrf52-tools.md) while usb-4.rs is running.

$ cargo xtask usb-list
(...) random other USB devices will be listed
Bus 004 Device 001: ID 1209:0717 <-- nRF52840 on the nRF52840 Development Kit

You can also try cyme, but we've found that on Windows, the device may not appear in the tool's output. Possibly this is because it's only showing devices which have accepted a Configuration.

You can find a working solution up to this point in nrf52-code/usb-app-solutions/src/bin/usb-4.rs. Note that the solution uses the usb2 crate to parse SETUP packets and that crate supports parsing all standard requests.

Next Steps

String descriptors

If you'd like to continue working on your project, we recommend adding String Descriptors support to the USB firmware. To do this, follow these steps:

✅ Read through section 9.6.7 of the USB spec, which covers string descriptors.

✅ Change your configuration descriptor to use string descriptors. You'll want to change the iConfiguration field to a non-zero value. Note that this change will likely break enumeration.

✅ Re-run the program to see what new control requests you get from the host.

✅ Update the usb parser to handle the new requests.

✅ Extend the logic of ep0setup to handle these new requests.

Eventually, you'll need to send a string descriptor to the host. Note here that Rust string literals are UTF-8 encoded but the USB protocol uses UTF-16 strings. You'll need to convert between these formats.

✅ If this works, add strings to other descriptors like the device descriptor e.g. its iProduct field.

✅ To verify that string descriptors are working in a cross-platform way, extend the cargo xtask usb-descriptors program to also print the device's string descriptors. See the read_string_descriptor method but note that this must be called on a "device handle", which is what the commented out open operation does.

Explore more RTIC features

We have covered only a few of the core features of the RTIC framework but the framework has many more features like software tasks, tasks that can be spawned by the software; message passing between tasks; and task scheduling, which allows the creation of periodic tasks. We encourage to check the RTIC book which describes the features we haven't covered here.

usb-device

usb-device is a library for building USB devices. It has been built using traits (the pillar of Rust's generics) such that USB interfaces like HID and TTY ACM can be implemented in a device agnostic manner. The device details then are limited to a trait implementation. There's an implementation of the usb-device trait for the nRF52840 device in the nrf-hal and there are many usb-device "classes" like HID and TTY ACM that can be used with that trait implementation. We encourage you to check out that implementation, test it on different OSes and report issues, or contribute fixes, to the usb-device ecosystem.

Extra Info

The following chapters contain extra detail about DMA on the nRF52, the USB stack, and how we protect against stack overflows. You do not require them to complete the exercises, but you may find them interesting reading.

The USB Specification

The USB 2.0 specification is available free of charge from https://www.usb.org/document-library/usb-20-specification. On the right, you will see a link like usb_20_yyyymmdd.zip. Download and unpack the zip file, and the core specification can be found within as a file called usb_20.pdf (alongside a bunch of errata and additional specifications). Note that the date on the cover page is April 27, 2000 - and actually, the portions of the specification we are implementing are unchanged from the earlier USB 1.1 specification.

Direct Memory Access

🔎 this section covers the implementation of the Ep0In abstraction; it's not necessary to fully understand this section to continue working on the exercise.

Let's zoom into the Ep0In abstraction we used in usb-3.rs.

✅ Open the file. Use VSCode's "Go to Definition" to see the implementation of the Ep0In::start() method.

This is how data transfers over USB work on the nRF52840: for each endpoint there's a buffer in the USBD peripheral. Data sent by the host over USB to a particular endpoint will be stored in the corresponding endpoint buffer. Likewise, data stored in one of these endpoint buffers can be send to the host over USB from that particular endpoint. These buffers are not directly accessible by the CPU but data stored in RAM can be copied into these buffers; likewise, the contents of an endpoint buffer can be copied into RAM. A second peripheral, the Direct Memory Access (DMA) peripheral, can copy data between these endpoint buffers and RAM. The process of copying data in either direction is referred to as "a DMA transfer".

What the start method does is start a DMA transfer to copy bytes into endpoint buffer IN 0; this makes the USBD peripheral send data to the host from endpoint IN 0 fs. The data (bytes), which may be located in Flash or RAM, is first copied into an internal buffer, allocated in RAM, and then the DMA is configured to move the data from this internal buffer to endpoint buffer 0 IN, which is part of the USBD peripheral.

The signature of the start() method does not ensure that:

bytes won't be deallocated before the DMA transfer is over (e.g. bytes could be pointing into the stack), or that
bytes won't be modified right after the DMA transfer starts (this would be a data race in the general case).

For these two safety reasons the API is implemented using an internal buffer called buffer. The internal buffer has a 'static lifetime so it's guaranteed to never be deallocated -- this prevents issue (a). The busy flag prevents any further modification to the internal buffer -- from the public API -- while the DMA transfer is in progress.

Apart from thinking about lifetimes and explicit data races in the surface API one must internally use memory fences to prevent reordering of memory operations (e.g. by the compiler), which can also cause data races. DMA transfers run in parallel to the instructions performed by the processor and are "invisible" to the compiler.

In the implementation of the start method, data is copied from bytes to the internal buffer (a memcpy() operation) and then the DMA transfer is started with a write to the TASKS_STARTEPIN0 register. The compiler sees the start of the DMA transfer (register write) as an unrelated memory operation so it may move the memcpy() to after the DMA transfer has started. This reordering results in a data race: the processor modifies the internal buffer while the DMA is reading data out from it.

To avoid this reordering a memory fence, dma_start(), is used. The fence pairs with the store operation (register write) that starts the DMA transfer and prevents the previous memcpy(), and any other memory operation, from being move to after the store operation.

Another memory fence, dma_end(), is needed at the end of the DMA transfer. In the general case, this prevents instruction reordering that would result in the processor accessing the internal buffer before the DMA transfer has finished. This is particularly problematic with DMA transfers that modify a region of memory which the processor intends to read after the transfer.

Note: Not relevant to the DMA operation but relevant to the USB specification, the start() method sets a shortcut in the USBD peripheral to issue a STATUS stage right after the DATA stage is finished. Thanks to this it is not necessary to manually start a STATUS stage after calling the end method.

SET_CONFIGURATION (Linux & macOS)

On Linux and macOS, the host will likely send a SET_CONFIGURATION request right after enumeration to put the device in the Configured state. For now you can stall the request. It is not necessary at this stage because the device has already been enumerated.

Interface

We have covered configurations and endpoints but what is an interface?

An interface is closest to a USB device's function. For example, a USB mouse may expose a single HID (Human Interface Device) interface to report user input to the host. USB devices can expose multiple interfaces within a configuration. For example, the nRF52840 Dongle could expose both a CDC ACM interface (AKA virtual serial port) and a HID interface; the first interface could be used for (defmt::println!-style) logs; and the second one could provide a RPC (Remote Procedure Call) interface to the host for controlling the nRF52840's radio.

An interface is made up of one or more endpoints. To give an example, a HID interface can use two (interrupt) endpoints, one IN and one OUT, for bidirectional communication with the host. A single endpoint cannot be used by more than one interface with the exception of the special "endpoint 0", which can be (and usually is) shared by all interfaces.

For detailed information about interfaces check section 9.6.5, Interface, of the USB specification.

Interface descriptor

The interface descriptor describes one of the device interfaces to the host. The descriptor contains the following information about a particular interface:

its interface number -- this is a zero-based index
its alternate setting -- this allows configuring the interface
its number of endpoints
class, subclass and protocol -- these define the interface (HID, or TTY ACM, or DFU, etc.) according to the USB specification

The number of endpoints can be zero and endpoint zero must not be accounted when counting endpoints.

The full format of the interface descriptor is specified in section 9.6.5, Interface, of the USB specification.

Endpoint descriptor

We will not need to deal with endpoint descriptors in this exercise but they are specified in section 9.6.6, Endpoint, of the USB specification.

Inspecting the Descriptors

There's a tool built into our cargo xtask called usb-descriptors, it prints all the descriptors reported by your application

✅ Run this tool

Your output should look like this:

$ cargo xtask usb-descriptors
DeviceDescriptor {
    bLength: 18,
    bDescriptorType: 1,
    bcdUSB: 512,
    bDeviceClass: 0,
    bDeviceSubClass: 0,
    bDeviceProtocol: 0,
    bMaxPacketSize: 64,
    idVendor: 8224,
    idProduct: 1815,
    bcdDevice: 256,
    iManufacturer: 0,
    iProduct: 0,
    iSerialNumber: 0,
    bNumConfigurations: 1,
}
address: 22
config0: ConfigDescriptor {
    bLength: 9,
    bDescriptorType: 2,
    wTotalLength: 18,
    bNumInterfaces: 1,
    bConfigurationValue: 42,
    iConfiguration: 0,
    bmAttributes: 192,
    bMaxPower: 250,
    extra: None,
}
iface0: [
    InterfaceDescriptor {
        bLength: 9,
        bDescriptorType: 4,
        bInterfaceNumber: 0,
        bAlternateSetting: 0,
        bNumEndpoints: 0,
        bInterfaceClass: 0,
        bInterfaceSubClass: 0,
        bInterfaceProtocol: 0,
        iInterface: 0,
    },
]

The output above corresponds to the descriptor values we suggested. If you used different values, e.g. for bMaxPower, you'll a slightly different output.

Stack Overflow Protection

The usb-app crate in which we developed our USB exercise solutions (i.e. nrf52-code/usb-app) uses our open-source flip-link tool for zero-cost stack overflow protection.

This means that your application will warn you by crashing if you accidentally overreach the boundaries of your application's stack instead of running into undefined behavior and behaving erratically in irreproducible ways. This memory protection mechanism comes at no additional computational or memory-usage cost.

🔎 For a detailed description of how flip-link and Stack Overflows in bare metal Rust in general work, please refer to the flip-link README.

You can see this in action in the stack_overflow.rs file that can be found in nrf52-code/usb-app/src/bin/stack_overflow.rs:

#![no_main]
#![no_std]

use cortex_m::asm;
use cortex_m_rt::entry;
// this imports `src/lib.rs`to retrieve our global logger + panicking-behavior
use usb_app as _;

#[entry]
fn main() -> ! {
    // board initialization
    dk::init().unwrap();

    fib(100);

    loop {
        asm::bkpt();
    }
}

#[inline(never)]
fn fib(n: u32) -> u32 {
    // allocate and initialize one kilobyte of stack memory to provoke stack overflow
    let use_stack = [0xAA; 1024];
    defmt::println!("allocating [{}; 1024]; round #{}", use_stack[1023], n);

    if n < 2 {
        1
    } else {
        fib(n - 1) + fib(n - 2) // recursion
    }
}

The spam() function allocates data on the stack until the stack boundaries are reached.

✅ Run stack_overflow.rs

You should see output similar to this (the program output between the horizontal bars might be missing):

(HOST) INFO  flashing program (35.25 KiB)
(HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
INFO:stack_overflow -- provoking stack overflow...
INFO:stack_overflow -- address of current `use_stack` at recursion depth 0: 0x2003aec0
INFO:stack_overflow -- address of current `use_stack` at recursion depth 1: 0x20039e50
(...)
INFO:stack_overflow -- address of current `use_stack` at recursion depth 10: 0x20030a60
INFO:stack_overflow -- address of current `use_stack` at recursion
────────────────────────────────────────────────────────────────────────────────
stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
(HOST) WARN  call stack was corrupted; unwinding could not be completed
(HOST) ERROR the program has overflowed its stack

❗️ flip-link is a third-party tool (in fact it was written by Ferrous Systems as part of their Knurling Project), so make sure you've installed it through cargo install flip-link

To see how we've activated flip-link, take a look at nrf52-code/usb-app/.cargo/config.toml:

rustflags = [
  "-C", "linker=flip-link", # adds stack overflow protection
  #
]

There, we've configured flip-link as the linker to be used for all ARM targets. If you'd like to use flip-link in your own projects, this is all you need to add!

🔎 Note: if you try to run stack_overflow.rs without flip-link enabled, you might see varying behavior depending on the rustc version you're using, timing and pure chance. This is because undefined behavior triggered by the program may change between rustc releases.

Working without the Standard Library

This section has some exercises which introduce ways to move away from libstd and write applications which only use libcore (or liballoc). This is important when writing safety-critical systems.

Replacing `println!`

In this exercise, we will write a basic "Hello, World!" application, but without using println!. This will introduce some of the concepts we will need for writing safety-critical Rust code that runs on certified OSes like QNX, where the Rust Standard Library is not available.

However, to keep things easy to deploy, you can use your normal Windows, macOS or Linux system to complete this exercise.

Task 1 - Make a program

Use cargo new to make a package containing the default binary crate - a Hello, World example that uses println!

Solution

$ cargo new testprogram
     Created binary (application) `testbin` package
$ cd testprogram
$ cargo run
   Compiling testbin v0.1.0 (/Users/jonathan/Documents/clients/training/oxidze-2024/testbin)
    Finished dev [unoptimized + debuginfo] target(s) in 0.32s
     Running `target/debug/testbin`
Hello, world!

Task 2 - Lock the Standard Out

The println! expands to some code which:

Grabs a lock on standard out
Formats the arguments into the locked standard out

We can do these two steps manually, using std::io::stdout(), and the writeln! (which is actually from in libcore).

Replace the call to println! with a call to writeln! that uses a locked standard out. Work out how best to handle the fact that writeln! returns an error. Think about why println! didn't return an error? How it did handle a possible failure?

If you get an error about the write_fmt method not being available, make sure you have brought the std::io::Write trait into scope. Recall that trait methods are not available on types unless the trait is in scope - otherwise how would the compiler know which traits to look for the method in? If we were on a no-std system, the same method is available in the core::fmt::Write trait - the writeln! macro is happy with either as long as the method exists.

Solution

use std::io::Write;

fn main() {
    let mut stdout = std::io::stdout();
    writeln!(stdout, "Hello, World!").expect("writing to stdout");
}

The writeln call can fail because the it can get an error from the object it is writing to. What if you are writing to a file on disk, and the disk is full? Or the USB Thumb Drive it is on is unplugged? The println! macro knows it only writes to Standard Out, and if that is broken, there isn't much you can do about it (you probably can't even print an error), so it just panics.

Task 3 - Call `write_fmt`

The writeln! call expands to some code which:

Generates a value of type std::fmt::Arguments, using a macro called format_args!.
Passes that to the write_fmt method on whatever we're writing into.

You can do these two steps manually - but that's as far as we can go! The format_args! macro is special, and we are unable to replicate its functions by writing regular Rust code.

Replace the call to writeln! with a call to format_args!, passing the result to the write_fmt method on the locked standard output. Note that Rust won't let you store the result of format_args! in a variable - you need to call it inside the call to write_fmt. Try it for yourself!

Solution

use std::io::Write;

fn main() {
    let mut stdout = std::io::stdout();
    stdout.write_fmt(format_args!("Hello, World!"));
}

Task 4 - Ditch the standard output object

Rather than throw bytes into this mysterious Standard Out object, let's try and talk to our Operating System directly. We're going to do this using the libc crate, which provides raw access to the APIs typically found in most C Standard Libraries.

Step 1 - Run cargo add libc to add it as a dependency

Step 2 - Store your message in a local variable, as a string slice

#![allow(unused)]
fn main() {
let message = "Hello, World!";
}

Step 3 - Unsafely call the libc::write method, passing:
- 1 as the file descriptor (the standard output has this value, by default)
- A pointer to the start of your string slice
- The length of the string in bytes

You can make a pointer from a slice using the as_ptr() method, but this will give you *const u8 and libc::write might want *const c_void. You can use message.as_ptr() as _ to get Rust to cast the pointer into an automatically determined type (the _ means 'work this out for me').

You might also find the length of the string needs casting from the default usize to whatever libc wants on your platform.

Solution

fn main() {
    let message = "Hello, world";
    unsafe {
        libc::write(1, message.as_ptr() as _, message.len() as _);
    }
}

Bare-Metal Firmware on Cortex-R52 - Preparation

This chapter contains information about the QEMU-based exercises, the required software and an installation guide.

This example uses the armv8r-none-eabihf target.

Required Software

QEMU, version 9

Available for Windows, macOS or Linux from https://www.qemu.org/download/

Note that version 8 or lower will not work. It must be version 9 or higher to support the Cortex-R52.

Ensure that once installed you have qemu-system-arm on your path.

Ferrocene or Rust

If you use Ferrocene, you will need stable-25.02.0 or newer. A criticalup.toml file is provided, you can just criticalup install in the example directory and an appropriate toolchain will be provided.

If you use Rust, the armv8r-none-eabihf target is only in Tier 3, so you will need a nightly release from after March 2024. You will also need to compile the standard library from source - see the README for more details.

Bare-Metal Firmware on Cortex-R52 - Writing a UART Driver

We have supplied a small Rust no-std application, which is designed to run inside a QEMU emulation of an Armv8-R Cortex-R52 system. We build the code using the armv8r-none-eabihf target.

The application lives in ./qemu-code/uart-driver.

The application talks to the outside world through a UART driver. We have provided two - a working one, and a template one that doesn't work which you need to fix.

Task 1 - Get UART TX working

Modify the template driver and complete the missing code sections as commented. You can peek at the complete driver if you really need to!

This will involve reading and writing to the given registers. You have been given an MMIO handle for the UART peripheral as self.registers. This is of type MmioUart, which was automatically generated using the derive_mmio::Mmio derive-macro based on the register definition at the bottom of the file.

You'll want to read the derive-mmio documentation, or run cargo doc --open on your project to see the API available.

Task 2 - Get UART RX working (Optional)

Continue modifying the UART driver so that you can read data. You'll need to enable RX in the configuration register, and add an appropriate method to read a single byte, returning Option<u8>. Now modify the main loop to echo back received characters.

You can look in the CMSDK UART documentation to see which bit in the status register indicates that the 1-byte long RX FIFO has data in it.

Task 3 - Make the UART a global

Creating a UART on the stack is fine, but we would now like to use our UART from anywhere in our program.

To do that, we need to do a few things:

Step 1

Create a static variable, like static UART: T = something(). Work out how to initialise that variable.

Answer

static UART: Uart = unsafe { Uart::new_uart0() };

Step 2

Use the critical_section::Mutex type to solve the Sync error that appears.

Answer

static UART: critical_section::Mutex<Uart> = critical_section::Mutex::new(unsafe { Uart::new_uart0() });

// Now every time you touch the UART you must lock the Mutex first
critical_section::with(|cs| {
   UART.borrow(cs).enable(115200, PERIPHERAL_CLOCK);
});

Step 3

That's not enough. You only have &self to the Uart and you need &mut self. Try using a RefCell!

Answer

use core::cell::RefCell;

static UART: critical_section::Mutex<RefCell<Uart>> = critical_section::Mutex::new(RefCell::new(unsafe { Uart::new_uart0() }));

// Now every time you touch the UART you must lock the Mutex and borrow the RefCell first
critical_section::with(|cs| {
   UART.borrow_ref_mut(cs).enable(115200, PERIPHERAL_CLOCK);
});
critical_section::with(|cs| {
   let mut uart = UART.borrow_ref_mut(cs);
   _ = writeln!(uart, "Hello, this is Rust!");
});

Step 4

That's a lot of locking. Make a wrapper type called GlobalUart which hides this mess. Implement core::fmt::Write for it so can just writeln!(&UART, "Hello").

Answer

struct GlobalUart {
    inner: critical_section::Mutex<RefCell<Uart>>
}

impl GlobalUart {
    const fn new() -> GlobalUart {
        GlobalUart { inner:  critical_section::Mutex::new(RefCell::new(unsafe { Uart::new_uart0() })) }
    }

    fn enable(&self, baudrate: u32, periph_clk: u32) {
        critical_section::with(|cs| {
            let mut uart = self.inner.borrow_ref_mut(cs);
            uart.enable(baudrate, periph_clk);
        });
    }
}

static UART: GlobalUart = GlobalUart::new();

// Note that we are implementing the trait for reference-to-GlobalUart because
// we don't have mutable access to our static variable.
impl core::fmt::Write for &GlobalUart {
    fn write_str(&mut self, s: &str) -> core::fmt::Result {
        for b in s.bytes() {
            critical_section::with(|cs| {
                let mut uart = self.inner.borrow_ref_mut(cs);
                uart.write(b);
            });
        }
        Ok(())
    }
}

// Note we are writing into a reference-to-GlobalUart
writeln!(&UART, "Hello, this is Rust!")?;

Running the code

You will need QEMU 9 installed and in your $PATH for cargo run to work. This was the first version with Arm Cortex-R52 emulation.

Running the project gives:

$ cargo run
   Compiling uart-exercise v0.1.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/qemu-code/uart-driver)
warning: field `registers` is never read
 --> src/uart_driver.rs:9:5
  |
8 | pub struct Uart {
  |            ---- field in this struct
9 |     registers: MmioRegisters,
  |     ^^^^^^^^^
  |
  = note: `#[warn(dead_code)]` on by default

warning: `uart-exercise` (lib) generated 1 warning
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.87s
     Running `qemu-system-arm -machine mps3-an536 -cpu cortex-r52 -semihosting -nographic -kernel target/armv8r-none-eabihf/debug/uart-exercise`
PANIC: PanicInfo { message: I am a panic, location: Location { file: "src/main.rs", line: 45, col: 5 }, can_unwind: true, force_no_backtrace: false }

No UART output (the panic comes on semihosting). But also no-one is using that registers field. You should fix that.

With the Task 1 completed (or using the solution file) you will get:

$ cargo run
   Compiling uart-exercise v0.1.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/qemu-code/uart-driver)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s
     Running `qemu-system-arm -machine mps3-an536 -cpu cortex-r52 -semihosting -nographic -kernel target/armv8r-none-eabihf/debug/uart-exercise`
Hello, this is Rust!
    1.00     2.00     3.00     4.00     5.00     6.00     7.00     8.00     9.00    10.00
    2.00     4.00     6.00     8.00    10.00    12.00    14.00    16.00    18.00    20.00
    3.00     6.00     9.00    12.00    15.00    18.00    21.00    24.00    27.00    30.00
    4.00     8.00    12.00    16.00    20.00    24.00    28.00    32.00    36.00    40.00
    5.00    10.00    15.00    20.00    25.00    30.00    35.00    40.00    45.00    50.00
    6.00    12.00    18.00    24.00    30.00    36.00    42.00    48.00    54.00    60.00
    7.00    14.00    21.00    28.00    35.00    42.00    49.00    56.00    63.00    70.00
    8.00    16.00    24.00    32.00    40.00    48.00    56.00    64.00    72.00    80.00
    9.00    18.00    27.00    36.00    45.00    54.00    63.00    72.00    81.00    90.00
   10.00    20.00    30.00    40.00    50.00    60.00    70.00    80.00    90.00   100.00
PANIC: PanicInfo { payload: Any { .. }, message: Some(I am a panic), location: Location { file: "src/main.rs", line: 43, col: 5 }, can_unwind: true, force_no_backtrace: false }

Building a Linux Kernel Driver using Rust

In this example we're going to build a very basic Linux Kernel driver in Rust, compile a Linux kernel, and load the driver into it.

You will need QEMU installed and in your $PATH. If you have an AArch64 machine (Mac with Apple Silicon, Raspberry Pi 3 or newer, etc) you should use qemu-system-aarch64. If you have an x86-64 machine, you should use qemu-system-x86_64. You can use the 'wrong' one - it'll work just fine, but it will go much more slowly, and we have a lot of Linux kernel to build.

Task 1 - Fetch a Debian disk image

This example uses Debian Stable 'nocloud' disk image. Other disk images might work but this is the one we tested.

Make yourself a work area on your machine, and download either https://cdimage.debian.org/images/cloud/bookworm/20250316-2053/debian-12-nocloud-arm64-20250316-2053.qcow2 (AArch64 aka Arm64) or https://cdimage.debian.org/images/cloud/bookworm/20250316-2053/debian-12-nocloud-amd64-20250316-2053.qcow2 (x86-64 aka AMD64)

For example:

mkdir linux-rust-demo
cd linux-rust-demo
wget https://cdimage.debian.org/images/cloud/bookworm/20250316-2053/debian-12-nocloud-arm64-20250316-2053.qcow2

(Windows users can use their favourite tool to make the folder, and a web browser to download the disk image)

Task 1a - Fetch the BIOS (AArch64 only)

If you are going to use AArch64, you'll need a UEFI boot-loader. On macOS, homebrew installs a copy of EDK2, which is fine for our use-case. On my machine it was installed into /opt/homebrew/Cellar/qemu/9.2.2/share/qemu/edk2-aarch64-code.fd. You'll need to have a look in your QEMU installation directory to find where your copy is. Once you have found it, copy it to ./QEMU_EFI.fd, which is what the following qemu-system-aarch64 command lines expect.

When emulating x86-64, QEMU uses a copy of SeaBIOS automatically.

Task 2 - Resize the disk image

If your downloads take a while, you may want to make a backup copy of the disk image, because as soon as you boot up the VM, the disk will be changed.

Then we're going to resize the disk image because it's too small (this applies to both the AArch64 one and the x86-64 one). We'll deal with making the partition larger a bit later once the VM has booted.

qemu-img resize debian-12-nocloud-arm64-20250316-2053.qcow2 +32G

Task 3 - Boot it up

We're now going to boot the VM.

For AArch64 on Apple Silicon macOS:

qemu-system-aarch64 -m 8G -M virt -cpu host -accel hvf -smp 8 -bios QEMU_EFI.fd -drive if=none,file=debian-12-nocloud-arm64-20250316-2053.qcow2,id=hd0 -device virtio-blk-device,drive=hd0 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp:127.0.0.1:5555-:22 -nographic -serial telnet:localhost:4321,server,wait

For AArch64 on Arm Linux:

qemu-system-aarch64 -m 8G -M virt -cpu host -accel kvm -smp 8 -bios QEMU_EFI.fd -drive if=none,file=debian-12-nocloud-arm64-20250316-2053.qcow2,id=hd0 -device virtio-blk-device,drive=hd0 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp:127.0.0.1:5555-:22 -nographic -serial telnet:localhost:4321,server,wait

AArch64 otherwise:

qemu-system-aarch64 -m 8G -M virt -cpu cortex-a53 -smp 8 -bios QEMU_EFI.fd -drive if=none,file=debian-12-nocloud-arm64-20250316-2053.qcow2,id=hd0 -device virtio-blk-device,drive=hd0 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp:127.0.0.1:5555-:22 -nographic -serial telnet:localhost:4321,server,wait

For x86-64 on x86-64 Windows:

qemu-system-x86_64 -m 8G -M q35 -accel whpx -smp 8 -hda debian-12-nocloud-amd64-20250316-2053.qcow2 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp:127.0.0.1:5555-:22 -nographic -serial telnet:localhost:4321,server,wait

For x86-64 on x86-64 Linux:

qemu-system-x86_64 -m 8G -M q35 -accel kvm -smp 8 -hda debian-12-nocloud-amd64-20250316-2053.qcow2 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp:127.0.0.1:5555-:22 -nographic -serial telnet:localhost:4321,server,wait

x86-64 otherwise:

qemu-system-x86_64 -m 8G -M q35 -smp 8 -hda debian-12-nocloud-amd64-20250316-2053.qcow2 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp:127.0.0.1:5555-:22 -nographic -serial telnet:localhost:4321,server,wait

In all cases I gave the machine 8 GiB of RAM and 8 CPU cores. You may want to tweak that to suit your needs.

Connect to your virtual machine over Telnet (you can use PuTTY, or your favourite telnet client) on localhost:4321 You should end up at a login prompt. The user is root and there is no password.

Task 4 - Resize the partition

The Debian image we downloaded has quite a small root partition, and we're going to need a lot more space to build the Linux kernel. We already made the virtual disk (the QCOW file) larger, so now let's go in and resize it the partition to use the extra space.

Inside your VM, run:

apt update
apt install fdisk
cfdisk /dev/*da

Once in cfdisk, use the arrow keys to select [ Sort ], then select the root filesystem (the bottom one) and pick [ Resize ], then [ Write ], and [ Quit ]. You'll need to type yes in response to [ Write ].

Now run reboot and connect to the VM again. Running df -h should show that / is about 35 GiB in size - Debian's start-up scripts automatically resized the root filesystem to fill our newly enlarged the partition.

Task 5 - Pre-requisites

If you have an SSH key, you can install it now and start an SSH server.

apt install openssh-server
nano .ssh/authorized_keys # paste your key into this file

If you do that, you can SSH into the VM using localhost:5555 right away. You might prefer that to whatever telnet client you were using. Or, keep using telnet - either is fine.

Now install some more tools:

apt install build-essential libssl-dev python3 flex bison bc libncurses-dev gawk openssl libssl-dev libelf-dev libudev-dev libpci-dev libiberty-dev autoconf llvm clang lld git 
curl https://sh.rustup.rs | bash
source $HOME/.cargo/env
cargo install --locked bindgen-cli
rustup component add rust-src

Task 6 - Build a kernel

Let's grab Linux 6.14 and build it.

curl -O https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.14.tar.xz
tar xvf linux-6.14.tar.xz
cd linux-6.14
make LLVM=1 rustavailable
make LLVM=1 defconfig 
make LLVM=1 menuconfig  # General setup / [*] Rust support

In the menuconfig interface you will need to enter General Support and scroll down to Rust support. Press Y to enable Rust support. Then use the arrow keys to select < Exit > and < Exit > again, then < Yes > to save.

Now we can build and install our kernel.

make LLVM=1 -j8
make LLVM=1 modules_install
make LLVM=1 install
reboot

This takes around 12 minutes or so on a fast laptop (using Hypervisor acceleration). It'll be much longer if you're on the 'opposite' architecture and are having to fully emulate the guest processor.

Task 7 - Build a kernel module

OK! Now we have Linux 6.14 with Rust support enabled. Let's build an out-of-tree kernel module.

git clone https://github.com/Rust-for-Linux/rust-out-of-tree-module
cd rust-out-of-tree-module
git checkout 15de8569df46e16f4940b52c91ee8f6bfbe5ab22
make KDIR=../linux-6.14 LLVM=1

The kernel module has been compiled as rust_out_of_tree.ko. Let's load it.

insmod ./rust_out_of_tree.ko

As it loaded, you should see a message from the kernel - if you're on SSH, check the telnet window. If you don't have telnet connected, you can review the kernel logs with dmesg.

root@localhost:~/rust-out-of-tree-module# insmod rust_out_of_tree.ko
[   13.933513] rust_out_of_tree: loading out-of-tree module taints kernel.
[   13.938155] rust_out_of_tree: Rust out-of-tree sample (init)
root@localhost:~/rust-out-of-tree-module# dmesg
...
[   13.933513] rust_out_of_tree: loading out-of-tree module taints kernel.
[   13.938155] rust_out_of_tree: Rust out-of-tree sample (init)
root@localhost:~/rust-out-of-tree-module#

Now let's unload it.

rmmod rust_out_of_tree

Again, you should see some output:

root@localhost:~/rust-out-of-tree-module# rmmod rust_out_of_tree
[   72.677287] rust_out_of_tree: My numbers are [72, 108, 200]
[   72.677835] rust_out_of_tree: Rust out-of-tree sample (exit)
root@localhost:~/rust-out-of-tree-module# dmesg
...
[   13.933513] rust_out_of_tree: loading out-of-tree module taints kernel.
[   13.938155] rust_out_of_tree: Rust out-of-tree sample (init)
[   72.677287] rust_out_of_tree: My numbers are [72, 108, 200]
[   72.677835] rust_out_of_tree: Rust out-of-tree sample (exit)
root@localhost:~/rust-out-of-tree-module#

Task 8 - Create a device

As of Linux 6.14, Rust for Linux has Rust APIs for:

Block Devices
Miscellaneous Devices
Network Devices

A Miscellaneous Device has an entry like /dev/foobar and we can open it, close it, and send it ioctl requests.

Looking at the documentation for the MiscDevice::register method we can see that we get an opaque object that implements PinInit rather than a concrete type. So, we're going to need to make a bunch of changes, step by step.

First, let's remove that example Vec and hold MiscDeviceRegistration instead. We mark the struct with #[pin_data(PinnedDrop)] to promise that we're not going to be moving things around in memory whilst the module is loaded, and mark the _miscdev field with #[pin]:

#[pin_data(PinnedDrop)]
struct RustOutOfTree {
    #[pin]
    _miscdev: kernel::miscdevice::MiscDeviceRegistration<RustOutOfTreeDevice>,
}

Now instead of implementing kernel::Module, let's implement kernel::InPlaceModule:

impl kernel::InPlaceModule for RustOutOfTree {
    fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
        pr_info!("Rust out-of-tree sample (init)\n");

        let options = kernel::miscdevice::MiscDeviceOptions {
            name: kernel::c_str!("rust-misc-device"),
        };

        try_pin_init!(Self {
            _miscdev <- kernel::miscdevice::MiscDeviceRegistration::register(options),
        })
    }
}

Instead of a plain Result we're returning something that implements PinInit. The try_pin_init! macro will do this for us. The name in our kernel::miscdevice::MiscDeviceOptions sets the name of our device in /dev. Pick something else if you like!

We need to adjust our Drop impl, to deal with our newly pinned data structure.

#[pinned_drop]
impl PinnedDrop for RustOutOfTree {
    fn drop(self: Pin<&mut Self>) {
        pr_info!("Rust out-of-tree sample (exit)\n");
    }
}

(We've also removed the bit that prints the Vec we removed).

Finally, let's make the RustOutOfTreeDevice we referenced earlier in our RustOutOfTree structure. It's as basic as we can get away with.

struct RustOutOfTreeDevice {}

#[vtable]
impl kernel::miscdevice::MiscDevice for RustOutOfTreeDevice {
    type Ptr = Pin<KBox<Self>>;

    fn open(
        _file: &kernel::fs::File,
        _misc: &kernel::miscdevice::MiscDeviceRegistration<Self>,
    ) -> Result<Pin<KBox<Self>>> {
        return Err(ENOTTY);
    }
}

Our full file looks like this

// SPDX-License-Identifier: GPL-2.0

//! Rust out-of-tree sample

use kernel::prelude::*;

module! {
    type: RustOutOfTree,
    name: "rust_out_of_tree",
    author: "Rust for Linux Contributors",
    description: "Rust out-of-tree sample",
    license: "GPL",
}

#[pin_data(PinnedDrop)]
struct RustOutOfTree {
    #[pin]
    _miscdev: kernel::miscdevice::MiscDeviceRegistration<RustOutOfTreeDevice>,
}

impl kernel::InPlaceModule for RustOutOfTree {
    fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
        pr_info!("Rust out-of-tree sample (init)\n");

        let options = kernel::miscdevice::MiscDeviceOptions {
            name: kernel::c_str!("rust-misc-device"),
        };

        try_pin_init!(Self {
            _miscdev <- kernel::miscdevice::MiscDeviceRegistration::register(options),
        })
    }
}

#[pinned_drop]
impl PinnedDrop for RustOutOfTree {
    fn drop(self: Pin<&mut Self>) {
        pr_info!("Rust out-of-tree sample (exit)\n");
    }
}

struct RustOutOfTreeDevice {}

#[vtable]
impl kernel::miscdevice::MiscDevice for RustOutOfTreeDevice {
    type Ptr = Pin<KBox<Self>>;

    fn open(
        _file: &kernel::fs::File,
        _misc: &kernel::miscdevice::MiscDeviceRegistration<Self>,
    ) -> Result<Pin<KBox<Self>>> {
        return Err(ENOTTY);
    }
}

Let's load it and see if we get a device:

$ make KDIR=../linux-6.14 LLVM=1
$ insmod ./rust_out_of_tree.ko
[ 2337.507487] rust_out_of_tree: Rust out-of-tree sample (init)
$ ls /dev/rust*
crw------- 1 root root 10, 124 Apr  2 16:29 /dev/rust-misc-device
$ rmmod rust_out_of_tree
[ 2345.938810] rust_out_of_tree: Rust out-of-tree sample (exit)

Nice, we got a device!

Task 9 - Implement `open`

Let's implement the open function for our RustOutOfTreeDevice.

Our RustOutOfTreeDevice will need to hold onto a reference to our open Device:

#[pin_data]
struct RustOutOfTreeDevice {
    dev: kernel::types::ARef<kernel::device::Device>,
}

Yes, more pinning was required. The clue was the return type of the open function in the MiscDevice trait: Result<Pin<KBox<Self>>>.

Let's re-write that open function to actually open our device.

#[vtable]
impl kernel::miscdevice::MiscDevice for RustOutOfTreeDevice {
    type Ptr = Pin<KBox<Self>>;

    fn open(
        file: &kernel::fs::File,
        misc: &kernel::miscdevice::MiscDeviceRegistration<Self>,
    ) -> Result<Pin<KBox<Self>>> {
        let dev = kernel::types::ARef::from(misc.device());

        dev_info!(
            dev,
            "Opening Rust Misc Device Sample (uid = {})\n",
            file.cred().euid().into_uid_in_current_ns()
        );

        KBox::try_pin_init(
            try_pin_init! {
                RustOutOfTreeDevice {
                    dev: dev,
                }
            },
            GFP_KERNEL,
        )
    }
}

Our full file looks like this:

// SPDX-License-Identifier: GPL-2.0

//! Rust out-of-tree sample

use kernel::prelude::*;

module! {
    type: RustOutOfTree,
    name: "rust_out_of_tree",
    author: "Rust for Linux Contributors",
    description: "Rust out-of-tree sample",
    license: "GPL",
}

#[pin_data(PinnedDrop)]
struct RustOutOfTree {
    #[pin]
    _miscdev: kernel::miscdevice::MiscDeviceRegistration<RustOutOfTreeDevice>,
}

impl kernel::InPlaceModule for RustOutOfTree {
    fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
        pr_info!("Rust out-of-tree sample (init)\n");

        let options = kernel::miscdevice::MiscDeviceOptions {
            name: kernel::c_str!("rust-misc-device"),
        };

        try_pin_init!(Self {
            _miscdev <- kernel::miscdevice::MiscDeviceRegistration::register(options),
        })
    }
}

#[pinned_drop]
impl PinnedDrop for RustOutOfTree {
    fn drop(self: Pin<&mut Self>) {
        pr_info!("Rust out-of-tree sample (exit)\n");
    }
}

#[pin_data]
struct RustOutOfTreeDevice {
    dev: kernel::types::ARef<kernel::device::Device>,
}

#[vtable]
impl kernel::miscdevice::MiscDevice for RustOutOfTreeDevice {
    type Ptr = Pin<KBox<Self>>;

    fn open(
        file: &kernel::fs::File,
        misc: &kernel::miscdevice::MiscDeviceRegistration<Self>,
    ) -> Result<Pin<KBox<Self>>> {
        let dev = kernel::types::ARef::from(misc.device());

        dev_info!(
            dev,
            "Opening Rust Misc Device Sample (uid = {})\n",
            file.cred().euid().into_uid_in_current_ns()
        );

        KBox::try_pin_init(
            try_pin_init! {
                RustOutOfTreeDevice {
                    dev: dev,
                }
            },
            GFP_KERNEL,
        )
    }
}

Now we should be able to see a log when we try and open our device.

$ make KDIR=../linux-6.14 LLVM=1
$ insmod ./rust_out_of_tree.ko
[ 3918.696311] rust_out_of_tree: Rust out-of-tree sample (init)
$ cat /dev/rust-misc-device
cat: /dev/rust-misc-device: Invalid argument
[ 3990.836103] misc rust-misc-device: Opening Rust Misc Device Sample (uid = 0)

Great, we can see that the device has been opened by cat. As of Linux 6.14 there's no support for Read operations - only ioctl operations - so cat gets an error from the Kernel. That's expected.

Task 10 - implement `ioctl`

You've got the hang of this now, so as a bonus exercise, why not implement ioctl. See https://rust.docs.kernel.org/kernel/miscdevice/trait.MiscDevice.html for details.

You'll need some IOCTL numbers to use. Try creating a "Hello" ioctl, with no argument (no data read and no data written):

const RUST_MISC_DEV_HELLO: u32 = _IO('|' as u32, 0x80);

I chose | as the ioctl type for Miscellaneous Devices, because that's what is in the example code. If you need help with this step, that's a great place to look.

To send an ioctl to your device, you can use this Rust program. You'll need to put it in a package (cargo new --bin openfile) and add the nix crate with ioctl feature enabled (cargo add -F ioctl nix).

use std::os::fd::AsRawFd;

const HELLO: u8 = 0x80;
nix::ioctl_none!(hello_ioctl, '|', HELLO);

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let f = std::fs::File::open("/dev/rust-misc-device")?;
    let fd = f.as_raw_fd();
    let result = unsafe { hello_ioctl(fd) };
    println!("ioctl returned {:?}", result);
    Ok(())
}

Or you could try and write the equivalent in Rust (you'll probably need the nix crate and the libc crate).

Here's an example `ioctl` method if you need one

fn ioctl(
    me: Pin<&RustOutOfTreeDevice>,
    _file: &kernel::fs::File,
    cmd: u32,
    arg: usize,
) -> Result<isize> {
    dev_info!(me.dev, "IOCTLing Rust Out Of Tree Device\n");

    let size = kernel::ioctl::_IOC_SIZE(cmd);

    match cmd {
        RUST_MISC_DEV_HELLO => {
            dev_info!(me.dev, "-> hello received (size {}, arg {})\n", size, arg);
            Ok(100)
        }
        _ => {
            dev_err!(me.dev, "-> IOCTL not recognised: {}\n", cmd);
            Err(ENOTTY)
        }
    }
}

Here's what your output might look like if we run that example:

$ make KDIR=../linux-6.14 LLVM=1
$ insmod ./rust_out_of_tree.ko
[12147.696311] rust_out_of_tree: Rust out-of-tree sample (init)
$ cd ../openfile
$ cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/openfile`
ioctl returned Ok(100)
$ dmesg
[12150.540427] misc rust-misc-device: Opening Rust Misc Device Sample (uid = 0)
[12150.541318] misc rust-misc-device: IOCTLing Rust Misc Device Sample
[12150.541606] misc rust-misc-device: -> hello received (size 0, arg 0)

Task 11 - Keep going!

OK! If you still want more, try implementing 'read' and 'write' ioctls, so you can communicate with your driver. Or look at the Kernel mailing list for the patches that will let you do ordinary read and write syscalls on your device, rather than just ioctls. Happy kernel hacking in Rust!

Interactive TCP Echo Server

In this exercise, we will make a simple TCP "echo" server using APIs in Rust's Standard Library.

Here's how an interaction with it would look like from a client point of view. You connect to it using nc, for example:

nc localhost 7878

and type in one line of text. As soon as you hit enter, the server sends the line back but keeps the connection opened. You can type another line and get it back, and so on.

Here's an example interaction with the server. Notice that after typing a single line the connection is not closed and we receive the line back. All inputs and outputs should be separated by new line characters (\n).

$ nc localhost 7878
hello
> hello
world
> world

(> denotes the text that is sent back to you)

After completing this exercise you are able to

open a TCP port and react to TCP clients connecting
use I/O traits to read/write from a TCP socket
use threads to support multiple connections

Tasks

Create a new binary project tcp-server
Implement a basic TCP server that listens for connections on a given port (you can use 127.0.0.1:7878 or any other port that you like).
Implement a loop that would read data from a TcpStream one line at a time. We assume that lines are separated by a \n character.
Add writing the received line back to the stream. Resolve potential borrow checker issues using standard library APIs.
Use Rust's thread API to add support for multiple connections.

Here's a bit of code to get you started:

use std::{io, net::{TcpListener, TcpStream}};

fn handle_client(mut stream: TcpStream) -> Result<(), io::Error> {
    todo!("read stream line by line, write lines back to the stream");
    // for line in stream {
    //   write line back to the to stream
    // }
    Ok(())
}

fn main() -> Result<(), io::Error> {
    let listener = todo!("bind a listener to 127.0.0.1:7878");
    for stream in todo!("accept incoming connections") {
        // todo!("support multiple connections in parallel");
        handle_client(stream)?;
    }
    Ok(())
}

Help

Reading line by line

Rust by Example has a chapter showing examples of reading files line by line that can be adapted to TcpStream, too.

Solving borrow checker issues

At some point you may run into borrow checker issues because you are essentially trying to write into a stream as you read from it.

The solution is to end up with two separate owned variables that perform reading and writing respectively.

There are two general approaches to do so:

Simply clone the stream. TcpStream has a try_clone() method. This will not clone the stream itself: on the Operating System level there will still be a single connection. But from Rust perspective now this underlying OS resource will be represented by two distinct variables.
Use the fact that Read and Write traits are implemented not only for TcpStream but also for &TcpStream. For example, you can create a pair of BufReader and BufWriter by passing &stream as an argument.

Troubleshooting I/O operations

If you decide to use BufWriter to handle writes you may not see any text echoed back in the terminal when using nc. As the name applies the output is buffered, and you need to explicitly call flush() method for text to be send out over the TCP socket.

Running `nc` on Windows

Windows doesn't come with a TCP client out of the box. You have a number of options:

Git-for-Windows comes with Git-Bash - a minimal Unix emulation layer. It has Windows ports of many popular UNIX command-line utilities, including nc.
If you have WSL setup your Linux environment has nc (or it is available as a package). You may either run the exercise in your Linux environment, too, or connect from Linux guest to your host.
There's a Windows-native version of ncat from Nmap project that is available as a separate portable download
If you have access to a remote Linux server you can use SSH tunnelling to connect remote nc to a TCP server running on your local machine. ssh -L 7878:<remote_host>:8888 <user>@<remote_host> -p <ssh_port> will let you run nc 0.0.0.0 8888 on your Linux box and talk to a locally run TCP Echo server example.
If you have friends that can run nc you can let them connect to your developer machine and play a role of your client. It's often possible if you share the same local network with them, but you can always rely on ngrok or cloudflared to expose a specific TCP port to anyone on the internet.

In this exercise we will take our interactive server and add a common log for lengths of messages that each client sends us. We will explore synchronization primitives that Rust offers in its Standard Library.

After completing this exercise you are able to

share data between threads using Mutexes
use reference-counting to ensure data stays available across multiple threads
use scoped threads to avoid runtime reference counting
use channels and message passing to share data among threads by communicating

Tasks

Part 1

Add a log to store length of messages: let mut log: Vec<usize> = vec![];
Pass it to a handle_client function and record a length of each incoming line of text:
```
log.push(line.len());
```
Resolve lifetime issues by using a reference-counting pointer.
Resolve mutability issues by using a mutex

Part 2

Use the thread::scope function to get rid of reference counting for log vector

Part 3

Instead of sharing log vector use a mpsc::channel to send length of lines from worker threads.
Create a separate thread that listens for new channel messages and updates the vector accordingly.

Writing an async chat

Nothing is simpler than creating a chat server, right? Not quite, chat servers expose you to all the fun of asynchronous programming:

How will the server handle clients connecting concurrently?

How will it handle them disconnecting?

How will it distribute the messages?

This tutorial explains how to write a chat server in tokio.

Specification and Getting Started

Specification

The chat uses a simple text protocol over TCP. The protocol consists of utf-8 messages, separated by \n.

The client connects to the server and sends login as a first line. After that, the client can send messages to other clients using the following syntax:

login1, login2, ... loginN: message

Each of the specified clients then receives a from login: message message.

A possible session might look like this

On Alice's computer:   |   On Bob's computer:

> alice                |   > bob
> bob: hello               < from alice: hello
                       |   > alice, bob: hi!
                           < from bob: hi!
< from bob: hi!        |

The main challenge for the chat server is keeping track of many concurrent connections. The main challenge for the chat client is managing concurrent outgoing messages, incoming messages and user's typing.

Getting Started

Let's create a new Cargo project:

$ cargo new a-chat
$ cd a-chat

Add the following lines to Cargo.toml:

[dependencies]
tokio = { version = "1", features = ["full"] }

Writing an Accept Loop

Let's implement the scaffold of the server: a loop that binds a TCP socket to an address and starts accepting connections.

First of all, let's add required import boilerplate:

extern crate tokio;
use std::future::Future; // 1
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader}, // 2
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs}, // 3
    sync::{mpsc, oneshot},
    task, // 4
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>; // 5

Import traits required to work with futures.
Import traits required to work with streams.
For the socket type, we use TcpListener from tokio, which is similar to the sync std::net::TcpListener, but is non-blocking and uses async API.
The task module roughly corresponds to the std::thread module, but tasks are much lighter weight. A single thread can run many tasks.
We will skip implementing detailed error handling in this example. To propagate the errors, we will use a boxed error trait object. Do you know that there's From<&'_ str> for Box<dyn Error> implementation in stdlib, which allows you to use strings with ? operator?

Now we can write the server's accept loop:

extern crate tokio;
use tokio::net::{TcpListener, ToSocketAddrs};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> { // 1
    let listener = TcpListener::bind(addr).await?; // 2

    loop { // 3
        let (stream, _) = listener.accept().await?;
        // TODO
    }

    Ok(())
}

We mark the accept_loop function as async, which allows us to use .await syntax inside.
TcpListener::bind call returns a future, which we .await to extract the Result, and then ? to get a TcpListener. Note how .await and ? work nicely together. This is exactly how std::net::TcpListener works, but with .await added.
We generally use loop and break for looping in Futures, that makes things easier down the line.

Finally, let's add main:

extern crate tokio;
use tokio::net::{ToSocketAddrs};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
Ok(())
}

#[tokio::main]
pub(crate) async fn main() -> Result<()> {
    accept_loop("127.0.0.1:8080").await
}

The crucial thing to realise is that in Rust, unlike in other languages, calling an async function does not run any code. Async functions only construct futures, which are inert state machines. To start stepping through the future state-machine in an async function, you should use .await. In a non-async function, a way to execute a future is to hand it to the executor.

Receiving messages

Let's implement the receiving part of the protocol. We need to:

split incoming TcpStream on \n and decode bytes as utf-8
interpret the first line as a login
parse the rest of the lines as a login: message

We recommend that you speed through this quickly, it is mostly a lot of uninteresting protocol minutia.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;
    loop {
        let (stream, _socket_addr) = listener.accept().await?;
        println!("Accepting from: {}", stream.peer_addr()?);
        let _handle = task::spawn(connection_loop(stream)); // 1
    }
    Ok(())
}

async fn connection_loop(stream: TcpStream) -> Result<()> {
    let reader = BufReader::new(stream);
    let mut lines = reader.lines(); // 2

    // 3
    let name = match lines.next_line().await? {
        None => Err("peer disconnected immediately")?,
        Some(line) => line,
    };
    println!("name = {}", name);

    // 4
    loop {
        if let Some(line) = lines.next_line().await? {
            // 5
            let (dest, msg) = match line.find(':') {
                None => continue,
                Some(idx) => (&line[..idx], line[idx + 1..].trim()),
            };
            let dest = dest
                .split(',')
                .map(|name| name.trim().to_string())
                .collect::<Vec<_>>();
            let msg = msg.to_string();
            // TODO: this is temporary
            println!("Received message: {}", msg);
        } else {
            break
        }
    }
    Ok(())
}

We use task::spawn function to spawn an independent task for working with each client. That is, after accepting the client the accept_loop immediately starts waiting for the next one. This is the core benefit of event-driven architecture: we serve many clients concurrently, without spending many hardware threads.
Luckily, the "split byte stream into lines" functionality is already implemented. .lines() call returns a stream of String's.
We get the first line -- login
And, once again, we implement a manual async loop.
Finally, we parse each line into a list of destination logins and the message itself.

Managing Errors

One issue with the previous solution is that while we correctly propagate errors in the connection_loop, we just drop the error afterwards! That is, task::spawn does not return an error immediately (it can't, it needs to run the future to completion first), only after it is joined. We can "fix" it by waiting for the task to be joined, like this:

extern crate tokio;
use tokio::{
    net::TcpStream,
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
async fn connection_loop(stream: TcpStream) -> Result<()> {
Ok(())
}

async fn accept_loop(stream: TcpStream) -> Result<()> {
let handle = task::spawn(connection_loop(stream));
handle.await?
}

The .await waits until the client finishes, and ? propagates the result.

There are two problems with this solution however!

First, because we immediately await the client, we can only handle one client at a time, and that completely defeats the purpose of async!
Second, if a client encounters an IO error, the whole server immediately exits. That is, a flaky internet connection of one peer brings down the whole chat room!

The correct way to handle client errors in this case is log them, and continue serving other clients. So let's use a helper function for this:

extern crate tokio;
use std::future::Future;
use tokio::task;
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    task::spawn(async move {
        if let Err(e) = fut.await {
            eprintln!("{}", e)
        }
    })
}

Sending Messages

Now it's time to implement the other half -- sending messages. As a rule of thumb, only a single task should write to each TcpStream. This way, we also have compartmentalised that activity and automatically serialize all outgoing messages. So let's create a connection_writer_loop task which receives messages over a channel and writes them to the socket. If Alice and Charley send two messages to Bob at the same time, Bob will see the messages in the same order as they arrive in the channel.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};

use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::oneshot,
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
use tokio::sync::mpsc; // 1

type Sender<T> = mpsc::UnboundedSender<T>; // 2
type Receiver<T> = mpsc::UnboundedReceiver<T>;

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf // 3
) -> Result<()> {
    loop {
        let msg = messages.recv().await;
        match msg {
            Some(msg) => stream.write_all(msg.as_bytes()).await?,
            None => break,
        }
    }
    Ok(())
}

We will use mpsc channels from tokio.
For simplicity, we will use unbounded channels, and won't be discussing backpressure in this tutorial.

As connection_loop and connection_writer_loop share the same TcpStream, we use splitting. We'll glue this together later.

extern crate tokio;
use tokio::net::TcpStream;
async fn connection_loop(stream: TcpStream) {

use tokio::net::tcp;
let (reader, writer): (tcp::OwnedReadHalf, tcp::OwnedWriteHalf) = stream.into_split();
}

A broker as a connection point

So how do we make sure that messages read in connection_loop flow into the relevant connection_writer_loop? We should somehow maintain a peers: HashMap<String, Sender<String>> map which allows a client to find destination channels. However, this map would be a bit of shared mutable state, so we'll have to wrap an RwLock over it and answer tough questions about what should happen if the client joins at the same moment as it receives a message.

One trick to make reasoning about state simpler is by taking inspiration from the actor model. We can create a dedicated broker task which owns the peers map and communicates with other tasks using channels. The broker reacts on events and appropriately informs the peers. By hiding peer handling inside such an "actor" task, we remove the need for mutexes and also make the serialization point explicit. The order of events "Bob sends message to Alice" and "Alice joins" is determined by the order of the corresponding events in the broker's event queue.

extern crate tokio;
use std::future::Future;
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
) -> Result<()> {
Ok(())
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}

use std::collections::hash_map::{Entry, HashMap};

#[derive(Debug)]
enum Event { // 1
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn broker_loop(mut events: Receiver<Event>) {
    let mut peers: HashMap<String, Sender<String>> = HashMap::new(); // 2

    loop {
        let event = match events.recv().await {
            Some(event) => event,
            None => break,
        };

        match event {
            Event::Message { from, to, msg } => { // 3
                for addr in to {
                    if let Some(peer) = peers.get_mut(&addr) {
                        let msg = format!("from {from}: {msg}\n");
                        peer.send(msg).unwrap();
                    }
                }
            }
            Event::NewPeer { name, mut stream } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender); // 4
                    spawn_and_log_error(async move {
                        connection_writer_loop(&mut client_receiver, &mut stream).await
                    }); // 5
                }
            },
        }
    }
}

The broker task should handle two types of events: a message or an arrival of a new peer.
The internal state of the broker is a HashMap. Note how we don't need a Mutex here and can confidently say, at each iteration of the broker's loop, what is the current set of peers.
To handle a message, we send it over a channel to each destination.
To handle a new peer, we first register it in the peer's map ...
... and then spawn a dedicated task to actually write the messages to the socket.

Gluing it all together

At this point, we only need to start the broker to get a fully-functioning (in the happy case!) chat.

Scroll past the example find a list of all changes.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};

use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::mpsc,
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

#[tokio::main]
pub(crate) async fn main() -> Result<()> {
    accept_loop("127.0.0.1:8080").await
}

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;

    let (broker_sender, broker_receiver) = mpsc::unbounded_channel(); // 1
    let _broker = task::spawn(broker_loop(broker_receiver));

    while let Ok((stream, _socket_addr)) = listener.accept().await {
        println!("Accepting from: {}", stream.peer_addr()?);
        spawn_and_log_error(connection_loop(broker_sender.clone(), stream));
    }
    Ok(())
}

async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> { // 2
    let (reader, writer) = stream.into_split(); // 3
    let reader = BufReader::new(reader);
    let mut lines = reader.lines();

    let name = match lines.next_line().await {
        Ok(Some(line)) => line,
        Ok(None) => return Err("peer disconnected immediately".into()),
        Err(e) => return Err(Box::new(e)),
    };

    println!("user {} connected", name);

    broker
        .send(Event::NewPeer {
            name: name.clone(),
            stream: writer,
        })
        .unwrap(); // 5

    loop {
        if let Some(line) = lines.next_line().await? {
            let (dest, msg) = match line.find(':') {
                None => continue,
                Some(idx) => (&line[..idx], line[idx + 1..].trim()),
            };
            let dest: Vec<String> = dest
                .split(',')
                .map(|name| name.trim().to_string())
                .collect();
            let msg: String = msg.trim().to_string();

            broker
                .send(Event::Message { // 4
                    from: name.clone(),
                    to: dest,
                    msg,
                })
                .unwrap();
        } else {
            break;
        }
    }

    Ok(())
}

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf // 3
) -> Result<()> {
    loop {
        let msg = messages.recv().await;
        match msg {
            Some(msg) => stream.write_all(msg.as_bytes()).await?,
            None => break,
        }
    }
    Ok(())
}

#[derive(Debug)]
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn broker_loop(mut events: Receiver<Event>) {
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = match events.recv().await {
            Some(event) => event,
            None => break,
        };
        match event {
            Event::Message { from, to, msg } => {
                for addr in to {
                    if let Some(peer) = peers.get_mut(&addr) {
                        let msg = format!("from {from}: {msg}\n");
                        peer.send(msg).unwrap();
                    }
                }
            }
            Event::NewPeer { name, mut stream } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender);
                    spawn_and_log_error(async move {
                        connection_writer_loop(&mut client_receiver, &mut stream).await
                    });
                }
            },
        }
    }
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    task::spawn(async move {
        if let Err(e) = fut.await {
            eprintln!("{}", e)
        }
    })
}

Inside the accept_loop, we create the broker's channel and task.
We need the connection_loop to accept a handle to the broker.
Inside connection_loop, we need to split the TcpStream, to be able to share it with the connection_writer_loop.
On login, we notify the broker. Note that we .unwrap on send: broker should outlive all the clients and if that's not the case the broker probably panicked, so we can escalate the panic as well.
Similarly, we forward parsed messages to the broker, assuming that it is alive.

Clean Shutdown

One of the problems with the current implementation is that it doesn't handle graceful shutdown. If we break from the accept loop for some reason, all in-flight tasks are just dropped.

Instead, let's intercept Ctrl-C and implement a more correct shutdown sequence:

Stop accepting new clients
Notify the readers we're not accepting new messages
Deliver all pending messages
Exit the process

A clean shutdown in a channel based architecture is easy, although it can appear a magic trick at first. In Rust, receiver side of a channel is closed as soon as all senders are dropped. That is, as soon as producers exit and drop their senders, the rest of the system shuts down naturally. In tokio this translates to two rules:

Make sure that channels form an acyclic graph.
Take care to wait, in the correct order, until intermediate layers of the system process pending messages.

In a-chat, we already have an unidirectional flow of messages: reader -> broker -> writer. However, we never wait for broker and writers, which might cause some messages to get dropped.

We also need to notify all readers that we are going to stop accepting messages. Here, we use tokio::sync::Notify.

Let's first add the notification feature to the readers. We have to start using select! here to work

async fn connection_loop(broker: Sender<Event>, stream: TcpStream, shutdown: Arc<Notify>) -> Result<()> {
    // ...
    loop {
        tokio::select! {
            Ok(Some(line)) = lines.next_line() => {
                let (dest, msg) = match line.split_once(':') {

                    None => continue,
                    Some((dest, msg)) => (dest, msg.trim()),
                };
                let dest: Vec<String> = dest
                    .split(',')
                    .map(|name| name.trim().to_string())
                    .collect();
                let msg: String = msg.trim().to_string();

                broker
                    .send(Event::Message {
                        from: name.clone(),
                        to: dest,
                        msg,
                    })
                    .unwrap();
            },
            _ = shutdown.notified() => break,
        }
    }
}

Let's add Ctrl-C handling and waiting to the server.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot, Notify},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}
async fn broker_loop(mut events: Receiver<Event>) {}
async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> {
    Ok(())
}
fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;

    let (broker_sender, broker_receiver) = mpsc::unbounded_channel();
    let broker = task::spawn(broker_loop(broker_receiver));
    let shutdown_notification = Arc::new(Notify::new());

    loop {
        tokio::select!{
            Ok((stream, _socket_addr)) = listener.accept() => {
                println!("Accepting from: {}", stream.peer_addr()?);
                spawn_and_log_error(connection_loop(broker_sender.clone(), stream, shutdown_notification.clone()));
            },
            _ = tokio::signal::ctrl_c() => break,
        }
    }
    println!("Shutting down server!");
    shutdown_notification.notify_waiters(); // 1
    drop(broker_sender); // 2
    broker.await?; // 5
    Ok(())
}

And to the broker:

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}
async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> {
    Ok(())
}
fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}
async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
    mut shutdown: oneshot::Receiver<()>,
) -> Result<()> {
    Ok(())
}

async fn broker_loop(mut events: Receiver<Event>) {
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = match events.recv().await {
            Some(event) => event,
            None => break,
        };
        match event {
            Event::Message { from, to, msg } => {
                // ...
            }
            Event::NewPeer {
                name,
                mut stream,
            } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender);
                    spawn_and_log_error(async move {
                        connection_writer_loop(&mut client_receiver, &mut stream).await
                    });
                }
            },
        }
    }

    drop(peers) //4
}

Notice what happens with all of the channels once we exit the accept loop:

We notify all readers to stop accepting messages.
We drop the main broker's sender. That way when the readers are done, there's no sender for the broker's channel, and the channel closes.
Next, the broker exits while let Some(event) = events.next().await loop.
It's crucial that, at this stage, we drop the peers map. This drops writer's senders.
Tokio will automatically wait for all finishing futures
Finally, we join the broker, which also guarantees that all the writes have terminated.

Handling Disconnections

Currently, we only ever add new peers to the map. This is clearly wrong: if a peer closes connection to the chat, we should not try to send any more messages to it.

One subtlety with handling disconnection is that we can detect it either in the reader's task, or in the writer's task. The most obvious solution here is to just remove the peer from the peers map in both cases, but this would be wrong. If both read and write fail, we'll remove the peer twice, but it can be the case that the peer reconnected between the two failures! To fix this, we will only remove the peer when the write side finishes. If the read side finishes we will notify the write side that it should stop as well. That is, we need to add an ability to signal shutdown for the writer task.

One way to approach this is a shutdown: Receiver<()> channel. There's a more minimal solution however, which makes clever use of RAII (Resource Acquisition Is Initialization). Closing a channel is a synchronization event, so we don't need to send a shutdown message, we can just drop the sender. This way, we statically guarantee that we issue shutdown exactly once, even if we early return via ? or panic.

First, let's add a shutdown channel to the connection_loop:

extern crate tokio;
use std::future::Future;
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
) -> Result<()> {
Ok(())
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}


#[derive(Debug)]
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>, // 1
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> {
    let (reader, writer) = stream.into_split();
    let reader = BufReader::new(reader);
    let mut lines = reader.lines();
    let name: String = String::new();
    // ...
    let (_shutdown_sender, shutdown_receiver) = oneshot::channel::<()>(); // 3
    broker
        .send(Event::NewPeer {
            name: name.clone(),
            stream: writer,
            shutdown: shutdown_receiver, // 2
        })
        .unwrap();
    // ...
  unimplemented!()
}

To enforce that no messages are sent along the shutdown channel, we use a oneshot channel.
We pass the shutdown channel to the writer task.
In the reader, we create a _shutdown_sender whose only purpose is to get dropped.

In the connection_writer_loop, we now need to choose between shutdown and message channels. We use the select macro for this purpose:

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
    mut shutdown: oneshot::Receiver<()>, // 1
) -> Result<()> {
    loop { // 2
        tokio::select! {
            msg = messages.recv() => match msg {
                Some(msg) => stream.write_all(msg.as_bytes()).await?,
                None => break,
            },
            _ = &mut shutdown => break // 3
        }
    }

    println!("Closing connection_writer loop!");

    Ok(())
}

We add shutdown channel as an argument.
Because of select, we can't use a while let loop, so we desugar it further into a loop.
In the shutdown case break the loop.

Another problem is that between the moment we detect disconnection in connection_writer_loop and the moment when we actually remove the peer from the peers map, new messages might be pushed into the peer's channel.

The final thing to handle is actually clean up our peers map. Here, we need to establish a communication back to the broker. However, we can handle that completely within the brokers scope, to not infect the writer loop with this concern.

To not lose these messages completely, we'll return the writers messages receiver back to the broker. This also allows us to establish a useful invariant that the message channel strictly outlives the peer in the peers map, and makes the broker itself infallible.

async fn broker_loop(mut events: Receiver<Event>) {
    let (disconnect_sender, mut disconnect_receiver) =
        mpsc::unbounded_channel::<(String, Receiver<String>)>(); // 1
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = tokio::select! {
            event = events.recv() => match event {
                None => break,
                Some(event) => event,
            },
            disconnect = disconnect_receiver.recv() => {
                let (name, _pending_messages) = disconnect.unwrap();
                assert!(peers.remove(&name).is_some());
                println!("user {} disconnected", name);
                continue;
            },
        };
        match event {
            Event::Message { from, to, msg } => {
                // ...
            }
            Event::NewPeer {
                name,
                mut stream,
                shutdown,
            } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    // ...
                    spawn_and_log_error(async move {
                        let res =
                            connection_writer_loop(&mut client_receiver, &mut stream, shutdown)
                                .await;
                        println!("user {} disconnected", name);
                        disconnect_sender.send((name, client_receiver)).unwrap(); // 2
                        res
                    });
                }
            },
        }
    }
    drop(peers);
    drop(disconnect_sender);
    while let Some((_name, _pending_messages)) = disconnect_receiver.recv().await {}
}

Final Server Code

The final code looks like this:

use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
    sync::Arc,
};

use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot, Notify},
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

#[tokio::main]
pub(crate) async fn main() -> Result<()> {
    accept_loop("127.0.0.1:8080").await
}

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;

    let (broker_sender, broker_receiver) = mpsc::unbounded_channel();
    let broker = task::spawn(broker_loop(broker_receiver));
    let shutdown_notification = Arc::new(Notify::new());

    loop {
        tokio::select!{
            Ok((stream, _socket_addr)) = listener.accept() => {
                println!("Accepting from: {}", stream.peer_addr()?);
                spawn_and_log_error(connection_loop(broker_sender.clone(), stream, shutdown_notification.clone()));
            },
            _ = tokio::signal::ctrl_c() => break,
        }
    }
    println!("Shutting down!");
    shutdown_notification.notify_waiters();
    drop(broker_sender);
    broker.await?;
    Ok(())
}

async fn connection_loop(broker: Sender<Event>, stream: TcpStream, shutdown: Arc<Notify>) -> Result<()> {
    let (reader, writer) = stream.into_split();
    let reader = BufReader::new(reader);
    let mut lines = reader.lines();
    let (shutdown_sender, shutdown_receiver) = oneshot::channel::<()>();

    let name = match lines.next_line().await {
        Ok(Some(line)) => line,
        Ok(None) => return Err("peer disconnected immediately".into()),
        Err(e) => return Err(Box::new(e)),
    };

    println!("user {} connected", name);

    broker
        .send(Event::NewPeer {
            name: name.clone(),
            stream: writer,
            shutdown: shutdown_receiver,
        })
        .unwrap();
    
    loop {
        tokio::select! {
            Ok(Some(line)) = lines.next_line() => {
                let (dest, msg) = match line.split_once(':') {

                    None => continue,
                    Some((dest, msg)) => (dest, msg.trim()),
                };
                let dest: Vec<String> = dest
                    .split(',')
                    .map(|name| name.trim().to_string())
                    .collect();
                let msg: String = msg.trim().to_string();
        
                broker
                    .send(Event::Message {
                        from: name.clone(),
                        to: dest,
                        msg,
                    })
                    .unwrap();
            },
            _ = shutdown.notified() => break,
        }
    }
    println!("Closing connection loop!");
    drop(shutdown_sender);

    Ok(())
}

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
    mut shutdown: oneshot::Receiver<()>,
) -> Result<()> {
    loop {
        tokio::select! {
            msg = messages.recv() => match msg {
                Some(msg) => stream.write_all(msg.as_bytes()).await?,
                None => break,
            },
            _ = &mut shutdown => break
        }
    }

    println!("Closing connection_writer loop!");

    Ok(())
}

#[derive(Debug)]
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn broker_loop(mut events: Receiver<Event>) {
    let (disconnect_sender, mut disconnect_receiver) =
        mpsc::unbounded_channel::<(String, Receiver<String>)>();
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = tokio::select! {
            event = events.recv() => match event {
                None => break,
                Some(event) => event,
            },
            disconnect = disconnect_receiver.recv() => {
                let (name, _pending_messages) = disconnect.unwrap();
                assert!(peers.remove(&name).is_some());
                println!("user {} disconnected", name);
                continue;
            },
        };
        match event {
            Event::Message { from, to, msg } => {
                for addr in to {
                    if let Some(peer) = peers.get_mut(&addr) {
                        let msg = format!("from {}: {}\n", from, msg);
                        peer.send(msg).unwrap();
                    }
                }
            }
            Event::NewPeer {
                name,
                mut stream,
                shutdown,
            } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender);
                    let disconnect_sender = disconnect_sender.clone();
                    spawn_and_log_error(async move {
                        let res =
                            connection_writer_loop(&mut client_receiver, &mut stream, shutdown)
                                .await;
                        println!("user {} disconnected", name);
                        disconnect_sender.send((name, client_receiver)).unwrap();
                        res
                    });
                }
            },
        }
    }
    drop(peers);
    drop(disconnect_sender);
    while let Some((_name, _pending_messages)) = disconnect_receiver.recv().await {}
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    task::spawn(async move {
        if let Err(e) = fut.await {
            eprintln!("{}", e)
        }
    })
}

Implementing a client

Since the protocol is line-based, implementing a client for the chat is straightforward:

Lines read from stdin should be sent over the socket.
Lines read from the socket should be echoed to stdout.

Although async does not significantly affect client performance (as unlike the server, the client interacts solely with one user and only needs limited concurrency), async is still useful for managing concurrency!

The client has to read from stdin and the socket simultaneously. Programming this with threads is cumbersome, especially when implementing a clean shutdown. With async, the select! macro is all that is needed.

extern crate tokio;
use tokio::{
    io::{stdin, AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{TcpStream, ToSocketAddrs},
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;

// main
async fn run() -> Result<()> {
    try_main("127.0.0.1:8080").await
}

async fn try_main(addr: impl ToSocketAddrs) -> Result<()> {
    let stream = TcpStream::connect(addr).await?;
    let (reader, mut writer) = stream.into_split();

    let mut lines_from_server = BufReader::new(reader).lines(); // 2
    let mut lines_from_stdin = BufReader::new(stdin()).lines(); // 3

    loop {
        tokio::select! { // 4
            line = lines_from_server.next_line() => match line {
                Ok(Some(line)) => {
                    println!("{}", line);
                },
                Ok(None) => break,
                Err(e) => eprintln!("Error {:?}:", e),
            },
            line = lines_from_stdin.next_line() => match line {
                Ok(Some(line)) => {
                    writer.write_all(line.as_bytes()).await?;
                    writer.write_all(b"\n").await?;
                },
                Ok(None) => break,
                Err(e) => eprintln!("Error {:?}:", e),
            }
        }
    }
    Ok(())
}

Here we split TcpStream into read and write halves.
We create a stream of lines for the socket.
We create a stream of lines for stdin.
In the main select loop, we print the lines we receive from the server and send the lines we read from the console.

Verifying Data Structures with Kani

In this exercise we implement a remove_at method for a linked list data structure and will write a Kani harness to prove the correctness of the method.

Custom data structures (trees, graphs, etc.) in Rust often require the use of unsafe code for efficient implementation. Working with raw pointers means that Rust's borrow checker cannot guarantee the correctness of memory access, and Rust cannot reliably clean memory for us. Getting a code like this right is difficult, and without Rust compiler helping us along the way we would have to rely on testing instead.

Today we will look at a linked list - the simplest data structure that relies pointer manipulation. The example below is very limited. The unofficial Rust tutorial - Learn Rust With Entirely Too Many Linked Lists - explores the production-ready design of a linked list from Rust Standard Library, while Implementing Vec section of The Rustonomicon explores aspects of unsafe Rust use for designing data structures in more detail. The same principles apply to other, more complex data structures like trees, graphs, queues, etc.

After completing this exercise you will be able to

set up Kani support for a project
write Kani harnesses
produce Kani playback tests

Tasks

Create a new library project kani-linked-list, copy the code from below
Set up Kani support for the project
Add Kani proof for remove_at method.
If Kani discovers bugs in the method, then generate playback tests.
Fix bugs in the code.

Starting code

This is a doubly-linked list with a relatively limited API. You can push and pop elements from both ends of the list (push_front, pop_front, and push_back, pop_back respectively), you can access the ends of the list (front[_mut] and back[_mut]), read list's length, and iterate over it. The test at the bottom of a snippet demonstrates most of the methods available.

#![allow(unused)]
fn main() {
use std::{fmt::Debug, ptr::NonNull, vec::IntoIter};

type Link<T> = Option<NonNull<Node<T>>>;

pub struct DoublyLinkedList<T> {
    first: Link<T>,
    last: Link<T>,
    len: usize,
}

struct Node<T> {
    prev: Link<T>,
    next: Link<T>,
    elem: T,
}

impl<T> Default for DoublyLinkedList<T> {
    fn default() -> Self {
        Self::new()
    }
}

impl<T> DoublyLinkedList<T> {
    pub fn new() -> Self {
        Self {
            first: None,
            last: None,
            len: 0,
        }
    }

    pub fn push_front(&mut self, elem: T) {
        unsafe {
            let mut new_first = NonNull::new_unchecked(Box::into_raw(Box::new(Node {
                prev: None,
                next: None,
                elem,
            })));
            match self.first {
                Some(mut old_first) => {
                    // rewire pointers
                    old_first.as_mut().prev = Some(new_first);
                    new_first.as_mut().next = Some(old_first);
                }
                None => {
                    // make a list with a single element
                    self.last = Some(new_first)
                }
            }
            self.first = Some(new_first);
            self.len += 1;
        }
    }

    pub fn push_back(&mut self, elem: T) {
        unsafe {
            let mut new_last = NonNull::new_unchecked(Box::into_raw(Box::new(Node {
                prev: None,
                next: None,
                elem,
            })));
            match self.last {
                Some(mut old_last) => {
                    // Put the new back before the old one
                    old_last.as_mut().next = Some(new_last);
                    new_last.as_mut().prev = Some(old_last);
                }
                None => {
                    // make a list with a single element
                    self.first = Some(new_last);
                }
            }
            self.last = Some(new_last);
            self.len += 1;
        }
    }

    pub fn pop_front(&mut self) -> Option<T> {
        let node = self.first?;
        unsafe {
            let node = Box::from_raw(node.as_ptr());
            let elem = node.elem;

            self.first = node.next;
            match self.first {
                Some(mut new_first) => {
                    new_first.as_mut().prev = None;
                }
                None => {
                    self.last = None;
                }
            }

            self.len -= 1;
            Some(elem)
        }
    }

    pub fn pop_back(&mut self) -> Option<T> {
        let node = self.last?;
        unsafe {
            let node = Box::from_raw(node.as_ptr());
            let elem = node.elem;

            self.last = node.prev;
            match self.last {
                Some(mut new_last) => {
                    new_last.as_mut().prev = None;
                }
                None => {
                    self.last = None;
                }
            }

            self.len -= 1;
            Some(elem)
        }
    }

    pub fn front(&self) -> Option<&T> {
        Some(unsafe { &self.first?.as_ref().elem })
    }

    pub fn front_mut(&mut self) -> Option<&mut T> {
        Some(unsafe { &mut self.first?.as_mut().elem })
    }

    pub fn back(&self) -> Option<&T> {
        Some(unsafe { &self.last?.as_ref().elem })
    }

    pub fn back_mut(&mut self) -> Option<&mut T> {
        Some(unsafe { &mut self.last?.as_mut().elem })
    }

    pub fn len(&self) -> usize {
        self.len
    }

    pub fn is_empty(&self) -> bool {
        self.len == 0
    }

    pub fn iter(&self) -> IntoIter<&T> {
        self.into_iter()
    }

    pub fn remove_at(&mut self, index: usize) -> Option<T> {
        unsafe {
            // find an element to remove.
            // if `index` is too large this will return `None` early
            let to_remove = {
                let mut to_remove = self.first;
                for _ in 0..index {
                    to_remove = to_remove?.as_mut().next;
                }
                Box::from_raw(to_remove?.as_ptr())
            };
            // connect previous and next elements together
            let mut prev = to_remove.prev?;
            let mut next = to_remove.next?;
            prev.as_mut().next = Some(next);
            next.as_mut().prev = Some(prev);

            Some(to_remove.elem)
        }
    }
}

impl<T> Drop for DoublyLinkedList<T> {
    fn drop(&mut self) {
        while self.pop_front().is_some() {}
    }
}

impl<T> IntoIterator for DoublyLinkedList<T> {
    type Item = T;

    type IntoIter = IntoIter<Self::Item>;

    fn into_iter(mut self) -> Self::IntoIter {
        // lazy implementation: we take all items from a linked list, put them
        // into a vector, and return Vec's iterator.
        let mut iter = vec![];
        while let Some(elem) = self.pop_front() {
            iter.push(elem);
        }
        iter.into_iter()
    }
}

impl<'a, T> IntoIterator for &'a DoublyLinkedList<T> {
    type Item = &'a T;

    type IntoIter = IntoIter<Self::Item>;

    fn into_iter(self) -> Self::IntoIter {
        // lazy implementation: we take all items from a linked list, put them
        // into a vector, and return Vec's iterator.
        let mut iter: Vec<&'a T> = vec![];
        let mut current = self.first;
        while let Some(node) = current.map(|n| unsafe { n.as_ref() }) {
            current = node.next;
            iter.push(&node.elem);
        }
        iter.into_iter()
    }
}

impl<T> FromIterator<T> for DoublyLinkedList<T> {
    fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> Self {
        let mut list = Self::new();
        for item in iter {
            list.push_back(item);
        }
        list
    }
}

// Needed for testing
impl<T> Debug for DoublyLinkedList<T>
where
    T: Debug,
{
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_list().entries(self).finish()
    }
}

// Needed for testing
impl<T> PartialEq for DoublyLinkedList<T>
where
    T: PartialEq,
{
    fn eq(&self, other: &Self) -> bool {
        self.iter().eq(other.iter())
    }
}

#[cfg(test)]
mod tests {
    use crate::DoublyLinkedList;

    #[test]
    fn test_list_apis() {
        let mut list = DoublyLinkedList::new();
        list.push_back(2);
        list.push_front(1);
        list.push_back(3);
        assert_eq!(format!("{list:?}"), "[1, 2, 3]");
        let front = list.front();
        assert_eq!(front, Some(&1));

        let removed = list.remove_at(1);
        assert_eq!(removed, Some(2));
        assert_eq!(format!("{list:?}"), "[1, 3]");

        let non_existing = list.remove_at(100);
        assert_eq!(non_existing, None);
        assert_eq!(format!("{list:?}"), "[1, 3]");
    }
}

mod proofs {
    use super::*;

    // TODO: write a proof for `DoublyLinkedList::remove_at`
}
}

Help

Setting up Kani

Setting up Cargo

# put this into Cargo.toml
[dev-dependencies]
kani-verifier = "0.56.0"

[dependencies]
# enables autocomplete and code inspections for `kani::*` api
kani = { version = "0.56", git = "https://github.com/model-checking/kani", tag = "kani-0.56.0", optional = true }

# removes warnings about unknown `cfg` attributes
[lints.rust]
unexpected_cfgs = { level = "warn", check-cfg = ['cfg(rust_analyzer)', 'cfg(kani)'] }

Rust Analyzer project settings

For VSCode this should be in .vscode/settings.json

{
    "rust-analyzer.cargo.features": ["kani"]
}

Optional: You may decide to use nightly Rust if Rust Analyzer doesn't work as expected.

Create rust-toolchain.toml in project's root

[toolchain]
channel = "nightly"

Writing Kani proofs

Code snippet to keep Rust Analyzer from showing macro errors for Kani

#[cfg_attr(not(rust_analyzer), cfg(kani))]
mod proofs {
    use super::*;

    #[cfg_attr(not(rust_analyzer), kani::proof)]
    fn kani_harness() {
        todo!();
    }

    #[test]
    fn kani_concrete_playback_xxx() {
        // playback code here
    }
}

Other Kani help

Generating a linked list of random elements.

You can make a list by making an array first using kani::any(). Then you can pass the array to a from_iter method.

const TOTAL: usize = 10;
let items: [u32; TOTAL] = kani::any();

let mut list = DoublyLinkedList::from_iter(items.iter().copied());
assert_eq!(list.len(), TOTAL);

You can use kani::any_where() to generate a value within a specific range.

let x: i32 = kani::any_where(|n| (1..=10).contains(n));

If the proof takes a lot of time to run you can introduce the upper unwind limit for loops.

#[cfg_attr(not(rust_analyzer), kani::proof)]
#[cfg_attr(not(rust_analyzer), kani::unwind(20))]
fn long_running_proof() {
}

The exact limit is not important. But by making the list under test shorter you can in turn safely lower the limit while letting the solver observe all possible states of the program and complete the verification.