Course 1: Rust Language

Rust Introduction: Cargo, Crates, Rust project, Hello Word, (2h on computers),

Pierre Cochard, Tanguy Risset

This course has been set up for student of the Telecommunication Department at INSA-Lyon (5th year), it is vastly inspired by the Rust book and many other resource on the web. It assumes that student do not have any programming experience in Rust by have a strong programming experience in other languages (C/C++ and object languages in particular).

In addition to this documents, you'll find other documents on Moodle presenting the concepts covered in this course. Don't forget to check it before you start. Many of the information listed here come from https://www.rust-lang.org/learn/.

The course is organize in sections that have questions. In addition you will find boxed text labeled course: which consists in important concept

Course: What is Rust and Why Rust

Why should computer science engineer always learn new languages?

Nowadays, Rust use is growing exponentially, the number of library and project using or useful to Rust developpers is already huge. The reason for that it that Rust provides safe memory management without a garbage collector, ensuring both performance and security. Its ownership system eliminates data races and segmentation faults. It enables efficient concurrent programming while guaranteeing memory safety. It is associated with a powerful modern ecosystem and and it is suited for embedded systems, system programming as well as high-performance applications. Adopted by major industry players, Rust is emerging as a reliable alternative for secure and system-level development.

Setting up the environnement for using Rust

We're going to start by setting up the environment that will enable you to program in Rust. We recommand that you use Rust on your own machine, but the environment is already installed on the departement computers.

This environment simply consists of having:

  1. An editor for programming, we strongly recommend Visual Studio Code that is available on all OS, for instance here: https://visualstudio.microsoft.com/fr/downloads/. See Appendix below for an introduction to Visual Studio Code.

  2. Install the cargo command with rustup.

cargo is the Rust compiler as well as the package manager and build system for Rust. cargo's installation and updating is itself managed by rustup.

Below is a summary for installing cargo on your laptop. The original complete instruction can be found here: https://doc.rust-lang.org/book/ch01-01-installation.html.

As a summary:

  • On linux or macOS, use the following command:

    curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh

  • on Windows, go to https://www.rust-lang.org/tools/install and follow the instructions for installing Rust.

Course: What is cargo used for?

`cargo` is not only a compiler, it manages many aspects that you need to understand to use Rust correctly:
  • Dependency Management: Cargo manages the dependencies of a Rust project. It automatically downloads and builds the required libraries and dependencies, making it easier for developers to include external code in their projects.

  • Project Configuration: Cargo uses a file called Cargo.toml to configure a Rust project. This file includes information about the project, its dependencies, and various settings.

  • Building and Compilation: Cargo handles the compilation process of Rust code. It can build the project, manage dependencies, and generate executable binaries. Developers can use Cargo commands like cargo build to compile the project or cargo run to build and run it in one step.

  • Testing: Cargo provides built-in support for testing Rust code. Developers can use the cargo test command to run tests defined in the project.

  • Documentation: Cargo can generate and serve documentation for the project using the cargo doc command. This is useful for both internal and external documentation (the HTML file that you are reading has been generated by cargo doc)

  • Publishing Packages: Cargo facilitates the process of publishing Rust packages to the official package registry, called "Crates.io." This makes it easy for others to discover and use Rust libraries and projects.

Rust Hello World

A Rust project is contained in a directory which has the name of the project. From now on, we suggest making a projects directory in your home directory and keeping all your Rust projects there.

We will use the cargo command to build our first hello_world project. Note that this is not mandatory, everything can be built by hand, the Rust ompiler can be invoked without cargo by using the command rustc.

Execute this command
cargo new hello_world

Check the files generated. Cargo.toml is the project configuration file written in the TOML (Tom's Obvious, Minimal Language) format1, and src/main.rs is the Rust "main" file. if anything is unclear ask the teacher.

Build your project with the command:
cargo build
Where is the generated executable file? What is the bang (!) after println

Course: What about println!

As your can check on the Rust Standard Library documentation : println! is not a function, it is a macro.

Macros are called with a trailling bang (such as println!), they are a way of writing code that writes other code, which is known as metaprogramming. Understanding and declaring Macros is quite complex and will be seen later. But using them (such as using println!) is usually very easy.

Variables, Types ans Mutability

What is the problem with the program below? Does it compile?

let x = 5;
println!("The value of x is: {x}");
x = 6;
println!("The value of x is: {x}");

What are the values printed by the program below? (Of course no one should do this kind of things.)

#![allow(unused)]
fn main() {
let x = 5;
{
    let x = x + 1;
    {
        let x = x * 2;
        println!("The value of x in the inner scope is: {x}");
    }
}
println!("The value of x is: {x}");
}

The notion of scope is quite important in Rust, a "scope" can be manipulated in the language as an object, we will see it in more detail in TD4.

Function in Rust

Functions in Rust are like in other languages. They are declared with the keyword fn, parameters are passed by value and an important specificity (borrowing values) will be study on next course.

Write a function fibonacci:
fibonacci(n: i32) -> i32
which computes element n of the fibonacci sequence.

We will not use a recursive solution, but rather a for loop whose syntax will be: for i in 2..n+1 ( half-open range) and mutable variables.

We recall the definition of the fibonacci function fib:

fib(0)=1
fib(1)=1
fib(i)=fib(i-1)+fib(i-2)  for i >= 

Generic types, traits and #derive directive

As a class in C++, types can be defined and can implement different methods. for instance, the following code defines the type Complex as a struct of two floats (as C++ classes, types begin with an Uppercase by convention).

#![allow(unused)]
fn main() {
struct Complex {
    re: f32,
    im: f32,
}

fn build_complex(re: f32,im:f32)->  Complex {
    Complex {re,im}
}       

let mut a = build_complex(2.3,4.0);
println!("a=({},{})",a.re,a.im);
a.re = a.re+1.;
println!("a=({},{})",a.re,a.im);
}

Course: Generic Types

As templates in C++, Rust enables the use of generic type in function or struct, enums or methods definitions. Here is a simple definition of a Point structure using integer or float coordinates:

#![allow(unused)]
fn main() {
struct Point<T> {
    x: T,
    y: T,
}

let integer = Point { x: 5, y: 10 };
let float = Point { x: 1.0, y: 4.0 };
}

Modifier le programme de création de Complex ci dessus en utilisant un type struct Complex<T> utilisant un type générique

Course: Traits: Defining common behaviour

A trait defines a functionality that a particular type has and can share with other types. Traits are similar to a feature often called interfaces in other languages.

Many natural methods can be defined for any -- or at least many -- types. For instance the copy or clone methods (rust primitives) or the fmt method (of the trait std::fmt::Display, rust standard library) that enables to use println!. These methods are not defined by default when a new type is defined.

Traits are defined by the trait keyword. By convention they are named starting with an upper case, e.g. the trait Clone, it usually defines a method with the same name in lower case (here: clone())

If you want to clone a Complex, you juste have to write the implementation of the clone method:

impl Clone for Complex {
    fn clone(&self) -> Self{
        Complex{re: self.re, im: self.im}
    }
     // Now a.clone() can be used on Complex variables
}

Implement the Clone trait for the struct complex<T> type defined before. You will have to use impl<T: Clone> to ensure that the T generic type implements the Clone trait.

By the way, do you know the difference between copy and clone?

Select your prefered answer:

  • Clone is a supertrait of Copy2

  • Copy is implicit, inexpensive, and cannot be re-implemented (memcpy). Clone is explicit, may be expensive, and may be re-implemented arbitrarily.

  • The main difference is that cloning is explicit. Implicit notation means move for a non-Copy type.

Course: Deriving traits

For certain traits3: (Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default, etc.), the compiler is capable of providing basic implementations for some traits via the #[derive] attribute (In Rust, an attribute is metadata applied to code elements like functions, structs, modules, or crates, attributes are prefixed with # and enclosed in square brackets []).

These traits can still be manually implemented if a more complex behavior is required.

for instance, the Clone trait can be automatically derived for Complex type:

#[derive(Clone)]
struct Complex {
    re: f32,
    im: f32,
}

[... no need to implement Clone ...]

let a = build_complex(2.3,4.0);
let _c=a.clone()

[...]

How could we use println! to display Complex variables? Two methods, test both:

  1. implement the std::fmt::Display trait for type Complex. This will need to:
  • use std::fmt
  • search for the prototype of the Display trait
  • use the macro write! to print fields
  1. Derive the Debug trait that includes the fmt::Display trait and use the "{:?} format.

First example of "Move" semantic: the cube

In this first example we define a very simple data structure, a 'Cube' with a single field c that indicates the size of the cube.

Write a program Rust that defines a structure Cube and prints its size

Can you print te cube itself, i.e. can you write: println!("My cube: {}", Cube{c:0.5});? How to make the cube printable?

Hint: You can derive the std::fmt::Display trait or the Debug trait

define a variable x assigned to a given cube, print it and then define a second variable y defined by let y = x;. Then print x again, what is the problem?

Course: Move semantic

In Rust, move semantics refers to the ownership transfer of data from one variable to another. Rust enforces a strict ownership model where each piece of data has a single owner at a time, and ownership can be transferred (or "moved") when a value is assigned to another variable or passed as an argument to a function.

By default an assignement such as let y = x implies a transfer of ownership of the content of x to y. This ownership concept will be studied further in next course. This limit side effects: modifying y do not modify x.

In order to dupplicate the cube (as it would be done in any language), one has to clone it or to implement the Copy trait. Deriving the Copy trait for Cube changes the semantic of the assignement: the assignement is now a copy, not a move.

modify your program by cloning x into y

Introduction to Visual Studio Code

Visual Studio Code (often abbreviated as VS Code) is a cross-platform source code editor developed by Microsoft. It is compatible with Windows, macOS, and Linux, offering great flexibility to developers working in diverse environments. This lightweight yet powerful editor is designed to meet the needs of modern developers, providing a wide range of features.

Key Features of Visual Studio Code

Visual Studio Code stands out due to the following features:

  • Built-in support for multiple programming languages: VS Code supports a wide array of languages such as Python, JavaScript, C++, Java, and more, thanks to its extension system.

  • Extensions and customization: A vast library of extensions is available to add functionalities like debugging, version control, and language-specific tools.

  • Integrated debugger: VS Code provides an interactive debugging environment to simplify error correction in the code.

  • Version control integration: Seamless integration with Git and other version control systems allows developers to track code changes directly within the editor.

  • Integrated terminal: A terminal is available inside the editor, enabling command execution without leaving the application.

  • IntelliSense: This feature offers intelligent code completion and contextual suggestions based on syntax and variable types.

  • Cross-platform compatibility: VS Code works consistently on Windows, macOS, and Linux, ensuring a uniform user experience regardless of the operating system.

Thanks to its intuitive interface and powerful tools, Visual Studio Code has become one of the most popular editors among developers, whether they are beginners or experienced professionals. Its active community and frequent updates make it a reliable choice for addressing the evolving needs of software development.

How to use VS code efficiently for Rust (TODO)

  • installation avec apt sur linux, aller sur https://code.visualstudio.com/

  • lancer sur le répertoire projet

  • ajouter l'extension rust (barre de gauche, petit carrés), search rust -> install rust-analyzer

  • go to explorer

  • ctrl-shift-P pour la liste des commandes

1

You can have more documentation about the TOML format here: https://toml.io/en/ or here in french: https://toml.io/fr/. However, it is probably not necessary, TOML is quite simple to understand

Course 2: Rust Language

Ownership, borrowing, mutability, heap and stack in Rust (2h on computers)

Pierre Cochard, Tanguy Risset

course: Ownership (from https://doc.rust-lang.org/book/)

Ownership is a set of rules that govern how a Rust program manages memory. All programs have to manage the way they use a computer's memory while running. Some languages have garbage collection that regularly looks for no-longer-used memory as the program runs; in other languages, the programmer must explicitly allocate and free the memory.

Rust uses a third approach: memory is managed through a system of ownership with a set of rules that the compiler checks.

If any of the rules are violated, the program won't compile. None of the features of ownership will slow down your program while it's running.

Because ownership is a new concept for many programmers, it does take some time to get used to. When you understand ownership, you'll have a solid foundation for understanding the features that make Rust unique.

Here are the Ownership Rules:

  • Each value in Rust has an owner.

  • There can only be one owner at a time.

  • When the owner goes out of scope, the value will be dropped.

The scope notion is (for the moment) the same as in traditional languages such as C.

Move semantics and Copy semantics

As we have seen in previous course, the following program will compile because:

  1. default semantics of assignement is for type Cube is move.

  2. But the derivation of the Copy trait turns it into a copy semantics, hence x and y represent two different values.

    #[derive(Debug, Clone, Copy)]
    struct Cube {
        c: f32,
    }

    fn main() {
        let x = [Cube{c:0.5},Cube{c:0.75},Cube{c:1.0}];
        let y = x;
        println!("x is: {:?}", x);
        println!("y is: {:?}", y);
    }

Try this program:

    fn main() {
        let x = [(10,20),(30,40),(50,60)];
        let y = x;
        println!("x is: {:?}", x);
        println!("y is: {:?}", y);
    }

Why does it work?

References and borrowing

There is an alternative to moving or dupplicating (i.e. cloning) a value: you can borrow it. Borowing in Rust is done with the reference operator: ’&’.

in the original Cube program which has not derive the Copy trait, create a reference to x by using let y = & x. Can you print x after that?

course: Reference in Rust

References in Rust are equivalent to references in any language: a pointer to the same content, except that, because of the strong static verifications performed by the compiler, a reference is always guaranteed to point to a valid value of a particular type for the life of that reference1.

References are indicated by the ’&’ operator. As in C, the opposite of referencing is dereferencing, which is accomplished with the dereference operator: ’*’. However, in practice, the ’&’ operator can be omitted; this is called deref coercion or autoderef (it is implemented in a trait Deref that is implemented for all references).

This autoderef is implemented in almost all cases, except when you assign a value to a dereferenced mutable reference:

    let mut x = 10;
    let y = &mut x;

    *y = 20; //explicit dereferencing is required here

Borrowing is extremely useful in function calls. Each time you call a function with a parameter, the ownership of the object passed as a parameter is transferred to the function (actually, it is transferred to the formal parameter of the function). If, instead, you pass a reference to the object, the ownership does not change, so you can call many functions that only use an object without modifying it by using references.

Mutable Reference

Sometimes, you wish to have a function call that modifies an object. For that, you can use a mutable reference with the syntax: let y = &mut x. Mutable references in Rust do not change ownership. They only provide exclusive access to a value for mutation while ensuring that the ownership of the value remains unchanged.

By using a mutable reference to x (let y = &mut x), write a function called double that double the size of your cube x.

course: Mutable reference

Ownership in Rust means having full control over a value (here a value is to be understand as L-value, i.e. a value which is stored in a memory box). The owner is responsible for managing the value's lifetime (we will talk later about lifetimes) and cleaning up its resources when it goes out of scope. Ownership can be transferred (moved) but is unique at any given time (except in very special cases that we will see).

Borrowing (via references, either &T or &mut T) allows you to access a value without transferring ownership. Immutable borrow (&T): Grants read-only access to a value. Mutable borrow (& mut T): Grants exclusive, write-access to a value.

Rules of Mutable References:

  • You can only have one mutable reference to a value at a time.

  • While a mutable reference exists, no other references (mutable or immutable) to the same value are allowed.

It is important to understand that the Rust compiler evaluate very precisely the scope of variable.

In the two codes below, only one of them is correct, which one and why?

    #[derive(Debug)]
    struct Cube {
        c: f32,
    }

    fn double(y : &mut Cube) {
        y.c = 2.*y.c;
     }
     

    fn main() {
        let mut x = Cube { c: 0.75 };
        let y = &mut x;
        double(y);
        println!("My cube is: {:?}", x);
        println!("My cube is: {:?}", y);    
    }

    #[derive(Debug)]
    struct Cube {
        c: f32,
    }

    fn double(y : &mut Cube) {
        y.c = 2.*y.c;
     }
     

    fn main() {
        let mut x = Cube { c: 0.75 };
        let y = &mut x;
        double(y);
        println!("My cube is: {:?}", y);
        println!("My cube is: {:?}", x);    
    }

Heap and Stack: the String example

Many programming languages don't require you to think about the stack and the heap very often. But in a systems programming language like Rust, whether a value is on the stack or the heap affects how the language behaves and why you have to make certain decisions.

Section 6 recalls the basics that everyone should know about the heap and the stack; please read it if you are not very familiar with these concepts.

The following code manipulates a string that contains hello:

       let s1 = String::from("hello"); 
       let s2 = s1;

As you know, if s1 were set to an integer (say 5), then s2 would have been set to a copy of 5, because int32 has copy semantics by default. But here, s1 is assigned to a String. We will study strings in more detail later, but this is a good example to understand the difference between the heap and the stack.

A String is made up of three parts, shown in the left figure 2{reference-type="ref" reference="trpl04-01"} (taken from the Rust book): a pointer to the memory that holds the contents of the string, a length, and a capacity. This group of data is stored on the stack. On the right is the memory on the heap that holds the contents. The reason for this is that a string might contain an arbitrarily long character string, but the size used to store the structural information (i.e., pointer, length, and capacity) does not change from one string to another; it is known statically.

(a)                        (b)

(a) Representation in memory of a String holding the value "hello" bound to `s1`. (b) Representation in memory of the variable `s2` that has a copy of the pointer, length, and capacity of `s1`

When we assign s1 to s2, the String data is copied, meaning we copy the pointer, the length, and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers to. In other words, the data representation in memory looks the right of like Figure above.

Note that the effective content of the string (i.e. the 'hello' characters) is not duplicated, moreover it cannot be reached anymore with s1 string have move semantics so s1 is moved to s2 (data is now owned by s2)2.

write a function fn append_word(s: & mut String), call it giving a mutable reference to s2. you can use the function pub fn push_str(&mut self, string: &str)

Smart Pointers

Smart pointers are inherited from other language such as C++. Smart pointers are data structures that act like a pointer but also have additional metadata and capabilities. Rust has a variety of smart pointers defined in the standard library that provide functionality beyond that provided by references. To explore the general concept, we'll look at a couple of different examples of smart pointers, including a reference counting smart pointer type (Rc) and a unique pointer on the heap (Box).

The most straightforward smart pointer is a Box, whose type is written Box<T>. Boxes allow you to store data on the heap rather than the stack with a Unique pointer. What remains on the stack is the pointer to the heap data. This is usefull for instance to create recursive type.

Create a type List based on the following structure: a list is either (the "either" correspond to an enum) the constant Nil or the concatenation of an integer and a List: Cons(i32, List). Try without using Box then using Box.

You will have to declare the use of the created symbols after the definition of List by writing: use crate::List::{Cons,Nil};

In the majority of cases, ownership is clear: you know exactly which variable owns a given value. However, there are application when a single value might have multiple "owners". For example, in graph data structures, multiple edges might point to the same node, and that node is conceptually owned by all of the edges that point to it. A node shouldn't be cleaned up unless it doesn't have any edges pointing to it and so has no owners.

You have to enable multiple ownership explicitly by using the Rust type Rc<T>, which is an abbreviation for reference counting. We use the Rc<T> type when we want to allocate some data on the heap for multiple parts of our program to read and we can't determine at compile time which part will finish using the data last. Note that Rc<T> is only for use in single-threaded scenario, other constructs are used in multithreaded programs.

consider the scheme below where a list (a) is shared by two other lists (b and c). Write a program that creates this object by using Rc<T> instead of Box<T> in the List definition. (As Rc is not in the prelude, you have to use use std::rc::Rc;)  

Example of "shared" List structure

Recalls on Heap and Stack

Although knowing the exact memory management is generaly not necessary to a programmer, in many case (system programming or embedded programming for instance, often done in Rust), it is crucial to understand how memory is handle by the compiler/OS. From the programmer point of view, and thanks to virtual memory system, everything happens as if we had all the memory available.

The memory management is more or less the same for every language and system, what differ is what is visible for the programmer: explicit memory management (malloc/free) or garbage collecting etc. This memory is organized in different section, almost allways in the following way:

(a) Representation of the memory as seen by the programmer

The "code" section contains the assemble code of the program. The "static" section contains all the "static" variables (i.e. variables that are available during the whole execution of the program). The two other section are managed dynamically during execution:

  • The heap is used for dynamic memory allocation: malloc (in C) or new (in object languages). The object stored in the heap have a lifetime that is independent of function execution, they can survive after the function that created them has finished. The heap can be managed explicitely (as is C with malloc and free) or implicitely (using a garbage collector as in Python for instance).

  • The stack is used to manage the execution of functions (or procedures in general) which includes in particuly the allocation and management of functions local variables.

The stack start from big adresses and grows downward, although it is often represented upside-down as below: small adresses up, big addresses down. The heap grows upward, when the two bounds meet, the system is out of memory.

The stack execution principle is important to know. when a function is called, a space is allocated on the stack to store its local variables: this space is called the function frame. When the function ends, its frame is freed and the stack goes back to the frame of the calling function.

Below is an illustration of the evolution of the stack during a function call, two registers of the processor are indicated: the stack pointer (SP) that indicate the top of the stack and the frame pointer that indicate the beginning of the frame of the current fonction. The frame contains all the information needed to the execution of the function, including room for local variables.

         
(a) before call       (b) during call                   (c) after call 

Evolution of the stack during a function call
  1. before the call, the frame pointer FP points to the frame of the calling function

  2. during the call, the stack is increase (i.e. SP is decreased as the stack is upside-down) to have room for the frame of the called function. This includes room for local variable of the function, parameter given to the function and information for returning from the function (return address in the code because a given function can be called from many places in the code), room for the function result as well as some bookeeping information such as saved values of the processor registers.

  3. after the call, the called fonction frame has disappeared. Actually its content is still there but cannot be accessed anymore because the stack pointer SP has been put back to its location before the call

Important to remember: The function variables whose size are known at compile time are usually stored in the stack. The variable whose size are know during execution, such as String or object created by new are usually stored on the heap.

1

It is a major difference between Rust and other languages: there are no "null pointers", interestingly enough, the decision of authorizing Null pointer was taken by Tony Hoare place during the 60's, it is known as his "billion dollar mistake": https://news.ycombinator.com/item?id=12427069

2

It is important to know that the pointer used in a String has the "Unique<T>" type, which forbid the object pointed by this pointer to have two Owner at the same time. Hence the String type cannot have a copy semantics

Course 2: Rust Language

Advanced types Compound and collection types

Pierre Cochard, Tanguy Risset

Compound types

Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.

Course: Tuples

'Tuples' are fixed-size collections of arbitrary-typed values, they are defined with the (Type, Type, ...) syntax:

#![allow(unused)]
fn main() {
// Explicit type:
let mut tup: (i32, i32, f32, &str) = (31, 16, 47.27, "hello!"); 

// Inferred type:
let mut tup = (31, 16, 47.27, "hello!"); 
}

Accessing individual values within a Tuple can be done by either:

  • referring to its index
  • destructuring the tuple and bind it to individually-named variables:
#![allow(unused)]
fn main() {
let mut tup = (31, 16, 47.27, "hello!"); 

// Access by index:
tup.0 = 22;
tup.3 = "world!";

// 'Destructuring' a tuple:
let (t0, t1, t2, t3) = tup;
println!("y = ({t1}, {t2})");
}

Take a mutable reference to the third element (f32) of the tuple tup and pass it to a function that multiplyies it by 2:

let mut tup = (31, 16, 47.27, "hello!"); 

// The function's prototype (to be implemented):
fn mul2(t: &mut f32);

mul2(...);

assert!(tup == (31, 16, 94.54, "hello!"));

Tuples can be conviently used in a function in order to return multiple values, which can then be assigned to distinct variables in a same expression:

#![allow(unused)]
fn main() {
// A function returning a pair of signed integers:
fn return_tuple(x: i32) -> (i32, i32) {
    return (x+1, x+2);
}

// Calling a function, and storing its result:
let y: (i32, i32) = return_tuple(8);
println!("y = {:?}", y);
println!("y = ({:?}, {:?})", y.0, y.1);

// Passing a tuple as an argument to a function:
fn print_tuple(x: &(i32, i32)) {
    println!("x = ({:?}, {:?})", x.0, x.1);
}

print_tuple(&y);
}

Write a function that transforms a (i32, i32) tuple by swapping its two values:

#![allow(unused)]
fn main() {
// The function prototype to be implemented:
fn swap(tup: &mut(i32, i32));

let mut x = (31i32, 27i32);
swap(&mut x);

assert!(x == (27i32, 31i32));
}

Course: Arrays (primitive type)

Arrays are fixed-size groups of values of the same type, and can be defined in Rust with the syntax:

  • [Subtype; Length], for instance [i32; 10]
#![allow(unused)]
fn main() {
// Explicit type:
let a: [i32; 3] = [31, 16, 47];

// Inferred type:
let b = [0, 1, 2, 3, 4]; // #[i32; 5]

// Create and zero-initialize an array:
let mut a: [usize; 10] = [0; 10];
// same as:
let mut a = [0 as usize; 10];
// same as:
let mut a = [0usize; 10];

// Writing at a specific index:
// Note: in Rust, as in C, array indices start at 0
a[0] = 2;
println!("a[0] = {}", a[0]);
}

As in most programming languages, multidimensional/nested arrays are also supported in rust, and can be declared as follows:

#![allow(unused)]
fn main() {
// 2-dimensional array, 2 arrays of `i32` with a length of 10 each:
let mut multi_array = [[0 as i32; 10]; 2];
}

What would be the type of the following arrays?

#![allow(unused)]
fn main() {
let a1 = [(1, 2), (3, 4), (5, 6)];
let a2 = [(1, 2), (3, 4), (5, (6, 7))];
}

Course: Ranges & Iterators

Arrays are convenient for storing and processing a set of contiguous data on the stack, for instance through the use of loops, ranges and iterators.

A range represents an interval of values between a start and an end point. In rust, they can be conveniently used with the start..end construct (here excluding the end value), or with start..=end (here including the end value).

Examine the following assert! statements, will this program compile?

let a = 0..10;
let b = 1..=10;

// 'a' range:
assert!(a.contains(&0));
assert!(!a.contains(&10));

// 'b' range:
assert!(!b.contains(&0));
assert!(b.contains(&10)); 

// The 'a' and 'b' ranges have the same number of elements:
assert_eq!(a.count(), b.count());

Course: Iterators

Iterators allow to go through an array, a range or a collection, and access each element one-by-one.

#![allow(unused)]
fn main() {
let r = 0..10;
// Iterate over a range:
for n in r.into_iter() {
    print!("{n} ");
    // -> 0 1 2 3 4 5 6 7 8 9
}
println!();

// Iterate over an array:
let mut a = [0; 10];

// Basic for-loop iteration:
for x in a {
    println!("{x}");
}
// From a 'range':
for n in (0 .. a.len()) {
    println!("{}", a[n]);
}
// As mutable, changing the values of the array:
for x in &mut a {
    *x += 1;
}
// Equivalent to (using a 'closure'):
a.iter_mut().for_each(|x| *x += 1 );

// Iterate with both element and index:
for (i, x) in a.iter_mut().enumerate() {
    *x += i;
}
}

Using ranges and/or iterators, write in the following multidimensional array's first sub-array values that incrementally go from 1 to 10, and in the second, decrement the values from 10 to 1, as shown below:

#![allow(unused)]
fn main() {
let mut multi_array = [[0 as i32; 10]; 2];

// The following must be true:
assert_eq!(multi_array, [
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
]);
}

Collections

In addition to primitive compound types, the Rust standard library includes a number of very useful data structures called collections. Unlike the built-in array and tuple types, the data these collections point to is stored on the heap, which means the amount of data does not need to be known at compile time and can grow or shrink as the program runs.

Course: Vectors

'Vectors' are a collection of multiple values of a same type stored on the heap. Unlike arrays, they have a dynamic size: they can grow, or shrink.

A Vec object has ownership over the data located in its underlying heap-allocated buffer, which means that the buffer will be deallocated whenever the owning object goes out of scope.

#![allow(unused)]
fn main() {
// The easiest way to create a vector is to use the 'vec!()' macro:
let mut v = vec![0, 1, 2, 3, 4, 5]; // Vec<i32>
println!("Value: {:?}", v);

// 'Pushing' (appending) a new value at the end:
v.push(6);
println!("Value: {:?}", v);

// 'Popping' (removing) its last value:
let last = v.pop();
println!("Last value: {:?}, Vector: {:?}", last, v);
}

Examine the following v1, v2 and v3 vectors and their underlying heap buffer pointers.

#![allow(unused)]
fn main() {
let mut v0: Vec<i32> = vec![0, 1, 2, 3, 4];
// get a pointer to the underlying heap memory buffer:
let v0_ptr: *const i32 = v0.as_ptr();

// Create another vec 'v1' from 'v0', and get its heap pointer again:
let mut v1: Vec<i32> = v0;
let v1_ptr = v1.as_ptr();

// Create another vec 'v2' from 'v1':
let mut v2: Vec<i32> = v1.clone();
let v2_ptr = v2.as_ptr();
}

Which of the following assertions are true:

#![allow(unused)]
fn main() {
// Assertion A: the address of pointer 'v0' is the same as pointer 'v1'
assert_eq!(v0_ptr.addr(), v1_ptr.addr());

// Assertion B: the address of pointer 'v1' is the same as pointer 'v2'
assert_eq!(v1_ptr.addr(), v2_ptr.addr());

// Assertion C: the address of pointer 'v0' is the same as pointer 'v2'
assert_eq!(v0_ptr.addr(), v2_ptr.addr());
}

Iterating over a vector is the exact same process as for an array (most operations are inter-compatible!).

#![allow(unused)]
fn main() {
// Initializing from a range and iterator:
let mut v = Vec::from_iter((0..6).map(|i| i+1 ));
println!("Value: {:?}", v);

// Iterate/increment:
for x in &mut v {
    *x += 1;
}
println!("Value: {:?}", v);
// General operations:
v.rotate_left(1);
println!("Value: {:?}", v);
// etc.
}

Using a single loop, move the contents of vector v to array a such as vector v is equal to vec![] (empty vector) and array a is equal to [5, 4, 3, 2, 1, 0]:

#![allow(unused)]
fn main() {
let mut v = vec![0, 1, 2, 3, 4, 5];
let mut a = [0; 6];

(...)

assert_eq!(v, vec![]);
assert_eq!(a, [5, 4, 3, 2, 1, 0]);
}

Course: Hash-maps

HashMap are heap-allocated collections of same-type values indexed by a unique key. Like vectors, they can grow, or shrink. They make a convenient choice for representing indexes, dictionaries, or any other type of database-like objects:

#![allow(unused)]
fn main() {
// Unlike Vec, the HashMap data structure need to be explicitly included!
use std::collections::HashMap;

// Inferred type:
let mut departments = HashMap::new(); // HashMap<i32, str>
departments.insert(85, "Vendée");
departments.insert(31, "Haute-Garonne");
departments.insert(44, "Loire-Atlantique");

// We use the ampersand(&) and the key (&1) as the argument 
// because [..] returns us a reference of the value. It is not the actual value in the HashMap.
let d31 = departments[&31];
assert_eq!(d31, "Haute-Garonne");

// Removing a key:
departments.remove(&85);

// Iterating over all values:
for department in departments {
    // We get a tuple!
    println!("Key: {}, Value: {}", department.0, department.1);
}
}

Move the contents of the following Vec object into a BTreeMap (which behaves the same as a HashMap, but will sort its contents by key) in order to get these athlete names sorted by their score in points.

Note: some of them have the same score, which should appear in the same key.

#![allow(unused)]
fn main() {
use std::collections::BTreeMap;

let vec = vec![
    ("Y. Horigome", 281),
    ("N. Huston", 279),
    ("M. Dell", 153),
    ("J. Eaton", 281),
    ("S. Shirai", 278),
    ("K. Hoefler", 270),
    ("C. Russell", 211),
    ("R. Tury", 273),
];

let mut map = BTreeMap::new();

[...]

for score in map {
    println!("{:?}", score);
}
}

The last for loop should print:

(153, ["M. Dell"])
(211, ["C. Russell"])
(270, ["K. Hoefler"])
(273, ["R. Tury"])
(278, ["S. Shirai"])
(279, ["N. Huston"])
(281, ["Y. Horigome", "J. Eaton"])

'string' types (str and String)

Course: str primitive

The str primitive type can be used to represent a string literal:

#![allow(unused)]
fn main() {
// String literal:
let s = "Hello, World!";
}

As a literal, a str has a static lifetime which can be also explicitly stated in its type declaration. A static lifetime means that the object is valid throughout the entire duration of the program.

#![allow(unused)]
fn main() {
// Here, the three syntaxes are equivalent:
let s = "Hello, World!"; // Inferred type & lifetime
let s: &str = "Hello, World!"; // Explicit type, inferred lifetime
let s: &'static str = "Hello, World!"; // Explicit type & lifetime
}

Unlike const char* in the C programming language, &str in Rust is not null-terminated, but relies on a slice, which is composed of a pointer and a size in bytes:

#![allow(unused)]
fn main() {
let s = "Hello, World!";
println!("Pointer: {:?}, Length: {} bytes", s.as_ptr(), s.len());

for (n, char) in s.chars().enumerate() {
    println!("Char {n}: {char}");
}
}

For safety reasons, Rust doesn't allow modifying the actual contents (the characters) of a &str, thus the following does not compile:

#![allow(unused)]
fn main() {
let s: &mut str = "Hello, World!";
}

Run the following code:

#![allow(unused)]
fn main() {
let s1 = "It's not about the bunny      \t"; 

// Remove leading/trailing whitespace, tabs and newlines from 's1': 
let s2 = s1.trim();
println!("{s1}");
println!("Address: {:?}, Length: {}", s1.as_ptr(), s1.len());
println!();
println!("{s2}");
println!("Address: {:?}, Length: {}", s2.as_ptr(), s2.len());
}
  • Since Rust forbids modifying the contents of a str literal, why are we in this case allowed to use the .trim() function? What is truly happening in this code?
  • What would happen if we modified s1 as follows?
#![allow(unused)]
fn main() {
let s1 = "     It's not about the bunny      \t"; 
let s2 = s1.trim();
}

Course: String

A String, on the other hand, is a standard library collection type that can be basically seen as a vector of char, dynamically stored on the heap. Just like a Vec, it can grow, shrink, and has ownership over its own underlying buffer, which makes it an easier object to manipulate. While it inherits all of the str methods, it does not have a static lifetime.

#![allow(unused)]
fn main() {
// Create from a string literal:
let mut s = String::from("Owls are not what they seem");

// Append a 'char':
s.push('!');
println!("Value: {:?}", s);

// Append another string:
s.push_str(" Really?");
println!("Value: {:?}", s);

// Other ways of appending to the String:
s = s + " Yes, ";
s += "really!";

// Iterate over every 'char'
for c in s.chars() {
    print!("{c} ");
}
println!("");

// Example of transformation:
s = s.chars().rev().collect();

println!("Value: {:?}", s);
}

Examine the following code:

#![allow(unused)]
fn main() {
let s0: &str = "That gum you like is going to come back in style";

// Build a 'String' object from the previous '&str':
let mut s1: String = String::from(s0);

// Now modify 'string':
s1 = s1.to_ascii_uppercase();
println!("{string}");

}

Is s0 still accessible? If yes, what is now its value? Is it the same as s1 and why?

We now call the as_str() method on s1, and store the resulting &str value in a new variable called s2. Can you guess what is the lifetime of s2?

#![allow(unused)]
fn main() {
let s0: &str = "That gum you like is going to come back in style";

// Build a 'String' object from the previous '&str':
let mut s1: String = String::from(s0);
let s2: &str = s1.as_str();
}

Slices

A slice in rust can be considered as a bounded pointer or reference to a contiguous sequence of elements in an array, a collection, or a string of characters, as we saw earlier. It is declared with the &[T] syntax. Since it works like a reference, it does not have ownership over its contents.

#![allow(unused)]
fn main() {
// Create a byte buffer:
let mut buffer = [0 as u8; 16];

// Get a slice on half the buffer:
// (notice how the slice itself is not mutable, 
// but instead points to a mutable sequence in the buffer)
let slice: &mut[u8] = &mut buffer[0..8]; // we use the 'range syntax' here to capture the slice

// Iterate on the slice to change values:
for (i, n) in slice.iter_mut().enumerate() {
    *n = i as u8;
}
println!("{:?}", buffer);
}

Take a slice out of the string object, starting from character 25 until the end, and use the .make_ascii_lowercase() method on the captured slice.

#![allow(unused)]
fn main() {
let mut string = String::from("YOU REMIND ME TODAY OF A SMALL MEXICAN CHIHUAHUA");

let slice = ...;

slice.make_ascii_lowercase();

println!("{string}");
}

What is the inferred type of the slice variable?

Course 1: Rust Language

Struct, enums, Traits and Object definitions

Pierre Cochard, Tanguy Risset

Struct in Rust

Course: Structs (from https://doc.rust-lang.org/book/)

We have already seen the Struct concepts: Structs are similar to tuples in that both hold multiple related values. Unlike with tuples, in a struct you’ll name each piece of data so it’s clear what the values mean. Adding these names means that structs are more flexible than tuples: you don’t have to rely on the order of the data to specify or access the values of an instance.

Here is an example of Struct definition:

struct User {
    active: bool,
    username: String,
    email: String,
    sign_in_count: u64,
}

We create an instance by stating the name of the struct and then add curly brackets containing key: value pairs as for example:

    let user1 = User {
        active: true,
        username: String::from("someusername123"),
        email: String::from("someone@example.com"),
        sign_in_count: 1,
    };

To get a specific value from a struct, we use dot notation. For example, to access this user’s email address, we use user1.email. When creating instances from other instances The syntax .. specifies that the remaining fields not explicitly set should have the same value as the fields in the given instance.:

let user2 = User {
        email: String::from("another@example.com"),
        ..user1
    };

Unit struct (i.e. struct without fields) and Tuple struct (i.e. struct with no named fields) are special Rust features that might be useful in some cases (see https://practice.course.rs/compound-types/struct.html)

Consider the following struct definition (replacing string by string slice str):

struct User {
    active: bool,
    username: &str,
    email: &str,
    sign_in_count: u64,
}

Try to create an instance of this structure in a main program. You will not be able because the lifetime of the str slice is not known (it depends on the life time of the string it points to). In a Struct all fields must have the same lifetime.

Try to solve adding lifetime in your definition using the compiler message ?

Methods in Rust

As in any object oriented langage, methods are defined within the context of a struct making a struct a regular object as defined by object oriented programming paradigm. (In Rust, methods can be also defined in the context of an enum or a trait object. The definition of methods use the keyword fn as function but the are preceded by the keyword impl (eg: impl MyStruct { le fn [...] }) to specify that this function is only defined in the context of a particular type object.

For a method, the first parameter is always &self (or self if the method need to take ownership of self). &self is a short cut for &self: MyObjectType whatever this type is.

For example, if we want to defined a methode name() for the previous structure User (the first on, with String) in order to obtain the username of a User, we will use the following syntax:

impl User {
    fn name(&self)->&String{
        &self.username
    }
}

When calling this method, the self argument is never written, it is implicit (e.g. : user1.name()). The above method is ofter called a getter. Getters are not implemented by default for Rust Struct, it is a good habit to name the getter after the field name they are getting (hence we should have called this method username())

Define a struct Rectangle with two integer fields width and height. Then implement two methods for Rectangle: area(&self) (which computes the surface of the rectangle) and fits_in(&self, other_rect: &Rectangle) which indicate if self fits completely inside the other_rectrectangle.

Enum and pattern matching

Course: Enum and option Enum (from https://doc.rust-lang.org/book/)

As in C, Enum gives you a way of saying a value is one of a possible set of values. For instance, An IP address can be either an IPv4 address or an IPv6 address, but not both at the same time:

enum IpAddrKind {
    V4,
    V6,
}

This allows to store the two possible kind in "the same" memory location given that they will never be active together in one instance.

A new feature comparing to C is that we can put data directly into each enum variant.

enum IpAddr {
        V4(u8, u8, u8, u8),
        V6(String),
    }

    let home = IpAddr::V4(127, 0, 0, 1);

    let loopback = IpAddr::V6(String::from("::1"));

Rust has an extremely powerful control flow construct called match that allows you to compare a value against a series of patterns and then execute code based on which pattern matches. Pattern matching in an important area of computer science and compilation, we just show a very simple example here:

enum Coin {
    Penny,
    Nickel,
    Dime,
    Quarter,
}

fn value_in_cents(coin: Coin) -> u8 {
    match coin {
        Coin::Penny => 1,
        Coin::Nickel => 5,
        Coin::Dime => 10,
        Coin::Quarter => 25,
    }
}

The Option Enum

As explained in TD1, the Null value was invented to represent the "no value" value and it was a bad invention. With the Option Enum Rust proposes a mecanism that explicitely distinguish the cases where a variable has a value or no value. Rust defines (in the standard library, in the prelude) a particular Option<T> enum:

 enum Option<T> {
    None,
    Some(T),
}

The <T> syntax is a feature of Rust called a generic type parameter (similar to template in C++) that will be explained hereafter. Write a function that computes the square root of a f32 number and return None if the number is negative.

Of course a some(T) object cannot be "casted" or "simplified" in a T object, The only way to get the Tobject is to unwrap the option. unwrap will be explained in next course (for Result type), here it will return the Tobject or panic in case of None. The Option<T> enum has a large number of methods that are useful in a variety of situations https://doc.rust-lang.org/std/option/enum.Option.html.

try to unwrap (i.e. apply .unwrap()) your square root for positive and negative number.

Generic Types

Every programming language has tools for effectively handling the duplication of concepts. In Rust, one such tool is generics. Generic Types, Traits and Lifetimes are three kind of generics.

Generic Data Types can be used in the definition of functions, struct or enum. Generic Data Type are very close to the template concept in C++.

Write a function that will be able to find the minimum of a Vector may this vector be a vector of i32, f32 or characters or on any type that has the possibility of comparing elements. Use the following function signature: smallest<T: PartialOrd> (v: &[T])-> &T

Define a structure Point with two fields xand y that can be integer or floating point. Is Point{x:2,y:2.3} a valid instance of the structure point?

Traits: Defining Shared Behavior

A trait defines the functionality a particular type has and can share with other types. We have already seen some very common traits: Clone or Debug. The trait Clone for instance represents the fact that a variable of a type can be dupplicated (i.e. cloned) to a second instance of the type, identical to the variable but refering to a different memory location. Traits are similar to a feature often called interfaces in other languages, although with some differences.

Defining a trait consists in defining the methods we can call on that type, using the keywork trait. In general, trait names should be defined in UpperCamelCase (e.g. IsEven) and traits methods should be defined with snake_case name (e.g. is_even(&self)). Implementing a trait for a particular type is done using impl name_of_trait for name_of_type{[...]}.

Define a trait IsEven that is composed of the method is_even(&self). Implement the trait for the Rectangletype defined préviously (a Rectangle is even if both height and width are even)

You can specify a default implementation of each method in the definition of the trait (then an explicite implementation will hide the default implementation)

Trait as parameter: the Trait Bound Syntax

Trait can be used as function parameter, the function will be valid for any type implementing the trait. The usual syntax for that is called the Trait Bound Syntax:

fn myFunction<T: TheTrait>(a_variable: &T) { [..]}

Define a function notifyEven that takes as paremeter a type that implements the trait IsEven as parameter and notify (i.e. print) the fact that the object is even. Note that you can implement IsEven and Debug traits together by specifying IsEven+Debug.

Sometimes, this syntax can be heavy and you can use the where Clause: instead of writing this:

fn some_function<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32 {

we can use a where clause, like this:

fn some_function<T, U>(t: &T, u: &U) -> i32
where
    T: Display + Clone,
    U: Clone + Debug,
{

The impl trait syntax can be used to specify that a result of a function must implement a trait.

Using Trait Bounds to Conditionally Implement Methods

The following example (from https://doc.rust-lang.org/book/ch10-02-traits.html) illustrates the fact the trait can be used to conditional implementation of methods

use std::fmt::Display;

struct Pair<T> {
    x: T,
    y: T,
}

impl<T> Pair<T> {
    fn new(x: T, y: T) -> Self {
        Self { x, y }
    }
}

impl<T: Display + PartialOrd> Pair<T> {
    fn cmp_display(&self) {
        if self.x >= self.y {
            println!("The largest member is x = {}", self.x);
        } else {
            println!("The largest member is y = {}", self.y);
        }
    }
}

Programming paradigm in Rust

This section is largely inspired by https://corrode.dev/blog/paradigms/.

Rust is a multi-paradigm programming language, accommodating imperative, object-oriented, and functional programming styles. It is important to be aware that the programming paradigm is an important design choice when you start a new Rust programming. The choice of style often depends on a developer’s background and the specific problem they’re addressing. but there are also many "known habits" of Rust developpers.

An originality compared to other recent languages is the important influence of functionnal programming in Rust.

A simple example: integer Vector sum.

Consider the problem of suming all the component of vector with i32 values. Write a program to do it in an iterative/imperative way (i.e. a loop accumulating in a temporary variable)

Do the same thing in a more functionnal way: Use the iter() method of Vector type and sum() method of iterators

The second formulation is, of course, much more concise, as is often the case in functional programming, but it is less suited to certain types of processing (matrix calculations, for example).

A More Complete Example

Consider the following Rust code that defines a list of several languages along with the paradigms they are associated with. You will start from this code. The task will be to find the top five languages that support functional programming and have the most users.

#[derive(PartialEq,Clone,Debug)]
enum Paradigm {
    Functional,
    ObjectOriented,
}


#[derive(Clone,Debug)]
struct Language {
    name: &'static str,
    paradigms: Vec<Paradigm>,
    nb_users: i32,
}

impl Language {
    fn new(name: &'static str, paradigms: Vec<Paradigm>, nb_users: i32) -> Self {
        Language { name, paradigms, nb_users }
    }
}

let languages = vec![
    Language::new("Rust", vec![Paradigm::Functional,Paradigm::ObjectOriented], 100_000),
    Language::new("Go", vec![Paradigm::ObjectOriented], 200_000),
    Language::new("Haskell", vec![Paradigm::Functional], 5_000),
    Language::new("Java", vec![Paradigm::ObjectOriented], 1_000_000),
    Language::new("C++", vec![Paradigm::ObjectOriented], 1_000_000),
    Language::new("Python", vec![Paradigm::ObjectOriented, Paradigm::Functional], 1_000_000),
];

Give a imperative solution to the task using nested for loops

Give a more functionnal implementation by transforming the vector in an iterator and using the following method (see https://docs.rs/itertools/latest/itertools/trait.Itertools.html):

  • into_iter() method transforms a Vector in a iterator

  • filter() method can keep only element with a property (use a lambda as an argument to filter)

  • sorted_by_key() sorts all iterator elements into a new iterator in ascending order (use Reverse())

  • collect() transform an iterator into a collection.

Which implementation is more efficient in terms of computation time? TODO Correction

Course 2: Rust Language

Rust as a safe language: Pattern Matching, Handling Results/Errors, Options and Simple macros

Pierre Cochard, Tanguy Risset

Introduction

TODO

Pattern matching in Rust

Pattern matching is the act of checking a given sequence of tokens or expressions for the presence of one or more specific patterns. The concept is implemented in many programming languages (Rust, Haskell, Swift, etc.) and tools, for various purposes, such as: regular expressions, search and replace features, etc.

In Rust, patterns and pattern matching constitute a specific syntax, that is used in different places in the language (match statements, if let expressions, function parameters, simple macros, etc.)

Course: match statements

The primary, and most explicit, use of pattern matching in Rust is done through the match statement, which can be perceived as the Rust-equivalent of a C switch, but with additional features. Its syntax is also a little bit different. For instance, let's take a look at this simple C program:

// C basic switch case:
enum Colors {
    Red, Blue, Green, Yellow, Orange
};

bool match_color_orange(Colors color) {
    switch (color) {
        case Orange: {
            printf("Orange!\n");
            return true;            
        }
        case Red:
        case Blue: {
            printf("Not orange :(\n"));
            return false
        }
        default: {
            printf("Still not orange\n");
            return false;
        }
    }
}

In Rust, we would have the following equivalent:

#![allow(unused)]
fn main() {
enum Colors {
    Red, Blue, Green, Yellow, Orange
}

fn match_color_orange(color: Colors) -> bool {
    match color {
        // 'case' statements are replaced by the
        // 'PATTERN => EXPRESSION' syntax:
        Colors::Orange => {
            println!("Orange!");
            true
        }
        // We use '|' operators here, instead of having 
        // multiple 'case' statements: 
        Colors::Red | Colors::Blue => {
            println!("Not orange :(");
            false
        }
        // Anything else (equivalent to 'default'):
        _ => {
            println!("Still not orange...");
            false
        }
    }
}
}

Once a pattern mach is found, the corresponding instruction are executed and the match instruction terminates (it does not check for other matching patterns, the first matching pattern is choosen). match statements can be directly bound to variables:

#![allow(unused)]
fn main() {
enum Colors { Red, Blue, Green, Yellow, Orange }

let color = Colors::Red;

let is_color_warm = match color {
    Colors::Orange => true,
       Colors::Red => true,
    Colors::Yellow => true,
                 _ => false
};
}

Matching ranges is also supported, for instance:

#![allow(unused)]
fn main() {
fn match_number(number: i32) {
    match number {
          50..=99 => println!("Between 50 and 99"),
       100..=1000 => println!("Between 100 and 1000"),
                _ => println!("Other value")
    }
}
}

And, as a matter of fact, any other type of expression can be matched! from string types:

#![allow(unused)]
fn main() {
fn match_str(s: &'static str) {
    match s {
        "Orange" => println!("Orange!"),
        "Yellow" => println!("Not orange"),
        _ => println!("Something else...")
    }
}
}

to other kinds of collections:

#![allow(unused)]
fn main() {
fn match_tup(tup: (i32, i32)) {
    match tup {
        (0, 0) => println!("Zeroes!"),
        (1, 1) => println!("Ones"),
        _ => println!("Something else...")
    }
}

match_tup((1, 1));
match_tup((0, 1));

fn match_array(arr: [u8; 3]) {
    match arr {
        [0, 1, 2] => println!("Array match!"),
                _ => println!("No match")
    }
}

match_array([0, 1, 2]);
match_array([4, 5, 6]);

fn match_slice(sl: &[i32]) {
    match sl {
        &[0, 1, 2] => println!("Slice matches!"),
                 _ => println!("No match")
    }
}

match_slice(&[0, 1, 2]);
}

Write a match statement which applies to any i32 number. It should only have the two following patterns:

  1. The value is below 100 (including negative numbers);
  2. The value is equal or higher than 100.

Course: match statements: "flexible" patterns

As we saw earlier, the match statement can test any kind of value, and it also extends to custom and composite types, including struct instances. A custom struct can be indeed either matched by its contents in a very precise manner:

#![allow(unused)]
fn main() {
struct Point {
    x: isize,
    y: isize
}

let point = Point {x: 0, y: 100};

match point {
    // Only match Point if its 'x' member is equal to 0
    // and 'y' is equal to 100:
    Point {x: 0, y: 100} => println!("Match!"),
    _ => println!("No match!")
}
}

Or, in a more flexible way, using, for instance, ranges for its member values:

#![allow(unused)]
fn main() {
struct Point {
    x: isize,
    y: isize
}

let point = Point {x: 25, y: 100};
    
match point {
    // Only match if 'x' is between 0 and 100,
    // and 'y' is between 50 and 100
    Point {x: 0..=100, y: 50..=100} => println!("Match!"),
    _ => println!("No match!")
}
}

Finally, the _ => expression can be extended to any kind of value (or field value) that we want to ignore. This can also be done using the .. syntax, which will ignore all the following values or field values:

#![allow(unused)]
fn main() {
struct Point {
    x: isize,
    y: isize
}

let p = Point {x: 0, y: 100};
    
match p {
    // Only match if 'x' is between 0 and 100,
    // and ignore the 'y' field:
    Point {x: 0..=100, y: _} => println!("Match!"),
   // Only match if 'x' is between 101 and 1000,
    // Similarly, the '..' syntax will ignore all the struct fields after 'x':
    Point {x: 101..=1000, ..} => println!("Match!"),
    _ => println!("No match!")
}
}

Note: for compound/collection types, the .. syntax may be followed by other patterns:

#![allow(unused)]
fn main() {
let tup = (0, 1, 2, 3, 4);

match tup {
    // The first element of the tuple should be '0', and the last should be '4',
    // we ignore the values in-between:
    (0, .., 4) => println!("Match!"),
    _ => println!("No match")
}
}

Implement a match statement on a &[i32] slice. It should match all of the following patterns:

  1. The slice's first value should be 0;
  2. The slice's second value should be either 10 or 20;
  3. The slice's final value should be 100;
  4. The slice can have an arbitrary size.

You can use the following assert! statements to test your code:

#![allow(unused)]
fn main() {
fn match_slice(s: &[i32]) -> bool;

assert_eq!(match_slice(&[1, 20, 20, 30, 100]), false);
assert_eq!(match_slice(&[0, 5, 20, 30, 100]), false);
assert_eq!(match_slice(&[0, 10, 20, 30, 99]), false);
assert!(match_slice(&[0, 10, 20, 30, 40, 100]));
assert!(match_slice(&[0, 20, 20, 30, 50, 60, 70, 80, 90, 100]));
}

The Result enum type

Course: Returning a Result from a function

To handle and propagate runtime errors, Rust relies on a simple but efficient mechanism based on an enum: the Result<T, E> enum, which is defined as:

#![allow(unused)]
fn main() {
enum Result<T, E> {
    Ok(T),
    Err(E)
}
}

The templated T and E types have no trait implementation predicate whatsoever, they could be anything, for instance:

#![allow(unused)]
fn main() {
// This function returns a u8 slice if there is no error, a Vec<f32> if there is.
// This is probably not very useful, but it's still perfectly valid code:
fn my_function(i: i32) -> Result<&[u8], Vec<f32>> {...} 

// This too (empty tuple type for both):
fn my_function(i: i32) -> Result<(),()> {
    Ok(())
} 
}

The unwrap() method can be used to extract the Ok argument from a Result<...>. This will be explained in more detail further, but you will need it to test your function:

let u = my_function(3).unwrap(); 

Write a function called positive() that takes an i32 as argument and checks whether it is (strictly) positive. It should return the same argument value if it is positive, or a String with an error message if the argument is negative or equal to zero.

Course: propagating Errors

An Error in Rust can be propagated down the call stack by using the ? syntax:

#![allow(unused)]
fn main() {
fn my_function(i: i32) -> Result<i32, ()> {...}

fn my_other_function() -> Result<i32, ()> {
    // Append the '?' operator right after the function call:
    let mut i = my_function(1)?;
    // Do something with 'i': 
    i += 1;
    // Return an 'Ok' result with the modified 'i' value:
    Ok(i)
}
}

Here, the my_function(1)? function call means:

  • if the result enum value is Err (an error), then propagate the error now, by returning the same Err from my_function(), otherwise, continue with the rest of the code.

This code could also be implemented with an equivalent match statement, but is a bit more verbose:

#![allow(unused)]
fn main() {
fn my_other_function() -> Result<i32, ()> {
    let mut i = match my_function(1) {
        Ok(i) => i,
        Err() => return Err(())
    };
    i += 1;
    Ok(i)
}
}

Implement the same mechanism for the previous positive(i: i32) example, using the ? syntax, and test it with both positive and negative values in a new function with the same return type, in order to see what happens.

// Your 'positive' function:
fn positive(i: i32) -> Result<i32, String> {...}

// Define a new function, and call 'positive(...)' from here:
fn check_positive() -> Result<i32, String> {
    ... // <- test positive & negative values here using the '?' syntax
}

// Check the return type from main:
fn main() {
    println!("{:?}", check_positive());
}

Course: returning a Result from the main() function

In general, returning with or without error from a main function, such as in the C or C++ programming languages, is done by returning an integer exit code (0 for success, error otherwise):

int main(void) {
    // No error, return 0:
    return 0;
}

In Rust, the main() function has no return type by default, and returning a i32 is not accepted by the compiler. For instance, the following code is invalid:

fn main() -> i32 {
    0
}

In this case, the compiler prints the following:

error[E0277]: `main` has invalid return type `i32`
 --> src/main.rs:3:14
  |
3 | fn main() -> i32 {
  |              ^^^ `main` can only return types that implement `Termination`
  |
  = help: consider using `()`, or a `Result`

The Termination trait documentation indeed indicates that only the following types are valid:

#![allow(unused)]
fn main() {
impl Termination for Infallible;
impl Termination for !;
impl Termination for ();
impl Termination for ExitCode;
impl<T: Termination, E: Debug> Termination for Result<T, E>;
}

Therefore, we can see that propagating a Result down to main() is possible, but is still a bit of a specific case. The type T held by the Ok enum value must implement the Termination trait, and the type held by the Err enum value must implement the Debug trait. For instance, the following still does not work because i32 does not implement the Termination trait:

fn main() -> Result<i32, String> {
    Ok(0)
}

But the following works:

// Using an empty tuple as the 'Ok' result:
fn main() -> Result<(), String> {
    Ok(())
}

// Or using the ExitCode type:
use std::process::ExitCode;

fn main() -> Result<ExitCode, String> {
    Ok(ExitCode::from(0))
}

Using a valid Result type in the main() function, propagate the Result of our positive() function down.

fn positive(i: i32) -> Result<i32, String> {...}

fn check_positive() -> Result<i32, String> {...}

// Use a valid Result type here:
fn main() -> Result<?, ?>{
    // Call the 'check_positive' function here, and propagate its Result as the main() return type:
}

Course: Handling Result types immediately

.unwrap(), .expect()

In some cases - when propagating an error is not possible, or unconvenient - dealing immediately with a Result type is preferrable. This is why certain methods, such as .unwrap() or .expect() are natively implemented, and quite commonly used:

  • The .unwrap() method, for instance, will induce a panic! call and will exit the program immediately when encountering an error. Otherwise it will return the Ok value safely:
#![allow(unused)]
fn main() {
// Get the current working directory:
let dir: std::path::PathBuf = std::env::current_dir().unwrap();
println!("{:?}", dir);
}
  • The .expect() method is really similar, but will allow the user to print a custom &str message on error, which will be prepended to the actual display of the Err contents:
#![allow(unused)]
fn main() {
fn my_function() -> Result<(), i32> {
    Err(1)
}
my_function().expect("Error! Now exiting program with error code");
}
  • Other similar helper methods also exist, with different behaviors:
    • .unwrap_or(other: T) returns the value other in case of an Err
    • .unwrap_or_default() returns the type's default value in case of an Err
    • .unwrap_or_else(func: Fn) executes a custom function in case of an Err
    • etc.

panic!, assert! and other macros

In addition to Result types, other simple tools, in the form of macros, are provided:

  • The panic!(msg) macro interrupts the program immediately and prints a custom error message:
fn main() {
    use std::io;
    // Read input from stdin:
    let mut buffer = String::new();
    println!("Please enter password:");
    io::stdin().read_line(&mut buffer).unwrap();
    // Panic if password is not long enough:
    if buffer.len() < 8 {
        panic!("Password should be at least 8 characters");
    } else {
        println!("{buffer}");
    }
}
  • The assert!(bool), assert_eq!(lhs, rhs), and assert_ne!(lhs, rhs), verify a boolean statement or check equality between two elements:
fn main() {
    use std::io;
    // Read input from stdin:
    let mut buffer = String::new();
    println!("Please enter password:");
    io::stdin().read_line(&mut buffer).unwrap();

    // We assert that the password is at least 8 characters
    // This will cause a 'panic' if the assertion is false:
    assert!(buffer.len() >= 8, "Password should be at least 8 characters");

    // Here, we forbidden choosing 'password' as a password:
    assert_ne!(buffer.trim(), "password", "Choosing 'password' as password is unsafe.");
}

Using Result, and all the previous examples, create a program which parses a password, with the following rules:

  • Password must:
    • be at least 8 characters long;
    • should contain at least one of these special characters: !, ? or _;
    • should contain at least one number;
    • should not contain any whitespace.
  • A specific error message should be displayed for each rule.

Note: in order to verify your program, you can implement a #[test] function, such as:

#![allow(unused)]
fn main() {
#[test]
fn password_test() {
    let pwd0 = String::from("pass");
    let pwd1 = String::from("password");
    let pwd2 = String::from("password!");
    let pwd3 = String::from("pass word!");
    let pwd4 = String::from("password!1");
    assert!(parse_password(&pwd0).is_err());
    assert!(parse_password(&pwd1).is_err());
    assert!(parse_password(&pwd2).is_err());
    assert!(parse_password(&pwd3).is_err());
    assert!(parse_password(&pwd4).is_ok());
}
}

and then run cargo test on your program.

The Option enum type

The Option enum in Rust is somewhat similar to the Result enum, but its main purpose is to indicate the presence or absence of a value, rather than an error. It has the following definition:

#![allow(unused)]
fn main() {
pub enum Option<T> {
    None, 
    Some(T)
}
}

As an example, let's suppose we want to build a list of people to contact, with various information, such as the contact's name and address, and optionally phone number and/or e-mail, we could for instance define the following struct:

#![allow(unused)]
fn main() {
struct MyContact {
    name: &'static str,
 address: &'static str,
   email: Option<&'static str>,
   phone: Option<&'static str>,
}

let mut list: Vec<MyContact> = Vec::new();
list.push(MyContact {
       name: "Marlo Stanfield",
    address: "2601 E Baltimore St, Baltimore, MD 21224",
      email: None,
      phone: Some("+1 410-915-0909")
});
}

By doing this, we can then take advantage of the Option enum and pattern matching to decide of the best way to contact each person in the list:

#![allow(unused)]
fn main() {
for contact in list {
    match (contact.email, contact.phone) {
        // Both e-mail and phone are available:
        (Some(email), Some(phone)) => {
            if is_email_correct(email) {
                contact_by_email(email);
            } else {
                contact_by_phone(phone);
            }
        }
        // Only e-mail is available:
        (Some(email), None) => {
            contact_by_email(email);
        }
        // Only phone is available:
        (None, Some(phone)) => {
            contact_by_phone(phone);
        }
        // Neither phone nor email:
        (None, None) => {
            send_mail_to_address(contact.address);
        }
    }
}
}

As for the Result type, Option can be checked and handled immediately using the same .unwrap(), .expect() methods.

#![allow(unused)]
fn main() {
fn money_left() -> Option<i32>;
money_left().expect("No money left :(");
}

Using the previous contact list example, implement a small database of books that would be used by a library. It should have a search_book(...) function which searches for a specific book using its name and/or the name of the author (we assume here that there's only one book per author). The function should return a reference to the Book object if it has been found, or None otherwise.

#![allow(unused)]
fn main() {
struct Book {
      name: String,
    author: String,
}

#[derive(Default)]
struct LibraryDatabase {
    books: Vec<Book>
}

impl LibraryDatabase {
    // The function to implement:
    fn search_book(&self, 
        name: Option<&'static str>, 
        author: Option<&'static str>
    ) -> Option<&Book> {...}
}
}

You can use the following #[test] function to verify your code:

#![allow(unused)]
fn main() {
#[test]
fn test() {
    let mut database = LibraryDatabase::default();
    database.books.push(Book { name: String::from("Peter Pan"), author: String::from("Barrie")});

    assert!(database.search_book(Some("Peter Pan"), None).is_some());
    assert!(database.search_book(None, Some("Barrie")).is_some());
    assert!(database.search_book(None, None).is_none());
    assert!(database.search_book(Some("Barrie"), None).is_none());
    assert!(database.search_book(None, Some("Peter Pan")).is_none());
    assert!(database.search_book(Some("Alice in Wonderland"), Some("Barrie")).is_none());
    assert!(database.search_book(Some("Peter Pan"), Some("Lewis Carroll")).is_none());
}
}

Simple macros with macro_rules!

Unlike other programming languages, such as C or C++, Rust's macro system is based on abstract syntax trees (AST), instead of string preprocessing, which makes them a bit more complex to use, but also more reliable and powerful. macros are expanded before the compiler interprets the meaning of the code. The difference between a macro and a function is that macro definitions are more complex than function definitions because you’re writing Rust code that writes Rust code. Due to this indirection, macro definitions are generally more difficult to read, understand, and maintain than function definitions.

Throughout this course, we have already encoutered a few of them, including vec!, panic!, assert!, and of course println!. These macros are defined by the macro!(...) syntax (don't forget the trailling exclamation mark) and are called simple macros, as opposed to Rust's more complex macro systems, such as attribute macros (for instance the #[test] function attribute), and derive macros (the #[derive(Debug)] statement on top of a struct), which we have already both seen as well.

Course: Basic macro_rules! usage

Unlike attribute or derive macros, which must be defined in a separate crate, simple macros can be defined anywhere in our code, using the macro_rules! syntax:

#![allow(unused)]
fn main() {
macro_rules! hello_world {
    () => {
        println!("Hello World!")
    };
}

hello_world!();
}

Here, we defined a macro! that takes no argument, which is indicated by the () statement. Our hello_world!() macro call will be under the hood replaced by the contents that we defined within the => { ... } block.

Advantages of using macros

Our hello_world!() example is of course not very useful, and in fact adds unnecessary noise to a very simple piece of code, but think of the vec! macro for instance:

#![allow(unused)]
fn main() {
let v1 = vec![1, 2, 3, 4, 5];
let v2 = vec!();
let v3 = vec![1];
}

Defining the three different vectors by hand would actually mean writing the following code:

#![allow(unused)]
fn main() {
let v1 = <[_]>::into_vec(Box::new([1, 2, 3, 4, 5]));
let v2 = Vec::new<i32>();
let v3 = std::vec::from_elem(1, 1);
}

Notice how these three vectors are each time created in a very different way? In this case, the vec! macro allows defining a more practical and unified way of instantiating a Vec object, without having to remember all the (sometimes complex) underlying code. Furthermore, as you can see with this example, a macro! can also accept a variable number of arguments, which is not the case with a Rust function.

The different types of arguments (or fragment specifiers)

As you may already have guessed with our first basic macro example, which uses the => operator, macro_rules! relies on pattern matching to parse its arbitrary number of arguments.

macro_rules! can parse different kinds of patterns, including:

  • (): the empty pattern, which means no argument (our previous example);
  • block: a block expression, surrounded by { };
  • expr: any kind of Rust expression;
  • ident: an identifier (the name of a variable, function, etc.);
  • literal: a number/string or other kind of litteral;
  • etc. (see the full list here).

Matching a specific pattern

Let's now try an example with an actual argument. Here, we will use the ident designator in order to create functions from a simple macro call:

#![allow(unused)]
fn main() {
macro_rules! define_fn {
    ($fn_name:ident) => {
        fn $fn_name() {
            println!(
                "This function's name (ident) is: '{}()'.", 
                stringify!($fn_name)
            );
        }
    }
}
define_fn!(foo);
define_fn!(bar);
foo();
bar();
}

Let's break this code piece-by-piece:

#![allow(unused)]
fn main() {
($fn_name:ident) => {
}

   Instead of an empty pattern (), we use the pattern ($fn_name:ident), in which $fn_name would be the name of the argument, and ident its type. The dollar sign ($) is used to declare a variable in the macro system that will contain the Rust code matching the pattern.

#![allow(unused)]
fn main() {
fn $fn_name() {
}

   Within our generated code block, we declare a function with the name taken from our $fn_name ident argument.

#![allow(unused)]
fn main() {
println!(
    "This function's name (ident) is: '{}()'.", 
    stringify!($fn_name)
);
}

   We then define the function's body, with a println! call, in which we print the ident's name using a utility macro called stringify!. This very useful macro will transform our $fn_name identifier into a &'static str object;

What would be the generated code for the define_fn!(foo) macro call?

Create a similar macro, but this time the generated code should define a struct and its impl block like the following:

#![allow(unused)]
fn main() {
// All of this code should be generated by our new macro,
// but the name 'Foo' should be made variable:
struct Foo {
    print: &'static str
}
impl Foo {
    fn new() -> Foo {
        Foo { 
            print: "Foo" 
        }
    }
}
}
#![allow(unused)]
fn main() {
// The macro to implement:
macro_rules! define_struct { 
    ... 
}

// Use the following to verify that the macro is correct:
define_struct!(Foo);

let bar = Foo::new();
assert_eq!(bar.print, "Foo");
}

Course: pattern overloading

macro_rules! definitions can accept an arbitrary number of patterns, in a very simple way. Let's try it out on our define_fn! macro. We will add another pattern allowing to add arbitrary code expressions to the created fn:

#![allow(unused)]
fn main() {
macro_rules! define_fn {
    ($fn_name:ident) => {
        fn $fn_name() {
            println!(
                "This function's name (ident) is: '{}()'.", 
                stringify!($fn_name)
            );
        }
    }; // <-- pattern blocks must end with a semicolon if they're followed by other blocks
    
    // Our new pattern:
    ($fn_name:ident, $additional_code:expr) => {
        fn $fn_name() {
            println!(
                "This function's name (ident) is: '{}()'.", 
                stringify!($fn_name)
            );
            // Append the additional code 'expr' at the end of the defined fn:
            $additional_code
        }
    }
}
define_fn!(foo);
define_fn!(bar, println!("Additional code"));
foo();
bar();
}

Here, we added the ($fn_name:ident, $additional_code:expr) pattern, which is composed of two arguments: the same ident argument, followed by an expr argument, which can be any valid Rust expression. The two arguments are separated by a comma , but the choice of a comma is completely arbitrary, it could be (almost) any symbol.

In our previous example, try replacing the , symbol between $fn_name:ident and $additional_code:expr with another one. Then, call the define_fn! macro with two arguments separated by the same new symbol, and see what it does.

Course: pattern matching for macro_rules!

Pattern matching for macro_rules! is quite different from pattern matching used in the match keyword. Macros can also easily deal with pattern repetition by using a special syntax, which resembles the one used for regular expressions. In particuler, the usual operators of regular expressions can be used: '_', '+' or '*' (one object, a repetition of objects -- at least one, a repetition of objects - possibly 0). In the following example, we want to replace the std::cmp::max() function to take an arbitrary number of arguments:

#![allow(unused)]
fn main() {
let mut max = std::cmp::max(1, 2);
max = std::cmp::max(max, 3);
max = std::cmp::max(max, 4);
max = std::cmp::max(max, 5);
}

In this case, having a macro like the following could prove useful, and would lighten the code a lot:

#![allow(unused)]
fn main() {
max!(1, 2, 3, 4, 5, 6*7, 3*4);
}

The way to do this is to use the $(...),+ syntax, as follows:

#![allow(unused)]
fn main() {
macro_rules! max {
    // Only one argument 'x', return 'x':
    ($x:expr) => {$x};
    // At least two arguments, 
    // - 'x' being the first,
    // - 'y' being one or more additional argument(s),
    // which is defined by the '$(...),+' syntax:
    ($x:expr, $($y:expr),+) => {
        // We recursively call 'max!' on the tail 'y'
        std::cmp::max($x, max!($($y),+))
    }
}
}

The + in the $($y:expr),+ syntax means one or more instances of the ($y) expression, separated by a comma ,.

Note: The * symbol also exists, and means zero or more instances of the pattern.

Now, if we were to call the max! macro the following way, we would only match the first pattern ($x:expr) => {$x}:

#![allow(unused)]
fn main() {
max!(1);
}
  • With two arguments, we would match the second pattern ($x:expr, $($y:expr),+) with a single additional argument:
#![allow(unused)]
fn main() {
max!(1, 2); // expands to:
std::cmp::max(1, max!(2)); // expands to:
std::cmp::max(1, 2);
}
  • And with more arguments recursively:
#![allow(unused)]
fn main() {
max!(1, 2, 3); // expands to:
std::cmp::max(1, max!(2, 3)); // expands to:
std::cmp::max(1, std::cmp::max(2, max!(3))); // expands to:
std::cmp::max(1, std::cmp::max(2, 3));
}

As a summary:

  • $var captures a value in a pattern.
  • $var:ident specifies a type (could be ident, expr, ty, etc.).
  • $( $(var:pat),* or $( $(var:pat),+ captures repetitive sequences separated by commas.
  • $var is replaced during macro expansion.

Bonus: Transform our previous define_struct! example into a more elaborated macro. It should now have the following interface:

#![allow(unused)]
fn main() {
// Define the struct 'Foo':
define_struct!(
    name: Foo,
    members: {
        bar: i32
    }
    methods: {
        fn hello() {
            println!("hello!")
        }
        fn bar(&self) {
            println!("{}", self.bar);
        }
    }
);
// Instantiate the struct 'Foo':
let f = Foo::default();
// Call its bar method:
f.bar();
// Or its static 'hello' method:
Foo::hello();
}

Bonus: Add a nested macro_rules! definition into define_struct! which allows to copy the members and methods of the defined struct into a new different one, such as:

#![allow(unused)]
fn main() {
// Define Foo struct:
define_struct!(
    name: Foo,
    members: {
        bar: i32
    }
    methods: {
        fn hello() {
            println!("hello!")
        }
        fn bar(&self) {
            println!("{}", self.bar);
        }
    }
);

// The define_struct! macro should define a new 'Foo!' macro,
// which allows copying the members and methods of 'Foo' into another new struct:
Foo!(FooCopy);

let c = FooCopy::default();
c.bar();
FooCopy::hello();
}
Course 6: Closures, Threads, Channels and Concurrency

Pierre Cochard, Tanguy Risset

Introduction

Using Threads and Closures to Run Code Simultaneously

Course: Closures

Closures in Rust are anonymous functions that can be stored in variables, or passed to other processes as arguments. They can be found in a lot of places in the language, in order to allow functional-style programming, behavior customization or to provide concurrent/parallel code execution. They are defined by the following syntax |arguments| -> return_type { body }, for instance:

#![allow(unused)]
fn main() {
// Define a closure and store it into a variable:
let my_closure = |x: i32| -> i32 {
    println!("my_closure");
    x*2
};

// Execute the closure like you would normally do with a function:
let y = my_closure(2);
println!("{y}");
}

Borrowed captures

Just like in C++, closures can capture the environment it originates in and use external data in its own internal scope. By default, captured data, whether it is mutable or not, will be borrowed:

#![allow(unused)]
fn main() {
let x = 2;

let my_closure = || -> i32 {
    // Use a borrowed (immutable) capture of 'x', 
    // and return two times its value:
    x*2
};

println!("{}", my_closure());
}
#![allow(unused)]
fn main() {
// Same, but this time, we modify 'x' directly in the closure:
let mut x = 2;

// The closure itself also has to be made 'mutable' 
// in the case of a mutable borrow:
let mut my_closure_mut = || x *= 2;

my_closure_mut();
println!("{x}");
}

Moved captures

Instead of being borrowed, data can also be captured by value into a closure scope, using the move keyword before declaration:

#![allow(unused)]
fn main() {
let mut x = 2;

// Capturing 'x' by value. Here, it is made with a simple copy:
let mut my_closure_mut = move || {
    x *= 2;
    println!("x (closure): {x}");
};

my_closure_mut();
println!("x: {x}");
}

Why is the following code invalid? How can we solve the issue?

#![allow(unused)]
fn main() {
let mut x = vec![31, 47, 27, 16];

let mut my_closure_mut = move || {
    x.push(32);
    println!("{:?}", x);
};

my_closure_mut();
println!("{:?}", x);
}

Course: Passing closures as objects or arguments

One big specificity of closures is that they have a unique, anonymous type that cannot be written out. This can for instance be demonstrated by running the following piece of code:

#![allow(unused)]
fn main() {
// Utility function that prints out the type of a variable:
fn print_type_of<T>(_: &T) {
    println!("Type of my closure: {}", std::any::type_name::<T>());
}

let my_closure = |x: i32| -> i32 {
    x*2
};

print_type_of(&my_closure);
}

Therefore, passing a closure as an argument to a function using a specific type is not possible in Rust. Instead, in order to do that, one would have to use a trait. Indeed, all closures implement one or several of the following traits, depending on their nature and properties:

  • FnOnce: applies to closures that can be called once. All closures implement this trait;
  • Fn: applies to closures that don't move captured values out of their body and that don't mutate captured values, as well as closures that capture nothing from their environment. These closures can be called more than once without mutating their environment, which is important in cases such as calling a closure multiple times concurrently.
  • FnMut: applies to closures that don't move captured values out of their body, but that might mutate the captured values. These closures can be called more than once.

A closure can then be passed as an argument the same way we do for passing trait-implementing objects, by using the Fn/FnOnce/FnMut(argument_type) -> return_type:

#![allow(unused)]
fn main() {
// All of the 'Fn' traits have the format:
// 'Fn(argument-types) -> return types'
fn exec_closure(x: i32, closure: impl Fn(i32) -> i32) -> i32 {
    closure(x)
}

let c = |x: i32| x + 27;
let r = exec_closure(31, c);

println!("{r}");
}

The generic form also works (and is usually preferrable):

#![allow(unused)]
fn main() {
fn exec_closure<T>(x: i32, closure: T) -> i32 
where 
    T: Fn(i32) -> i32 
{
    closure(x)
}
let c = |x: i32| x + 27;
let r = exec_closure(31, c);

println!("{r}");
}

Using the generic form, store the following chirp_fn closure as a member of the struct Bird, with a valid type signature:

let chirp_fn = |times: i32| {
    for _ in 0..times {
        println!("chirp!");
    }
}

// The struct to implement (use generics!):
struct Bird<...> { 
    chirp: ... 
}

// Create a new instance of 'Bird' with the chirp_fn closure:
let bird = Bird { chirp: chirp_fn };

// Call the 'chirp' closure from inside the struct:
// operator precedence priority can be found here: https://doc.rust-lang.org/reference/expressions.html
(bird.chirp)(10);

Course: Spawning Threads using Closures

A std::thread object in Rust will execute a given closure in an independent (or parallel) context of execution. In the following example, the std::thread::spawn call will only return when it's done creating the thread, but not when the thread has actually finished executing:

#![allow(unused)]
fn main() {
// Spawn a new thread which will execute its given closure:
std::thread::spawn(|| {
    println!("Thread 1: chirp!");
});
// At the 'same time', print something from the main thread:
println!("Main thread: chirp chirp!");
}

In this case, the main function returns before the independent thread's println! call happens. This is why we can only see the "main thread" print output. Usually, a thread is bound to a local variable, and is waited upon before the parent context of execution finishes. This can be done by calling the .join() method on the thread handle:

#![allow(unused)]
fn main() {
// Bind the thread's "handle" to a variable:
let th = std::thread::spawn(|| {
    println!("Thread 1: chirp!");
});

println!("Main thread: chirp chirp!");

// Wait for 'th' to finish executing, and re-synchronise both threads:
th.join().unwrap();
}

The following code prints a modified value of variable var from 3 different threads, running independently from one another. Is the code safe? Can you guess what will be the resulting output?

use std::thread;
let mut var = 32;

let t1 = thread::spawn(move || {
    var += 1;
    println!("Thread 1: reading value {}!", var);
});

let t2 = thread::spawn(move || {
    var += 2;
    println!("Thread 2: reading value {}!", var);
});

var += 3;
println!("Main thread: reading value {}", var);

// Re-synchronise both threads:
t1.join().unwrap();
t2.join().unwrap();

What would happen if we removed all the move keywords from the code?

Sharing data safely between threads

Using Shared State data sets

Course: Exclusive access with Mutexes

Mutual exclusion, or mutex is a mechanism which prevents accessing the same data from multiple threads running at the same time. It relies on a locking system in order to do so: a thread must first ask to acquire the mutex's lock before being able to access the underlying protected data. The lock is a data structure that keeps track of whichever thread has exclusive access to the data. If the mutex happens to be locked at the time a thread tries to access the data, it will stall until the lock is eventually released, and the data is free to acquire.

#![allow(unused)]
fn main() {
use std::sync::Mutex;
// Instantiate a new Mutex<i32> instance with the value '32':
let var: Mutex<i32> = Mutex::new(32);
{
    // Acquire the mutex's lock (and panic in case of failure):
    let mut v = var.lock().unwrap();
    // Modify the value safely:
    *v += 32;
}
// Print the result:
println!("var = {var:?}");
}

In the following example, we want to try to use a Mutex to get both threads to use and modify var, but the compiler doesn't allow it, what is the underlying issue here?

#![allow(unused)]
fn main() {
use std::thread;

let mut var = 32;

let t1 = thread::spawn(move || {
    var += 1;
});

let t2 = thread::spawn(move || {
    var += 2;
});
// Re-synchronise both threads:
t1.join().unwrap();
t2.join().unwrap();

println!("Result: {var}");
}

Course: Reference Counted Mutexes

As we could see in our previous example, a Mutex in itself is not sufficient to implement viable thread-safe data sharing:

  • First is the issue of ownership, which could be solved using, for instance, a shared pointer.
  • Second would be the issue of concurrency in accessing this shared pointer from multiple threads simultaneously.

In the Rust programming language, Atomic Reference Counting Arc<T>, which can be seen as an atomic shared pointer, is designed to remedy this very specific problem. It firstly solves the ownership issue by being "reference-counted", just like a standard Rc object, but also solves the concurrency issue by being atomic, meaning that is guaranteed to execute as a single unified transaction. When an atomic operation is executed on an object by a specific thread, no other threads can read or modify the object while the atomic operation is in progress. In other words, other threads will only see the object before or after the operation, there would be no intermediary state.

Our previous example can then be replaced by the following:

#![allow(unused)]
fn main() {
use std::thread;
use std::sync::{Arc, Mutex};

// We wrap the Mutex in a Atomically Reference-Counted object:
let arc = Arc::new(Mutex::new(32));

// We prepare two clones, for moving into the two distinct closures:
let rc1 = Arc::clone(&arc);
let rc2 = Arc::clone(&arc);

let t1 = thread::spawn(move || {
    let mut v = rc1.lock().unwrap();
    *v += 1;
});

let t2 = thread::spawn(move || {
    let mut v = rc2.lock().unwrap();
    *v += 2;
});

// Re-synchronise both threads:
t1.join().unwrap();
t2.join().unwrap();

println!("Result: {arc:?}");
}

If an Arc object is sufficient to provide multiple ownership and thread-safe access to data, why do we still need a Mutex guarding our data? TODO: pas sur de savoir expliquer la réponse clairement... tu ne relache jamais les Mutex ou ARC?

Course: Lock-free data sharing with Atomics

Another way of making data thread-safe is by directly using atomic data structures. The Rust standard library provides a few of them in its std::sync::atomic module. The main difference with using Mutexes is that atomics are what we call lock-free: unlike Mutexes, its underlying mechanism never sleeps, but it spins (it will check data availability in a continuous loop), that's why we usually call them spinlocks. In real-time contexts, where thread sleep is not an option, it is always preferrable to use lock-free data structures.

Our previous example would for instance look like this with atomics:

#![allow(unused)]
fn main() {
use std::thread;
use std::sync::Arc;
use std::sync::atomic::AtomicI32;
use std::sync::atomic::Ordering;

// We replace 'Mutex::new()' with the following:
let arc = Arc::new(AtomicI32::new(32));
let rc1 = Arc::clone(&arc);
let rc2 = Arc::clone(&arc);

let t1 = thread::spawn(move || {
    // Acquire underlying data as a copy:
    let mut v = rc1.load(Ordering::Acquire);
    // Modify the copy:
    v += 1;
    // Update the value atomically, release the lock:
    rc1.store(v, Ordering::Release);
});

let t2 = thread::spawn(move || {
    // Another (more compact) way of doing this:
    rc2.fetch_add(2, Ordering::AcqRel);
});
// Re-synchronise both threads:
t1.join().unwrap();
t2.join().unwrap();

println!("Result: {arc:?}");
}

Using Message passing to transfer data between threads

Another approach to multiple ownership and thread-safety in Rust would be using an mpsc::channel or mpsc::sync_channel, which are asynchronous/synchronous FIFO queues that store all the updated states of a value in a shared infinite buffer.

An mpsc::channel() call will return a tuple containing a handle to a Sender object and a Receiver object, which are by convention respectively named tx and rx. These two objects are positioned at each end of a FIFO which tunnels data between the two:

#![allow(unused)]
fn main() {
// Create a 'data channel' between a Sender `tx`, and a Receiver `rx`:  
let (tx, rx) = std::sync::mpsc::channel::<i32>();

// Send the values `32` and then `16` through the channel:
tx.send(32).unwrap();
tx.send(16).unwrap();

// Poll the channel, read data if available:
println!("n = {}", rx.recv().unwrap());
println!("n = {}", rx.recv().unwrap());
}

Example of sending from a separate thread:

#![allow(unused)]
fn main() {
use std::thread;
use std::sync::mpsc;
let (tx, rx) = mpsc::channel();

// Do the same thing in a separate thread:
let th = thread::spawn(move || {
    tx.send(32).unwrap();
    tx.send(16).unwrap();
});

th.join();

// Poll the channel, read data if available:
println!("n = {}", rx.recv().unwrap());
println!("n = {}", rx.recv().unwrap());
}

Or from multiple threads simultaneously:

#![allow(unused)]
fn main() {
use std::thread;
use std::sync::mpsc;

let (tx, rx) = mpsc::channel();
let mut vec = Vec::new();

for n in 0..8 {
    // Clone the Sender `tx` for each thread:
    let tx = tx.clone();
    vec.push(thread::spawn(move || {
        tx.send(n).unwrap();
    }));
}
for t in vec {
    t.join().unwrap();
}
for _ in 0..8 {
    // Consume the FIFO value-by-value:
    let value = rx.recv().unwrap();
    println!("Received value: {}", value);
}
}

Using an mpsc::channel, an Arc<T> and a Mutex (or an Atomic), implement a program which creates and run two independent threads:

  • A producer thread which continuously counts from 0 to infinity.
  • A consumer thread which continuously reads and prints the count produced by the producer thread.
  • The two threads should run for 5 seconds and then stop.
  • Hint: you can use thread::sleep to pause a thread for a certain amount of time.

Asynchronous programming

While programming with threads is a perfectly valid way of implementing concurrent programming, it also has a few disadvantages, such as having to rely on operating system scheduling, as well as sometimes making the code difficult to re-use or modify. To address these issues, programmers came up with a new way of structuring a program in a different set of tasks (whether they are independent, concurrent, or sequential), which has been called asynchronous programming.

Code within a thread is written in sequential style and the operating system executes them concurrently. With asynchronous programming, concurrency happens entirely within a program: the operating system is not involved, making context switch faster, and memory overhead also lower. By being natively integrated into a programming language, which is the case with Rust, it also makes control flow more flexible and expressive. For instance, a program using threads, like in the following example:

#![allow(unused)]
fn main() {
fn count_to(N: i32) {
    for n in 1..=N {
        println!("{n}");
        std::thread::sleep(std::time::Duration::from_secs(1));
    }
}
let t1 = std::thread::spawn(|| {
    count_to(10)
});
let t2 = std::thread::spawn(|| {
    count_to(20)
});
t1.join().unwrap();
t2.join().unwrap();
}

Could be also described like this using Rust's (with tokio) async and await features:

use tokio::{join, spawn, time::{sleep, Duration}};

async fn count_to(N: i32) {
    for n in 1..=N {
        println!("{n}");
        sleep(Duration::from_secs(1)).await;
    }
}

#[tokio::main]
async fn main() {
    // Run the following code expressions on a same task:
    join!(count_to(10), count_to(15));
}