Rust Introduction: Cargo, Crates, Rust project, Hello Word, (2h on computers),
Pierre Cochard, Tanguy Risset
This course has been set up for student of the Telecommunication Department at INSA-Lyon (5th year), it is vastly inspired by the Rust book and many other resource on the web. It assumes that student do not have any programming experience in Rust by have a strong programming experience in other languages (C/C++ and object languages in particular).
- Setting up the environnement for using Rust
- Rust Hello World
- Variables, Types ans Mutability
- Function in Rust
- Generic types, traits and
#derive
directive - First example of "Move" semantic: the cube
- Introduction to Visual Studio Code
- How to use VS code efficiently for Rust (TODO) {#how-to-use-vs-code-efficiently-for-rust .unnumbered}
In addition to this documents, you'll find other documents on Moodle presenting the concepts covered in this course. Don't forget to check it before you start. Many of the information listed here come from https://www.rust-lang.org/learn/.
The course is organize in sections that have questions. In addition you will find boxed text labeled course: which consists in important concept
Course: What is Rust and Why Rust
Nowadays, Rust use is growing exponentially, the number of library and project using or useful to Rust developpers is already huge. The reason for that it that Rust provides safe memory management without a garbage collector, ensuring both performance and security. Its ownership system eliminates data races and segmentation faults. It enables efficient concurrent programming while guaranteeing memory safety. It is associated with a powerful modern ecosystem and and it is suited for embedded systems, system programming as well as high-performance applications. Adopted by major industry players, Rust is emerging as a reliable alternative for secure and system-level development.
Setting up the environnement for using Rust
We're going to start by setting up the environment that will enable you to program in Rust. We recommand that you use Rust on your own machine, but the environment is already installed on the departement computers.
This environment simply consists of having:
-
An editor for programming, we strongly recommend Visual Studio Code that is available on all OS, for instance here: https://visualstudio.microsoft.com/fr/downloads/. See Appendix below for an introduction to Visual Studio Code.
-
Install the
cargo
command withrustup
.
cargo
is the Rust compiler as well as the package manager and build
system for Rust. cargo
's installation and updating is itself managed
by rustup
.
Below is a summary for installing cargo
on your laptop. The original complete instruction can be found here:
https://doc.rust-lang.org/book/ch01-01-installation.html.
As a summary:
-
On linux or macOS, use the following command:
curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh
-
on Windows, go to https://www.rust-lang.org/tools/install and follow the instructions for installing Rust.
Course: What is cargo used for?
-
Dependency Management: Cargo manages the dependencies of a Rust project. It automatically downloads and builds the required libraries and dependencies, making it easier for developers to include external code in their projects.
-
Project Configuration: Cargo uses a file called
Cargo.toml
to configure a Rust project. This file includes information about the project, its dependencies, and various settings. -
Building and Compilation: Cargo handles the compilation process of Rust code. It can build the project, manage dependencies, and generate executable binaries. Developers can use Cargo commands like
cargo build
to compile the project orcargo run
to build and run it in one step. -
Testing: Cargo provides built-in support for testing Rust code. Developers can use the
cargo test
command to run tests defined in the project. -
Documentation: Cargo can generate and serve documentation for the project using the
cargo doc
command. This is useful for both internal and external documentation (the HTML file that you are reading has been generated bycargo doc
) -
Publishing Packages: Cargo facilitates the process of publishing Rust packages to the official package registry, called "
Crates.io.
" This makes it easy for others to discover and use Rust libraries and projects.
Rust Hello World
A Rust project is contained in a directory which has the name of the project. From now on, we suggest making a projects directory in your home directory and keeping all your Rust projects there.
We will use the cargo
command to build our first hello_world
project. Note that this is not mandatory, everything can be built by
hand, the Rust ompiler can be invoked without cargo by using the command
rustc
.
cargo new hello_world
Cargo.toml
is the project configuration
file written in the TOML (Tom's Obvious, Minimal Language) format1,
and src/main.rs
is the Rust "main" file. if anything is unclear ask
the teacher.
cargo build
Where is the generated executable file? What is the bang (!
) after
println
Course: What about println!
As your can check on the Rust Standard Library
documentation : println!
is not a
function, it is a macro.
Macros are called with a trailling bang (such as println!
), they are a
way of writing code that writes other code, which is known as
metaprogramming. Understanding and declaring Macros is quite complex and
will be seen later. But using them (such as using println!
) is usually
very easy.
Variables, Types ans Mutability
let x = 5;
println!("The value of x is: {x}");
x = 6;
println!("The value of x is: {x}");
#![allow(unused)] fn main() { let x = 5; { let x = x + 1; { let x = x * 2; println!("The value of x in the inner scope is: {x}"); } } println!("The value of x is: {x}"); }
The notion of scope is quite important in Rust, a "scope" can be manipulated in the language as an object, we will see it in more detail in TD4.
Function in Rust
Functions in Rust are like in other languages. They are declared with
the keyword fn
, parameters are passed by value and an important
specificity (borrowing values) will be study on next course.
fibonacci
:
fibonacci(n: i32) -> i32
which computes element n
of the fibonacci sequence.
We will not use a
recursive solution, but rather a for
loop whose syntax will be:
for i in 2..n+1
( half-open range) and mutable variables.
We recall the definition of the fibonacci function fib
:
fib(0)=1
fib(1)=1
fib(i)=fib(i-1)+fib(i-2) for i >=
Generic types, traits and #derive
directive
As a class in C++, types can be defined and can implement different
methods. for instance, the following code defines the type Complex
as
a struct of two floats (as C++ classes, types begin with an Uppercase by
convention).
#![allow(unused)] fn main() { struct Complex { re: f32, im: f32, } fn build_complex(re: f32,im:f32)-> Complex { Complex {re,im} } let mut a = build_complex(2.3,4.0); println!("a=({},{})",a.re,a.im); a.re = a.re+1.; println!("a=({},{})",a.re,a.im); }
Course: Generic Types
As templates in C++, Rust enables the use of generic
type in function or struct, enums or methods definitions. Here is a
simple definition of a Point
structure using integer or float
coordinates:
#![allow(unused)] fn main() { struct Point<T> { x: T, y: T, } let integer = Point { x: 5, y: 10 }; let float = Point { x: 1.0, y: 4.0 }; }
Complex
ci dessus en utilisant un
type struct Complex<T>
utilisant un type générique
Course: Traits: Defining common behaviour
A trait defines a functionality that a particular type has and can share with other types. Traits are similar to a feature often called interfaces in other languages.
Many natural methods can be defined for any -- or at least many --
types. For instance the copy
or clone
methods (rust primitives) or
the fmt
method (of the trait std::fmt::Display
, rust standard
library) that enables to use println!
. These methods are not defined
by default when a new type is defined.
Traits are defined by the trait
keyword. By
convention they are named starting with an upper case, e.g. the trait
Clone
, it usually defines a method with the same name in lower case
(here: clone()
)
If you want to clone a Complex
, you juste have to write the
implementation of the clone method:
impl Clone for Complex {
fn clone(&self) -> Self{
Complex{re: self.re, im: self.im}
}
// Now a.clone() can be used on Complex variables
}
struct complex<T>
type defined before. You will have to use impl<T: Clone>
to ensure that the T
generic type implements the Clone trait.
By the way, do you know the difference between copy
and clone
?
Select your prefered answer:
-
Clone
is a supertrait ofCopy
2 -
Copy
is implicit, inexpensive, and cannot be re-implemented (memcpy).Clone
is explicit, may be expensive, and may be re-implemented arbitrarily. -
The main difference is that cloning is explicit. Implicit notation means move for a non-Copy type.
Course: Deriving traits
For certain traits3:
(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Default, etc.
),
the compiler is capable of providing basic implementations for some
traits via the #[derive] attribute (In Rust, an attribute is metadata applied to code elements like functions, structs, modules, or crates, attributes are prefixed with #
and enclosed in square brackets []
).
These traits can still be manually implemented if a more complex behavior is required.
for instance, the Clone
trait can be automatically derived for
Complex
type:
#[derive(Clone)]
struct Complex {
re: f32,
im: f32,
}
[... no need to implement Clone ...]
let a = build_complex(2.3,4.0);
let _c=a.clone()
[...]
println!
to display Complex
variables? Two methods, test both:
- implement the
std::fmt::Display
trait for typeComplex
. This will need to:
- use
std::fmt
- search for the prototype of the
Display
trait - use the macro
write!
to print fields
- Derive the
Debug
trait that includes thefmt::Display
trait and use the"{:?}
format.
First example of "Move" semantic: the cube
In this first example we define a very simple data structure, a 'Cube'
with a single field c
that indicates the size of the cube.
Cube
and prints its size
println!("My cube: {}", Cube{c:0.5});
? How to make the cube printable?
Hint: You can derive the std::fmt::Display
trait or the Debug
trait
x
assigned to a given cube, print it and then define
a second variable y
defined by let y = x;
. Then print x
again,
what is the problem?
Course: Move semantic
In Rust, move semantics refers to the ownership transfer of data from one variable to another. Rust enforces a strict ownership model where each piece of data has a single owner at a time, and ownership can be transferred (or "moved") when a value is assigned to another variable or passed as an argument to a function.
By default an assignement such as let y = x
implies a transfer of
ownership of the content of x
to y
. This ownership concept will be
studied further in next course. This limit side effects: modifying y
do not modify x
.
In order to dupplicate the cube (as it would be done in any language),
one has to clone it or to implement the Copy
trait. Deriving the
Copy
trait for Cube
changes the semantic of the assignement: the
assignement is now a copy, not a move.
x
into y
Introduction to Visual Studio Code
Visual Studio Code (often abbreviated as VS Code) is a cross-platform source code editor developed by Microsoft. It is compatible with Windows, macOS, and Linux, offering great flexibility to developers working in diverse environments. This lightweight yet powerful editor is designed to meet the needs of modern developers, providing a wide range of features.
Key Features of Visual Studio Code
Visual Studio Code stands out due to the following features:
-
Built-in support for multiple programming languages: VS Code supports a wide array of languages such as Python, JavaScript, C++, Java, and more, thanks to its extension system.
-
Extensions and customization: A vast library of extensions is available to add functionalities like debugging, version control, and language-specific tools.
-
Integrated debugger: VS Code provides an interactive debugging environment to simplify error correction in the code.
-
Version control integration: Seamless integration with Git and other version control systems allows developers to track code changes directly within the editor.
-
Integrated terminal: A terminal is available inside the editor, enabling command execution without leaving the application.
-
IntelliSense: This feature offers intelligent code completion and contextual suggestions based on syntax and variable types.
-
Cross-platform compatibility: VS Code works consistently on Windows, macOS, and Linux, ensuring a uniform user experience regardless of the operating system.
Thanks to its intuitive interface and powerful tools, Visual Studio Code has become one of the most popular editors among developers, whether they are beginners or experienced professionals. Its active community and frequent updates make it a reliable choice for addressing the evolving needs of software development.
How to use VS code efficiently for Rust (TODO)
-
installation avec
apt
sur linux, aller sur https://code.visualstudio.com/ -
lancer sur le répertoire projet
-
ajouter l'extension rust (barre de gauche, petit carrés), search rust -> install rust-analyzer
-
go to explorer
-
ctrl-shift-P pour la liste des commandes
You can have more documentation about the TOML format here: https://toml.io/en/ or here in french: https://toml.io/fr/. However, it is probably not necessary, TOML is quite simple to understand
list of derivable traits: https://doc.rust-lang.org/rust-by-example/trait/derive.html?highlight=derive#derive
Ownership, borrowing, mutability, heap and stack in Rust (2h on computers)
Pierre Cochard, Tanguy Risset
- Move semantics and Copy semantics
- References and borrowing
- Mutable Reference
- Heap and Stack: the String example
- Smart Pointers
- Recalls on Heap and Stack {#appStack}
course: Ownership (from https://doc.rust-lang.org/book/)
Rust uses a third approach: memory is managed through a system of ownership with a set of rules that the compiler checks.
If any of the rules are violated, the program won't compile. None of the features of ownership will slow down your program while it's running.
Because ownership is a new concept for many programmers, it does take some time to get used to. When you understand ownership, you'll have a solid foundation for understanding the features that make Rust unique.
Here are the Ownership Rules:
-
Each value in Rust has an owner.
-
There can only be one owner at a time.
-
When the owner goes out of scope, the value will be dropped.
The scope notion is (for the moment) the same as in traditional languages such as C.
Move semantics and Copy semantics
As we have seen in previous course, the following program will compile because:
-
default semantics of assignement is for type
Cube
is move. -
But the derivation of the
Copy
trait turns it into a copy semantics, hencex
andy
represent two different values.
#[derive(Debug, Clone, Copy)]
struct Cube {
c: f32,
}
fn main() {
let x = [Cube{c:0.5},Cube{c:0.75},Cube{c:1.0}];
let y = x;
println!("x is: {:?}", x);
println!("y is: {:?}", y);
}
fn main() {
let x = [(10,20),(30,40),(50,60)];
let y = x;
println!("x is: {:?}", x);
println!("y is: {:?}", y);
}
Why does it work?
References and borrowing
There is an alternative to moving or dupplicating (i.e. cloning) a
value: you can borrow it. Borowing in Rust is done with the
reference operator: ’&’
.
Copy
trait,
create a reference to x
by using let y = & x
. Can you print x
after that?
course: Reference in Rust
References in Rust are equivalent to references in any language: a pointer to the same content, except that, because of the strong static verifications performed by the compiler, a reference is always guaranteed to point to a valid value of a particular type for the life of that reference1.
References are indicated by the ’&’
operator. As in C, the opposite of
referencing is dereferencing, which is accomplished with the dereference
operator: ’*’
. However, in practice, the ’&’
operator can be
omitted; this is called deref coercion or autoderef (it is
implemented in a trait Deref
that is implemented for all references).
This autoderef is implemented in almost all cases, except when you assign a value to a dereferenced mutable reference:
let mut x = 10;
let y = &mut x;
*y = 20; //explicit dereferencing is required here
Borrowing is extremely useful in function calls. Each time you call a function with a parameter, the ownership of the object passed as a parameter is transferred to the function (actually, it is transferred to the formal parameter of the function). If, instead, you pass a reference to the object, the ownership does not change, so you can call many functions that only use an object without modifying it by using references.
Mutable Reference
Sometimes, you wish to have a function call that modifies an object.
For that, you can use a mutable reference with the syntax:
let y = &mut x
. Mutable references in Rust do not change ownership.
They only provide exclusive access to a value for mutation while
ensuring that the ownership of the value remains unchanged.
let y = &mut x
), write a function
called double
that double the size of your cube x
.
course: Mutable reference
Ownership in Rust means having full control over a value (here a value is to be understand as L-value, i.e. a value which is stored in a memory box). The owner is responsible for managing the value's lifetime (we will talk later about lifetimes) and cleaning up its resources when it goes out of scope. Ownership can be transferred (moved) but is unique at any given time (except in very special cases that we will see).
Borrowing (via references, either &T
or &mut T
) allows you to
access a value without transferring ownership. Immutable borrow (&T):
Grants read-only access to a value. Mutable borrow (& mut T): Grants
exclusive, write-access to a value.
Rules of Mutable References:
-
You can only have one mutable reference to a value at a time.
-
While a mutable reference exists, no other references (mutable or immutable) to the same value are allowed.
It is important to understand that the Rust compiler evaluate very precisely the scope of variable.
#[derive(Debug)]
struct Cube {
c: f32,
}
fn double(y : &mut Cube) {
y.c = 2.*y.c;
}
fn main() {
let mut x = Cube { c: 0.75 };
let y = &mut x;
double(y);
println!("My cube is: {:?}", x);
println!("My cube is: {:?}", y);
}
#[derive(Debug)]
struct Cube {
c: f32,
}
fn double(y : &mut Cube) {
y.c = 2.*y.c;
}
fn main() {
let mut x = Cube { c: 0.75 };
let y = &mut x;
double(y);
println!("My cube is: {:?}", y);
println!("My cube is: {:?}", x);
}
Heap and Stack: the String example
Many programming languages don't require you to think about the stack and the heap very often. But in a systems programming language like Rust, whether a value is on the stack or the heap affects how the language behaves and why you have to make certain decisions.
Section 6 recalls the basics that everyone should know about the heap and the stack; please read it if you are not very familiar with these concepts.
The following code manipulates a string that contains hello
:
let s1 = String::from("hello");
let s2 = s1;
As you know, if s1
were set to an integer (say 5), then s2
would
have been set to a copy of 5
, because int32
has copy semantics by
default. But here, s1
is assigned to a String
. We will study strings
in more detail later, but this is a good example to understand the
difference between the heap and the stack.
A String is made up of three parts, shown in the left figure 2{reference-type="ref" reference="trpl04-01"} (taken from the Rust book): a pointer to the memory that holds the contents of the string, a length, and a capacity. This group of data is stored on the stack. On the right is the memory on the heap that holds the contents. The reason for this is that a string might contain an arbitrarily long character string, but the size used to store the structural information (i.e., pointer, length, and capacity) does not change from one string to another; it is known statically.


(a) (b)
When we assign s1
to s2
, the String data is copied, meaning we copy
the pointer, the length, and the capacity that are on the stack. We do
not copy the data on the heap that the pointer refers to. In other
words, the data representation in memory looks the right of like
Figure above.
Note that the effective content of the string (i.e. the 'hello'
characters) is not duplicated, moreover it cannot be reached anymore
with s1
string have move semantics so s1
is moved to s2
(data is
now owned by s2
)2.
fn append_word(s: & mut String)
, call it giving a
mutable reference to s2
. you can use the function
pub fn push_str(&mut self, string: &str)
Smart Pointers
Smart pointers are inherited from other language such as C++. Smart
pointers are data structures that act like a pointer but also have
additional metadata and capabilities. Rust has a variety of smart
pointers defined in the standard library that provide functionality
beyond that provided by references. To explore the general concept,
we'll look at a couple of different examples of smart pointers,
including a reference counting smart pointer type (Rc
) and a unique
pointer on the heap (Box
).
The most straightforward smart pointer is a Box, whose type is
written Box<T>
. Boxes allow you to store data on the heap rather than
the stack with a Unique
pointer. What remains on the stack is the
pointer to the heap data. This is usefull for instance to create
recursive type.
List
based on the following structure: a list is either
(the "either" correspond to an enum
) the constant Nil
or the
concatenation of an integer and a List
: Cons(i32, List)
. Try without
using Box
then using Box
.
You will have to declare the use of the created symbols after the definition of List
by writing: use crate::List::{Cons,Nil};
In the majority of cases, ownership is clear: you know exactly which variable owns a given value. However, there are application when a single value might have multiple "owners". For example, in graph data structures, multiple edges might point to the same node, and that node is conceptually owned by all of the edges that point to it. A node shouldn't be cleaned up unless it doesn't have any edges pointing to it and so has no owners.
You have to enable multiple ownership explicitly by using the Rust type
Rc<T>
, which is an abbreviation for reference counting. We use
the Rc<T> type when we want to allocate some data on the heap for
multiple parts of our program to read and we can't determine at compile
time which part will finish using the data last. Note that Rc<T> is
only for use in single-threaded scenario, other constructs are used in
multithreaded programs.
a
) is shared by two other
lists (b
and c
). Write a program that creates this object by using
Rc<T>
instead of Box<T>
in the List definition. (As Rc
is not in
the prelude, you have to use use std::rc::Rc;
)

Recalls on Heap and Stack
Although knowing the exact memory management is generaly not necessary to a programmer, in many case (system programming or embedded programming for instance, often done in Rust), it is crucial to understand how memory is handle by the compiler/OS. From the programmer point of view, and thanks to virtual memory system, everything happens as if we had all the memory available.
The memory management is more or less the same for every language and
system, what differ is what is visible for the programmer: explicit
memory management (malloc/free
) or garbage collecting etc. This memory
is organized in different section, almost allways in the following way:

The "code" section contains the assemble code of the program. The "static" section contains all the "static" variables (i.e. variables that are available during the whole execution of the program). The two other section are managed dynamically during execution:
-
The heap is used for dynamic memory allocation:
malloc
(in C) ornew
(in object languages). The object stored in the heap have a lifetime that is independent of function execution, they can survive after the function that created them has finished. The heap can be managed explicitely (as is C withmalloc
andfree
) or implicitely (using a garbage collector as in Python for instance). -
The stack is used to manage the execution of functions (or procedures in general) which includes in particuly the allocation and management of functions local variables.
The stack start from big adresses and grows downward, although it is often represented upside-down as below: small adresses up, big addresses down. The heap grows upward, when the two bounds meet, the system is out of memory.
The stack execution principle is important to know. when a function is called, a space is allocated on the stack to store its local variables: this space is called the function frame. When the function ends, its frame is freed and the stack goes back to the frame of the calling function.
Below is an illustration of the evolution of the stack during a function call, two registers of the processor are indicated: the stack pointer (SP) that indicate the top of the stack and the frame pointer that indicate the beginning of the frame of the current fonction. The frame contains all the information needed to the execution of the function, including room for local variables.



(a) before call (b) during call (c) after call
-
before the call, the frame pointer FP points to the frame of the calling function
-
during the call, the stack is increase (i.e. SP is decreased as the stack is upside-down) to have room for the frame of the called function. This includes room for local variable of the function, parameter given to the function and information for returning from the function (return address in the code because a given function can be called from many places in the code), room for the function result as well as some bookeeping information such as saved values of the processor registers.
-
after the call, the called fonction frame has disappeared. Actually its content is still there but cannot be accessed anymore because the stack pointer SP has been put back to its location before the call
Important to remember: The function variables whose size are known
at compile time are usually stored in the stack. The variable whose
size are know during execution, such as String
or object created by
new
are usually stored on the heap.
It is a major difference between Rust and other languages: there are no "null pointers", interestingly enough, the decision of authorizing Null pointer was taken by Tony Hoare place during the 60's, it is known as his "billion dollar mistake": https://news.ycombinator.com/item?id=12427069
It is important to know that the pointer used in a String has the
"Unique<T>
" type, which forbid the object pointed by this pointer
to have two Owner at the same time. Hence the String type cannot
have a copy semantics
Advanced types Compound and collection types
Pierre Cochard, Tanguy Risset
Compound types
Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.
Course: Tuples
'Tuples' are fixed-size collections of arbitrary-typed values,
they are defined with the (Type, Type, ...)
syntax:
#![allow(unused)] fn main() { // Explicit type: let mut tup: (i32, i32, f32, &str) = (31, 16, 47.27, "hello!"); // Inferred type: let mut tup = (31, 16, 47.27, "hello!"); }
Accessing individual values within a Tuple can be done by either:
- referring to its index
- destructuring the tuple and bind it to individually-named variables:
#![allow(unused)] fn main() { let mut tup = (31, 16, 47.27, "hello!"); // Access by index: tup.0 = 22; tup.3 = "world!"; // 'Destructuring' a tuple: let (t0, t1, t2, t3) = tup; println!("y = ({t1}, {t2})"); }
Take a mutable reference to the third element ( f32
) of the tupletup
and pass it to a function that multiplyies it by2
:
let mut tup = (31, 16, 47.27, "hello!");
// The function's prototype (to be implemented):
fn mul2(t: &mut f32);
mul2(...);
assert!(tup == (31, 16, 94.54, "hello!"));
Tuples can be conviently used in a function in order to return multiple values, which can then be assigned to distinct variables in a same expression:
#![allow(unused)] fn main() { // A function returning a pair of signed integers: fn return_tuple(x: i32) -> (i32, i32) { return (x+1, x+2); } // Calling a function, and storing its result: let y: (i32, i32) = return_tuple(8); println!("y = {:?}", y); println!("y = ({:?}, {:?})", y.0, y.1); // Passing a tuple as an argument to a function: fn print_tuple(x: &(i32, i32)) { println!("x = ({:?}, {:?})", x.0, x.1); } print_tuple(&y); }
Write a function that transforms a (i32, i32)
tuple by swapping its two values:
#![allow(unused)] fn main() { // The function prototype to be implemented: fn swap(tup: &mut(i32, i32)); let mut x = (31i32, 27i32); swap(&mut x); assert!(x == (27i32, 31i32)); }
Course: Arrays (primitive type)
Arrays are fixed-size groups of values of the same type, and can be defined in Rust with the syntax:
[Subtype; Length]
, for instance[i32; 10]
#![allow(unused)] fn main() { // Explicit type: let a: [i32; 3] = [31, 16, 47]; // Inferred type: let b = [0, 1, 2, 3, 4]; // #[i32; 5] // Create and zero-initialize an array: let mut a: [usize; 10] = [0; 10]; // same as: let mut a = [0 as usize; 10]; // same as: let mut a = [0usize; 10]; // Writing at a specific index: // Note: in Rust, as in C, array indices start at 0 a[0] = 2; println!("a[0] = {}", a[0]); }
As in most programming languages, multidimensional/nested arrays are also supported in rust, and can be declared as follows:
#![allow(unused)] fn main() { // 2-dimensional array, 2 arrays of `i32` with a length of 10 each: let mut multi_array = [[0 as i32; 10]; 2]; }
What would be the type of the following arrays?
#![allow(unused)] fn main() { let a1 = [(1, 2), (3, 4), (5, 6)]; let a2 = [(1, 2), (3, 4), (5, (6, 7))]; }
Course: Ranges & Iterators
Arrays are convenient for storing and processing a set of contiguous data on the stack, for instance through the use of loops, ranges and iterators.
A range represents an interval of values between a start and an end point.
In rust, they can be conveniently used with the start..end
construct (here excluding the end value),
or with start..=end
(here including the end value).
Examine the following assert!
statements, will this program compile?
let a = 0..10;
let b = 1..=10;
// 'a' range:
assert!(a.contains(&0));
assert!(!a.contains(&10));
// 'b' range:
assert!(!b.contains(&0));
assert!(b.contains(&10));
// The 'a' and 'b' ranges have the same number of elements:
assert_eq!(a.count(), b.count());
Course: Iterators
Iterators allow to go through an array, a range or a collection, and access each element one-by-one.
#![allow(unused)] fn main() { let r = 0..10; // Iterate over a range: for n in r.into_iter() { print!("{n} "); // -> 0 1 2 3 4 5 6 7 8 9 } println!(); // Iterate over an array: let mut a = [0; 10]; // Basic for-loop iteration: for x in a { println!("{x}"); } // From a 'range': for n in (0 .. a.len()) { println!("{}", a[n]); } // As mutable, changing the values of the array: for x in &mut a { *x += 1; } // Equivalent to (using a 'closure'): a.iter_mut().for_each(|x| *x += 1 ); // Iterate with both element and index: for (i, x) in a.iter_mut().enumerate() { *x += i; } }
Using ranges and/or iterators, write in the following multidimensional array's first sub-array values that incrementally go from 1 to 10, and in the second, decrement the values from 10 to 1, as shown below:
#![allow(unused)] fn main() { let mut multi_array = [[0 as i32; 10]; 2]; // The following must be true: assert_eq!(multi_array, [ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] ]); }
Collections
In addition to primitive compound types, the Rust standard library includes a number of very useful data structures called collections. Unlike the built-in array and tuple types, the data these collections point to is stored on the heap, which means the amount of data does not need to be known at compile time and can grow or shrink as the program runs.
Course: Vectors
'Vectors' are a collection of multiple values of a same type stored on the heap. Unlike arrays, they have a dynamic size: they can grow, or shrink.
A Vec
object has ownership over the data located in its underlying heap-allocated buffer,
which means that the buffer will be deallocated whenever the owning object goes out of scope.
#![allow(unused)] fn main() { // The easiest way to create a vector is to use the 'vec!()' macro: let mut v = vec![0, 1, 2, 3, 4, 5]; // Vec<i32> println!("Value: {:?}", v); // 'Pushing' (appending) a new value at the end: v.push(6); println!("Value: {:?}", v); // 'Popping' (removing) its last value: let last = v.pop(); println!("Last value: {:?}, Vector: {:?}", last, v); }
Examine the following v1
,v2
andv3
vectors and their underlying heap buffer pointers.
#![allow(unused)] fn main() { let mut v0: Vec<i32> = vec![0, 1, 2, 3, 4]; // get a pointer to the underlying heap memory buffer: let v0_ptr: *const i32 = v0.as_ptr(); // Create another vec 'v1' from 'v0', and get its heap pointer again: let mut v1: Vec<i32> = v0; let v1_ptr = v1.as_ptr(); // Create another vec 'v2' from 'v1': let mut v2: Vec<i32> = v1.clone(); let v2_ptr = v2.as_ptr(); }
Which of the following assertions are
true
:
#![allow(unused)] fn main() { // Assertion A: the address of pointer 'v0' is the same as pointer 'v1' assert_eq!(v0_ptr.addr(), v1_ptr.addr()); // Assertion B: the address of pointer 'v1' is the same as pointer 'v2' assert_eq!(v1_ptr.addr(), v2_ptr.addr()); // Assertion C: the address of pointer 'v0' is the same as pointer 'v2' assert_eq!(v0_ptr.addr(), v2_ptr.addr()); }
Iterating over a vector is the exact same process as for an array (most operations are inter-compatible!).
#![allow(unused)] fn main() { // Initializing from a range and iterator: let mut v = Vec::from_iter((0..6).map(|i| i+1 )); println!("Value: {:?}", v); // Iterate/increment: for x in &mut v { *x += 1; } println!("Value: {:?}", v); // General operations: v.rotate_left(1); println!("Value: {:?}", v); // etc. }
Using a single loop, move the contents of vector v
to arraya
such as vectorv
is equal tovec![]
(empty vector) and arraya
is equal to[5, 4, 3, 2, 1, 0]
:
#![allow(unused)] fn main() { let mut v = vec![0, 1, 2, 3, 4, 5]; let mut a = [0; 6]; (...) assert_eq!(v, vec![]); assert_eq!(a, [5, 4, 3, 2, 1, 0]); }
Course: Hash-maps
HashMap
are heap-allocated collections of same-type values indexed by a unique key.
Like vectors, they can grow, or shrink. They make a convenient choice for representing indexes, dictionaries, or any other type of database-like objects:
#![allow(unused)] fn main() { // Unlike Vec, the HashMap data structure need to be explicitly included! use std::collections::HashMap; // Inferred type: let mut departments = HashMap::new(); // HashMap<i32, str> departments.insert(85, "Vendée"); departments.insert(31, "Haute-Garonne"); departments.insert(44, "Loire-Atlantique"); // We use the ampersand(&) and the key (&1) as the argument // because [..] returns us a reference of the value. It is not the actual value in the HashMap. let d31 = departments[&31]; assert_eq!(d31, "Haute-Garonne"); // Removing a key: departments.remove(&85); // Iterating over all values: for department in departments { // We get a tuple! println!("Key: {}, Value: {}", department.0, department.1); } }
Move the contents of the following Vec
object into aBTreeMap
(which behaves the same as aHashMap
, but will sort its contents by key) in order to get these athlete names sorted by their score in points.Note: some of them have the same score, which should appear in the same
key
.
#![allow(unused)] fn main() { use std::collections::BTreeMap; let vec = vec![ ("Y. Horigome", 281), ("N. Huston", 279), ("M. Dell", 153), ("J. Eaton", 281), ("S. Shirai", 278), ("K. Hoefler", 270), ("C. Russell", 211), ("R. Tury", 273), ]; let mut map = BTreeMap::new(); [...] for score in map { println!("{:?}", score); } }
The last
for
loop should print:
(153, ["M. Dell"])
(211, ["C. Russell"])
(270, ["K. Hoefler"])
(273, ["R. Tury"])
(278, ["S. Shirai"])
(279, ["N. Huston"])
(281, ["Y. Horigome", "J. Eaton"])
'string' types (str
and String
)
Course: str
primitive
The str
primitive type can be used to represent a string literal:
#![allow(unused)] fn main() { // String literal: let s = "Hello, World!"; }
As a literal, a str
has a static lifetime which can be also explicitly stated in its type declaration.
A static lifetime means that the object is valid throughout the entire duration of the program.
#![allow(unused)] fn main() { // Here, the three syntaxes are equivalent: let s = "Hello, World!"; // Inferred type & lifetime let s: &str = "Hello, World!"; // Explicit type, inferred lifetime let s: &'static str = "Hello, World!"; // Explicit type & lifetime }
Unlike const char*
in the C programming language, &str
in Rust is not null-terminated, but relies on a slice, which is composed of a pointer and a size in bytes:
#![allow(unused)] fn main() { let s = "Hello, World!"; println!("Pointer: {:?}, Length: {} bytes", s.as_ptr(), s.len()); for (n, char) in s.chars().enumerate() { println!("Char {n}: {char}"); } }
For safety reasons, Rust doesn't allow modifying the actual contents (the characters) of a &str
, thus the following does not compile:
#![allow(unused)] fn main() { let s: &mut str = "Hello, World!"; }
Run the following code:
#![allow(unused)] fn main() { let s1 = "It's not about the bunny \t"; // Remove leading/trailing whitespace, tabs and newlines from 's1': let s2 = s1.trim(); println!("{s1}"); println!("Address: {:?}, Length: {}", s1.as_ptr(), s1.len()); println!(); println!("{s2}"); println!("Address: {:?}, Length: {}", s2.as_ptr(), s2.len()); }
- Since Rust forbids modifying the contents of a
str
literal, why are we in this case allowed to use the.trim()
function? What is truly happening in this code?
- What would happen if we modified
s1
as follows?
#![allow(unused)] fn main() { let s1 = " It's not about the bunny \t"; let s2 = s1.trim(); }
Course: String
A String
, on the other hand, is a standard library collection type that can be basically seen as a vector of char
, dynamically stored on the heap. Just like a Vec
, it can grow, shrink, and has ownership over its own underlying buffer, which makes it an easier object to manipulate. While it inherits all of the str
methods, it does not have a static lifetime.
#![allow(unused)] fn main() { // Create from a string literal: let mut s = String::from("Owls are not what they seem"); // Append a 'char': s.push('!'); println!("Value: {:?}", s); // Append another string: s.push_str(" Really?"); println!("Value: {:?}", s); // Other ways of appending to the String: s = s + " Yes, "; s += "really!"; // Iterate over every 'char' for c in s.chars() { print!("{c} "); } println!(""); // Example of transformation: s = s.chars().rev().collect(); println!("Value: {:?}", s); }
Examine the following code:
#![allow(unused)] fn main() { let s0: &str = "That gum you like is going to come back in style"; // Build a 'String' object from the previous '&str': let mut s1: String = String::from(s0); // Now modify 'string': s1 = s1.to_ascii_uppercase(); println!("{string}"); }
Is
s0
still accessible? If yes, what is now its value? Is it the same ass1
and why?
We now call the
as_str()
method ons1
, and store the resulting&str
value in a new variable calleds2
. Can you guess what is the lifetime ofs2
?
#![allow(unused)] fn main() { let s0: &str = "That gum you like is going to come back in style"; // Build a 'String' object from the previous '&str': let mut s1: String = String::from(s0); let s2: &str = s1.as_str(); }
Slices
A slice in rust can be considered as a bounded pointer or reference to a contiguous sequence of elements in an array,
a collection, or a string of characters, as we saw earlier. It is declared with the &[T]
syntax. Since it works like a reference, it does not have ownership over its contents.
#![allow(unused)] fn main() { // Create a byte buffer: let mut buffer = [0 as u8; 16]; // Get a slice on half the buffer: // (notice how the slice itself is not mutable, // but instead points to a mutable sequence in the buffer) let slice: &mut[u8] = &mut buffer[0..8]; // we use the 'range syntax' here to capture the slice // Iterate on the slice to change values: for (i, n) in slice.iter_mut().enumerate() { *n = i as u8; } println!("{:?}", buffer); }
Take a slice out of the string
object, starting from character25
until the end, and use the.make_ascii_lowercase()
method on the captured slice.
#![allow(unused)] fn main() { let mut string = String::from("YOU REMIND ME TODAY OF A SMALL MEXICAN CHIHUAHUA"); let slice = ...; slice.make_ascii_lowercase(); println!("{string}"); }
What is the inferred type of the
slice
variable?
Struct, enums, Traits and Object definitions
Pierre Cochard, Tanguy Risset
Struct in Rust
Course: Structs (from https://doc.rust-lang.org/book/)
We have already seen the Struct
concepts: Structs are similar to tuples
in that both hold multiple related values. Unlike with tuples, in a struct you’ll name each piece of data so it’s clear what the values mean. Adding these names means that structs are more flexible than tuples: you don’t have to rely on the order of the data to specify or access the values of an instance.
Here is an example of Struct definition:
struct User {
active: bool,
username: String,
email: String,
sign_in_count: u64,
}
We create an instance by stating the name of the struct and then add curly brackets containing key: value pairs as for example:
let user1 = User {
active: true,
username: String::from("someusername123"),
email: String::from("someone@example.com"),
sign_in_count: 1,
};
To get a specific value from a struct, we use dot notation. For example, to access this user’s email address, we use user1.email
. When creating instances from other instances The syntax .. specifies that the remaining fields not explicitly set should have the same value as the fields in the given instance.:
let user2 = User {
email: String::from("another@example.com"),
..user1
};
Unit struct (i.e. struct without fields) and Tuple struct (i.e. struct with no named fields) are special Rust features that might be useful in some cases (see https://practice.course.rs/compound-types/struct.html)
struct
definition (replacing string
by string slice str
):
struct User {
active: bool,
username: &str,
email: &str,
sign_in_count: u64,
}
Try to create an instance of this structure in a main program. You will not be able because the lifetime of the str
slice is not known (it depends on the life time of the string it points to). In a Struct
all fields must have the same lifetime.
Try to solve adding lifetime in your definition using the compiler message ?
Methods in Rust
As in any object oriented langage, methods are defined within the context of a struct
making a struct
a regular object as defined by object oriented programming paradigm.
(In Rust, methods can be also defined in the context of an enum
or a trait
object. The definition of methods use the keyword fn
as function but the are preceded by the keyword impl
(eg: impl MyStruct { le fn [...]
}) to specify that this function is only defined in the context of a particular type object.
For a method, the first parameter is always &self
(or self
if the method need to take ownership of self
). &self
is a short cut for &self: MyObjectType
whatever this type is.
For example, if we want to defined a methode name()
for the previous structure User
(the first on, with String
) in order to obtain the username
of a User
, we will use the following syntax:
impl User {
fn name(&self)->&String{
&self.username
}
}
When calling this method, the self
argument is never written, it is implicit (e.g. : user1.name()
). The above method is ofter called a getter
. Getters are not implemented by default for Rust Struct
, it is a good habit to name the getter
after the field name they are getting (hence we should have called this method username()
)
struct
Rectangle
with two integer fields width
and height
. Then implement two methods for Rectangle
: area(&self)
(which computes the surface of the rectangle) and fits_in(&self, other_rect: &Rectangle)
which indicate if self
fits completely inside the other_rect
rectangle.
Enum and pattern matching
Course: Enum and option
Enum (from https://doc.rust-lang.org/book/)
As in C, Enum
gives you a way of saying a value is one of a possible set of values. For instance, An IP address can be either an IPv4 address or an IPv6 address, but not both at the same time:
enum IpAddrKind {
V4,
V6,
}
This allows to store the two possible kind in "the same" memory location given that they will never be active together in one instance.
A new feature comparing to C is that we can put data directly into each enum variant.
enum IpAddr {
V4(u8, u8, u8, u8),
V6(String),
}
let home = IpAddr::V4(127, 0, 0, 1);
let loopback = IpAddr::V6(String::from("::1"));
Rust has an extremely powerful control flow construct called match
that allows you to compare a value against a series of patterns and then execute code based on which pattern matches. Pattern matching in an important area of computer science and compilation, we just show a very simple example here:
enum Coin {
Penny,
Nickel,
Dime,
Quarter,
}
fn value_in_cents(coin: Coin) -> u8 {
match coin {
Coin::Penny => 1,
Coin::Nickel => 5,
Coin::Dime => 10,
Coin::Quarter => 25,
}
}
The Option Enum
As explained in TD1, the Null
value was invented to represent the "no value" value and it was a bad invention. With the Option Enum
Rust proposes a mecanism that explicitely distinguish the cases where a variable has a value or no value. Rust defines (in the standard library, in the prelude) a particular Option<T>
enum:
enum Option<T> {
None,
Some(T),
}
The <T>
syntax is a feature of Rust called a generic type parameter (similar to template in C++) that will be explained hereafter.
f32
number and return None if the number is negative.
Of course a some(T)
object cannot be "casted" or "simplified" in a T
object, The only way to get the T
object is to unwrap the option. unwrap will be explained in next course (for Result
type), here it will return the T
object or panic in case of None
. The Option<T>
enum has a large number of methods that are useful in a variety of situations https://doc.rust-lang.org/std/option/enum.Option.html.
.unwrap()
) your square root for positive and negative number.
Generic Types
Every programming language has tools for effectively handling the duplication of concepts. In Rust, one such tool is generics. Generic Types
, Traits
and Lifetimes
are three kind of generics.
Generic Data Types can be used in the definition of functions, struct or enum. Generic Data Type are very close to the template concept in C++.
Vector
may this vector be a vector of i32
, f32
or characters or on any type that has the possibility of comparing elements. Use the following function signature: smallest<T: PartialOrd> (v: &[T])-> &T
Point
with two fields x
and y
that can be integer or floating point. Is Point{x:2,y:2.3}
a valid instance of the structure point
?
Traits: Defining Shared Behavior
A trait
defines the functionality a particular type has and can share with other types. We have already seen some very common traits
: Clone
or Debug
. The trait Clone
for instance represents the fact that a variable of a type can be dupplicated (i.e. cloned) to a second instance of the type, identical to the variable but refering to a different memory location. Traits are similar to a feature often called interfaces in other languages, although with some differences.
Defining a trait consists in defining the methods we can call on that type, using the keywork trait
. In general, trait names should be defined in UpperCamelCase (e.g. IsEven
) and traits methods should be defined with snake_case name (e.g. is_even(&self)
). Implementing a trait for a particular type is done using impl name_of_trait for name_of_type{[...]}
.
IsEven
that is composed of the method is_even(&self)
. Implement the trait for the Rectangle
type defined préviously (a Rectangle is even if both height and width are even)
You can specify a default implementation of each method in the definition of the trait (then an explicite implementation will hide the default implementation)
Trait as parameter: the Trait Bound Syntax
Trait can be used as function parameter, the function will be valid for any type implementing the trait. The usual syntax for that is called the Trait Bound Syntax:
fn myFunction<T: TheTrait>(a_variable: &T) { [..]}
notifyEven
that takes as paremeter a type that implements the trait IsEven
as parameter and notify (i.e. print) the fact that the object is even. Note that you can implement IsEven
and Debug
traits together by specifying IsEven+Debug
.
Sometimes, this syntax can be heavy and you can use the where
Clause: instead of writing this:
fn some_function<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32 {
we can use a where clause, like this:
fn some_function<T, U>(t: &T, u: &U) -> i32
where
T: Display + Clone,
U: Clone + Debug,
{
The impl trait
syntax can be used to specify that a result of a function must implement a trait
.
Using Trait Bounds to Conditionally Implement Methods
The following example (from https://doc.rust-lang.org/book/ch10-02-traits.html) illustrates the fact the trait can be used to conditional implementation of methods
use std::fmt::Display;
struct Pair<T> {
x: T,
y: T,
}
impl<T> Pair<T> {
fn new(x: T, y: T) -> Self {
Self { x, y }
}
}
impl<T: Display + PartialOrd> Pair<T> {
fn cmp_display(&self) {
if self.x >= self.y {
println!("The largest member is x = {}", self.x);
} else {
println!("The largest member is y = {}", self.y);
}
}
}
Programming paradigm in Rust
This section is largely inspired by https://corrode.dev/blog/paradigms/.
Rust is a multi-paradigm programming language, accommodating imperative, object-oriented, and functional programming styles. It is important to be aware that the programming paradigm is an important design choice when you start a new Rust programming. The choice of style often depends on a developer’s background and the specific problem they’re addressing. but there are also many "known habits" of Rust developpers.
An originality compared to other recent languages is the important influence of functionnal programming in Rust.
A simple example: integer Vector sum.
i32
values. Write a program to do it in an iterative/imperative
way (i.e. a loop accumulating in a temporary variable)
iter()
method of Vector
type and sum()
method of iterators
The second formulation is, of course, much more concise, as is often the case in functional programming, but it is less suited to certain types of processing (matrix calculations, for example).
A More Complete Example
Consider the following Rust code that defines a list of several languages along with the paradigms they are associated with. You will start from this code. The task will be to find the top five languages that support functional programming and have the most users.
#[derive(PartialEq,Clone,Debug)]
enum Paradigm {
Functional,
ObjectOriented,
}
#[derive(Clone,Debug)]
struct Language {
name: &'static str,
paradigms: Vec<Paradigm>,
nb_users: i32,
}
impl Language {
fn new(name: &'static str, paradigms: Vec<Paradigm>, nb_users: i32) -> Self {
Language { name, paradigms, nb_users }
}
}
let languages = vec![
Language::new("Rust", vec![Paradigm::Functional,Paradigm::ObjectOriented], 100_000),
Language::new("Go", vec![Paradigm::ObjectOriented], 200_000),
Language::new("Haskell", vec![Paradigm::Functional], 5_000),
Language::new("Java", vec![Paradigm::ObjectOriented], 1_000_000),
Language::new("C++", vec![Paradigm::ObjectOriented], 1_000_000),
Language::new("Python", vec![Paradigm::ObjectOriented, Paradigm::Functional], 1_000_000),
];
for
loops
-
into_iter()
method transforms a Vector in a iterator -
filter()
method can keep only element with a property (use a lambda as an argument to filter) -
sorted_by_key()
sorts all iterator elements into a new iterator in ascending order (useReverse()
) -
collect()
transform an iterator into a collection.
Rust as a safe language: Pattern Matching, Handling Results/Errors, Options and Simple macros
Pierre Cochard, Tanguy Risset
- Introduction
- Pattern matching in Rust
- The
Result
enum type - The
Option
enum type - Simple macros with
macro_rules!
Introduction
TODO
Pattern matching in Rust
Pattern matching is the act of checking a given sequence of tokens or expressions for the presence of one or more specific patterns. The concept is implemented in many programming languages (Rust, Haskell, Swift, etc.) and tools, for various purposes, such as: regular expressions, search and replace features, etc.
In Rust, patterns and pattern matching constitute a specific syntax, that is used in different places in the language (match
statements, if let
expressions, function parameters, simple macros, etc.)
Course: match
statements
The primary, and most explicit, use of pattern matching in Rust is done through the match
statement, which can be perceived as the Rust-equivalent of a C switch
, but with additional features. Its syntax is also a little bit different. For instance, let's take a look at this simple C program:
// C basic switch case:
enum Colors {
Red, Blue, Green, Yellow, Orange
};
bool match_color_orange(Colors color) {
switch (color) {
case Orange: {
printf("Orange!\n");
return true;
}
case Red:
case Blue: {
printf("Not orange :(\n"));
return false
}
default: {
printf("Still not orange\n");
return false;
}
}
}
In Rust, we would have the following equivalent:
#![allow(unused)] fn main() { enum Colors { Red, Blue, Green, Yellow, Orange } fn match_color_orange(color: Colors) -> bool { match color { // 'case' statements are replaced by the // 'PATTERN => EXPRESSION' syntax: Colors::Orange => { println!("Orange!"); true } // We use '|' operators here, instead of having // multiple 'case' statements: Colors::Red | Colors::Blue => { println!("Not orange :("); false } // Anything else (equivalent to 'default'): _ => { println!("Still not orange..."); false } } } }
Once a pattern mach is found, the corresponding instruction are executed and the match instruction terminates (it does not check for other matching patterns, the first matching pattern is choosen).
match
statements can be directly bound to variables:
#![allow(unused)] fn main() { enum Colors { Red, Blue, Green, Yellow, Orange } let color = Colors::Red; let is_color_warm = match color { Colors::Orange => true, Colors::Red => true, Colors::Yellow => true, _ => false }; }
Matching ranges is also supported, for instance:
#![allow(unused)] fn main() { fn match_number(number: i32) { match number { 50..=99 => println!("Between 50 and 99"), 100..=1000 => println!("Between 100 and 1000"), _ => println!("Other value") } } }
And, as a matter of fact, any other type of expression can be matched! from string types:
#![allow(unused)] fn main() { fn match_str(s: &'static str) { match s { "Orange" => println!("Orange!"), "Yellow" => println!("Not orange"), _ => println!("Something else...") } } }
to other kinds of collections:
#![allow(unused)] fn main() { fn match_tup(tup: (i32, i32)) { match tup { (0, 0) => println!("Zeroes!"), (1, 1) => println!("Ones"), _ => println!("Something else...") } } match_tup((1, 1)); match_tup((0, 1)); fn match_array(arr: [u8; 3]) { match arr { [0, 1, 2] => println!("Array match!"), _ => println!("No match") } } match_array([0, 1, 2]); match_array([4, 5, 6]); fn match_slice(sl: &[i32]) { match sl { &[0, 1, 2] => println!("Slice matches!"), _ => println!("No match") } } match_slice(&[0, 1, 2]); }
Write a match
statement which applies to anyi32
number. It should only have the two following patterns:
- The value is below
100
(including negative numbers);- The value is equal or higher than
100
.
Course: match
statements: "flexible" patterns
As we saw earlier, the match
statement can test any kind of value, and it also extends to custom and composite types, including struct
instances. A custom struct
can be indeed either matched by its contents in a very precise manner:
#![allow(unused)] fn main() { struct Point { x: isize, y: isize } let point = Point {x: 0, y: 100}; match point { // Only match Point if its 'x' member is equal to 0 // and 'y' is equal to 100: Point {x: 0, y: 100} => println!("Match!"), _ => println!("No match!") } }
Or, in a more flexible way, using, for instance, ranges
for its member values:
#![allow(unused)] fn main() { struct Point { x: isize, y: isize } let point = Point {x: 25, y: 100}; match point { // Only match if 'x' is between 0 and 100, // and 'y' is between 50 and 100 Point {x: 0..=100, y: 50..=100} => println!("Match!"), _ => println!("No match!") } }
Finally, the _ =>
expression can be extended to any kind of value (or field value) that we want to ignore. This can also be done using the ..
syntax, which will ignore all the following values or field values:
#![allow(unused)] fn main() { struct Point { x: isize, y: isize } let p = Point {x: 0, y: 100}; match p { // Only match if 'x' is between 0 and 100, // and ignore the 'y' field: Point {x: 0..=100, y: _} => println!("Match!"), // Only match if 'x' is between 101 and 1000, // Similarly, the '..' syntax will ignore all the struct fields after 'x': Point {x: 101..=1000, ..} => println!("Match!"), _ => println!("No match!") } }
Note: for compound/collection types, the ..
syntax may be followed by other patterns:
#![allow(unused)] fn main() { let tup = (0, 1, 2, 3, 4); match tup { // The first element of the tuple should be '0', and the last should be '4', // we ignore the values in-between: (0, .., 4) => println!("Match!"), _ => println!("No match") } }
Implement a match
statement on a&[i32]
slice. It should match all of the following patterns:
- The slice's first value should be
0
;- The slice's second value should be either
10
or20
;- The slice's final value should be
100
;- The slice can have an arbitrary size.
You can use the following assert!
statements to test your code:
#![allow(unused)] fn main() { fn match_slice(s: &[i32]) -> bool; assert_eq!(match_slice(&[1, 20, 20, 30, 100]), false); assert_eq!(match_slice(&[0, 5, 20, 30, 100]), false); assert_eq!(match_slice(&[0, 10, 20, 30, 99]), false); assert!(match_slice(&[0, 10, 20, 30, 40, 100])); assert!(match_slice(&[0, 20, 20, 30, 50, 60, 70, 80, 90, 100])); }
The Result
enum type
Course: Returning a Result
from a function
To handle and propagate runtime errors, Rust relies on a simple but efficient mechanism based on an enum
: the Result<T, E>
enum, which is defined as:
#![allow(unused)] fn main() { enum Result<T, E> { Ok(T), Err(E) } }
The templated T
and E
types have no trait implementation predicate whatsoever, they could be anything, for instance:
#![allow(unused)] fn main() { // This function returns a u8 slice if there is no error, a Vec<f32> if there is. // This is probably not very useful, but it's still perfectly valid code: fn my_function(i: i32) -> Result<&[u8], Vec<f32>> {...} // This too (empty tuple type for both): fn my_function(i: i32) -> Result<(),()> { Ok(()) } }
The unwrap()
method can be used to extract the Ok
argument from a Result<...>
. This will be explained in more detail further, but you will need it to test your function:
let u = my_function(3).unwrap();
Write a function called positive()
that takes ani32
as argument and checks whether it is (strictly) positive. It should return the same argument value if it is positive, or aString
with an error message if the argument is negative or equal to zero.
Course: propagating Errors
An Error in Rust can be propagated down the call stack by using the ?
syntax:
#![allow(unused)] fn main() { fn my_function(i: i32) -> Result<i32, ()> {...} fn my_other_function() -> Result<i32, ()> { // Append the '?' operator right after the function call: let mut i = my_function(1)?; // Do something with 'i': i += 1; // Return an 'Ok' result with the modified 'i' value: Ok(i) } }
Here, the my_function(1)?
function call means:
- if the result enum value is
Err
(an error), then propagate the error now, by returning the sameErr
frommy_function()
, otherwise, continue with the rest of the code.
This code could also be implemented with an equivalent match
statement, but is a bit more verbose:
#![allow(unused)] fn main() { fn my_other_function() -> Result<i32, ()> { let mut i = match my_function(1) { Ok(i) => i, Err() => return Err(()) }; i += 1; Ok(i) } }
Implement the same mechanism for the previous positive(i: i32)
example, using the?
syntax, and test it with both positive and negative values in a new function with the same return type, in order to see what happens.
// Your 'positive' function: fn positive(i: i32) -> Result<i32, String> {...} // Define a new function, and call 'positive(...)' from here: fn check_positive() -> Result<i32, String> { ... // <- test positive & negative values here using the '?' syntax } // Check the return type from main: fn main() { println!("{:?}", check_positive()); }
Course: returning a Result
from the main()
function
In general, returning with or without error from a main
function, such as in the C
or C++
programming languages, is done by returning an integer exit code (0
for success, error otherwise):
int main(void) {
// No error, return 0:
return 0;
}
In Rust, the main()
function has no return type by default, and returning a i32
is not accepted by the compiler. For instance, the following code is invalid:
fn main() -> i32 { 0 }
In this case, the compiler prints the following:
error[E0277]: `main` has invalid return type `i32`
--> src/main.rs:3:14
|
3 | fn main() -> i32 {
| ^^^ `main` can only return types that implement `Termination`
|
= help: consider using `()`, or a `Result`
The Termination
trait documentation indeed indicates that only the following types are valid:
#![allow(unused)] fn main() { impl Termination for Infallible; impl Termination for !; impl Termination for (); impl Termination for ExitCode; impl<T: Termination, E: Debug> Termination for Result<T, E>; }
Therefore, we can see that propagating a Result
down to main()
is possible, but is still a bit of a specific case. The type T
held by the Ok
enum value must implement the Termination
trait, and the type held by the Err
enum value must implement the Debug
trait. For instance, the following still does not work because i32
does not implement the Termination
trait:
fn main() -> Result<i32, String> { Ok(0) }
But the following works:
// Using an empty tuple as the 'Ok' result: fn main() -> Result<(), String> { Ok(()) } // Or using the ExitCode type: use std::process::ExitCode; fn main() -> Result<ExitCode, String> { Ok(ExitCode::from(0)) }
Result
type in the main()
function, propagate the Result
of our positive()
function down.
fn positive(i: i32) -> Result<i32, String> {...} fn check_positive() -> Result<i32, String> {...} // Use a valid Result type here: fn main() -> Result<?, ?>{ // Call the 'check_positive' function here, and propagate its Result as the main() return type: }
Course: Handling Result
types immediately
.unwrap()
, .expect()
In some cases - when propagating an error is not possible, or unconvenient - dealing immediately with a Result
type is preferrable. This is why certain methods, such as .unwrap()
or .expect()
are natively implemented, and quite commonly used:
- The
.unwrap()
method, for instance, will induce apanic!
call and will exit the program immediately when encountering an error. Otherwise it will return theOk
value safely:
#![allow(unused)] fn main() { // Get the current working directory: let dir: std::path::PathBuf = std::env::current_dir().unwrap(); println!("{:?}", dir); }
- The
.expect()
method is really similar, but will allow the user to print a custom&str
message on error, which will be prepended to the actual display of theErr
contents:
#![allow(unused)] fn main() { fn my_function() -> Result<(), i32> { Err(1) } my_function().expect("Error! Now exiting program with error code"); }
- Other similar helper methods also exist, with different behaviors:
.unwrap_or(other: T)
returns the valueother
in case of anErr
.unwrap_or_default()
returns the type's default value in case of anErr
.unwrap_or_else(func: Fn)
executes a custom function in case of anErr
- etc.
panic!
, assert!
and other macros
In addition to Result
types, other simple tools, in the form of macros, are provided:
- The
panic!(msg)
macro interrupts the program immediately and prints a custom error message:
fn main() { use std::io; // Read input from stdin: let mut buffer = String::new(); println!("Please enter password:"); io::stdin().read_line(&mut buffer).unwrap(); // Panic if password is not long enough: if buffer.len() < 8 { panic!("Password should be at least 8 characters"); } else { println!("{buffer}"); } }
- The
assert!(bool)
,assert_eq!(lhs, rhs)
, andassert_ne!(lhs, rhs)
, verify a boolean statement or check equality between two elements:
fn main() { use std::io; // Read input from stdin: let mut buffer = String::new(); println!("Please enter password:"); io::stdin().read_line(&mut buffer).unwrap(); // We assert that the password is at least 8 characters // This will cause a 'panic' if the assertion is false: assert!(buffer.len() >= 8, "Password should be at least 8 characters"); // Here, we forbidden choosing 'password' as a password: assert_ne!(buffer.trim(), "password", "Choosing 'password' as password is unsafe."); }
Using Result
, and all the previous examples, create a program which parses a password, with the following rules:
- Password must:
- be at least 8 characters long;
- should contain at least one of these special characters:
!
,?
or_
;- should contain at least one number;
- should not contain any whitespace.
- A specific error message should be displayed for each rule.
Note: in order to verify your program, you can implement a
#[test]
function, such as:
#![allow(unused)] fn main() { #[test] fn password_test() { let pwd0 = String::from("pass"); let pwd1 = String::from("password"); let pwd2 = String::from("password!"); let pwd3 = String::from("pass word!"); let pwd4 = String::from("password!1"); assert!(parse_password(&pwd0).is_err()); assert!(parse_password(&pwd1).is_err()); assert!(parse_password(&pwd2).is_err()); assert!(parse_password(&pwd3).is_err()); assert!(parse_password(&pwd4).is_ok()); } }
and then run
cargo test
on your program.
The Option
enum type
The Option
enum in Rust is somewhat similar to the Result
enum, but its main purpose is to indicate the presence or absence of a value, rather than an error. It has the following definition:
#![allow(unused)] fn main() { pub enum Option<T> { None, Some(T) } }
As an example, let's suppose we want to build a list of people to contact, with various information, such as the contact's name and address, and optionally phone number and/or e-mail, we could for instance define the following struct
:
#![allow(unused)] fn main() { struct MyContact { name: &'static str, address: &'static str, email: Option<&'static str>, phone: Option<&'static str>, } let mut list: Vec<MyContact> = Vec::new(); list.push(MyContact { name: "Marlo Stanfield", address: "2601 E Baltimore St, Baltimore, MD 21224", email: None, phone: Some("+1 410-915-0909") }); }
By doing this, we can then take advantage of the Option
enum and pattern matching to decide of the best way to contact each person in the list:
#![allow(unused)] fn main() { for contact in list { match (contact.email, contact.phone) { // Both e-mail and phone are available: (Some(email), Some(phone)) => { if is_email_correct(email) { contact_by_email(email); } else { contact_by_phone(phone); } } // Only e-mail is available: (Some(email), None) => { contact_by_email(email); } // Only phone is available: (None, Some(phone)) => { contact_by_phone(phone); } // Neither phone nor email: (None, None) => { send_mail_to_address(contact.address); } } } }
As for the Result
type, Option
can be checked and handled immediately using the same .unwrap()
, .expect()
methods.
#![allow(unused)] fn main() { fn money_left() -> Option<i32>; money_left().expect("No money left :("); }
Using the previous contact list example, implement a small database of books that would be used by a library. It should have a search_book(...)
function which searches for a specific book using its name and/or the name of the author (we assume here that there's only one book per author). The function should return a reference to theBook
object if it has been found, orNone
otherwise.
#![allow(unused)] fn main() { struct Book { name: String, author: String, } #[derive(Default)] struct LibraryDatabase { books: Vec<Book> } impl LibraryDatabase { // The function to implement: fn search_book(&self, name: Option<&'static str>, author: Option<&'static str> ) -> Option<&Book> {...} } }
You can use the following #[test]
function to verify your code:
#![allow(unused)] fn main() { #[test] fn test() { let mut database = LibraryDatabase::default(); database.books.push(Book { name: String::from("Peter Pan"), author: String::from("Barrie")}); assert!(database.search_book(Some("Peter Pan"), None).is_some()); assert!(database.search_book(None, Some("Barrie")).is_some()); assert!(database.search_book(None, None).is_none()); assert!(database.search_book(Some("Barrie"), None).is_none()); assert!(database.search_book(None, Some("Peter Pan")).is_none()); assert!(database.search_book(Some("Alice in Wonderland"), Some("Barrie")).is_none()); assert!(database.search_book(Some("Peter Pan"), Some("Lewis Carroll")).is_none()); } }
Simple macros with macro_rules!
Unlike other programming languages, such as C
or C++
, Rust's macro system is based on abstract syntax trees (AST), instead of string preprocessing, which makes them a bit more complex to use, but also more reliable and powerful. macros are expanded before the compiler interprets the meaning of the code. The difference between a macro and a function is that macro definitions are more complex than function definitions because you’re writing Rust code that writes Rust code. Due to this indirection, macro definitions are generally more difficult to read, understand, and maintain than function definitions.
Throughout this course, we have already encoutered a few of them, including vec!
, panic!
, assert!
, and of course println!
. These macros are defined by the macro!(...)
syntax (don't forget the trailling exclamation mark) and are called simple macros, as opposed to Rust's more complex macro systems, such as attribute macros (for instance the #[test]
function attribute), and derive macros (the #[derive(Debug)]
statement on top of a struct
), which we have already both seen as well.
Course: Basic macro_rules!
usage
Unlike attribute or derive macros, which must be defined in a separate crate, simple macros can be defined anywhere in our code, using the macro_rules!
syntax:
#![allow(unused)] fn main() { macro_rules! hello_world { () => { println!("Hello World!") }; } hello_world!(); }
Here, we defined a macro!
that takes no argument, which is indicated by the ()
statement. Our hello_world!()
macro call will be under the hood replaced by the contents that we defined within the => { ... }
block.
Advantages of using macros
Our hello_world!()
example is of course not very useful, and in fact adds unnecessary noise to a very simple piece of code, but think of the vec!
macro for instance:
#![allow(unused)] fn main() { let v1 = vec![1, 2, 3, 4, 5]; let v2 = vec!(); let v3 = vec![1]; }
Defining the three different vectors by hand would actually mean writing the following code:
#![allow(unused)] fn main() { let v1 = <[_]>::into_vec(Box::new([1, 2, 3, 4, 5])); let v2 = Vec::new<i32>(); let v3 = std::vec::from_elem(1, 1); }
Notice how these three vectors are each time created in a very different way? In this case, the vec!
macro allows defining a more practical and unified way of instantiating a Vec
object, without having to remember all the (sometimes complex) underlying code. Furthermore, as you can see with this example, a macro!
can also accept a variable number of arguments, which is not the case with a Rust function.
The different types of arguments (or fragment specifiers)
As you may already have guessed with our first basic macro example, which uses the =>
operator, macro_rules!
relies on pattern matching to parse its arbitrary number of arguments.
macro_rules!
can parse different kinds of patterns, including:
()
: the empty pattern, which means no argument (our previous example);block
: a block expression, surrounded by{ }
;expr
: any kind of Rust expression;ident
: an identifier (the name of a variable, function, etc.);literal
: a number/string or other kind of litteral;- etc. (see the full list here).
Matching a specific pattern
Let's now try an example with an actual argument. Here, we will use the ident
designator in order to create functions from a simple macro call:
#![allow(unused)] fn main() { macro_rules! define_fn { ($fn_name:ident) => { fn $fn_name() { println!( "This function's name (ident) is: '{}()'.", stringify!($fn_name) ); } } } define_fn!(foo); define_fn!(bar); foo(); bar(); }
Let's break this code piece-by-piece:
#![allow(unused)] fn main() { ($fn_name:ident) => { }
→ Instead of an empty pattern ()
, we use the pattern ($fn_name:ident)
, in which $fn_name
would be the name of the argument, and ident
its type. The dollar sign ($) is used to declare a variable in the macro system that will contain the Rust code matching the pattern.
#![allow(unused)] fn main() { fn $fn_name() { }
→ Within our generated code block, we declare a function with the name taken from our $fn_name
ident argument.
#![allow(unused)] fn main() { println!( "This function's name (ident) is: '{}()'.", stringify!($fn_name) ); }
→ We then define the function's body, with a println!
call, in which we print the ident's name using a utility macro called stringify!
. This very useful macro will transform our $fn_name
identifier into a &'static str
object;
What would be the generated code for the define_fn!(foo)
macro call?
Create a similar macro, but this time the generated code should define a struct
and itsimpl
block like the following:
#![allow(unused)] fn main() { // All of this code should be generated by our new macro, // but the name 'Foo' should be made variable: struct Foo { print: &'static str } impl Foo { fn new() -> Foo { Foo { print: "Foo" } } } }
#![allow(unused)] fn main() { // The macro to implement: macro_rules! define_struct { ... } // Use the following to verify that the macro is correct: define_struct!(Foo); let bar = Foo::new(); assert_eq!(bar.print, "Foo"); }
Course: pattern overloading
macro_rules!
definitions can accept an arbitrary number of patterns, in a very simple way. Let's try it out on our define_fn!
macro. We will add another pattern allowing to add arbitrary code expressions to the created fn
:
#![allow(unused)] fn main() { macro_rules! define_fn { ($fn_name:ident) => { fn $fn_name() { println!( "This function's name (ident) is: '{}()'.", stringify!($fn_name) ); } }; // <-- pattern blocks must end with a semicolon if they're followed by other blocks // Our new pattern: ($fn_name:ident, $additional_code:expr) => { fn $fn_name() { println!( "This function's name (ident) is: '{}()'.", stringify!($fn_name) ); // Append the additional code 'expr' at the end of the defined fn: $additional_code } } } define_fn!(foo); define_fn!(bar, println!("Additional code")); foo(); bar(); }
Here, we added the ($fn_name:ident, $additional_code:expr)
pattern, which is composed of two arguments: the same ident
argument, followed by an expr
argument, which can be any valid Rust expression. The two arguments are separated by a comma ,
but the choice of a comma is completely arbitrary, it could be (almost) any symbol.
In our previous example, try replacing the ,
symbol between$fn_name:ident
and$additional_code:expr
with another one. Then, call thedefine_fn!
macro with two arguments separated by the same new symbol, and see what it does.
Course: pattern matching for macro_rules!
Pattern matching for macro_rules!
is quite different from pattern matching used in the match
keyword. Macros can also easily deal with pattern repetition by using a special syntax, which resembles the one used for regular expressions. In particuler, the usual operators of regular expressions can be used: '_', '+'
or '*'
(one object, a repetition of objects -- at least one, a repetition of objects - possibly 0). In the following example, we want to replace the std::cmp::max()
function to take an arbitrary number of arguments:
#![allow(unused)] fn main() { let mut max = std::cmp::max(1, 2); max = std::cmp::max(max, 3); max = std::cmp::max(max, 4); max = std::cmp::max(max, 5); }
In this case, having a macro like the following could prove useful, and would lighten the code a lot:
#![allow(unused)] fn main() { max!(1, 2, 3, 4, 5, 6*7, 3*4); }
The way to do this is to use the $(...),+
syntax, as follows:
#![allow(unused)] fn main() { macro_rules! max { // Only one argument 'x', return 'x': ($x:expr) => {$x}; // At least two arguments, // - 'x' being the first, // - 'y' being one or more additional argument(s), // which is defined by the '$(...),+' syntax: ($x:expr, $($y:expr),+) => { // We recursively call 'max!' on the tail 'y' std::cmp::max($x, max!($($y),+)) } } }
The +
in the $($y:expr),+
syntax means one or more instances of the ($y)
expression, separated by a comma ,
.
Note: The
*
symbol also exists, and means zero or more instances of the pattern.
Now, if we were to call the max!
macro the following way, we would only match the first pattern ($x:expr) => {$x}
:
#![allow(unused)] fn main() { max!(1); }
- With two arguments, we would match the second pattern
($x:expr, $($y:expr),+)
with a single additional argument:
#![allow(unused)] fn main() { max!(1, 2); // expands to: std::cmp::max(1, max!(2)); // expands to: std::cmp::max(1, 2); }
- And with more arguments recursively:
#![allow(unused)] fn main() { max!(1, 2, 3); // expands to: std::cmp::max(1, max!(2, 3)); // expands to: std::cmp::max(1, std::cmp::max(2, max!(3))); // expands to: std::cmp::max(1, std::cmp::max(2, 3)); }
As a summary:
$var
captures a value in a pattern.$var:ident
specifies a type (could beident
,expr
,ty
, etc.).$( $(var:pat),*
or$( $(var:pat),+
captures repetitive sequences separated by commas.$var
is replaced during macro expansion.
Bonus: Transform our previous define_struct!
example into a more elaborated macro. It should now have the following interface:
#![allow(unused)] fn main() { // Define the struct 'Foo': define_struct!( name: Foo, members: { bar: i32 } methods: { fn hello() { println!("hello!") } fn bar(&self) { println!("{}", self.bar); } } ); // Instantiate the struct 'Foo': let f = Foo::default(); // Call its bar method: f.bar(); // Or its static 'hello' method: Foo::hello(); }
Bonus: Add a nested macro_rules!
definition intodefine_struct!
which allows to copy the members and methods of the defined struct into a new different one, such as:
#![allow(unused)] fn main() { // Define Foo struct: define_struct!( name: Foo, members: { bar: i32 } methods: { fn hello() { println!("hello!") } fn bar(&self) { println!("{}", self.bar); } } ); // The define_struct! macro should define a new 'Foo!' macro, // which allows copying the members and methods of 'Foo' into another new struct: Foo!(FooCopy); let c = FooCopy::default(); c.bar(); FooCopy::hello(); }
Pierre Cochard, Tanguy Risset
- Introduction
- Using Threads and Closures to Run Code Simultaneously
- Sharing data safely between threads
- Asynchronous programming
Introduction
Using Threads and Closures to Run Code Simultaneously
Course: Closures
Closures in Rust are anonymous functions that can be stored in variables, or passed to other processes as arguments. They can be found in a lot of places in the language, in order to allow functional-style programming, behavior customization or to provide concurrent/parallel code execution. They are defined by the following syntax |arguments| -> return_type { body }
, for instance:
#![allow(unused)] fn main() { // Define a closure and store it into a variable: let my_closure = |x: i32| -> i32 { println!("my_closure"); x*2 }; // Execute the closure like you would normally do with a function: let y = my_closure(2); println!("{y}"); }
Borrowed captures
Just like in C++
, closures can capture the environment it originates in and use external data in its own internal scope. By default, captured data, whether it is mutable or not, will be borrowed:
#![allow(unused)] fn main() { let x = 2; let my_closure = || -> i32 { // Use a borrowed (immutable) capture of 'x', // and return two times its value: x*2 }; println!("{}", my_closure()); }
#![allow(unused)] fn main() { // Same, but this time, we modify 'x' directly in the closure: let mut x = 2; // The closure itself also has to be made 'mutable' // in the case of a mutable borrow: let mut my_closure_mut = || x *= 2; my_closure_mut(); println!("{x}"); }
Moved captures
Instead of being borrowed, data can also be captured by value into a closure scope, using the move
keyword before declaration:
#![allow(unused)] fn main() { let mut x = 2; // Capturing 'x' by value. Here, it is made with a simple copy: let mut my_closure_mut = move || { x *= 2; println!("x (closure): {x}"); }; my_closure_mut(); println!("x: {x}"); }
Why is the following code invalid? How can we solve the issue?
#![allow(unused)] fn main() { let mut x = vec![31, 47, 27, 16]; let mut my_closure_mut = move || { x.push(32); println!("{:?}", x); }; my_closure_mut(); println!("{:?}", x); }
Course: Passing closures as objects or arguments
One big specificity of closures is that they have a unique, anonymous type that cannot be written out. This can for instance be demonstrated by running the following piece of code:
#![allow(unused)] fn main() { // Utility function that prints out the type of a variable: fn print_type_of<T>(_: &T) { println!("Type of my closure: {}", std::any::type_name::<T>()); } let my_closure = |x: i32| -> i32 { x*2 }; print_type_of(&my_closure); }
Therefore, passing a closure as an argument to a function using a specific type is not possible in Rust. Instead, in order to do that, one would have to use a trait
. Indeed, all closures implement one or several of the following traits, depending on their nature and properties:
FnOnce
: applies to closures that can be called once. All closures implement this trait;Fn
: applies to closures that don't move captured values out of their body and that don't mutate captured values, as well as closures that capture nothing from their environment. These closures can be called more than once without mutating their environment, which is important in cases such as calling a closure multiple times concurrently.FnMut
: applies to closures that don't move captured values out of their body, but that might mutate the captured values. These closures can be called more than once.
A closure can then be passed as an argument the same way we do for passing trait-implementing objects, by using the Fn/FnOnce/FnMut(argument_type) -> return_type
:
#![allow(unused)] fn main() { // All of the 'Fn' traits have the format: // 'Fn(argument-types) -> return types' fn exec_closure(x: i32, closure: impl Fn(i32) -> i32) -> i32 { closure(x) } let c = |x: i32| x + 27; let r = exec_closure(31, c); println!("{r}"); }
The generic form also works (and is usually preferrable):
#![allow(unused)] fn main() { fn exec_closure<T>(x: i32, closure: T) -> i32 where T: Fn(i32) -> i32 { closure(x) } let c = |x: i32| x + 27; let r = exec_closure(31, c); println!("{r}"); }
Using the generic form, store the following chirp_fn
closure as a member of thestruct
Bird
, with a valid type signature:
let chirp_fn = |times: i32| {
for _ in 0..times {
println!("chirp!");
}
}
// The struct to implement (use generics!):
struct Bird<...> {
chirp: ...
}
// Create a new instance of 'Bird' with the chirp_fn closure:
let bird = Bird { chirp: chirp_fn };
// Call the 'chirp' closure from inside the struct:
// operator precedence priority can be found here: https://doc.rust-lang.org/reference/expressions.html
(bird.chirp)(10);
Course: Spawning Threads using Closures
A std::thread
object in Rust will execute a given closure in an independent (or parallel) context of execution. In the following example, the std::thread::spawn
call will only return when it's done creating the thread, but not when the thread has actually finished executing:
#![allow(unused)] fn main() { // Spawn a new thread which will execute its given closure: std::thread::spawn(|| { println!("Thread 1: chirp!"); }); // At the 'same time', print something from the main thread: println!("Main thread: chirp chirp!"); }
In this case, the main function returns before the independent thread's println!
call happens. This is why we can only see the "main thread" print output. Usually, a thread is bound to a local variable, and is waited upon before the parent context of execution finishes. This can be done by calling the .join()
method on the thread handle:
#![allow(unused)] fn main() { // Bind the thread's "handle" to a variable: let th = std::thread::spawn(|| { println!("Thread 1: chirp!"); }); println!("Main thread: chirp chirp!"); // Wait for 'th' to finish executing, and re-synchronise both threads: th.join().unwrap(); }
The following code prints a modified value of variable var
from 3 different threads, running independently from one another. Is the code safe? Can you guess what will be the resulting output?
use std::thread;
let mut var = 32;
let t1 = thread::spawn(move || {
var += 1;
println!("Thread 1: reading value {}!", var);
});
let t2 = thread::spawn(move || {
var += 2;
println!("Thread 2: reading value {}!", var);
});
var += 3;
println!("Main thread: reading value {}", var);
// Re-synchronise both threads:
t1.join().unwrap();
t2.join().unwrap();
What would happen if we removed all the move
keywords from the code?
Sharing data safely between threads
Using Shared State data sets
Course: Exclusive access with Mutexes
Mutual exclusion, or mutex is a mechanism which prevents accessing the same data from multiple threads running at the same time. It relies on a locking system in order to do so: a thread must first ask to acquire the mutex's lock before being able to access the underlying protected data. The lock is a data structure that keeps track of whichever thread has exclusive access to the data. If the mutex happens to be locked at the time a thread tries to access the data, it will stall until the lock is eventually released, and the data is free to acquire.
#![allow(unused)] fn main() { use std::sync::Mutex; // Instantiate a new Mutex<i32> instance with the value '32': let var: Mutex<i32> = Mutex::new(32); { // Acquire the mutex's lock (and panic in case of failure): let mut v = var.lock().unwrap(); // Modify the value safely: *v += 32; } // Print the result: println!("var = {var:?}"); }
In the following example, we want to try to use a Mutex
to get both threads to use and modifyvar
, but the compiler doesn't allow it, what is the underlying issue here?
#![allow(unused)] fn main() { use std::thread; let mut var = 32; let t1 = thread::spawn(move || { var += 1; }); let t2 = thread::spawn(move || { var += 2; }); // Re-synchronise both threads: t1.join().unwrap(); t2.join().unwrap(); println!("Result: {var}"); }
Course: Reference Counted Mutexes
As we could see in our previous example, a Mutex
in itself is not sufficient to implement viable thread-safe data sharing:
- First is the issue of ownership, which could be solved using, for instance, a shared pointer.
- Second would be the issue of concurrency in accessing this shared pointer from multiple threads simultaneously.
In the Rust programming language, Atomic Reference Counting Arc<T>
, which can be seen as an atomic shared pointer, is designed to remedy this very specific problem. It firstly solves the ownership issue by being "reference-counted", just like a standard Rc
object, but also solves the concurrency issue by being atomic, meaning that is guaranteed to execute as a single unified transaction. When an atomic operation is executed on an object by a specific thread, no other threads can read or modify the object while the atomic operation is in progress. In other words, other threads will only see the object before or after the operation, there would be no intermediary state.
Our previous example can then be replaced by the following:
#![allow(unused)] fn main() { use std::thread; use std::sync::{Arc, Mutex}; // We wrap the Mutex in a Atomically Reference-Counted object: let arc = Arc::new(Mutex::new(32)); // We prepare two clones, for moving into the two distinct closures: let rc1 = Arc::clone(&arc); let rc2 = Arc::clone(&arc); let t1 = thread::spawn(move || { let mut v = rc1.lock().unwrap(); *v += 1; }); let t2 = thread::spawn(move || { let mut v = rc2.lock().unwrap(); *v += 2; }); // Re-synchronise both threads: t1.join().unwrap(); t2.join().unwrap(); println!("Result: {arc:?}"); }
If an Arc
object is sufficient to provide multiple ownership and thread-safe access to data, why do we still need aMutex
guarding our data? TODO: pas sur de savoir expliquer la réponse clairement... tu ne relache jamais les Mutex ou ARC?
Course: Lock-free data sharing with Atomics
Another way of making data thread-safe is by directly using atomic data structures. The Rust standard library provides a few of them in its std::sync::atomic
module. The main difference with using Mutexes is that atomics are what we call lock-free: unlike Mutexes, its underlying mechanism never sleeps, but it spins (it will check data availability in a continuous loop), that's why we usually call them spinlocks. In real-time contexts, where thread sleep is not an option, it is always preferrable to use lock-free data structures.
Our previous example would for instance look like this with atomics:
#![allow(unused)] fn main() { use std::thread; use std::sync::Arc; use std::sync::atomic::AtomicI32; use std::sync::atomic::Ordering; // We replace 'Mutex::new()' with the following: let arc = Arc::new(AtomicI32::new(32)); let rc1 = Arc::clone(&arc); let rc2 = Arc::clone(&arc); let t1 = thread::spawn(move || { // Acquire underlying data as a copy: let mut v = rc1.load(Ordering::Acquire); // Modify the copy: v += 1; // Update the value atomically, release the lock: rc1.store(v, Ordering::Release); }); let t2 = thread::spawn(move || { // Another (more compact) way of doing this: rc2.fetch_add(2, Ordering::AcqRel); }); // Re-synchronise both threads: t1.join().unwrap(); t2.join().unwrap(); println!("Result: {arc:?}"); }
Using Message passing to transfer data between threads
Another approach to multiple ownership and thread-safety in Rust would be using an mpsc::channel
or mpsc::sync_channel
, which are asynchronous/synchronous FIFO queues that store all the updated states of a value in a shared infinite buffer.
An mpsc::channel()
call will return a tuple containing a handle to a Sender
object and a Receiver
object, which are by convention respectively named tx
and rx
. These two objects are positioned at each end of a FIFO which tunnels data between the two:
#![allow(unused)] fn main() { // Create a 'data channel' between a Sender `tx`, and a Receiver `rx`: let (tx, rx) = std::sync::mpsc::channel::<i32>(); // Send the values `32` and then `16` through the channel: tx.send(32).unwrap(); tx.send(16).unwrap(); // Poll the channel, read data if available: println!("n = {}", rx.recv().unwrap()); println!("n = {}", rx.recv().unwrap()); }
Example of sending from a separate thread:
#![allow(unused)] fn main() { use std::thread; use std::sync::mpsc; let (tx, rx) = mpsc::channel(); // Do the same thing in a separate thread: let th = thread::spawn(move || { tx.send(32).unwrap(); tx.send(16).unwrap(); }); th.join(); // Poll the channel, read data if available: println!("n = {}", rx.recv().unwrap()); println!("n = {}", rx.recv().unwrap()); }
Or from multiple threads simultaneously:
#![allow(unused)] fn main() { use std::thread; use std::sync::mpsc; let (tx, rx) = mpsc::channel(); let mut vec = Vec::new(); for n in 0..8 { // Clone the Sender `tx` for each thread: let tx = tx.clone(); vec.push(thread::spawn(move || { tx.send(n).unwrap(); })); } for t in vec { t.join().unwrap(); } for _ in 0..8 { // Consume the FIFO value-by-value: let value = rx.recv().unwrap(); println!("Received value: {}", value); } }
Using an mpsc::channel
, anArc<T>
and aMutex
(or anAtomic
), implement a program which creates and run two independent threads:
- A producer thread which continuously counts from 0 to infinity.
- A consumer thread which continuously reads and prints the count produced by the producer thread.
- The two threads should run for 5 seconds and then stop.
- Hint: you can use
thread::sleep
to pause a thread for a certain amount of time.
Asynchronous programming
While programming with threads is a perfectly valid way of implementing concurrent programming, it also has a few disadvantages, such as having to rely on operating system scheduling, as well as sometimes making the code difficult to re-use or modify. To address these issues, programmers came up with a new way of structuring a program in a different set of tasks (whether they are independent, concurrent, or sequential), which has been called asynchronous programming.
Code within a thread is written in sequential style and the operating system executes them concurrently. With asynchronous programming, concurrency happens entirely within a program: the operating system is not involved, making context switch faster, and memory overhead also lower. By being natively integrated into a programming language, which is the case with Rust, it also makes control flow more flexible and expressive. For instance, a program using threads, like in the following example:
#![allow(unused)] fn main() { fn count_to(N: i32) { for n in 1..=N { println!("{n}"); std::thread::sleep(std::time::Duration::from_secs(1)); } } let t1 = std::thread::spawn(|| { count_to(10) }); let t2 = std::thread::spawn(|| { count_to(20) }); t1.join().unwrap(); t2.join().unwrap(); }
Could be also described like this using Rust's (with tokio
) async
and await
features:
use tokio::{join, spawn, time::{sleep, Duration}}; async fn count_to(N: i32) { for n in 1..=N { println!("{n}"); sleep(Duration::from_secs(1)).await; } } #[tokio::main] async fn main() { // Run the following code expressions on a same task: join!(count_to(10), count_to(15)); }