Course 3: Rust Language

Advanced types Compound and collection types

Pierre Cochard, Tanguy Risset

Compound types

Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.

Course: Tuples

'Tuples' are fixed-size collections of arbitrary-typed values, they are defined with the (Type, Type, ...) syntax:

#![allow(unused)]
fn main() {
// Explicit type:
let mut tup: (i32, i32, f32, &str) = (31, 16, 47.27, "hello!"); 

// Inferred type:
let mut tup = (31, 16, 47.27, "hello!"); 
}

Accessing individual values within a Tuple can be done by either:

  • referring to its index
  • destructuring the tuple and bind it to individually-named variables:
#![allow(unused)]
fn main() {
let mut tup = (31, 16, 47.27, "hello!"); 

// Access by index:
tup.0 = 22;
tup.3 = "world!";

// 'Destructuring' a tuple:
let (t0, t1, t2, t3) = tup;
println!("y = ({t0}, {t2})");
}

'Tuples' are stored on the stack unless Box<...> is explicitely used.

Take a mutable reference to the third element (f32) of the tuple tup and pass it to a function that multiplyies it by 2:

let mut tup = (31, 16, 47.27, "hello!"); 

// The function's prototype (to be implemented):
fn mul2(t: &mut f32);

mul2(...);

assert!(tup == (31, 16, 94.54, "hello!"));
Correction
#![allow(unused)]
fn main() {
let mut tup = (31, 16, 47.27, "hello!"); 

fn mul2(t: &mut f32) {
    *t += *t;
}

mul2(&mut tup.2);
assert!(tup == (31, 16, 94.54, "hello!"));
}

Rust refuse les mutations implicites. l'appel à mul2 a besoin de &mut:

& → reference (not a value)

mut → reference is exclusive and mutable

Tuples can be conviently used in a function in order to return multiple values, which can then be assigned to distinct variables in a same expression:

#![allow(unused)]
fn main() {
// A function returning a pair of signed integers:
fn return_tuple(x: i32) -> (i32, i32) {
    return (x+1, x+2);
}

// Calling a function, and storing its result:
let y: (i32, i32) = return_tuple(8);
println!("y = {:?}", y);
println!("y = ({:?}, {:?})", y.0, y.1);

// Passing a tuple as an argument to a function:
fn print_tuple(x: &(i32, i32)) {
    println!("x = ({:?}, {:?})", x.0, x.1);
}

print_tuple(&y);
}

Write a function that transforms a (i32, i32) tuple by swapping its two values:

#![allow(unused)]
fn main() {
// The function prototype to be implemented:
fn swap(tup: &mut(i32, i32));

let mut x = (31, 27);
swap(&mut x);

assert!(x == (27, 31));
}
Correction
#![allow(unused)]
fn main() {
fn swap(tup: &mut(i32, i32)) {
    *tup = (tup.1, tup.0);
}

let mut x = (31, 27);
swap(&mut x);

assert!(x == (27, 31));
}

Course: Arrays (primitive type)

Arrays are fixed-size groups of values of the same type, and can be defined in Rust with the syntax:

  • [Subtype; Length], for instance [i32; 10]

Arrays can be defined and initialized to a specific value in a single statement. They cannot be used without being initialized before.

#![allow(unused)]
fn main() {
// Explicit type:
let a: [i32; 3] = [31, 16, 47];

// Inferred type:
let b = [0, 1, 2, 3, 4]; // #[i32; 5]

// Create and zero-initialize an array:
let mut a: [usize; 10] = [0; 10];
// same as:
let mut a = [0 as usize; 10];
// same as:
let mut a = [0usize; 10];

// Writing at a specific index:
// Note: in Rust, as in C, array indices start at 0
a[0] = 2;
println!("a[0] = {}", a[0]);
}

'Arrays' are stored on the stack unless Box<...> is explicitely used.

As in most programming languages, multidimensional/nested arrays are also supported in rust, and can be declared as follows:

#![allow(unused)]
fn main() {
// 2-dimensional array, 2 arrays of `i32` with a length of 10 each:
let mut multi_array = [[0 as i32; 10]; 2];
[...]
multi_array[0][0] = 1;
}

What would be the type of the following arrays?

#![allow(unused)]
fn main() {
let a1 = [(1, 2), (3, 4), (5, 6)];
let a2 = [(1, 2), (3, 4), (5, (6, 7))];
}
Correction

The type of a1 would be:

#![allow(unused)]
fn main() {
let a1: [(i32, i32); 3] = [(1, 2), (3, 4), (5, 6)];
}

a2, on the other hand, does not compile: all values must have the same type in an array.

Course: Ranges & Iterators

Arrays are convenient for storing and processing a set of contiguous data on the stack, for instance through the use of loops, ranges and iterators.

A range represents an interval of values between a start and an end point. In rust, they can be conveniently used with the start..end construct (here excluding the end value), or with start..=end (here including the end value).

Examine the following assert! statements, will this program compile?

let a = 0..10;
let b = 1..=10;

// 'a' range:
assert!(a.contains(&0));
assert!(!a.contains(&10));

// 'b' range:
assert!(!b.contains(&0));
assert!(b.contains(&10)); 

// The 'a' and 'b' ranges have the same number of elements:
assert_eq!(a.count(), b.count());
Correction

Yes, all the assert!() statements are true.

Course: Iterators

Iterators allow to go through an array, a range or a collection, and access each element one-by-one.

#![allow(unused)]
fn main() {
let r = 0..10;
// Iterate over a range:
for n in r.into_iter() {
    print!("{n} ");
    // -> 0 1 2 3 4 5 6 7 8 9
}
println!();

// Iterate over an array:
let mut a = [0; 10];

// Basic for-loop iteration:
for x in a {
    println!("{x}");
}
// From a 'range':
for n in (0 .. a.len()) {
    println!("{}", a[n]);
}
// As mutable, changing the values of the array:
for x in &mut a {
    *x += 1;
}
// Equivalent to (using a 'closure'):
a.iter_mut().for_each(|x| *x += 1 );

// Iterate with both element and index:
for (i, x) in a.iter_mut().enumerate() {
    *x += i;
}
}

Using ranges and/or iterators, write in the following multidimensional array's first sub-array values that incrementally go from 1 to 10, and in the second, decrement the values from 10 to 1, as shown below:

#![allow(unused)]
fn main() {
let mut multi_array = [[0 as i32; 10]; 2];

// The following must be true:
assert_eq!(multi_array, [
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
]);
}
Correction
#![allow(unused)]
fn main() {
let mut multi_array = [[0 as i32; 10]; 2];
for n in 1..=10 {
    multi_array[0][n-1] = n as i32;
}
for n in (1..=10).rev() {
    multi_array[1][10-n] = n as i32;
}
// The resulting 2D-array
assert_eq!(multi_array, [
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
]);
}

Collections

In addition to primitive compound types, the Rust standard library includes a number of very useful data structures called collections. Unlike the built-in array and tuple types, the data these collections point to is stored on the heap, which means the amount of data does not need to be known at compile time and can grow or shrink as the program runs.

Course: Vectors

'Vectors' are a collection of multiple values of a same type stored on the heap. Unlike arrays, they have a dynamic size: they can grow, or shrink.

A Vec object has ownership over the data located in its underlying heap-allocated buffer, which means that the buffer will be deallocated whenever the owning object goes out of scope.

#![allow(unused)]
fn main() {
// The easiest way to create a vector is to use the 'vec!()' macro:
let mut v = vec![0, 1, 2, 3, 4, 5]; // Vec<i32>
println!("Value: {:?}", v);

// 'Pushing' (appending) a new value at the end:
v.push(6);
println!("Value: {:?}", v);

// 'Popping' (removing) its last value:
let last = v.pop();
println!("Last value: {:?}, Vector: {:?}", last, v);

// removing value at index i:
v.remove(i)
}

Examine the following v1, v2 and v3 vectors and their underlying heap buffer pointers.

#![allow(unused)]
fn main() {
let mut v0: Vec<i32> = vec![0, 1, 2, 3, 4];
// get a pointer to the underlying heap memory buffer:
let v0_ptr: *const i32 = v0.as_ptr();

// Create another vec 'v1' from 'v0', and get its heap pointer again:
// i.e. move v0 to v1
let mut v1: Vec<i32> = v0;
let v1_ptr = v1.as_ptr();

// Create another vec 'v2' from 'v1'using clone():
let mut v2: Vec<i32> = v1.clone();
let v2_ptr = v2.as_ptr();
}

Which of the following assertions are true:

#![allow(unused)]
fn main() {
// Assertion A: the address of pointer 'v0' is the same as pointer 'v1'
assert_eq!(v0_ptr.addr(), v1_ptr.addr());

// Assertion B: the address of pointer 'v1' is the same as pointer 'v2'
assert_eq!(v1_ptr.addr(), v2_ptr.addr());

// Assertion C: the address of pointer 'v0' is the same as pointer 'v2'
assert_eq!(v0_ptr.addr(), v2_ptr.addr());
}
Correction

Assertion A is correct: v0_ptr and v1_ptr are the same, since v0 was moved into v1 its memory was not re-allocated at any other address, but v1 has now exclusive ownership over this address, making v0 unaccessible.

Assertion B and C are incorrect: v2 has been explicitly cloned from v1: this means that all of its contents (including its heap-allocated memory buffer) have been deep-copied: another memory zone has been allocated and has been filled with the same contents.

Iterating over a vector is the exact same process as for an array (most operations are inter-compatible!).

#![allow(unused)]
fn main() {
// Initializing from a range and iterator:
let mut v = Vec::from_iter((0..6).map(|i| i+1 ));
println!("Value: {:?}", v);

// Iterate/increment:
for x in &mut v {
    *x += 1;
}
println!("Value: {:?}", v);
// General operations:
v.rotate_left(1);
println!("Value: {:?}", v);
// etc.
}

Using a single loop, move the contents of vector v to array a such as vector v is equal to vec![] (empty vector) and array a is equal to the inverse of v : [5, 4, 3, 2, 1, 0]:

#![allow(unused)]
fn main() {
let mut v = vec![0, 1, 2, 3, 4, 5];
let mut a = [0; 6];

(...)

assert_eq!(v, vec![]);
assert_eq!(a, [5, 4, 3, 2, 1, 0]);
}
Correction
#![allow(unused)]
fn main() {
let mut v = vec![0, 1, 2, 3, 4, 5];
let mut a = [0; 6];

for n in (0..v.len()) {
    a[n] = v.pop().unwrap();
}
assert_eq!(v, vec![]);
assert_eq!(a, [5, 4, 3, 2, 1, 0]);
}

Course: Hash-maps

HashMap are heap-allocated collections of same-type values indexed by a unique key. Like vectors, they can grow, or shrink. They make a convenient choice for representing indexes, dictionaries, or any other type of database-like objects:

#![allow(unused)]
fn main() {
// Unlike Vec, the HashMap data structure need to be explicitly included!
use std::collections::HashMap;

// Inferred type:
let mut departments = HashMap::new(); // HashMap<i32, str>
departments.insert(85, "Vendée");
departments.insert(31, "Haute-Garonne");
departments.insert(44, "Loire-Atlantique");

// We use the ampersand(&) and the key (&1) as the argument 
// because [..] returns us a reference of the value. It is not 
//the actual value in the HashMap.
let d31 = departments[&31];
assert_eq!(d31, "Haute-Garonne");

// Removing a key:
departments.remove(&85);

// Iterating over all values:
for department in departments {
    // We get a tuple!
    println!("Key: {}, Value: {}", department.0, department.1);
}
}

Move the contents of the following Vec object into a BTreeMap (which behaves the same as a HashMap, but will sort its contents by key) in order to get these athlete names sorted by their score in points.

Note: some of them have the same score, which should appear in the same key.

#![allow(unused)]
fn main() {
use std::collections::BTreeMap;

let vec = vec![
    ("Y. Horigome", 281),
    ("N. Huston", 279),
    ("M. Dell", 153),
    ("J. Eaton", 281),
    ("S. Shirai", 278),
    ("K. Hoefler", 270),
    ("C. Russell", 211),
    ("R. Tury", 273),
];

let mut map = BTreeMap::new();

[...]

for score in map {
    println!("{:?}", score);
}
}

Hints:

  • build a BTreeMap with the key being the score
  • use the entry(key) method of HashMap that return the HashMap entru corresponding to key
  • apply or_defaulton the result of entry (it creates an entry with no value for this key, if the key does not exists in the HashMap)

The last for loop should print:

(153, ["M. Dell"])
(211, ["C. Russell"])
(270, ["K. Hoefler"])
(273, ["R. Tury"])
(278, ["S. Shirai"])
(279, ["N. Huston"])
(281, ["Y. Horigome", "J. Eaton"])
Correction
#![allow(unused)]
fn main() {
use std::collections::BTreeMap;

let vec = vec![
    ("Y. Horigome", 281),
    ("N. Huston", 279),
    ("M. Dell", 153),
    ("J. Eaton", 281),
    ("S. Shirai", 278),
    ("K. Hoefler", 270),
    ("C. Russell", 211),
    ("R. Tury", 273),
];

let mut map: BTreeMap<i32, Vec<&str>> = BTreeMap::new();
for v in vec {
    map.entry(v.1)
        .or_default()
        .push(v.0);
}
for score in map {
    println!("{:?}", score);
}
}

'string' types (str and String)

Course: str primitive

The str primitive type can be used to represent a string literal:

#![allow(unused)]
fn main() {
// String literal:
let s = "Hello, World!";
}

As a literal, a str has a static lifetime which can be also explicitly stated in its type declaration. A static lifetime means that the object is valid throughout the entire duration of the program.

The type &str is usually called a string slice

#![allow(unused)]
fn main() {
// Here, the three syntaxes are equivalent:
let s = "Hello, World!"; // Inferred type & lifetime
let s: &str = "Hello, World!"; // Explicit type, inferred lifetime
let s: &'static str = "Hello, World!"; // Explicit type & lifetime
}

Unlike const char* in the C programming language, &str in Rust is not null-terminated, but relies on a slice, which is composed of a pointer and a size in bytes:

#![allow(unused)]
fn main() {
let s = "Hello, World!";
println!("Pointer: {:?}, Length: {} bytes", s.as_ptr(), s.len());

for (n, char) in s.chars().enumerate() {
    println!("Char {n}: {char}");
}
}

For safety reasons, Rust doesn't allow modifying the actual contents (the characters) of a &str, thus the following does not compile:

#![allow(unused)]
fn main() {
let s: &mut str = "Hello, World!";
}

Run the following code:

#![allow(unused)]
fn main() {
let s1 = "It's not about the bunny      \t"; 

// trim() Returns a string slice with leading and trailing whitespace removed.
let s2 = s1.trim();
println!("{s1}");
println!("Address: {:?}, Length: {}", s1.as_ptr(), s1.len());
println!();
println!("{s2}");
println!("Address: {:?}, Length: {}", s2.as_ptr(), s2.len());
}
  • Since Rust forbids modifying the contents of a str literal, why are we in this case allowed to use the .trim() function? What is truly happening in this code?
Correction

In this specific case, since we can clearly see that the underlying pointer is the same in both s1 and s2, we can deduce that the underlying buffer has not been modified, but instead we instantiated another slice object, which points to the same address as s1, but has a different length, which omits the trailing whitespaces and tab.

  • What would happen if we modified s1 as follows?
#![allow(unused)]
fn main() {
let s1 = "     It's not about the bunny      \t"; 
let s2 = s1.trim();
}
Correction

Here, both the pointer and length would be different in s2, the pointer would be indeed offset by the number of whitespaces leading the string literal (which is here equal to 5):

#![allow(unused)]
fn main() {
let s1 = "     It's not about the bunny      \t"; 
let s2 = s1.trim();

// s2 pointer address minus 5 (removing the whitespaces) is equal to s1 pointer's address
assert_eq!(s2.as_ptr().addr()-5, s1.as_ptr().addr());
}

Course: String

A String, on the other hand, is a standard library collection type that can be basically seen as a vector of char, dynamically stored on the heap. Just like a Vec, it can grow, shrink, and has ownership over its own underlying buffer, which makes it an easier object to manipulate. While it inherits all of the str methods, it does not have a static lifetime.

#![allow(unused)]
fn main() {
// Create from a string literal:
let mut s = String::from("Owls are not what they seem");

// Append a 'char':
s.push('!');
println!("Value: {:?}", s);

// Append another string:
s.push_str(" Really?");
println!("Value: {:?}", s);

// Other ways of appending to the String:
s = s + " Yes, ";
s += "really!";

// Iterate over every 'char'
for c in s.chars() {
    print!("{c} ");
}
println!("");

// Example of transformation:
s = s.chars().rev().collect();

println!("Value: {:?}", s);
}

Examine the following code:

#![allow(unused)]
fn main() {
let s0: &str = "That gum you like is going to come back in style";

// Build a 'String' object from the previous '&str':
let mut s1: String = String::from(s0);

// Now modify 'string':
s1 = s1.to_ascii_uppercase();
println!("{s1}");

}

Is s0 still accessible? If yes, what is now its value? Is it the same as s1 and why?

Correction

s0 is still accessible, and its value remains the same. The contents of s0 has been copied into a heap buffer owned by s1. The two are completely separate from one another.

We now call the as_str() method on s1, and store the resulting &str value in a new variable called s2. Can you guess what is the lifetime of s2?

#![allow(unused)]
fn main() {
let s0: &str = "That gum you like is going to come back in style";

// Build a 'String' object from the previous '&str':
let mut s1: String = String::from(s0);
let s2: &str = s1.as_str();
}
Correction

s2 has the same lifetime as s1, since its referring to the same underlying heap buffer. In this case, this &str is not a string literal, which would have a 'static lifetime, but is a slice that points to dynamically-allocated memory.

Slices

A slice in rust can be considered as a bounded pointer or reference to a contiguous sequence of elements in an array, a collection, or a string of characters, as we saw earlier. It is declared with the &[T] syntax. Since it works like a reference, it does not have ownership over its contents.

#![allow(unused)]
fn main() {
// Create a byte buffer:
let mut buffer = [0 as u8; 16];

// Get a slice on half the buffer:
// (notice how the slice itself is not mutable, 
// but instead points to a mutable sequence in the buffer)
let slice: &mut[u8] = &mut buffer[0..8]; // we use the 'range syntax' here to capture the slice

// Iterate on the slice to change values:
for (i, n) in slice.iter_mut().enumerate() {
    *n = i as u8;
}
println!("{:?}", buffer);
}

Take a slice out of the string object, starting from character 25 until the end, and use the .make_ascii_lowercase() method on the captured slice.

#![allow(unused)]
fn main() {
let mut string = String::from("YOU REMIND ME TODAY OF A SMALL MEXICAN CHIHUAHUA");

let slice = ...;

slice.make_ascii_lowercase();

println!("{string}");
}
Correction
#![allow(unused)]
fn main() {
let mut string = String::from("YOU REMIND ME TODAY OF A SMALL MEXICAN CHIHUAHUA");

let slice = &mut string[25..];

slice.make_ascii_lowercase();

println!("{string}");
}

What is the inferred type of the slice variable?

Correction

It is a &mut str.