Course 2: Rust Language

Advanced types Compound and collection types

Pierre Cochard, Tanguy Risset

Compound types
Collections
- Course: Vectors
- Course: Hash-maps
'string' types (str and String)
- Course: str primitive
- Course: String
Slices

Compound types

Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.

Course: Tuples

'Tuples' are fixed-size collections of arbitrary-typed values, they are defined with the (Type, Type, ...) syntax:

#![allow(unused)]
fn main() {
// Explicit type:
let mut tup: (i32, i32, f32, &str) = (31, 16, 47.27, "hello!"); 

// Inferred type:
let mut tup = (31, 16, 47.27, "hello!"); 
}

Accessing individual values within a Tuple can be done by either:

referring to its index
destructuring the tuple and bind it to individually-named variables:

#![allow(unused)]
fn main() {
let mut tup = (31, 16, 47.27, "hello!"); 

// Access by index:
tup.0 = 22;
tup.3 = "world!";

// 'Destructuring' a tuple:
let (t0, t1, t2, t3) = tup;
println!("y = ({t1}, {t2})");
}

Take a mutable reference to the third element (f32) of the tuple tup and pass it to a function that multiplyies it by 2:

let mut tup = (31, 16, 47.27, "hello!"); 

// The function's prototype (to be implemented):
fn mul2(t: &mut f32);

mul2(...);

assert!(tup == (31, 16, 94.54, "hello!"));

Tuples can be conviently used in a function in order to return multiple values, which can then be assigned to distinct variables in a same expression:

#![allow(unused)]
fn main() {
// A function returning a pair of signed integers:
fn return_tuple(x: i32) -> (i32, i32) {
    return (x+1, x+2);
}

// Calling a function, and storing its result:
let y: (i32, i32) = return_tuple(8);
println!("y = {:?}", y);
println!("y = ({:?}, {:?})", y.0, y.1);

// Passing a tuple as an argument to a function:
fn print_tuple(x: &(i32, i32)) {
    println!("x = ({:?}, {:?})", x.0, x.1);
}

print_tuple(&y);
}

Write a function that transforms a (i32, i32) tuple by swapping its two values:

#![allow(unused)]
fn main() {
// The function prototype to be implemented:
fn swap(tup: &mut(i32, i32));

let mut x = (31i32, 27i32);
swap(&mut x);

assert!(x == (27i32, 31i32));
}

Course: Arrays (primitive type)

Arrays are fixed-size groups of values of the same type, and can be defined in Rust with the syntax:

[Subtype; Length], for instance [i32; 10]

#![allow(unused)]
fn main() {
// Explicit type:
let a: [i32; 3] = [31, 16, 47];

// Inferred type:
let b = [0, 1, 2, 3, 4]; // #[i32; 5]

// Create and zero-initialize an array:
let mut a: [usize; 10] = [0; 10];
// same as:
let mut a = [0 as usize; 10];
// same as:
let mut a = [0usize; 10];

// Writing at a specific index:
// Note: in Rust, as in C, array indices start at 0
a[0] = 2;
println!("a[0] = {}", a[0]);
}

As in most programming languages, multidimensional/nested arrays are also supported in rust, and can be declared as follows:

#![allow(unused)]
fn main() {
// 2-dimensional array, 2 arrays of `i32` with a length of 10 each:
let mut multi_array = [[0 as i32; 10]; 2];
}

What would be the type of the following arrays?

#![allow(unused)]
fn main() {
let a1 = [(1, 2), (3, 4), (5, 6)];
let a2 = [(1, 2), (3, 4), (5, (6, 7))];
}

Course: Ranges & Iterators

Arrays are convenient for storing and processing a set of contiguous data on the stack, for instance through the use of loops, ranges and iterators.

A range represents an interval of values between a start and an end point. In rust, they can be conveniently used with the start..end construct (here excluding the end value), or with start..=end (here including the end value).

Examine the following assert! statements, will this program compile?

let a = 0..10;
let b = 1..=10;

// 'a' range:
assert!(a.contains(&0));
assert!(!a.contains(&10));

// 'b' range:
assert!(!b.contains(&0));
assert!(b.contains(&10)); 

// The 'a' and 'b' ranges have the same number of elements:
assert_eq!(a.count(), b.count());

Course: Iterators

Iterators allow to go through an array, a range or a collection, and access each element one-by-one.

#![allow(unused)]
fn main() {
let r = 0..10;
// Iterate over a range:
for n in r.into_iter() {
    print!("{n} ");
    // -> 0 1 2 3 4 5 6 7 8 9
}
println!();

// Iterate over an array:
let mut a = [0; 10];

// Basic for-loop iteration:
for x in a {
    println!("{x}");
}
// From a 'range':
for n in (0 .. a.len()) {
    println!("{}", a[n]);
}
// As mutable, changing the values of the array:
for x in &mut a {
    *x += 1;
}
// Equivalent to (using a 'closure'):
a.iter_mut().for_each(|x| *x += 1 );

// Iterate with both element and index:
for (i, x) in a.iter_mut().enumerate() {
    *x += i;
}
}

Using ranges and/or iterators, write in the following multidimensional array's first sub-array values that incrementally go from 1 to 10, and in the second, decrement the values from 10 to 1, as shown below:

#![allow(unused)]
fn main() {
let mut multi_array = [[0 as i32; 10]; 2];

// The following must be true:
assert_eq!(multi_array, [
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
]);
}

Collections

In addition to primitive compound types, the Rust standard library includes a number of very useful data structures called collections. Unlike the built-in array and tuple types, the data these collections point to is stored on the heap, which means the amount of data does not need to be known at compile time and can grow or shrink as the program runs.

Course: Vectors

'Vectors' are a collection of multiple values of a same type stored on the heap. Unlike arrays, they have a dynamic size: they can grow, or shrink.

A Vec object has ownership over the data located in its underlying heap-allocated buffer, which means that the buffer will be deallocated whenever the owning object goes out of scope.

#![allow(unused)]
fn main() {
// The easiest way to create a vector is to use the 'vec!()' macro:
let mut v = vec![0, 1, 2, 3, 4, 5]; // Vec<i32>
println!("Value: {:?}", v);

// 'Pushing' (appending) a new value at the end:
v.push(6);
println!("Value: {:?}", v);

// 'Popping' (removing) its last value:
let last = v.pop();
println!("Last value: {:?}, Vector: {:?}", last, v);
}

Examine the following v1, v2 and v3 vectors and their underlying heap buffer pointers.

#![allow(unused)]
fn main() {
let mut v0: Vec<i32> = vec![0, 1, 2, 3, 4];
// get a pointer to the underlying heap memory buffer:
let v0_ptr: *const i32 = v0.as_ptr();

// Create another vec 'v1' from 'v0', and get its heap pointer again:
let mut v1: Vec<i32> = v0;
let v1_ptr = v1.as_ptr();

// Create another vec 'v2' from 'v1':
let mut v2: Vec<i32> = v1.clone();
let v2_ptr = v2.as_ptr();
}

Which of the following assertions are true:

#![allow(unused)]
fn main() {
// Assertion A: the address of pointer 'v0' is the same as pointer 'v1'
assert_eq!(v0_ptr.addr(), v1_ptr.addr());

// Assertion B: the address of pointer 'v1' is the same as pointer 'v2'
assert_eq!(v1_ptr.addr(), v2_ptr.addr());

// Assertion C: the address of pointer 'v0' is the same as pointer 'v2'
assert_eq!(v0_ptr.addr(), v2_ptr.addr());
}

Iterating over a vector is the exact same process as for an array (most operations are inter-compatible!).

#![allow(unused)]
fn main() {
// Initializing from a range and iterator:
let mut v = Vec::from_iter((0..6).map(|i| i+1 ));
println!("Value: {:?}", v);

// Iterate/increment:
for x in &mut v {
    *x += 1;
}
println!("Value: {:?}", v);
// General operations:
v.rotate_left(1);
println!("Value: {:?}", v);
// etc.
}

Using a single loop, move the contents of vector v to array a such as vector v is equal to vec![] (empty vector) and array a is equal to [5, 4, 3, 2, 1, 0]:

#![allow(unused)]
fn main() {
let mut v = vec![0, 1, 2, 3, 4, 5];
let mut a = [0; 6];

(...)

assert_eq!(v, vec![]);
assert_eq!(a, [5, 4, 3, 2, 1, 0]);
}

Course: Hash-maps

HashMap are heap-allocated collections of same-type values indexed by a unique key. Like vectors, they can grow, or shrink. They make a convenient choice for representing indexes, dictionaries, or any other type of database-like objects:

#![allow(unused)]
fn main() {
// Unlike Vec, the HashMap data structure need to be explicitly included!
use std::collections::HashMap;

// Inferred type:
let mut departments = HashMap::new(); // HashMap<i32, str>
departments.insert(85, "Vendée");
departments.insert(31, "Haute-Garonne");
departments.insert(44, "Loire-Atlantique");

// We use the ampersand(&) and the key (&1) as the argument 
// because [..] returns us a reference of the value. It is not the actual value in the HashMap.
let d31 = departments[&31];
assert_eq!(d31, "Haute-Garonne");

// Removing a key:
departments.remove(&85);

// Iterating over all values:
for department in departments {
    // We get a tuple!
    println!("Key: {}, Value: {}", department.0, department.1);
}
}

Move the contents of the following Vec object into a BTreeMap (which behaves the same as a HashMap, but will sort its contents by key) in order to get these athlete names sorted by their score in points.

Note: some of them have the same score, which should appear in the same key.

#![allow(unused)]
fn main() {
use std::collections::BTreeMap;

let vec = vec![
    ("Y. Horigome", 281),
    ("N. Huston", 279),
    ("M. Dell", 153),
    ("J. Eaton", 281),
    ("S. Shirai", 278),
    ("K. Hoefler", 270),
    ("C. Russell", 211),
    ("R. Tury", 273),
];

let mut map = BTreeMap::new();

[...]

for score in map {
    println!("{:?}", score);
}
}

The last for loop should print:

(153, ["M. Dell"])
(211, ["C. Russell"])
(270, ["K. Hoefler"])
(273, ["R. Tury"])
(278, ["S. Shirai"])
(279, ["N. Huston"])
(281, ["Y. Horigome", "J. Eaton"])

'string' types (`str` and `String`)

Course: `str` primitive

The str primitive type can be used to represent a string literal:

#![allow(unused)]
fn main() {
// String literal:
let s = "Hello, World!";
}

As a literal, a str has a static lifetime which can be also explicitly stated in its type declaration. A static lifetime means that the object is valid throughout the entire duration of the program.

#![allow(unused)]
fn main() {
// Here, the three syntaxes are equivalent:
let s = "Hello, World!"; // Inferred type & lifetime
let s: &str = "Hello, World!"; // Explicit type, inferred lifetime
let s: &'static str = "Hello, World!"; // Explicit type & lifetime
}

Unlike const char* in the C programming language, &str in Rust is not null-terminated, but relies on a slice, which is composed of a pointer and a size in bytes:

#![allow(unused)]
fn main() {
let s = "Hello, World!";
println!("Pointer: {:?}, Length: {} bytes", s.as_ptr(), s.len());

for (n, char) in s.chars().enumerate() {
    println!("Char {n}: {char}");
}
}

For safety reasons, Rust doesn't allow modifying the actual contents (the characters) of a &str, thus the following does not compile:

#![allow(unused)]
fn main() {
let s: &mut str = "Hello, World!";
}

Run the following code:

#![allow(unused)]
fn main() {
let s1 = "It's not about the bunny      \t"; 

// Remove leading/trailing whitespace, tabs and newlines from 's1': 
let s2 = s1.trim();
println!("{s1}");
println!("Address: {:?}, Length: {}", s1.as_ptr(), s1.len());
println!();
println!("{s2}");
println!("Address: {:?}, Length: {}", s2.as_ptr(), s2.len());
}

Since Rust forbids modifying the contents of a str literal, why are we in this case allowed to use the .trim() function? What is truly happening in this code?

What would happen if we modified s1 as follows?

#![allow(unused)]
fn main() {
let s1 = "     It's not about the bunny      \t"; 
let s2 = s1.trim();
}

Course: String

A String, on the other hand, is a standard library collection type that can be basically seen as a vector of char, dynamically stored on the heap. Just like a Vec, it can grow, shrink, and has ownership over its own underlying buffer, which makes it an easier object to manipulate. While it inherits all of the str methods, it does not have a static lifetime.

#![allow(unused)]
fn main() {
// Create from a string literal:
let mut s = String::from("Owls are not what they seem");

// Append a 'char':
s.push('!');
println!("Value: {:?}", s);

// Append another string:
s.push_str(" Really?");
println!("Value: {:?}", s);

// Other ways of appending to the String:
s = s + " Yes, ";
s += "really!";

// Iterate over every 'char'
for c in s.chars() {
    print!("{c} ");
}
println!("");

// Example of transformation:
s = s.chars().rev().collect();

println!("Value: {:?}", s);
}

Examine the following code:

#![allow(unused)]
fn main() {
let s0: &str = "That gum you like is going to come back in style";

// Build a 'String' object from the previous '&str':
let mut s1: String = String::from(s0);

// Now modify 'string':
s1 = s1.to_ascii_uppercase();
println!("{string}");

}

Is s0 still accessible? If yes, what is now its value? Is it the same as s1 and why?

We now call the as_str() method on s1, and store the resulting &str value in a new variable called s2. Can you guess what is the lifetime of s2?

#![allow(unused)]
fn main() {
let s0: &str = "That gum you like is going to come back in style";

// Build a 'String' object from the previous '&str':
let mut s1: String = String::from(s0);
let s2: &str = s1.as_str();
}

Slices

A slice in rust can be considered as a bounded pointer or reference to a contiguous sequence of elements in an array, a collection, or a string of characters, as we saw earlier. It is declared with the &[T] syntax. Since it works like a reference, it does not have ownership over its contents.

#![allow(unused)]
fn main() {
// Create a byte buffer:
let mut buffer = [0 as u8; 16];

// Get a slice on half the buffer:
// (notice how the slice itself is not mutable, 
// but instead points to a mutable sequence in the buffer)
let slice: &mut[u8] = &mut buffer[0..8]; // we use the 'range syntax' here to capture the slice

// Iterate on the slice to change values:
for (i, n) in slice.iter_mut().enumerate() {
    *n = i as u8;
}
println!("{:?}", buffer);
}

Take a slice out of the string object, starting from character 25 until the end, and use the .make_ascii_lowercase() method on the captured slice.

#![allow(unused)]
fn main() {
let mut string = String::from("YOU REMIND ME TODAY OF A SMALL MEXICAN CHIHUAHUA");

let slice = ...;

slice.make_ascii_lowercase();

println!("{string}");
}

What is the inferred type of the slice variable?

5TC-Rust