Advanced types Compound and collection types
Pierre Cochard, Tanguy Risset
Compound types
Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.
Course: Tuples
'Tuples' are fixed-size collections of arbitrary-typed values,
they are defined with the (Type, Type, ...) syntax:
#![allow(unused)] fn main() { // Explicit type: let mut tup: (i32, i32, f32, &str) = (31, 16, 47.27, "hello!"); // Inferred type: let mut tup = (31, 16, 47.27, "hello!"); }
Accessing individual values within a Tuple can be done by either:
- referring to its index
- destructuring the tuple and bind it to individually-named variables:
#![allow(unused)] fn main() { let mut tup = (31, 16, 47.27, "hello!"); // Access by index: tup.0 = 22; tup.3 = "world!"; // 'Destructuring' a tuple: let (t0, t1, t2, t3) = tup; println!("y = ({t0}, {t2})"); }
'Tuples' are stored on the stack unless Box<...> is explicitely used.
Take a mutable reference to the third element ( f32) of the tupletupand pass it to a function that multiplyies it by2:
let mut tup = (31, 16, 47.27, "hello!");
// The function's prototype (to be implemented):
fn mul2(t: &mut f32);
mul2(...);
assert!(tup == (31, 16, 94.54, "hello!"));
Correction
#![allow(unused)] fn main() { let mut tup = (31, 16, 47.27, "hello!"); fn mul2(t: &mut f32) { *t += *t; } mul2(&mut tup.2); assert!(tup == (31, 16, 94.54, "hello!")); }
Rust refuse les mutations implicites. l'appel à mul2 a besoin de &mut:
& → reference (not a value)
mut → reference is exclusive and mutable
Tuples can be conviently used in a function in order to return multiple values, which can then be assigned to distinct variables in a same expression:
#![allow(unused)] fn main() { // A function returning a pair of signed integers: fn return_tuple(x: i32) -> (i32, i32) { return (x+1, x+2); } // Calling a function, and storing its result: let y: (i32, i32) = return_tuple(8); println!("y = {:?}", y); println!("y = ({:?}, {:?})", y.0, y.1); // Passing a tuple as an argument to a function: fn print_tuple(x: &(i32, i32)) { println!("x = ({:?}, {:?})", x.0, x.1); } print_tuple(&y); }
Write a function that transforms a (i32, i32)tuple by swapping its two values:
#![allow(unused)] fn main() { // The function prototype to be implemented: fn swap(tup: &mut(i32, i32)); let mut x = (31, 27); swap(&mut x); assert!(x == (27, 31)); }
Correction
#![allow(unused)] fn main() { fn swap(tup: &mut(i32, i32)) { *tup = (tup.1, tup.0); } let mut x = (31, 27); swap(&mut x); assert!(x == (27, 31)); }
Course: Arrays (primitive type)
Arrays are fixed-size groups of values of the same type, and can be defined in Rust with the syntax:
[Subtype; Length], for instance[i32; 10]
Arrays can be defined and initialized to a specific value in a single statement. They cannot be used without being initialized before.
#![allow(unused)] fn main() { // Explicit type: let a: [i32; 3] = [31, 16, 47]; // Inferred type: let b = [0, 1, 2, 3, 4]; // #[i32; 5] // Create and zero-initialize an array: let mut a: [usize; 10] = [0; 10]; // same as: let mut a = [0 as usize; 10]; // same as: let mut a = [0usize; 10]; // Writing at a specific index: // Note: in Rust, as in C, array indices start at 0 a[0] = 2; println!("a[0] = {}", a[0]); }
'Arrays' are stored on the stack unless Box<...> is explicitely used.
As in most programming languages, multidimensional/nested arrays are also supported in rust, and can be declared as follows:
#![allow(unused)] fn main() { // 2-dimensional array, 2 arrays of `i32` with a length of 10 each: let mut multi_array = [[0 as i32; 10]; 2]; [...] multi_array[0][0] = 1; }
What would be the type of the following arrays?
#![allow(unused)] fn main() { let a1 = [(1, 2), (3, 4), (5, 6)]; let a2 = [(1, 2), (3, 4), (5, (6, 7))]; }
Correction
The type of a1 would be:
#![allow(unused)] fn main() { let a1: [(i32, i32); 3] = [(1, 2), (3, 4), (5, 6)]; }
a2, on the other hand, does not compile: all values must have the same type in an array.
Course: Ranges & Iterators
Arrays are convenient for storing and processing a set of contiguous data on the stack, for instance through the use of loops, ranges and iterators.
A range represents an interval of values between a start and an end point.
In rust, they can be conveniently used with the start..end construct (here excluding the end value),
or with start..=end (here including the end value).
Examine the following assert!statements, will this program compile?
let a = 0..10;
let b = 1..=10;
// 'a' range:
assert!(a.contains(&0));
assert!(!a.contains(&10));
// 'b' range:
assert!(!b.contains(&0));
assert!(b.contains(&10));
// The 'a' and 'b' ranges have the same number of elements:
assert_eq!(a.count(), b.count());
Correction
Yes, all the assert!() statements are true.
Course: Iterators
Iterators allow to go through an array, a range or a collection, and access each element one-by-one.
#![allow(unused)] fn main() { let r = 0..10; // Iterate over a range: for n in r.into_iter() { print!("{n} "); // -> 0 1 2 3 4 5 6 7 8 9 } println!(); // Iterate over an array: let mut a = [0; 10]; // Basic for-loop iteration: for x in a { println!("{x}"); } // From a 'range': for n in (0 .. a.len()) { println!("{}", a[n]); } // As mutable, changing the values of the array: for x in &mut a { *x += 1; } // Equivalent to (using a 'closure'): a.iter_mut().for_each(|x| *x += 1 ); // Iterate with both element and index: for (i, x) in a.iter_mut().enumerate() { *x += i; } }
Using ranges and/or iterators, write in the following multidimensional array's first sub-array values that incrementally go from 1 to 10, and in the second, decrement the values from 10 to 1, as shown below:
#![allow(unused)] fn main() { let mut multi_array = [[0 as i32; 10]; 2]; // The following must be true: assert_eq!(multi_array, [ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] ]); }
Correction
#![allow(unused)] fn main() { let mut multi_array = [[0 as i32; 10]; 2]; for n in 1..=10 { multi_array[0][n-1] = n as i32; } for n in (1..=10).rev() { multi_array[1][10-n] = n as i32; } // The resulting 2D-array assert_eq!(multi_array, [ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] ]); }
Collections
In addition to primitive compound types, the Rust standard library includes a number of very useful data structures called collections. Unlike the built-in array and tuple types, the data these collections point to is stored on the heap, which means the amount of data does not need to be known at compile time and can grow or shrink as the program runs.
Course: Vectors
'Vectors' are a collection of multiple values of a same type stored on the heap. Unlike arrays, they have a dynamic size: they can grow, or shrink.
A Vec object has ownership over the data located in its underlying heap-allocated buffer,
which means that the buffer will be deallocated whenever the owning object goes out of scope.
#![allow(unused)] fn main() { // The easiest way to create a vector is to use the 'vec!()' macro: let mut v = vec![0, 1, 2, 3, 4, 5]; // Vec<i32> println!("Value: {:?}", v); // 'Pushing' (appending) a new value at the end: v.push(6); println!("Value: {:?}", v); // 'Popping' (removing) its last value: let last = v.pop(); println!("Last value: {:?}, Vector: {:?}", last, v); // removing value at index i: v.remove(i) }
Examine the following v1,v2andv3vectors and their underlying heap buffer pointers.
#![allow(unused)] fn main() { let mut v0: Vec<i32> = vec![0, 1, 2, 3, 4]; // get a pointer to the underlying heap memory buffer: let v0_ptr: *const i32 = v0.as_ptr(); // Create another vec 'v1' from 'v0', and get its heap pointer again: // i.e. move v0 to v1 let mut v1: Vec<i32> = v0; let v1_ptr = v1.as_ptr(); // Create another vec 'v2' from 'v1'using clone(): let mut v2: Vec<i32> = v1.clone(); let v2_ptr = v2.as_ptr(); }
Which of the following assertions are
true:
#![allow(unused)] fn main() { // Assertion A: the address of pointer 'v0' is the same as pointer 'v1' assert_eq!(v0_ptr.addr(), v1_ptr.addr()); // Assertion B: the address of pointer 'v1' is the same as pointer 'v2' assert_eq!(v1_ptr.addr(), v2_ptr.addr()); // Assertion C: the address of pointer 'v0' is the same as pointer 'v2' assert_eq!(v0_ptr.addr(), v2_ptr.addr()); }
Correction
Assertion A is correct:
v0_ptrandv1_ptrare the same, sincev0was moved intov1its memory was not re-allocated at any other address, butv1has now exclusive ownership over this address, makingv0unaccessible.Assertion B and C are incorrect:
v2has been explicitly cloned fromv1: this means that all of its contents (including its heap-allocated memory buffer) have been deep-copied: another memory zone has been allocated and has been filled with the same contents.
Iterating over a vector is the exact same process as for an array (most operations are inter-compatible!).
#![allow(unused)] fn main() { // Initializing from a range and iterator: let mut v = Vec::from_iter((0..6).map(|i| i+1 )); println!("Value: {:?}", v); // Iterate/increment: for x in &mut v { *x += 1; } println!("Value: {:?}", v); // General operations: v.rotate_left(1); println!("Value: {:?}", v); // etc. }
Using a single loop, move the contents of vector vto arrayasuch as vectorvis equal tovec and arrayais equal to the inverse ofv:[5, 4, 3, 2, 1, 0]:
#![allow(unused)] fn main() { let mut v = vec![0, 1, 2, 3, 4, 5]; let mut a = [0; 6]; (...) assert_eq!(v, vec![]); assert_eq!(a, [5, 4, 3, 2, 1, 0]); }
Correction
#![allow(unused)] fn main() { let mut v = vec![0, 1, 2, 3, 4, 5]; let mut a = [0; 6]; for n in (0..v.len()) { a[n] = v.pop().unwrap(); } assert_eq!(v, vec![]); assert_eq!(a, [5, 4, 3, 2, 1, 0]); }
Course: Hash-maps
HashMap are heap-allocated collections of same-type values indexed by a unique key.
Like vectors, they can grow, or shrink. They make a convenient choice for representing indexes, dictionaries, or any other type of database-like objects:
#![allow(unused)] fn main() { // Unlike Vec, the HashMap data structure need to be explicitly included! use std::collections::HashMap; // Inferred type: let mut departments = HashMap::new(); // HashMap<i32, str> departments.insert(85, "Vendée"); departments.insert(31, "Haute-Garonne"); departments.insert(44, "Loire-Atlantique"); // We use the ampersand(&) and the key (&1) as the argument // because [..] returns us a reference of the value. It is not //the actual value in the HashMap. let d31 = departments[&31]; assert_eq!(d31, "Haute-Garonne"); // Removing a key: departments.remove(&85); // Iterating over all values: for department in departments { // We get a tuple! println!("Key: {}, Value: {}", department.0, department.1); } }
Move the contents of the following Vecobject into aBTreeMap(which behaves the same as aHashMap, but will sort its contents by key) in order to get these athlete names sorted by their score in points.Note: some of them have the same score, which should appear in the same
key.
#![allow(unused)] fn main() { use std::collections::BTreeMap; let vec = vec![ ("Y. Horigome", 281), ("N. Huston", 279), ("M. Dell", 153), ("J. Eaton", 281), ("S. Shirai", 278), ("K. Hoefler", 270), ("C. Russell", 211), ("R. Tury", 273), ]; let mut map = BTreeMap::new(); [...] for score in map { println!("{:?}", score); } }
Hints:
- build a BTreeMap with the key being the score
- use the
entry(key)method of HashMap that return the HashMap entru corresponding to key- apply
or_defaulton the result of entry (it creates an entry with no value for this key, if the key does not exists in the HashMap)
The last
forloop should print:
(153, ["M. Dell"])
(211, ["C. Russell"])
(270, ["K. Hoefler"])
(273, ["R. Tury"])
(278, ["S. Shirai"])
(279, ["N. Huston"])
(281, ["Y. Horigome", "J. Eaton"])
Correction
#![allow(unused)] fn main() { use std::collections::BTreeMap; let vec = vec![ ("Y. Horigome", 281), ("N. Huston", 279), ("M. Dell", 153), ("J. Eaton", 281), ("S. Shirai", 278), ("K. Hoefler", 270), ("C. Russell", 211), ("R. Tury", 273), ]; let mut map: BTreeMap<i32, Vec<&str>> = BTreeMap::new(); for v in vec { map.entry(v.1) .or_default() .push(v.0); } for score in map { println!("{:?}", score); } }
'string' types (str and String)
Course: str primitive
The str primitive type can be used to represent a string literal:
#![allow(unused)] fn main() { // String literal: let s = "Hello, World!"; }
As a literal, a str has a static lifetime which can be also explicitly stated in its type declaration.
A static lifetime means that the object is valid throughout the entire duration of the program.
The type &str is usually called a string slice
#![allow(unused)] fn main() { // Here, the three syntaxes are equivalent: let s = "Hello, World!"; // Inferred type & lifetime let s: &str = "Hello, World!"; // Explicit type, inferred lifetime let s: &'static str = "Hello, World!"; // Explicit type & lifetime }
Unlike const char* in the C programming language, &str in Rust is not null-terminated, but relies on a slice, which is composed of a pointer and a size in bytes:
#![allow(unused)] fn main() { let s = "Hello, World!"; println!("Pointer: {:?}, Length: {} bytes", s.as_ptr(), s.len()); for (n, char) in s.chars().enumerate() { println!("Char {n}: {char}"); } }
For safety reasons, Rust doesn't allow modifying the actual contents (the characters) of a &str, thus the following does not compile:
#![allow(unused)] fn main() { let s: &mut str = "Hello, World!"; }
Run the following code:
#![allow(unused)] fn main() { let s1 = "It's not about the bunny \t"; // trim() Returns a string slice with leading and trailing whitespace removed. let s2 = s1.trim(); println!("{s1}"); println!("Address: {:?}, Length: {}", s1.as_ptr(), s1.len()); println!(); println!("{s2}"); println!("Address: {:?}, Length: {}", s2.as_ptr(), s2.len()); }
- Since Rust forbids modifying the contents of a
strliteral, why are we in this case allowed to use the.trim()function? What is truly happening in this code?
Correction
In this specific case, since we can clearly see that the underlying pointer is the same in both s1 and s2, we can deduce that the underlying buffer has not been modified, but instead we instantiated another slice object, which points to the same address as s1, but has a different length, which omits the trailing whitespaces and tab.
- What would happen if we modified
s1as follows?
#![allow(unused)] fn main() { let s1 = " It's not about the bunny \t"; let s2 = s1.trim(); }
Correction
Here, both the pointer and length would be different in s2, the pointer would be indeed offset by the number of whitespaces leading the string literal (which is here equal to 5):
#![allow(unused)] fn main() { let s1 = " It's not about the bunny \t"; let s2 = s1.trim(); // s2 pointer address minus 5 (removing the whitespaces) is equal to s1 pointer's address assert_eq!(s2.as_ptr().addr()-5, s1.as_ptr().addr()); }
Course: String
A String, on the other hand, is a standard library collection type that can be basically seen as a vector of char, dynamically stored on the heap. Just like a Vec, it can grow, shrink, and has ownership over its own underlying buffer, which makes it an easier object to manipulate. While it inherits all of the str methods, it does not have a static lifetime.
#![allow(unused)] fn main() { // Create from a string literal: let mut s = String::from("Owls are not what they seem"); // Append a 'char': s.push('!'); println!("Value: {:?}", s); // Append another string: s.push_str(" Really?"); println!("Value: {:?}", s); // Other ways of appending to the String: s = s + " Yes, "; s += "really!"; // Iterate over every 'char' for c in s.chars() { print!("{c} "); } println!(""); // Example of transformation: s = s.chars().rev().collect(); println!("Value: {:?}", s); }
Examine the following code:
#![allow(unused)] fn main() { let s0: &str = "That gum you like is going to come back in style"; // Build a 'String' object from the previous '&str': let mut s1: String = String::from(s0); // Now modify 'string': s1 = s1.to_ascii_uppercase(); println!("{s1}"); }
Is
s0still accessible? If yes, what is now its value? Is it the same ass1and why?
Correction
s0 is still accessible, and its value remains the same. The contents of s0 has been copied into a heap buffer owned by s1. The two are completely separate from one another.
We now call the
as_str()method ons1, and store the resulting&strvalue in a new variable calleds2. Can you guess what is the lifetime ofs2?
#![allow(unused)] fn main() { let s0: &str = "That gum you like is going to come back in style"; // Build a 'String' object from the previous '&str': let mut s1: String = String::from(s0); let s2: &str = s1.as_str(); }
Correction
s2 has the same lifetime as s1, since its referring to the same underlying heap buffer. In this case, this &str is not a string literal, which would have a 'static lifetime, but is a slice that points to dynamically-allocated memory.
Slices
A slice in rust can be considered as a bounded pointer or reference to a contiguous sequence of elements in an array,
a collection, or a string of characters, as we saw earlier. It is declared with the &[T] syntax. Since it works like a reference, it does not have ownership over its contents.
#![allow(unused)] fn main() { // Create a byte buffer: let mut buffer = [0 as u8; 16]; // Get a slice on half the buffer: // (notice how the slice itself is not mutable, // but instead points to a mutable sequence in the buffer) let slice: &mut[u8] = &mut buffer[0..8]; // we use the 'range syntax' here to capture the slice // Iterate on the slice to change values: for (i, n) in slice.iter_mut().enumerate() { *n = i as u8; } println!("{:?}", buffer); }
Take a slice out of the stringobject, starting from character25until the end, and use the.make_ascii_lowercase()method on the captured slice.
#![allow(unused)] fn main() { let mut string = String::from("YOU REMIND ME TODAY OF A SMALL MEXICAN CHIHUAHUA"); let slice = ...; slice.make_ascii_lowercase(); println!("{string}"); }
Correction
#![allow(unused)] fn main() { let mut string = String::from("YOU REMIND ME TODAY OF A SMALL MEXICAN CHIHUAHUA"); let slice = &mut string[25..]; slice.make_ascii_lowercase(); println!("{string}"); }
What is the inferred type of the
slicevariable?
Correction
It is a &mut str.