Introduction

One of the most important and unique features of Rust is its memory management. This uniqueness is also the reason that its one of the more difficult concepts to grasp for someone new to Rust. Before getting into Rust’s memory management details, let us refresh our knowledge of common types of memory management in programming languages.

Types of Memory Management

  1. Manual Memory management: In languages like C++ or C, the programmer is responsible for allocating memory and making sure that memory is released after it is used and that memory is not used after it is released. The disadvantage of these types of languages is that a small error by the programmer can lead to a program crash or a security issue. And usually, such issues are hard to debug as well.
  2. Automatic Memory management: In languages like Java, Go, JavaScript, etc. there is a separate program, The Garbage Collector, which from time to time scans the memory to find out unused objects in memory. These objects are then automatically reclaimed. Programmer here does not need to worry about releasing the memory as it is taken care by the garbage collector. The drawback of these types of languages is that memory management is unpredictable as the garbage collector runs asynchronously. Also, garbage collection itself requires some cpu cycles and the program may even have to be paused for sometime. Hence, the performance of programs written in such languages is not predictable.

Rust does not fit into any of the above categories squarely. Rust takes the approach similar to RAII feature available in languages with Manual Memory Management and makes it part of the language. Not only Rust strictly enforces at compile time the allocation/deallocation of memory but also that released memory is not used either directly or via some reference. Rust enforces this by tracking at compile time the scope or lifetime of variables and ownership of data in memory. Ownership and lifetime together let Rust manage memory without the use of Garbage Collector. Let us first look at Ownership.

Ownership

Rust assigns ownership of data in memory to the variables that point to that memory. In case multiple variables point to same area of memory, Rust defines rules for transfer of ownership to make sure that ownership stays with only one variable at a time. These rules differ slightly depending on the data type.

Move Semantics

All types by default follow move semantics on assignment or when a value is passed as an argument to a function. It means that in an assignment like x=y, value from y is moved to x and y becomes invalid. Consider following example

 fn main(){
   let x = String::from("Hello"); // x scope starts here
   let y = x;   // x moved to y and y scope starts here
   println!("{}",y);
   println!("{}", x); 
 }

In above example, the last line where we print x will fail to compile. This is because when we assigned x to y, compiler would mark x as invalid and now y points to the String “Hello”. It is said that value has moved from x to y

In this case, due to move semantics, only one variable can point to a data in memory. This one variable is also the owner of that data in memory which allows rust to delete the memory when that owner variable goes out of scope.

But in some cases it may be desirable to have more than one variable have same value. To achieve this, one way is to explicitly copy the value to y using clone function.

 fn main(){
   let x = String::from("Hello"); // x scope starts here.
   let y = x.clone();   // y scope starts here
   println!("{}",y); // prints "Hello"
   println!("{}", x); // prints "Hello"  
 } // x and y scope end here

With explicit copy both x and y point to separate locations in memory and are also the owner of these locations. Rust clears up the memory when x and y respectively go out of scope.

move semantics are default because Rust does not want to copy value implicitly by default as some types can contain large amounts of data and also size of data is not fixed, so time to copy can be huge. But in case of some types where size of data is fixed, implicit copy may be desirable. Following we discuss implicit copy using copy semantics

Copy semantics

In some cases it is more convinient or desirable to allow copy implicitly. In the following example, we first initialize x to 5. We also assign x to y. When scope ends, the memory is reclaimed. Now first y is reclaimed as y's scope ends first and then x is reclaimed. This works because x and y point to separate memory locations of their own copy. The value was implicitly copied and not moved because integer implements Copy trait and when we assign x to y a new memory is allocated and the value is copied to that location.

fn main(){
   {
     let x = 5;              // x scope start
     {
       let y = x;           // x copied to y and y scope start 
       println!("{}",y);
     }                      // y scope end
     println!("{}",x)         
   }                        // x scope end
}

To further understand copy semantics let us look at a custom type. Let us say we want to create a struct. Since move semantics are default, the value of this struct will follow move semantics

#[derive(Debug)]
struct Foo{}


fn main(){
   {
     let x = Foo{};              // x scope start
     {
       let y = x;           // x moved to y and y scope start 
       println!("{:?}",y);
     }                      // y scope end
     println!("{:?}",x)  ;    // error   
   }                        
}

Due to move semantics, the above code will not compile as Foo by default follows move semantics. Foo needs to implement Copy trait for Foo to support copy semantics as follows

#[derive(Debug, Copy, Clone)]
struct Foo{}


fn main(){
   {
     let x = Foo{};              // x scope start
     {
       let y = x;           // x moved to y and y scope start 
       println!("{:?}",y);
     }                      // y scope end
     println!("{:?}",x)  ;   // works now 
   }                        
}

Now Rust has another rule that a struct can implement Copy trait only if either it is empty as in the case above or all the fields contained in it also implement Copy. Therefore, for example, a struct containing a String cannot implement Copy because String does not implement Copy trait.

References

In previous sections we discussed the two types of variables. We saw how Rust ensures that there is only single owner of the data in memory so that it knows when to delete that data from the memory. When we assign x to y ownership and value was transfered to y and we were not able to use x. We had to copy data using either clone or copy semantics to be able to use it via both x and y. How can we use the same data through multiple variables without having to copy data. Rust allows this by use of references. Reference allow creating variables which point to data but do have ownership of that data.

Consider following example

 fn main(){
   let x = String::from("Hello"); // x scope starts here
   {
      let y = &x;   // y borrows x and y scope starts here
      println!("{}",y);
   } // y scope ends here and value pointed by it is not reclaimed as ownership is still with x
   println!("{}", x);  // prints "Hello"
 }

In above example y also points to the same value of x but y does not own the value. Therefore, when y goes out of scope the value is not removed from the memory and x can still use the memory.

The opposite is not true, that is, if x goes out of scope than value will be removed from the memory and y cannot use it.

  fn main(){
   let y;
   {
      let x = String::from("Hello"); // x scope starts here
      y = &x;   // x moved to y and y scope starts here
      println!("{}",y);
   } // x scope ends here, now value pointed by y is not valid
   println!("{}", y);  // error
 }

The above code will not compile as y now points to the value owned by x and x is now out of scope and hence reclaimed. Therefore, Rust always ensures that lifetime of owner of borrowed value is more than lifetime of the borrower. Rust also allows as many immutable references as required.

 fn main(){
   let x = String::from("Hello"); // x scope starts here
   {
      let y = &x;   // x moved to y and y scope starts here
      let z = &x;
      println!("{}",y);
      println!("{}",z);
   } // y scope ends here
   println!("{}", x);  // prints "Hello"
 }

Mutable References

Till now all variables and references we discussed were immutable. In Rust all variables are immutable by default. This means that they cannot be used to change the value. We need to mark variables and references mutable to allow changes in data. Rust makes sure that at a time only one mutable reference is in scope for a given data item in memory. By ensuring this Rust avoids data races which can bring inconsistencies when more than one parts of program try to modify or access same data.

 fn main(){
   let mut x = String::from("Hello"); // x scope starts here
   {
      let y = & mut x;   // mutable reference to mutable variable x
      y.push_str(" world!"); //modify value
      println!("{}",y); // prints "Hello world!"
     
   } // y scope ends here
   println!("{}", x);  // prints "Hello world!"
 }

Lifetime

Functions

As we saw in previous sections that Rust keeps track of the lifetimes of various variables to efficiently manage the memory. Especially, it needs to know lifetime of owner variables to determine when to clean up the memory. In some cases, it may be difficult for Rust to know the owner variable. As we saw in references section that references do not own the data. Therefore, in case of references Rust needs to know the owner variable of the data referred to by the references so that, Rust makes sure that references are not used beyond the scope of the owner variables. Consider following example

fn concatenate(first : & mut String, second:&String)-> &String {
  first.push_str(second)
  return first;
}

In the above function, the function concatenate takes as input two string references and returns reference to another string. In this case, Rust does not know the owner of the data pointed to by the returned String reference. One possibility is that function body creates a string and returns the reference to that String. But this will not work because if we return reference to the new variable created inside function body then when that function body ends that memory will be reclaimed. Hence, the string reference must be derived from the input string references somehow. In order to solve this problem Rust provides a syntax to link the lifetimes of input and output references as follows

fn concatenate<'a> (first : &'a mut String, second:& String)-> & 'a String {
  first.push_str(second)
  return first;
}

Lifetime is denoted using same signature as the generic types in the function. Here 'a is the generic paramter denoting lifetime. All lifetime annotations have to begin with a single quote. Also, once a lifetime annotation is declared using angle bracket syntax it can be associated with the types in parameters and return types like &'a String where 'a lifetime annotation. This signature tells that lifetime of the return String reference is related to first parameter. Given this information, Rust will make sure that use of return String reference is not beyond the lifetime of the input variable first. For example,

fn concatenate<'a> (first :  &'a mut String, second:& String)->  &'a String {
  first.push_str(second);
  return first;
}
fn main(){
  let mut f = String::from("hello");
  let r;
  {
     let s = String::from(" world");
     r = concatenate(&mut f,&s);
  }// s is out of scope
  println!("{}",r);// prints "hello world"
}

The above example works because due lifetime annotations on the concatenate function signature Rust knows that lifetime of return value r must be less than first but its not related to reference passed for second. And since while printing r only second is out of scope but first is still in scope the r is allowed to be used. Now consider opposite case

fn concatenate<'a> (first : &'a mut String, second:& String)-> &'a String {
  first.push_str(second);
  return first;
}
fn main(){
  let s = String::from(" world");
  let r;
  {
     let mut f = String::from("hello");
     r = concatenate(& mut f,&s);
  }// s is out of scope
  println!("{}",r);// error, fails to compile
}

As expected rust will not let this code compile since as per signature returned reference from the function is related to first parameter of the function and when r is printed first is already out of scope and hence reclaimed

Structs

Just like in methods, sometimes it is useful to explicitly denote lifetimes using lifetime annotation when declaring structs as well. Usually, when declaring structs all fields of structs are non-references. But let’s say we are creating a struct which is having a field which is a reference.

struct Foo {
  name: &str
}
fn main(){
   let x = Foo{
       name: "hello"
   };
   println!("{:?}", x);
}

In this case, Rust will fail to compile above with following error

error[E0106]: missing lifetime specifier
 --> src/main.rs:3:9
  |
3 |   name: &str
  |         ^ expected named lifetime parameter
  |
help: consider introducing a named lifetime parameter
  |
2 | struct Foo<'a> {
3 |   name: &'a str
  |

error: aborting due to previous error

For more information about this error, try `rustc --explain E0106`.
error: could not compile `playground`

To learn more, run the command again with --verbose.

It also rightly suggests how to fix it by adding lifetime annotation. This lifetime annotation tells Rust that the instance of struct cannot live longer than the string reference name

#[derive(Debug)]
struct Foo<'a>{
  name: &'a str
}

fn main(){
   let x = Foo{
       name: "hello"
   };
   println!("{:?}", x);
}

Lifetime elision

In some cases Rust can figure out itself how lifetime of return variables related to the input variables. In such cases we need not provide the lifetime annotation. This is called lifetime elision. As explained in the previous section that returned reference can only be related to inputs to function. Hence, in case there is only one input to function Rust assumes that lifetime of returned reference is related to input reference. Therefore, if function signature has only one parameter we can omit the lifetime annotation, for example

fn add_exclamation(input : &mut String)-> & String {
  input.push_str(".");
  return input;
}
fn main(){
  let mut f = String::from("Hello World");
  let r = add_exclamation(& mut f);
  println!("{}",r);
}

Also, if it is a method and takes first parameter as self then lifetime of output reference is assumed to be less than or equal to the lifetime of self. For more detailed cases of lifetime elision please take a look at Rust Langauge reference.

Conclusion

Rust memory management is unique and it makes sure that there are no memory leaks just like the languages with garbage collection. But it also does not rely on programmer to make sure that memory is released correctly. Also, the memory related checks are done at compile time and therefore, it does not have overhead of memory management like in languages with garbage collection. It provides best of both worlds although, at the cost slightly steeper learning curve which I hope this article will help you overcome.