/rust-survey

A survey on Rust works: empirical studies, verification tools, analysis tools, etc

Rust

   .rs
---> AST (macro expansion)
---> HIR (type checking)
---> MIR (borrow checker & lifetime inference, optimizations; CFG)
---> LLVM
---> exec

Empirical Studies

Usage of Rust

  • [Astrauskas et. al OOPSLA'20][PDF]: how unsafe code is used. Concretely, it aims to answer the following questions:
    • (Frequency) How often does unsafe code appear explicitly in Rust crates?
    • (Size) What is the size of unsafe blocks that programmers write? R: typically small blocks, 75% of all unsafe blocks comprise at most 21 #MIR.
    • (Self-containedness) Is the behaviour of unsafe code dependent only on code in its own crate?
    • (Encapsulation). Is unsafe code typically shielded from clients through safe abstractions?
    • (Motivation). What are the most prevalent use cases for unsafe code? R: call to unsafe function, dereference of raw pointer, use of mutable static variable, access to union field, use of extern static variable, use of inline assembly, ...
  • [Mehmet et. al OOPSLA'21] Sources of unsafety in code automatically translated by c2rust - See below
  • [Evans et. al ICSE'20] [PDF]: how unsafe code is used. Aims to answer the following questions:
    • How much do developers use Unsafe Rust?
    • How much of the Rust code is Unsafe Rust?
    • What Unsafe Rust operations are used in practice?
    • What abstract binary interfaces (programming languages) are used in the declared unsafe functions?
    • Does the use of Unsafe Rust change over time?
    • Why do Rust developers use unsafe?

Classes of Bugs:

  • [Qin et. al PLDI'2020] [PDF] [Benchmarks] [Notes] indentifies patterns of bugs and offers some insight into their common repairs too, distinguising between memory safety issues and concurrency specific ones. #unsafe #memorybugs #concurrency

  • [Xu et. al arXiv'2021] [PDF] [Benchmarks] [Notes] focuses on memory-safety bugs. #unsafe #memorybugs

Verification

Safe Code

  • Prusti [Homepage] [GitHub] [VS code extension]
    • [PDF] [notes on Astrauskas et. al OOPSLA'19] - translates Rust programs and specifications into the intermediate language used by Viper. Verifies the translated program for memory safety using the information provided by Rust's type system, and for functional correctness by adding support for user provided specification in Rust programs. #separation-logic #mir
    • [PDF] [notes on Wolff et. al OOPSLA'21] - specification and verification of closure-manipulating code in Rust in the presence of side effects on closure arguments and captured state. It relies on Rust's type system for deriving a strong framing guarantee, namely that the only mutation allowed on a captured state is by the closure itself.
  • RustHorn [GitHub] [PDF] [Yusuke et. al PLS'20]: uses Horn clauses to verify partial correctness.
  • Creusot [[GitHub]https://github.com/xldenis/creusot]: deductive verification of Rust code; allows annotations.

Unsafe Code

  • RustBelt [Homepage]
    • [PDF] [notes on Jung et. al POPL'18] - machine checked safety proofs for a subset of Rust. In particular, it derives the sufficient safety conditions for libraries using unsafe code to be deemed as safe. It works on lambda-rust, a language similar to MIR, hence RustBelt still needs a human in the loop to do the translation from MIR to lambda-rust. #separation-logic #mir #coq
    • [PDF] [notes on Yanovski et. al ICFP'21] - proposes a new Rust API, namely GhostCell, to safely implement data structures with internal sharing. The key enabler of this API is the separation of permissions from data, i.e. tracks permissions separately from the data they control.

Analysis

Static

  • MIRAI [GitHub] [Detailed description] [Video]- interpreter that supports both concrete and abstract values from the interval and tag domains. (the classes of bugs are unclear to me as of now, but I suspect it's only taint analysis) #mir #unsafe
  • RMC [[GitHub]https://github.com/model-checking/rmc)] - Rust Model Checker (not much info - need to contact authors)
  • SMACK [GitHub] [Video] - relies on LLVM IR and Boogie IVL to verify Rust code. #llvm
  • SafeDrop [PDF] [Cui et. al arVix'2021]
  • UnsafeFencer [GitHub][PDF][Huang et.al 2018]
  • RUDRA [GitHub][PDF] - used for detecting potential memory safety bugs in unsafe Rust (Panic Safety, Higher Order invariant, Send/Sync VAriance). They managed to run it on the entire Rust package registry (a list of all detected bugs). It is now integrated into the official Rust linter.

Dynamic

  • MIRI [GitHub] - interpreter for MIR, can detect certain classes of memory errors and data races. It aslo supports Stacked Borrows to check for violation of pointer aliasing discipline in unsafe code.

Testing

  • RMC [GitHub] [Post] - CBMC for Rust. Work on converting MIR to the input language of CBMC. #mir
  • SyRust [PDF]

Code transpiling and refactoring

  • c2rust - translates C code to unsafe Rust
  • [Mehmet et. al OOPSLA'21] [PDF] [Artefact] [Translation Rules] - translates C to unsafe Rust and lifts raw pointers to Optional references (in order to maintain the C interface)
  • [Sam et. al ACSW'17] [PDF] [GitHub] - syntactic changes (variable renaming, inlining, lifetime elission from functio signature)
  • [Zborowski'17] [PDF] - refactoring rust output from the outdated corrode c-to-rust translator; loops, and basic ownership translation.

Semantics Formalization

  • Oxide [PDF] [Slides] [Weiss et. al arXiv'19]
  • KRust [PDF] [Want et. al TASE'18]
  • RustSEM [PDF] [Kan et. al arXiv'18]

Mitigation Solutions

  • [Liu et. al ICSE'20] PDF - ensures data flow integrity between unsafe Rust code and safe Rust code by isolating the footprint of unsafe Rust from the footprint of safe Rust (prevents any cross-region memory corruption).

Tool Support

Check these blogs, presentations, reports:

References

Vytautas Astrauskas, Christoph Matheja, Federico Poli, Peter Müller, and Alexander J. Summers. 2020. How do programmers use unsafe rust? Proc. ACM Program. Lang. 4, OOPSLA, Article 136 (November 2020), 27 pages. DOI:https://doi.org/10.1145/3428204

Vytautas Astrauskas, Peter Müller, Federico Poli, and Alexander J. Summers. 2019. Leveraging rust types for modular specification and verification. Proc. ACM Program. Lang. 3, OOPSLA, Article 147 (October 2019), 30 pages. DOI:https://doi.org/10.1145/3360573

Fabian Wolff, Aurel Bílý, Christoph Matheja, Peter Müller, and Alexander J. Summers. 2021. Modular Specification and Verification of Closures in Rust. In Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 145. Publication date: October 2021

Mohan Cui, Chengjun Chen, Hui Xu, Yangfan Zhou. 2021. SafeDrop: Detecting Memory Deallocation Bugs of Rust Programs via Static Data-Flow Analysis. arVix.

Ana Nora Evans, Bradford Campbell, and Mary Lou Soffa. 2020. Is rust used safely by software developers? In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE '20). Association for Computing Machinery, New York, NY, USA, 246–257. DOI:https://doi.org/10.1145/3377811.3380413

Zhijian Huang, Yong Jun Wang, Jing Liu. 2018. Detecting Unsafe Raw Pointer Dereferencing Behavior in Rust. IEICE Trans. Inf. Syst., 101-D, 2150-2153.

Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2017. RustBelt: securing the foundations of the Rust programming language. Proc. ACM Program. Lang. 2, POPL, Article 66 (January 2018), 34 pages. DOI:https://doi.org/10.1145/3158154

Shuanglong Kan, David Sanán, Shang-Wei Lin, Yang Liu: "K-Rust: An Executable Formal Semantics for Rust". CoRR abs/1804.07608 (2018)

Peiming Liu, Gang Zhao, and Jeff Huang. 2020. Securing unsafe rust programs with XRust. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE '20). Association for Computing Machinery, New York, NY, USA, 234–245. DOI:https://doi.org/10.1145/3377811.3380325

Mehmet Emre, Ryan Schroeder, Kyle Dewey, and Ben Hardekopf. 2021. Translating C to safer Rust. Proc. ACM Program. Lang. 5, OOPSLA, Article 121 (October 2021), 29 pages. DOI:https://doi.org/10.1145/3485498

Garming Sam, Nick Cameron, and Alex Potanin. 2017. Automated refactoring of rust programs. In Proceedings of the Australasian Computer Science Week Multiconference (ACSW '17). Association for Computing Machinery, New York, NY, USA, Article 14, 1–9. DOI:https://doi.org/10.1145/3014812.3014826

Joshua Yanovski, Hoang-Hai Dang, Ralf Jung, and Derek Dreyer. 2021. GhostCell: Separating Permissions from Data in Rust. In Proc. ACM Program. Lang., Vol. 5, No. ICFP, Article 92. Publication date: August 2021. https://dl.acm.org/doi/10.1145/3473597

Gongming, & Luo, & Reddy, Vishnu & Almeida, Marcelo & Zhu, Yingying & Du, Ke & Omar, Cyrus. (2020). RustViz: Interactively Visualizing Ownership and Borrowing.

Boqin Qin, Yilun Chen, Zeming Yu, Linhai Song, and Yiying Zhang. 2020. Understanding memory and thread safety practices and issues in real-world Rust programs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 763–779. DOI:https://doi.org/10.1145/3385412.3386036

F. Wang, F. Song, M. Zhang, X. Zhu and J. Zhang, "KRust: A Formal Executable Semantics of Rust," 2018 International Symposium on Theoretical Aspects of Software Engineering (TASE), 2018, pp. 44-51, doi: 10.1109/TASE.2018.00014.

Aaron Weiss, Daniel Patterson, Nicholas D. Matsakis, Amal Ahmed: "Oxide: The Essence of Rust" CoRR abs/1903.00982 (2019)

Hui Xu, Zhuangbin Chen, Mingshen Sun, Yangfan Zhou, Michael Lyu. 2021. Memory-safety challenge considered solved? An in-depth study with all Rust CVEs. arXiv.

Zborowski, Adrian. "Framework for Idiomatic Refactoring of Rust Programming Language Code."

Matsushita, Yusuke, Takeshi Tsukada and Naoki Kobayashi. “RustHorn: CHC-Based Verification for Rust Programs.” Programming Languages and Systems 12075 (2020): 484 - 514.