(Closed) ACT-PM – Automating Crash-Consistency Testing for Persistent Memory

Summary

TitleAutomating Crash-Consistency Testing for Persistent Memory
ReferenceUTA-EXPL/CA/0080/2019
Scientific AreaAdvanced Computing
Funding (PT)49 752,45 EUR
Funding (US)49 200,00 USD
Leading InstitutionsInstituto de Engenharia de Sistemas e Computadores, Investigação e Desenvolvimento em Lisboa (INESC ID/INESC/IST/ULisboa)
Department of Computer Science, College of Natural Sciences, UT Austin
Participating InstitutionsInstitute for Systems and Computer Engineering, Technology and Science (INESC TEC)
Duration12 months 15 months
Start dateOctober 1, 2020
End dateSeptember 30, 2021 December 30, 2021
KeywordsPersistent memory, Crash-consistency, Fault injection, Application testing

Persistent Memory (PM) is a recent technology that promises to deliver performance similar to Dynamic Random Access Memory (DRAM) combined with data persistency guarantees from disks, in case crashes do not occur. However, upon machine or application crashes, the application state can get corrupted, causing applications to malfunction. Implementing crash tolerance techniques is difficult and error-prone and, as a matter of fact, it has been shown that several PM applications do not always recover correctly from crashes. ACT-PM will automate the testing of crash-consistency for PM as memory and disk applications by conducting exploratory research in observability and fault-injection techniques tailored for PM applications.

In the same way that developers need to test regular applications to ensure they match the requirements when developing PM applications, it is fundamental to test their behavior upon crashes. The main challenge in testing PM applications is to have a complete and sound testing framework that intelligently prunes the search space such that applications under test crash only at sensitive points that are likely to reveal bugs and failures.

PM provides a novel point in the traditional memory hierarchy that promises to improve the performance and efficiency of applications. However, to fully exploit these capabilities, novel tools to assess the correctness of these applications under faults are needed. The research conducted in this project, and the resulting tools, will advance the state-of-the-art in the above areas and improve the workflow and efficiency of PM application developers to ultimately leading to safer and more performant applications that fully leverage PM capabilities.

Key Outcomes

Papers and Communications

E-Posters

Project Team

Miguel Matos

Principal Investigator in Portugal (INESC ID/INESC/IST/ULisboa) ACT-PM

Vijay Chidambaram

Principal Investigator in Austin (UT Austin) ACT-PM