|Title||Programmable and Adaptable Storage for AI-oriented HPC Ecosystems|
|Scientific Area||Advanced Computing|
|Funding (PT)||49 235,00 EUR|
|Funding (US)||49 789,00 USD|
|Leading Institutions||Institute for Systems and Computer Engineering, Technology and Science (INESC TEC)
Scalable Computational Intelligence – Texas Advanced Computing Center, UT Austin
|Duration||12 months 15 months|
|Start date||October 1, 2020|
|End date||September 30, 2021 31 December, 2021|
High-Performance Computing, Software-Defined Storage, Artificial Intelligence
Artificial Intelligence (AI)-oriented High-Performance Computing (HPC) services are redefining the way users leverage computational and storage resources offered by supercomputers. Although the processing capabilities offered by HPC infrastructures can scale and handle the computational requirements of AI applications, the same does not hold for the shared storage architectures when used by multiple instances of data-centric applications. The PAStor project aims at providing a novel Software-Defined Storage solution that can efficiently handle I/O flows from multiple AI workloads by automatically adjusting storage configurations and resources to dynamically meet application requirements.
The research output from PAStor will be released as an open-source prototype that will provide the first building block towards a novel storage architecture suited for the exascale computing infrastructure. By gathering the expertise of researchers in the AI and distributed storage fields, and by including researchers from Texas Advanced Computing Center (TACC) and Minho Advanced Computing Center (MACC) with experience in managing HPC infrastructures, the project will produce new high-quality research findings and advance the state-of-the-art for storage solutions currently deployed at HPC centers.
At the moment, a storage solution with characteristics similar to the ones offered by PAStor is unavailable in the HPC market. Moreover, the optimization of storage systems for AI workloads is a pressing research challenge. Also, PAStor is aligned with the Advanced Computing Portugal 2030 strategy. The outcomes of the project will also enable MACC to gain access to highly innovative technology and to strengthen its position in the European HPC strategy. The findings identified through this project will help TACC optimizing the design, deployment and management of clusters aimed for exascale computing, a priority area in the United States.
- An open-source prototype that will provide the first building block towards a novel storage architecture suited for exascale computing infrastructures;
- Research outputs dissemination through top scientific venues in the HPC and storage fields.
Papers and Communications
- Dantas, M., Leitao, D., Cui, P., Macedo, R., Liu, X., Xu, W., & Paulo, J. (2022). Accelerating Deep Learning Training Through Transparent Storage Tiering. In 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). 2022 22nd International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE. https://doi.org/10.1109/ccgrid54584.2022.00011
- Miranda, M., Esteves, T., Portela, B., & Paulo, J. (2021, June 14). S2Dedup. Proceedings of the 14th ACM International Conference on Systems and Storage. SYSTOR ’21: The 14th ACM International Systems and Storage Conference. https://doi.org/10.1145/3456727.3463773
Conferences Best Paper Awards: https://www.acm.org/conferences/best-paper-awards
- Dantas, M., Leitao, D., Correia, C., Macedo, R., Xu, W., & Paulo, J. (2021). MONARCH: Hierarchical Storage Management for Deep Learning Frameworks. 2021 IEEE International Conference on Cluster Computing.
- Dantas, M., Leitao, D., Correia, C., Macedo, R., Xu, W., & Paulo, J. (2021). The Case for Storage Optimization Decoupling in Deep Learning Frameworks. 2021 IEEE International Conference on Cluster Computing.