BigHPC – A Management Framework for Consolidated Big Data and HPC

Summary

Nowadays, it is becoming increasingly difficult to efficiently manage available computational and storage resources, to provide transparent application access to such resources, and to ensure performance isolation and fairness across different workloads. The BigHPC project will address these challenges with a novel management framework, for Big Data and parallel computing workloads.

In this sense, the BigHPC will simplify the management of Big Data applications and HPC infrastructural resources – with a direct impact on science, industry and society, by accelerating scientific breakthroughs in different fields and increasing the competitiveness of companies through better data analysis and improved decision-support processes.

The project will advance the current knowledge and develop new tools to address the different challenges in HPC infrastructures, namely the monitoring, virtualization and storage management components. At the end of the project, it is expected that the BigHPC will integrate these three components in a new platform, thus allowing a more efficient use of said infrastructures and their services.

Expected Outcomes

  • A novel solution to manage and monitor HPC and Big Data workloads that:
    1) combines novel monitoring, virtualization and software-defined storage components;
    2) can cope with HPC’s infrastructural scale and heterogeneity; 
    3) efficiently supports different workload requirements while ensuring the holistic performance and resource usage; 
    4) can be seamlessly integrated with existing HPC infrastructures and software stacks; 
    5) will be validated with pilots running in both MACC and TACC infrastructures.
Start Date – End Date: March 31, 2020 – March 31, 2023 June 30,2023
Scientific Area: Advanced Computing
Keywords:

Big Data, High Performance Computing, HPAI

Website:
Lead Beneficiary (PT): Wavecom – Soluções Rádio S.A.
Co-beneficiaries:
INESC TEC – Instituto de Engenharia de Sistemas e Computadores, Tecnologia e Ciência
LIP, Laboratório de Instrumentação e Física Experimental de Particulas – Associação para a Investigação e Desenvolvimento
PIs at UT Austin: Vijay Chidambaram (Department of Computer Science)
John Cazes (Texas Advanced Computing Center)
Other Partners: Minho Advanced Computing Center 
Total Eligible Investment (PT): 1 183 678,08 EUR
Total Eligible Investment (US): 799 998,00 USD
Funding Sources Distribution:  

Papers and Communications

  • Miranda, M., Esteves, T., Portela, B., & Paulo, J. (2021). S2Dedup: SGX-enabled secure deduplication. Proceedings of the 14th ACM International Conference on Systems and Storage. SYSTOR ’21: The 14th ACM International Systems and Storage Conference.
  • Faria, A., Macedo, R., Pereira, J., & Paulo, J. (2021). BDUS: implementing block devices in user space. Proceedings of the 14th ACM International Conference on Systems and Storage. SYSTOR ’21: The 14th ACM International Systems and Storage Conference.
  • Ruhela, A., Harrell, S. L., Evans, R. T., Zynda, G. J., Fonner, J., Vaughn, M., Minyard, T., & Cazes, J. (2021). Characterizing Containerized HPC Applications Performance at Petascale on CPU and GPU Architectures. Lecture Notes in Computer Science. Springer International Publishing.
  • Ruhela, A., Vaughn, M., Harrell, S. L., Zynda, G. J., Fonner, J., Evans, R. T., & Minyard, T. (2020). Containerization on Petascale HPC Clusters. State of Practice talk in International Conference for High Performance Computing, Networking, Storage and Analysis.
  • Evans, R. T. (2020). Democratizing Parallel Filesystem Monitoring. 2020 IEEE International Conference on Cluster Computing. (2020).
  • Kadekodi, R., Kadekodi, S., Ponnapalli, S., Shirwadkar, H., Ganger, G. R., Kolli, A., & Chidambaram, V. (2021). WineFS: a hugepage-aware file system for persistent memory that ages gracefully. Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles.
  • Esteves, T. Neves, F., Oliveira, R. & Paulo, J. (2021). CaT: Content-aware Tracing and Analysis for Distributed Systems. In Proceedings of the 22nd International Middleware Conference (Middleware ’21). Association for Computing Machinery, New York, NY, USA, 223–235. https://doi.org/10.1145/3464298.3493396.

E-Posters 

2020 Annual Conference

2021 Annual Conference

In The News

COMPETE 2020 Newsletter | BigHPC: A Management Framework for Consolidated Big Data and HPC

July 15, 2020. Conversing with COMPETE 2020 Bruno Antunes, Research Manager at Wavecom and responsible for the project BigHPC.

TechBit | Investment of more than 1.9 million euros to improve national supercomputers

July 5, 2020. Advanced Computing (High Performance Computing - HPC) offers more resources and better performance than ordinary computers, allowing to solve complex engineering and science problems and...

Exame Informática | Portuguese research will improve the performance of supercomputers

July 3, 2020. INESC TEC participates in a project to develop a platform that improves the management of high performance computing centers in support of Big Data and...

Job Positions

Computer Engineer at Wavecom

Co-funded by: