After the success of the first webinar series, the UT Austin Portugal Program and the BigHPC consortium are back with a second round of webinars, running every two months from March 2022 to March 2023, called Towards a new generation of Big Data & HPC applications. This time around, experts from academia and industry will meet to look through some of the most relevant scientific and technological advancements made in response to the challenges we have dissected before.
Webinar 4: Experiments in implementing the BigHPC Virtual Manager
- Date: September 22, 2022 | 3.00 p.m. (GMT+1)
- Speakers: Amit Ruhela (TACC),John Cazes (TACC), Stephen Harrell (TACC), Om Saran (TACC), Sangamithra Goutham (Systems and Storage Lab, UT Austin)
- Moderator: Miguel Viana (LIP)
Virtual Manager (VM) is a component in the BigHPC implementation that aims to stage and execute application workloads optimally on one of a variety of HPC systems. It mainly consists of two subcomponents, ie. VM scheduler and VM repository.
The Virtual Manager Scheduler provides an interface to submit and monitor application workloads, coordinate the allocation of computing resources on the HPC systems, and optimally execute workloads by matching the workload resource requirements and QoS specified by the user with the available HPC clusters, partitions and QoS reported by the BigHPC Monitoring and Storage Manager components respectively.
Additionally, the Virtual Manager Repository provides a platform to construct and store the software services and applications that support BigHPC workloads as container images. It then provides those uploaded images in a programmatic way when a workload request is submitted to the Virtual Manager Scheduler for execution.
In this talk, we first a few possible approaches to designing Virtual Manager, then we discuss the pros and cons of each approach, and last we discuss the approach which we determined was most feasible and then adopted in the BigHPC implementation.
About the speakers:
Amit Ruhela works as a Research Associate in the HPC group at TACC, Austin. He has earned his Ph.D. degree in computer science from IIT Delhi and postdoc experience from The Ohio State University. His research interests are focused on feature and performance enhancements in MPI communication through novel and innovative designs. Amit Ruhela also has deep interests in Big Data, Machine Learning, Social Computing, and Information Systems.
John Cazes joined TACC in March 2005. Prior to TACC, he served as Outreach lead to Naval Oceanographic Office Major Shared Resource Center (DOD) users for Lockheed Martin. Currently, he serves as the director of the High Performance Computing group at TACC. He has over 20 years of experience in high performance computing in public and private industry. John Cazes relies on his background in HPC, astrophysics, and climate/weather/ocean modeling to support the wide variety of researchers on TACC resources. His primary research interests are parallel I/O and advanced architectures.
Stephen Lien Harrell is an Engineering Scientist at the Texas Advanced Computing Center in the HPC Performance and Architectures group. His research interests include performance portability, performance modeling, benchmarking and HPC metric capture. Before his current appointment Stephen worked as an HPC System Administrator and HPC Support Staff for twelve years and received his bachelors degree in Computer Science at Purdue University.
Om Saran is a second year Computer Science Master’s student at UT Austin. He is collaborating with TACC as a research assistant to build the Virtual Manager for the BigHPC project. He is interested in systems and has previously worked at Nutanix. He is expected to graduate in May 2023.
Sangamithra Goutham is a second year Master’s student at The University of Texas at Austin. She works at the Systems and Storage Lab at UT as a Graduate Research Assistant and has previously completed her five year integrated Master’s degree at College of Engineering Guindy, Anna University. Her research interest lies in pursuing sustainability by optimizing resource consumption in large-scale systems
About the moderator:
Miguel Viana is a Linux SysAdmin at LIP-Minho. Graduated with a degree in Industrial Electronics Engineering and finishing a master degree dissertation in Embedded Systems. Interested in subject areas such as high-performance computing, automation, containerization and informatics security. Currently, mainly working on BigHPC project.
Important information about our training activities:
– Participation is free of charge. Kindly note, however, that to put on a course, we mobilize several people and incur expenses. Therefore, if you register for an event of ours, we expect you to attend it, for every time we assign a seat to someone we are preventing someone else from participating.
– Participants will be sent a link to the session 2 hours before the start time. If you do not receive it, please contact us at firstname.lastname@example.org
– Since our online courses are available globally, use a time zone converter to know what time the session is starting in your location.
The BigHPC Project is co-financed by the European Regional Development Fund through the Operational Program for Competitiveness and Internationalisation – COMPETE 2020, the Lisbon Portugal Regional Operational Program – Lisboa 2020 and the Portuguese Foundation for Science and Technology – FCT under UT Austin Portugal.