In traditional HPC environments, login nodes are typically used as an access point for users to submit and manage jobs. Although login nodes are still used today, HPC environments are increasingly being used by a broad class of users with domain expertise and not necessarily IT experts.
Much like dashboards in automobiles, dashboards in the context of HPC infrastructure are crucial to get an understanding of what’s happening under the hood of your HPC cluster - at a glance.
Overview
System monitoring is a fundamental part of IT best practices. High performance computing (HPC) environments are no exception to this. At the high-end, HPC clusters can consist of thousands of servers, processing millions of jobs per day.
A few days back I posted some of my initial thoughts of the MNT Reform 2 laptop which just recently arrived. I ran the usual battery of tests on the laptop including the High Performance Linpack (HPL) of course just for kicks.
I’ll admit it. I sat on the fence for a long time before placing an order for the MNT Reform 2 laptop. At the time, I was in the market for a laptop as my 2 Macbook Pro retina laptops were repurposed for online schooling for my children during the pandemic (and as it turns out were never returned to me).
IBM Spectrum LSF provides many ways to query the LSF cluster for information about workloads. As a user, once you’ve submitted a job to LSF, it’s logical to want to understand what has happened to your job.