The HPE Performance Cluster Manager (HPCM) integration provides centralized monitoring, discovery, and management of large-scale HPC (High-Performance Computing) environments. Designed for scientific, AI, ML, and enterprise HPC workloads, HPCM helps IT teams efficiently monitor thousands of nodes with high availability and performance.

With this integration, you can:

  • Automatically discover hardware, nodes, network interfaces, and firmware details.
  • Monitor critical metrics, track performance, and generate real-time alerts.
  • Organize resources hierarchically and visualize the cluster structure within OpsRamp.

Here is the highlevel architecture of HPE Performance Cluster Manager (HPCM):


Supported Target Version

HPE Performance Cluster Manager (HPCM) v1.12

Resource Hierarchy of HPCM

  • HPE Performance Cluster Manager

    • HPC Nodes
      • HPC NICs

Key Use Cases

Discovery Use cases

Gain visibility into all HPCM components with relationship mapping:

  • Automatically detects compute, management, and storage nodes.
  • Identifies network interfaces, switches, and fabric topology.
  • Discovers hardware (CPU, memory, GPU, disk), BIOS, and firmware versions.
  • Groups system components for streamlined management.

Monitoring Use cases

Track system health and performance with actionable insights:

  • Collects time-series metric data to detect anomalies and threshold breaches.
  • Triggers alerts to support fast issue resolution and reduce downtime.
  • Monitors job scheduling time, status, and related performance indicators.

Version History

Application VersionBug fixes / Enhancements
1.0.0Initial Version with discovery and monitoring features.