Bright Cluster Manager

2y ago
16 Views
3 Downloads
2.94 MB
8 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : River Barajas
Transcription

Bright ComputingBright Cluster ManagerAdvanced Cluster Management Made EasyBright Cluster Manager removes the complexity from the installation, management and use of HPC clusters. With Bright ClusterManager, an administrator can easily install, manage and use multiple clusters simultaneously, without the need for expert knowledgeof Linux or HPC. Maximum Uptime The Bright AdvantageLinux and Bright Computing repositories.Web-based user portal.Cloud-readiness at no extra cost, with support for “Cluster-onDemand” and “Cluster-Extension” scenarios. Unattended, robust head node failover to spare head node.Powerful cluster automation functionality allows pre-emptiveactions based on monitoring thresholds.Comprehensive cluster monitoring and health checking framework, including automatic sidelining of unhealthy nodes toprevent job failure.Bright Cluster Manager offers many advantages that lead to improved productivity, uptime, scalability, performance and security,while reducing total cost of ownership: Rapid Productivity GainsScalability from Deskside to TOP500 Easy to learn and use, with an intuitive GUI.Quick installation: from bare metal to a cluster ready to use, inless than an hour.Fast, flexible provisioning: incremental, live, disk-full, disk-less,provisioning over InfiniBand, auto node discovery.Comprehensive monitoring: on-the-fly graphs, rackview, multiple clusters, custom metrics.Powerful automation: thresholds, alerts, actions.Complete GPU support: NVIDIA, AMD ATI, CUDA, OpenCL.On-demand SMP: instant ScaleMP virtual SMP deployment.Powerful cluster management shell and SOAP API for automating tasks and creating custom capabilities.Seamless integration with leading workload managers: PBSPro, Moab, Maui, SLURM, GridEngine, Torque, LSF.Integrated (parallel) application development environment.Easy maintenance: automatically update your cluster from Off-loadable provisioning for maximum scalability.Proven on some of the world’s largest clusters.Minimum Overhead / Maximum Performance Single lightweight daemon drives all functionality.Daemon heavily optimized to minimize effect on operatingsystem and applications.Single database stores all metric and configuration data.Top Security Automated security and other updates from key-signedrepositories.Encrypted external and internal communications (optional).X509v3 certificate-based public-key authentication.Role-based access control and complete audit trail.Firewalls and secure LDAP.1

Bright ComputingBright Cluster Manager — easy-to-use, completeand scalableBright Cluster Manager removes thecomplexity from the installation, management and use of HPC clusters, without compromizing performance or capability. With Bright Cluster Manager, anadministrator can easily install, use andmanage multiple clusters simultaneously, without the need for expert knowledge of Linux or HPC.A Unified ApproachThe cluster installertakes the administrator through theinstallation processand offers advancedoptions such as“Express” and“Remote”.Other cluster management offerings take a “toolkit” approach in which a Linux distribution is combined with manythird-party tools for provisioning, monitoring, alerting, etc.This approach has critical limitations because those separate tools were not designed to work together, were notdesigned for HPC, and were not designed to scale. Furthermore, each of the tools has its own interface (mostlycommand-line based), and each has its own daemon(s) anddatabase(s). Countless hours of scripting and testing fromhighly skilled people are required to get the tools to workfor a specific cluster, and much of it goes undocumented.By selecting a cluster node in the tree on the left and the Taskstab on the right, the administrator can execute a number of powerful tasks on that node with just a single mouse click.Bright Cluster Manager takes a much more fundamental, integrated and unified approach. It was designed andwritten from the ground up for straightforward, efficient,comprehensive cluster management. It has a single lightweight daemon, a central database for all monitoring andconfiguration data, and a single CLI and GUI for all clustermanagement functionality.This approach makes Bright Cluster Manager extremely easy to use, scalable, secure and reliable, complete, flexible, and easy to maintain and support.Ease of InstallationBright Cluster Manager is easy to install. Typically, system“Bright met our demanding requirementsstraight out of the box.”— Dr Tommy Minyard, Director of Advanced Computing at TACC2

“Bright Cluster Manageris a comprehensivecluster managementsolution that providesall the functionality that we need here atCD-adapco. Our key applications, STARCCM and STAR-CD, were easy to installand run well on the cluster.” — PhilipJones, Euro IT Director at CD-adapcoadministrators can install and test a fully functional cluster from “bare metal” in less than an hour. Configurationchoices made during the installation can be modified afterwards. Multiple installation modes are available, including unattended and remote modes. Cluster nodes can beautomatically identified based on switch ports rather thanMAC addresses, improving speed and reliability of installation, as well as subsequent maintenance. All major hardware brands are supported: Dell, IBM, HP, Supermicro, Acer,Asus and more.Ease of UseBright Cluster Manager is easy to use. System administrators have two options: the intuitive Cluster ManagementGraphical User Interface (CMGUI) and the powerful ClusterManagement Shell (CMSH).The CMGUI is a standalone desktop application thatprovides a single system view for managing all hardwareand software aspects of the cluster through a single pointof control. Administrative functions are streamlined as alltasks are performed through one intuitive, visual interface.Multiple clusters can be managed simultaneously. The CMGUI runs on Linux, Windows and MacOS (coming soon) andcan be extended using plugins. The CMSH provides practi-cally the same functionality as the Bright CMGUI, but via acommand-line interface. The CMSH can be used both interactively and in batch mode via scripts.Either way, system administrators now have unprecedented flexibility and control over their clusters.The Overview tabprovides instant,high-level insightinto the status ofthe cluster.Support for Linux and WindowsBright Cluster Manager is based on Linux and is availablewith a choice of pre-integrated, pre-configured and opti-Cluster metrics,such as GPU andCPU temperatures,fan speeds andnetworks statisticscan be visualized bysimply dragging anddropping them fromthe list on the leftinto a graphing window on the right.Multiple metrics canbe combined in onegraph and graphscan be zoomedinto. Graph layoutand colors can betailored to yourrequirements.3

Bright Computing and profiler, TAU, TotalView, Allinea DDT and AllineaOPT.GPU libraries, including CUDA and OpenCL.MPI libraries, including OpenMPI, MPICH, MPICH2,MPICH-MX, MPICH2-MX, MVAPICH and MVAPICH2; allcross-compiled with the compilers installed on BrightCluster Manager, and optimized for high-speed interconnects such as InfiniBand and Myrinet.Mathematical libraries, including ACML, FFTW, GMP,GotoBLAS, MKL and ScaLAPACK.Other libraries, including Global Arrays, HDF5, IIPP,TBB, NetCDF and PETSc.Bright Cluster Manager also provides EnvironmentModules to make it easy to maintain multiple versionsof compilers, libraries and applications for differentusers on the cluster, without creating compatibilityconflicts. Each Environment Module file contains theinformation needed to configure the shell for an application, and automatically sets these variables correctlyfor the particular application when it is loaded. BrightCluster Manager includes many preconfigured modulefiles for many scenarios, such as combinations of compliers, mathematical and MPI libraries.The status of cluster nodes, switches,other hardware,as well as up tosix metrics can bevisualized in theRackview. A zoomout option is available for clusterswith many racks.mized Linux distributions, including SUSE Linux EnterpriseServer, Red Hat Enterprise Linux, CentOS and ScientificLinux. Dual-boot installations with Windows HPC Serverare supported as well, allowing nodes to either boot fromthe Bright-managed Linux head node, or the Windowsmanaged head node.Extensive Development EnvironmentBright Cluster Manager provides an extensive HPC development environment for both serial and parallel applications, including the following (some optional): Compilers, including full suites from GNU, Intel, AMDand Portland Group. Debuggers and profilers, including the GNU debuggerPowerful Image Management andProvisioningBright Cluster Manager features sophisticated softwareimage management and provisioning capability. A virtuallyunlimited number of images can be created and assignedto as many different categories of nodes as required. Default or custom Linux kernels can be assigned to individualimages. Incremental changes to images can be deployed tolive nodes without rebooting or re-installation.The provisioning system propagates only changes tothe images, minimizing time and impact on system performance and availability. Provisioning capability can beassigned to any number of nodes on-the-fly, for maximumflexibility and scalability. Bright Cluster Manager can alsoprovision over InfiniBand and to ramdisk.Comprehensive MonitoringWith Bright Cluster Manager, system administrators cancollect, monitor, visualize and analyze a comprehensive setThe parallel shellallows for simultaneous executionof commands orscripts across nodegroups or across theentire cluster.4“I am very impressed with theefficiency achieved with BrightCluster Manager. Our clusterwas up and running within afew hours, ready for integration into our HPC environment. Now itis continuing to save our system administrators valuable time.” — Prof. LennartJohnsson, Director of the TLC2 and theAdvanced Computing Research Laboratory at the University of Houston

of metrics. Practically all software and hardware metricsavailable to the Linux kernel, and all hardware management interface metrics (IPMI, iLO, etc.) are sampled.Examples include CPU and GPU temperatures, fanspeeds, switches, hard disk SMART information, system load, memory utilization, network statistics, storagemetrics, power systems statistics, and workload management statistics. Custom metrics can also easily be defined.Metric sampling is done very efficiently — in one process, or out-of-band where possible. System administratorshave full flexibility over how and when metrics are sampled, and historic data can be consolidated over time tosave disk space.Cluster Management AutomationCluster management automation takes preemptive actions when predetermined system thresholds are exceeded, saving time and preventing hardware damage.System thresholds can be configured on any of the available metrics. The built-in configuration wizard guides thesystem administrator through the steps of defining a rule:selecting metrics, defining thresholds and specifying actions. For example, a temperature threshold for GPUs canbe established that results in the system automaticallyshutting down an overheated GPU unit and sending anSMS message to the system administrator’s mobile phone.Several predefined actions are available, but any Linuxcommand or script can be configured as an action.Comprehensive GPU ManagementBright Cluster Manager radically reduces the time and effort of managing GPUs, and fully integrates these devicesinto the single view of the overall system. Bright includespowerful GPU management and monitoring capability thatleverages functionality in NVIDIA Tesla GPUs. Systemadministrators can easily assume maximum control of theGPUs and gain instant and time-based status insight. Inaddition to the standard cluster management capabilities,Bright Cluster Manager monitors the full range of GPUmetrics, including: Board serial, driver version, PCI info.Beyond metrics, Bright Cluster Manager features built-insupport for GPU computing with CUDA and OpenCL libraries. Switching between current and previous versions ofCUDA and OpenCL has also been made easy.Multi-Tasking Via Parallel ShellThe parallel shell allows simultaneous execution of multiple commands and scripts across the cluster as a whole,or across easily definable groups of nodes. Output fromthe executed commands is displayed in a convenient waywith variable levels of verbosity. Running commands andThe automationconfiguration wizard guides the system administratorthrough the stepsof defining a rule:selecting metrics,defining thresholdsand specifyingactions.GPU temperature, fan speed, utilization.GPU exclusivity, compute, display, persistance mode.GPU memory utilization, ECC statistics.Unit fan speed, serial number, temperature, power usage, voltages and currents, LED status, firmware.“Bright Cluster Manager is a key component of our solution.Bright’s image management capabilitiesmake it easy for Cray to test new imagesin a dynamic environment and rapidly deploy upgrades. We are able to just abouteliminate system downtime.” — KimSchumann, Data Management PracticeLeader at CrayExample graphsthat visualizemetrics on a GPUcluster.5

Bright Computing“With Bright ClusterManager now offeringfull support for ScaleMPvSMP Foundation, setting up and managing an SMP cluster has never been soeasy.” — Shai Fultheim, CEO of ScaleMP The GUI provides a user-friendly interface forconfiguring, monitoring and managing the selectedworkload manager.The CMSH and the SOAP API provide direct andpowerful access to a number of workload managercommands and metrics.Reliable workload manager failover is properlyconfigured.The workload manager is continuously made awareof the health state of nodes (see section on HealthChecking).The following user-selectable workload managers aretightly integrated with Bright Cluster Manager: PBS Pro, Moab, Maui, LSF. SLURM, Grid Engine, Torque.Creating and dismantling a virtualSMP node can beachieved with justa few clicks withinthe GUI or a singlecommand in thecluster management shell.scripts can be killed easily if necessary. The parallel shell isavailable through both the CMGUI and the CMSH.Integrated Workload ManagementBright Cluster Manager is integrated with a wide selectionof free and commercial workload managers. This integration provides a number of benefits: The selected workload manager gets automaticallyinstalled and configured.Many workload manager metrics are monitored.Alternatively, Lava, LoadLeveler or other workload managers can be installed on top of Bright Cluster Manager.Integrated SMP SupportBright Cluster Manager — Advanced Edition dynamicallyaggregates multiple cluster nodes into a single virtual SMPnode, using ScaleMP’s Versatile SMP (vSMP) architecture. Creating and dismantling a virtual SMP node can beachieved with just a few clicks within the CMGUI. VirtualSMP nodes can also be launched and dismantled automatically using the scripting capabilities of the CMSH.In Bright Cluster Manager a virtual SMP node behaveslike any other node, enabling transparent, on-the-flyprovisioning, configuration, monitoring and managementof virtual SMP nodes as part of the overall system management.Maximum Uptime with Head NodeFailoverBright Cluster Manager — Advanced Edition allows twohead nodes to be configured in active-active failovermode. Both head nodes are on active duty, but if one fails,the other takes over all tasks, seamlesly.Workload management queues canbe viewed andconfigured from theGUI, without theneed for workloadmanagementexpertise.6Maximum Uptime with Health CheckingBright Cluster Manager — Advanced Edition includes apowerful cluster health checking framework that maximizes system uptime. It continually checks multiple healthindicators for all hardware and software components andproactively initiates corrective actions. It can also automatically perform a series of standard and user-defined

“Our uniquely complexcluster represents adifficult managementchallenge, which is whywe chose Bright ClusterManager.” — Prof. Volker Lindenstruth,FIAS, University of Frankfurttests just before starting a new job, to ensure a successfulexecution.Examples of corrective actions include autonomousbypass of faulty nodes, automatic job requeuing to avoidqueue flushing, and process “jailing” to allocate, track,trace and flush completed user processes. The healthchecking framework ensures the highest job throughput,the best overall cluster efficiency and the lowest administration overhead.Web-Based User PortalThe web-based user portal provides read-only access toessential cluster information, including a general overviewof the cluster status, node hardware and software properties, workload manager statistics and user-customizablegraphs. The User Portal can easily be customized and expanded using PHP and the SOAP API.User and Group ManagementUsers can be added to the cluster through the CMGUI or theCMSH. Bright Cluster Manager comes with a pre-configuredLDAP database, but an external LDAP service, or alternative authentication system, can be used instead.Role-based Access Control and AuditingBright Cluster Manager’s role-based access control mechanism allows administrator privileges to be defined on aper-role basis. Administrator actions can be audited usingan audit file which stores all their write action.locations. Capabilities include: All cluster management and monitoring functionalityavailable for all clusters through one GUI. Selecting any set of configurations in one cluster andexport them to any or all other clusters with a fewmouse clicks. Making node images available to other clusters.The web-baseduser portal providesread-only access toessential cluster information, includinga general overviewof the clusterstatus, node hardware and softwareproperties, workloadmanager statisticsand user-customizable graphs.Top Cluster SecurityBright Cluster Manager offers an unprecedented level ofsecurity that can easily be tailored to local requirements.Security features include: Automated security and other updates from key-signedLinux and Bright Computing repositories. Encrypted internal and external communications. X509v3 certificate based public-key authentication tothe cluster management infrastructure. Role-based access control and complete audit trail. Firewalls and secure LDAP. Secure shell access.Multi-Cluster CapabilityBright Cluster Manager is ideal for organizations that needto manage multiple clusters, either in one or in multipleBright Cluster Manager can managemultiple clusterssimultaneously. Thisoverview showsclusters in Oslo, AbuDhabi and Houston,all managed throughone GUI.7

Bright ComputingFeatureCluster healthchecks can bevisualized in theRackview. Thisscreenshot showsthat GPU unit 41fails a health checkcalled “AllFansRunning”.Cloud BurstingBright Cluster Manager supports two cloud bursting scenarios: “Cluster-on-Demand” — running stand-alone clusters in the cloud; and “Cluster Extension” — adding cloudbased resources to existing, onsite clusters and managingthese cloud nodes as if they were local. Both scenarios canbe achieved in just a few mouse clicks. Every Bright clusteris automatically cloud-ready, at no extra cost.Standard and Advanced EditionsBright Cluster Manager is available in two editions: Standard and Advanced. The table on this page lists the differences. You can easily upgrade from the Standard to the Advanced Edition as your cluster grows in size or complexity.Scenario 1:“Cluster on Demand”Use Bright to createstand-alone clusters inthe cloud.StandardAdvancedChoice of Linux distributions Intel Cluster Ready Cluster Management GUI Cluster Management Shell Web-Based User Portal SOAP API Node Provisioning Node Identification Cluster Monitoring Cluster Automation User Management Parallel Shell Workload Manager Integration Cluster Security Compilers Debuggers & Profilers MPI Libraries Mathematical Libraries Environment Modules NVIDIA CUDA & OpenCL GPU Management & Monitoring Cloud Bursting ScaleMP Management & Monitoring- Redundant Failover Head Nodes- Cluster Health Checking- Off-loadable Provisioning- 4–128129–10,000 Standard Support Premium SupportOptionalOptionalMulti-Cluster ManagementSuggested Number of Nodes Documentation and ServicesA comprehensive system administrator manual and usermanual are included in PDF format. Customized trainingand professional services are available. Services includevarious levels of support, installation and consultancy.Bright Computing, Inc.Scenario 2:“Cluster Extension”Use Bright to extendonsite clusters into thecloud.2880 Zanker Road, Suite 203San Jose, California 95134United StatesTel: 1 408 300 9448Fax: 1 408 715 mBright Computing Terms & Conditions apply. Copyright 2009‑2012 Bright Computing, Inc. All rightsreserved. While every precaution has been taken in the preparation of this publication, the authorsassume no responsibility for errors or omissions, or for damage resulting from the use of the information contained herein. Bright Computing, Bright Cluster Manager and the Bright Computing logo aretrademarks of Bright Computing, Inc. All other trademarks are the property of their respective owners.8

Advanced Cluster Management Made Easy Bright Cluster Manager removes the complexity from the instal- . straight out of the box.” . Supermicro, Acer, Asus and more. Ease of Use Bright Cluster Manager is easy to use. System administra-tors have two options: the intuitive Cluster Management Graphical User Interface (CMGUI) and the powerful .

Related Documents:

Flume Cluster Cloumon Application Server DBMS Flume Manager Zookeeper Manager HBase Manager Hive Manager Hadoop Manager Host Manager Metrics Data Management Data Job Workflow Job Scheduler Alarm Service (Mail, SMS) Manager View (http) Zookeeper Cluster HBase Cluster Hadoop Cluster Cassandra Cluster Flume Master Zookeeper HMaster NameNode Region .

Cisco, DDN, IBM, HP, Supermicro, Acer, Asus and more. Ease of Use Bright Cluster Manager is easy to use, with two interface options: the intuitive Cluster Management Graphical User In-terface (CMGUI) and the powerful Cluster Management Shell (CMSH). The CMGUI is

On HP-UX 11i v2 and HP-UX 11i v3 through a cluster lock disk which must be accessed during the arbitration process. The cluster lock disk is a disk area located in a volume group that is shared by all nodes in the cluster. Each sub-cluster attempts to acquire the cluster lock. The sub-cluster that gets

PRIMERGY BX900 Cluster node HX600 Cluster node PRIMERGY RX200 Cluster node Cluster No.1 in Top500 (June 2011, Nov 2011) Japan’s Largest Cluster in Top500 (June 2010) PRIMERGY CX1000 Cluster node Massively Parallel Fujitsu has been developing HPC file system for customers 4

HP ProLiant SL230s Gen8 4-node cluster Dell PowerEdge R815 11-node cluster Dell PowerEdge C6145 6-node cluster Dell PowerEdge Dell M610 PowerEdge C6100 38-node cluster 4-node cluster Dell PowerVault MD3420 / MD3460 InfiniBand-based Lustre Storage Dell PowerEdge R720/R720xd 32-node cluster HP Proliant XL230a Gen9 .

Use MATLAB Distributed Computing Server MATLAB Desktop (Client) Local Desktop Computer Cluster Computer Cluster Scheduler Profile (Local) Profile (Cluster) MATLAB code MATLAB code 1. Prototype code 2. Get access to an enabled cluster 3. Switch cluster profile to run on cluster resources

What is the Cluster Performance Monitoring? Cluster Performance Monitoring (CPM) is a self-assessment of cluster performance against the six core cluster functions set out on the ZReference Module for Cluster Coordination at Country Level and accountability to affected populations. It is a country led process, which is supported

Within this programme, courses in Academic Writing and Communication Skills are available. There are also more intensive courses available, including the Pre-Sessional Course in English for Academic Purposes. This is a six-week course open to students embarking on a degree course at Oxford University or another English-speaking university. There are resources for independent study in the .