HPC Services AWS

Report 0 Downloads 93 Views
#HPEDiscover

Please give me your feedback Session ID: T4557

Speaker: Gisli Kr, Hans Rickardt

–Use the mobile app to complete a session survey 1.

Access “My schedule”

2.

Click on the session detail page

3.

Scroll down to “Feedback” – If the session is not on your schedule, just find it via the app’s “Session Schedule” menu, click on this session and scroll down to “Feedback” – If you don’t have it, download the event app today. Go to your phone’s app store and search for “HPE Discover 2017”

–To access the session survey online, go to the Agenda Builder in the event content catalog and click on your session

Thank you for providing your feedback, which helps us enhance content for future events.

2

ADVANIA HPCAAS CLOUD Advania’s HPC capabilities, delivery and technical specifications

About Advania

• Founded in 1939 • 1,200+ employees • 5,300 corporate clients, worldwide • Collaborate with all the leading international IT partners

Advania HPC HPC Architechture & Design

HPC Support

HPC Procurement Services HPC Delivery

Hybrid HPC

Hosted HPC

Dedicated HPC

Full Stack HPC

HPC on Demand

Advania Data Centers Uses natural free cooling PUE from 1.03

Powered by 100% green energy No CO2 footprint

ISO 27001 certified

HPC growing fast Room to grow

HPC as a Service Cluster design Hans Rickardt, HPC Solution Architect [email protected]

Agenda • HPC as a Service • Cluster configuration • Cluster Network and installation • Intel OmniPath • Cluster verification • Benchmark and results • BeeGFS / BeeOND Scratch Storage LOKI HPC Cluster

Advania HPCaaS Partner Ecosystem

UberCloud Living Heart Project • • • • •

Stanford University UberCloud Advania Simulia HPE

Achieve Breakthrough in Living Heart Simulations

Cluster extension

COMPONENT DETAILS

HPE Server xl230a 2 x Intel® Xeon® processors E5-2683 v4 2 x 16 core, Processor Base Frequency 2.1 GHz, Max Turbo Frequency 3.00 GHz 256GB memory, 2400 MT/s 1,92TByte SSD OmniPath 100Gbit HCA 2 x 10Gbit Ethernet

HPE Apollo 6000

Intel® Omni-Path Edge Switches

Server chassis

100Gbit ( infiniband )

HPE FACTORY INTEGRATION SERVICES

Cluster

 10Gbit Ethernet switches 

Network

 Intel® Omni-Path Edge Switches 

NETWORK SETUP Physical nodes NO hypervisor

VPN

Jump Server/ FireWall

Jump network vlan

Login Login server servers

GPU server NFS Server /home

10Gbit network vlan 1Gbit ILO network CMU Master

Internet

Compute Compute Compute Compute Compute Node Node Node Node Node

Compute Node

100Gbit Intel OmniPath

Compute Node

CUSTOMER SETUP Physical nodes NO hypervisor

VPN

Jump network vlan

Login server

Jump Server/ FireWall

Customer vlan

GPU server NFS Server /home

10Gbit network vlan Compute Node

Internet

Compute Node

Compute Node

100Gbit Intel OmniPath/Infiniband pkey

opafm.xml 0x00117501017a2123 FI 1 cn06 hfi1_0 1 0x0009 0x00117501017a2123 4 8 25Gb 2

INTEL® OMNI-PATH EDGE SWITCH CONFIGURATION 2 ROOT Switch configuration Blocking 2:1 4 Leaf Switches 128 nodes 6 Leaf Switches 192 nodes Intel® Omni-Path Edge Switch 100 Series 48 Port Managed / ROOT

8

8 8

32 Compute nodes

8

32 Compute nodes

32 Compute nodes

Intel® Omni-Path Edge Switch 100 Series 48 Port / LEAF

32 Compute nodes

INTEL® OMNI-PATH EDGE SWITCH LARGE CONFIGURATION 16 ROOT Switch configuration Blocking 2:1 48 Leaf Switches 1536 nodes 16 ROOT switches

1

1 1

1

32 Compute nodes

32 Compute nodes 48 LEAF switches

32 Compute nodes

32 Compute nodes

MANAGED SWITCH WEB INTERFACE

MANAGED SWITCH CLI

FABRIC MANAGER GUI

FABRIC MANAGER GUI

OMNI PATH DRIVER / MPI OPTIONS

$ mpi-selector --list mvapich2_gcc-2.2 mvapich2_gcc_hfi-2.2 mvapich2_intel_hfi-2.2 openmpi_gcc-1.10.4 openmpi_gcc_cuda_hfi-1.10.4 openmpi_gcc_hfi-1.10.4 openmpi_intel_hfi-1.10.4 intel-mpi ( extra )

Cluster Test Verification

 System verification BIOS settings Memory test  Cables check  Benchmark and results

NODE PERFORMANCE Min # sort -nk2 linpack_nodes_result.txt |head -10 linpack_hpcclu1cn16.txt 785.0371 Memory problems linpack_hpcclu1cn33.txt 804.3803 linpack_hpcclu1cn57.txt 949.1486 Slow Processors linpack_hpcclu1cn48.txt 950.0909 .. linpack_hpcclu1cn15.txt 974.0912 linpack_hpcclu1cn65.txt 976.8977

Max # sort -nk2 linpack_nodes_result.txt |tail -3 linpack_hpcclu1cn61.txt 1035.0789 linpack_hpcclu1cn22.txt 1035.2108 linpack_hpcclu1cn51.txt 1035.2122

Average # sort -nk2 linpack_nodes_result.txt |head -50 |tail -1 linpack_hpcclu1cn86.txt 1006.0928

SYSTEM PERFORMANCE LINPACK USING 96 NODES

******************************************************************************** *-->Estimated Run Time for Group 0 is: 21154 seconds @ 85%, Peak=103.219 TFlops ******************************************************************************** mpirun -ppn 2 -f ./hosts -np 192 -genvall -genv I_MPI_FABRICS tmi -genv I_MPI_OFA_ADAPTER_NAME=hfi1_0:1 ../sbin/xhpl.sh ** Group 0 interim data 21:33:52 hpcclu1cn94 : Column=007104 Fraction=0.005 Kernel= 1216.00 Mflops=103316494.46 ** ** ** 0% completed ** ** ** ** Group 0 interim data 02:40:28 hpcclu1cn40 : Column=1259136 Fraction=0.895 Kernel=94347551.26 Mflops=99357777.21 ** ** ** 80% completed ** ** ** *********************************FINAL RESULTS********************************** ******************************************************************************** ** Group 0 hpcclu1cn1 ... hpcclu1cn100 ** ** ** PASSED ** ** ** WR00L2L1 1406784 192 12 16 18693.48 9.92891e+04

96.2%

HPE INSIGHT CLUSTER MANAGEMENT UTILITY (CMU) Provisioning • Simplified discovery • Auto-Install {kickstart, autoyast…} • Fast & Scalable Cloning • Diskless support

Monitoring • ‘At a glance’ view of entire system or partitions • Customizable and HPC friendly • 2D Instant View • 3D Time View visualization

Scalers • GUI, CLI or API interfaces • ‘One click’ control of servers • Cluster aware Smart Terminal • CMU Diff: Command broadcast & analysis

• Batch Scheduler ready with CMU Partners connectors • History of performance metrics

15+ years in deployment, included Top500 sites with thousands of nodes Standard HP support, also available as factoryintegrated cluster option

12

BeeGFS / BeeOND

BeeGFS / BeeOND Scratch Storage

BEEGFS / BEEOND

https://www.beegfs.io/wiki/BeeOND

Please give me your feedback Session ID: T4557

Speaker: Gisli Kr, Hans Rickardt

–Use the mobile app to complete a session survey 1.

Access “My schedule”

2.

Click on the session detail page

3.

Scroll down to “Feedback” – If the session is not on your schedule, just find it via the app’s “Session Schedule” menu, click on this session and scroll down to “Feedback” – If you don’t have it, download the event app today. Go to your phone’s app store and search for “HPE Discover 2017”

–To access the session survey online, go to the Agenda Builder in the event content catalog and click on your session

Thank you for providing your feedback, which helps us enhance content for future events.

31

Ægir Rafn Magnusson Business Development Director [email protected]

Staffan Hansson

Thank you

Contact us

HPC Sales Specialist [email protected]

Hans Rickardt Lead HPC Specialist [email protected]

[email protected] m

Gisli Kr. HPC Marketing Manager [email protected]