• Tidak ada hasil yang ditemukan

High Performance Computing (HPC)

N/A
N/A
@Muhammad J. Abu-Emaish

Academic year: 2024

Membagikan "High Performance Computing (HPC)"

Copied!
15
0
0

Teks penuh

(1)

Dr. Noha MM.

Computer Science Department Thebes Academy

High Performance Computing (HPC)

Lecture 1

(2)

Content

Performance and Metrics

Anatomy of A Supercomputer

Serial Computing v.s Parallel Computing

Elementary Steps in Parallel Programing

Introduction

Why HPC

(3)

Introduction

• High Performance Computing (HPC) refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a desktop computer in order to solve large problems in any science and engineering, simulation, modeling, or big data analytics application.

• HPC is really a collection of multiple interrelated disciplines, each providing an important aspect of the total field.

• HPC is a field that relates to all facets of technology, methodology, and application associated with achieving the greatest computing capability possible at any point in time and technology.

(4)

Why HPC

• Scientific simulation and modelling, many complex physical phenomena, biological complex systems, complex social behaviors drive the need for greater computing power.

• Single – core processors can’t be made that have enough resources for the simulations need.

 Making processors with faster clock speeds is difficult due to cost and power/heat limitations.

 Expensive to put huge memory on a single processor.

(5)
(6)

Serial Computing v.s Parallel Computing

Solution: Parallel computing – divide up the work among numerous linked systems

(7)

Distributed memory Serial/ Sequential

Cont.

• The load (memory) can be heavy to carry.

• Task will take long time

• One task divided into small blocks and different computers.

• Parallel processing+ Scientific Computing = High Performance Scientific Computing

Serial

Parallel

(8)

Elementary Steps in Parallel Programing

• How many computer doing the work (Degree of parallelism)

• What is needed to begin the work. (Initialization)

• Who does what (Work distribution)

• Access to work part. (Data I/O access)

• Whether they need information from each other to finish their own job. (Communication)

• When are they all done. (Synchronization)

• What needs to be done to collect the result.

• Message Passing Interface (MPI), Open Multi – Processing (OpenMP), etc.

(9)

Anatomy of A Supercomputer

• A modern supercomputer: Titan is one of the fastest computers in the world.

• The layered hierarchy of the many physical and logical components contributing to a general-purpose supercomputer

• The system hardware layer, represents the physical resources that are the most visible (and audible) aspect of a supercomputer like Titan. Even in this high-level view, the principal components of the system are perceived.

• The processors that perform the calculations and the memory which stores both the data and the program codes that operate on it are shown here.

• the interconnection network that integrates potentially many thousands and eventually millions of such processor/memory “nodes” into a single supercomputer.

(10)
(11)

Cont.

• The first levels of software that control the hardware and manage these resources are associated with the operating system.

• Each node has a local instance of an operating system controlling the physical memory and processor resources of the node as well as the interface to the external (to the node) system area network.

• The overall work management layer involves several support capabilities, including the programming languages (e.g., Fortran, C, C++), additional libraries often for parallelism (e.g., MPI, OpenMP), and compilers that provide machine-readable code for the processor cores translated and optimized from the user code.

(12)

Performance and Metrics

• 1.23958456606 + 4.2254568978 = ??

• For HPC (For scientific and technical programming) the most widely used metric is “floating-point operations per second” or “flops”.

• Modern supercomputers measured in PFLOPS (PetaFLOPS)

• Kilo = 103 Mega = 106 Giga = 109

• Tera = 1012 Peta = 1015

• Two basic measures are employed individually or in combination and in differing contexts to formulate the values used to represent the quality of a supercomputer.

• These two fundamental measures are “time” and “number of operations

performed, both under prescribed conditions.

(13)

This formalization of performance degradation is referred to through the acronym SLOW , which identifies the sources as starvation, latency, overhead, and waiting for contention.

Latency

is the time it takes for information to travel from one

part of a system to

another.

Overhead is the amount of

additional work that required to

perform the computation (such as on a pure sequential

processor).

Waiting

Waiting of threads of action for

shared

resources due to contention of

access degrades performance

Starvation (delay)

Relates to a critical source of performance,

parallelism.

(14)

Techniques to improve one’s performance

Parallel algorithms

Performance monitoring

Hardware scaling

Task granularity control

Work and data distribution

(15)

Referensi

Dokumen terkait

The experimental results related to the comparison between multi-core CPU single-node in parallel versus multi-core CPU single-node serially also similar to the results obtained

ln thi s research, our contribution concentration is on Bounding Volume Hierarchies (BVH) technique using high performance computing based on GPU acceleration in order

Riset ini bertujuan untuk melakukan salah satu aspek perancangan obat melalui screening bahan aktif obat tradisional Indonesia menggunakan infrastruktur High Performance

Berdasarkan hasil penelitian, beberapa hal yang dapat disimpulkan bahwa optimasi dayadengan metode prediksi pada sistem ini dapat dilakukan pada migrasi VM dan

You need to know the content of this book's prequel, Learning IPython for Interactive Computing and Data Visualization: Python programming, the IPython console and notebook,

Embedded computing system designers need more complex methodologies because their system design encompasses both hardware and software.. The varying characteristics of embedded

The implementation of embedded system aims students be able to create an innovative high-performance computing cluster system to support education learning activities by

Although Parallel CRC Checker has larger in area and power as compared to Serial CRC Checker but it reduces the simulation time or checking time at the optimum level which is required