Are there Exascale Algorithms? - Blue Sky eLearn

Application complexity grew due to parallelism and more ambitious ... Scientific libraries enable these applications. 4 ... Global Address Space Programming.
5MB Sizes 3 Downloads 169 Views
Are there Exascale Algorithms? Kathy Yelick Associate Laboratory Director Acting NERSC Director Lawrence Berkeley National Laboratory EECS Professor, UC Berkeley

Computational Science has Moved through Difficult Technology Transitions Application Performance Growth (Gordon Bell Prizes)

Attack of the “killer micros”

2

HPC: From Vector Supercomputers to Massively Parallel Systems 500

Programmed by “annotating” serial programs

SIMD Single Proc. SMP Constellation Cluster MPP

300

Programmed by completely rethinking algorithms and software for parallelism

3

2011

2008

2007

2006

50% 2005

2004

2003

2002

2001

2000

1999

1998

industrial use 1997

1996

1995

0

1994

25%

2010

100

2009

200

1993

Systems

400

The Impact of Scientific Libraries in High Performance Computing •  Application complexity grew due to parallelism and more ambitious science problems (e.g., multiphysics, multiscale) •  Scientific libraries enable these applications

LAPACK   35%  of  apps  

ScaLAPACK   20%  of  apps  

Overture   ~1200  /yr  

netCDF:     ~12%  of  apps  

Trilinos     21,000  total      

METIS   4%  of  apps  

FFTW:  ~25%  of  apps  

PETSc     ~4800  /yr    

hypre     ~1400  /yr  

HDF5:     ~11%  of  apps  

ParPack   3%  of  apps  

FastBit     6,300  total    

SuperLU:     ~4%  of  apps  

GlobalArrays   28%  of  apps  

Numbers  show  downloads  per  year  or  total;    percentages  are   based  on  the  percentage  of  NERSC  projects  that  use  this  library   4

NITRD Projects Addressed Programmer Productivity of Irregular Problems

Message Passing Programming Global Address Space Programming Divide up domain in pieces Each start computing Compute one piece and exchange Grab whatever / whenever PVM, MPI, and many libraries

UPC, CAF, X10, Chapel, Fortress, Titanium, GA, 5

Computing Performance Improvements will be Harder than Ever 10,000,000 1,000,000 100,000

Transistors Transistors (Thousands) (Thousands) Frequency (MHz) (MHz) Frequency Power (W) (W) Power Cores

10,000 1,000 100 10 1 0

1970

1975

1980

1985

1990

1995

2000

2005

2010

Moore’s Law continues, but power limits performance growth. Parallelism is used instead. 6

Scientists Need to Undertake another Difficult Technology Transitions Application Performance Growth (Gordon Bell Prizes)

First Exascale Application? (billion-billion operations / sec)

Attack of the “killer micros”

The rest of the computing world gets parallelism

7

Energy Efficient Computing is Key to Performance Growth At $1M per MW, energy costs are substantial •  1 petaflop in 2010 used 3 MW •  1 exaflop in 2018 would use 130 MW with “Moore’s Law” scaling

usual scaling goal

2005

2010

2015

2020

This problem doesn’t change if we were to build 1000 1-Petaflop machines instead of 1 Exasflop machine. It affects every university department cluster and cloud data center. 8

New Processor Designs are Needed to Save Energy Cell phone processor (0.1 Watt, 4 Gflop/s) Server processor (100 Watts, 50 Gflop/s)

•  Server processors designed for performance •  Embedded and graphics processors use simple low-power processors  good p