* PHYS 743 - Parallel Programming ** General remarks - QUESTION: Do we have time to visit INJ? ** Tentative agenda *** Admin intro **** Projects *** Basic concepts **** ssh, scp, rsync **** Compilation ***** Modules **** Debugging *** Architecture **** Cluster (MPI) ***** Clusters in general ***** At SCITAS **** Multicore (OpenMP) **** Singlecore (SIMD) *** Optimization **** Data access **** Vectorization **** Basic optimization techniques *** Performance measurement **** Key concepts ***** FLOPS, memory bandwidth ***** timing (speedup, scalings) **** Profiling **** Roofline *** Shared memory (OpenMP) [10/13] **** [X] Task parallelism **** [X] OpenMP terminology / Read spec **** [X] Fork-join / Omp parallel / Implicit barriers **** [X] Exercise Hello World / SLURM **** [X] Omp parallel for **** [X] Exercise **** [-] Race condition, accumulation in array (false sharing) **** [X] Omp critical (synchronization), atomic **** [X] Omp private **** [X] Omp reduction **** [X] Work sharing constructs **** [-] OpenMP (new features not covered) **** [-] Exercise Poisson *** Advanced **** Schedule **** NUMA / pinning / first touch **** Collapse **** Barriers **** (GPU) *** Distributed memory (MPI) basic **** Introduction / Read spec **** MPI enviroment / Hello world ***** Print before init ***** Print rank ***** Print conditionnaly rank **** MPI terminology **** Point-to-point ***** Synchronous / Deadlock ***** Asynchronous / race condition **** Collective ***** Bcast ***** Gather/scatter ***** Reduce **** Advanced collective ***** All ***** Gather/Scatterv ***** All to all ***** Barrier **** MPI Fortran ***** Bindings ***** Asynchronous arrays **** Exercise Poisson *** Distributed memory (MPI) advanced **** Derived types **** (un)Pack **** Communicator **** Topologies **** IO **** One-sided **** Persistent **** Non blocking collectives *** Hybrid programming **** Mpi init **** Task/thread Repartition *** Recap *** Projects *** Pub pour SCITAS