* PHYS 743 - Parallel Programming ** General remarks - QUESTION: Do we have time to visit INJ? ** Tentative agenda *** Admin intro **** Projects *** Basic concepts **** ssh, scp, rsync **** Compilation ***** Modules **** Debugging *** Architecture **** Cluster (MPI) ***** Clusters in general ***** At SCITAS **** Multicore (OpenMP) **** Singlecore (SIMD) *** Optimization **** Data access **** Vectorization **** Basic optimization techniques *** Performance measurement **** Key concepts ***** FLOPS, memory bandwidth ***** timing (speedup, scalings) **** Profiling **** Roofline *** Shared memory (OpenMP) **** Task parallelism **** OpenMP terminology / Read spec **** Fork-join / Omp parallel / Implicit barriers **** Exercise Hello World / SLURM **** Omp parallel for **** Exercise **** Race condition **** Omp critical (synchronization), atomic, accumulation in array (false sharing) **** Omp private **** Omp reduction **** Work sharing constructs **** OpenMP (new features not covered) **** Exercise Poisson *** Advanced **** Schedule **** NUMA / pinning / first touch **** Collapse **** Barriers **** (GPU) *** Distributed memory (MPI) basic **** Introduction / Read spec **** MPI enviroment / Hello world ***** Print before init ***** Print rank ***** Print conditionnaly rank **** MPI terminology **** Point-to-point ***** Synchronous / Deadlock ***** Asynchronous / race condition **** Collective ***** Bcast ***** Gather/scatter ***** Reduce **** Advanced collective ***** All ***** Gather/Scatterv ***** All to all ***** Barrier **** MPI Fortran ***** Bindings ***** Asynchronous arrays **** Exercise Poisson *** Distributed memory (MPI) advanced **** Derived types **** (un)Pack **** Communicator **** Topologies **** IO **** One-sided **** Persistent **** Non blocking collectives *** Hybrid programming **** Mpi init **** Task/thread Repartition *** Recap *** Projects *** Pub pour SCITAS