=====Journal of Lessons, SPD year 2018-2019====
====Journal====
* 18/02/2019 **Course introduction** -- Parallel programming frameworks and high-level approach to parallel programming over different platforms: MPI, TBB and OpenCL as main examples; course organization and prerequisites; reference books and studying material. \\ -- ** MPI (Message Passing Interface) standard ** -- brief history and aim of the standard, single program / multiple data execution model, compilation and linkage model; issues in supporting multiple programming languages and uses (application, utility library and programming language support) with a static compilation and linkage approach. Portability in parallel programming: functional and non-functional aspects, performance tuning and performance debugging. **MPI basic concepts** MPI as a parallel framework that supports a structured approach to parallel programming. Basic concepts of MPI: communicators (definition, purpose, difference between inter and intra-communicators, process ranks).
* 22/02/2019 **MPI basic concepts** -- Point to point communication (concepts of envelope, local/global completion, blocking/non-blocking primitive, send modes); collective communications (definition, communication scope, global serialization, freedom of implementation in the standard); MPI datatypes (basic meaning and use, primitive / derived datatypes).
* 25/02/2019 **MPI** -- MPI Datatypes relationship with sequential language types. MPI library initialization and basic MPI usage; point to point communication semantics (buffer behaviour, receive, status objects, MPI_PROC_NULL), basic and derived MPI datatypes (purpose as explicitly defined meta-data provided to the MPI implementation, multiple language bindings, code-instantiated metadata, examples). MPI datatypes semantics: typemap and type signature.
* 27/02/2019 **MPI** -- MPI datatypes (matching rules for communication, role in MPI-performed packing and unpacking); core primitives for datatype creation ( MPI_Type_* : contiguous, vector, hvector, indexed, hindexed, struct; commit, free) and examples. ( // Lesson held in room C40 at CNR // )
* 04/03/2019 **MPI** -- Point to point communication modes (MPI_BSEND, MPI_SSEND; MPI_RSend usage); non-blocking communication (Wait and Test group of primitives, semantics, MPI_Request object handles to active requests); canceling and testing cancellation of non-blocking primitives (issues and pitfalls, interaction with MPI implementation, e.g. MPI_finalize). Communicators and groups (communicator design aim and programming abstraction, local and global information, groups as local objects, primitives for locally creating and managing groups); intracommunicators (basic primitives concerning size, rank, comparison); communicator creation as a collective operation.
* 06/03/2019 **MPI** -- MPI_Comm_create basic and general case; MPI_Comm_split; MPI collective communications (definition and semantics, execution environment, basic features, agreement of key parameters among the processes, constraints on Datatypes and typemaps for collective op.s, overall serialization vs synchronization, potential deadlocks).
* 11/03/2019 **MPI Lab** -- Basic program structure. Simple ping-pong example, generalization. Examples with derived datatypes. Structured parallel programming in MPI, separation of concerns in practice. Assigned task for next LAB time: MPI matrix multiplication.
* 14/03/2019 Rescheduled
* 18/03/2019 **MPI Lab** -- Matrix Multiplication. Implementing communication with assigned asynchronicity degree.
* 21/03/2019 **MPI** Extent and modification for derived datatypes. Collective operation for communication and computation. Taxonomy of MPI collectives (blocking/non-blocking, synchronization/communication/communication+computation, asymmetry of the communication pattern, variable size versions, all- versions). Blocking collectives. MPI collectives with both computation and communication: Reduce (and variants) and Scan (and variants). Using MPI operators with Reduce and Scan. Defining custom user operators, issues and implementation of operator functions.
* 25/03/2019 **MPI Lab** -- Farm skeleton with round robin and dynamic load dstribution strategies.
* 28/03/2019 **Applications + MPI Lab** KDD/Data Mining intro; K-means algorithm, parallelization opportunities. Lab session with sequential code and MPI parallelization.
* 08/04/2019 **TBB** -- Thread Building Blocks : C++ template library overview: purpose, abstraction mechanisms and implementation layers (templates, runtime, supported abstractions, use of C++ concepts). **Seminar** Real-Time in Heterogeneous High-performance Systems (Dr. Matteo Andreozzi, ARM) //Co-location of multiple workloads on a single system allows to improve its resources utilization, allows resources re-use and improves the efficiency of data sharing across workloads. This, however, comes at the cost of potential performance degradation due to interference on shared resources, and increased uncertainty in terms of workload performance predictability. The seminar provided an overview about how the System Architecture team in Arm approaches these challenges and how modelling and performance analysis play a central role in the activity.//
* 11/04/2019 Rescheduled
* 15/04/2019 **TBB** TBB introduction. Develompment history and current status: abstraction and implementation layers. Task-based description vs thread-based one, scalability and automatic management of hierarchical parallelism. Basic abstractions and algorithms.
* 29/04/2019 **TBB** Parallel_for : ranges, partitioners, grain size for parallelism exploitation. C++ lambda expressions.
* 02/05/2019 **MPI LAB** Correction and discussion of previously assigned MPI exercises.
* 06/05/2019 **TBB LAB** Basic examples of TBB parallel for. Performing parallel computations on ranges and returning computation results.
* 09/05/2019 (3h 14:30 -- 17:30) **TBB** TBB parallel containers: variants of maps, queues and vectors with higher concurrent performance and modified semantics with respect to STL. **TBB LAB** Mandelbrot set computation with a parallel for.
* 13/05/2019 Rescheduled
* 16/05/2019 (3h 14:30 -- 17:30) **TBB** TBB mutexes variants and their usage. Explicitly setting the thread-parallelism level (Task Scheduler, Task Arena). **TBB LAB** Experiments with Mandelbrot parallel for with varying thread count and computation load.
* 20/05/2019 **OpenCL** OpenCL intro. Design concepts and programming abstractions: Devices/host interaction, context, kernel, command queues; execution model; memory spaces and memory consistency in OpenCL.
* 23/05/2019 (3h 14:00 -- 17:00) **OpenCL** OpenCL C/C++ subset for kernels; kernel compilation, program objects, memory objects and kernel arguments, code execution, kernel instances and workgroups, workgroup synchronization; portability and chances for load balancing: mapping OpenCL code onto both the GPU and the CPU; examples of vector types and vector operations. Basic concepts of OpenCL 1 applied to program contruction: simple array operations, matrix multiplication and related workgroup optimizations, simple parallel reduction and thread cooperation.
* 27/05/2019 rescheduled due to EU election break.
* 28/05/2019 **OpenCL** OpenCL 1.2 and beyond. OpenCL event generation and handling (event barriers) for inter-queue, non local synchronization. OpenCL 2.0 and 2.1 features: shared virtual memory (GPU/CPU memory space overlaying); nested kernels and recursive parallelism without host/device interaction; generic address space as a tool to avoid source code duplication; C11 atomics in OpenCL 2.0; pipes. OpenCL 2.1: moving towards the use of a proper subset of C++14 for kernels (e.g. templates, overloading, lambda f.), allowing single-source joint OpenCL and non-OpenCL programming (SYCL) and providing a more homogeneous and organized semantics. The SPIR-V interoperable, symbolic GPU machine code representation and its use in the LLVM based development toolset.
* 30/05/2019 (3h 14:00 -- 17:00) **OpenCL LAB** Exercises from the GIT examples: exercises 4 and 5 (vector addition), 6 (matrix multiplication), 7 and 8 (exploiting private memory).
====Slides, Notes and References to papers====
^ Date ^ Slides ^ Notes ^ References / Info |
| 18/02 | {{ :magistraleinformaticanetworking:spd:2019:spd_feb_2019.pdf |Course introduction}} | | |
| 18/02, 22/02, 25/02, 27/02 | {{ :magistraleinformaticanetworking:spd:2019:mpi-lesson1.pdf |MPI lesson 1}} {{ :magistraleinformaticanetworking:spd:2019:mpi-lesson2.pdf |MPI lesson 2}} | | |
| 27/02 | {{ :magistraleinformaticanetworking:spd:2019:mpi-lab.pdf |MPI Lab}} | | |
| 04/03 | {{ :magistraleinformaticanetworking:spd:2019:mpi-lesson3.pdf |MPI lesson 3}} | | |
| 04/03, 06/03 | {{ :magistraleinformaticanetworking:spd:2019:mpi-lesson4.pdf |}} | | |
| 06/03, 13/03 | {{ :magistraleinformaticanetworking:spd:mpi-lesson5.pdf |MPI lesson 5}} | | |
| 11/03, 18/03, 25/03 | {{ :magistraleinformaticanetworking:spd:2019:exercise-collected.pdf |Collection of exercises}} | The list will be updated for future Lab sessions | |
| 21/03 | {{ :magistraleinformaticanetworking:spd:2019:mpi-lesson6.pdf |MPI Lesson 6}} | | |
| 25/03, 28/03 | | {{ :magistraleinformaticanetworking:spd:spd13-14-paralleldatamining_notes_ch2_3.pdf | Introductory notes about Data Mining}} {{ :magistraleinformaticanetworking:spd:spd11-12-dhillon-modha-corretto_parkmeans.ps |Dhillon and Modha Tech.R. on K-means}} {{ :magistraleinformaticanetworking:spd:2016:k-means3.tgz |Sequential reference code for K-means}} | |
| TBB - 04-05/2019 | {{ :magistraleinformaticanetworking:spd:2019:tbb-lesson1.pdf |}} {{ :magistraleinformaticanetworking:spd:2019:tbb-lesson2.pdf |}} {{ :magistraleinformaticanetworking:spd:2019:tbb-lesson3.pdf |}} {{ :magistraleinformaticanetworking:spd:2019:tbb-lesson4.pdf |}} | These presentations cover the TBB lessons scattered over April and May. TBB exercises have been added to the exercise collection {{ :magistraleinformaticanetworking:spd:2019:exercise-collected.pdf |Collection of exercises}} | |
| OpenCL 20/05, 23/05| {{ :magistraleinformaticanetworking:spd:2015:opencl-intro-tim-mattson.pdf |Open CL 1.0 intro and tutorial}} | | [[https://www.khronos.org/sycl/ | about SYCL on Khronos portal]] [[https://www.khronos.org/registry/SYCL/| See SYCL specification 1.2. from April 2019]]|
| OpenCL 28/05, 30/05 | {{ :magistraleinformaticanetworking:spd:2018:opencl_e_survey.pdf |Open CL changes from 1.0 to 2.0}} {{ :magistraleinformaticanetworking:spd:2019:ocl_g_opencl_sycl_2019.pdf |OpenCL 2019 IWOCL Keynote}} | For the OpenCL lab session, on latest OSX remember to define CL_SILENCE_DEPRECATION in order to avoid spurious warnings | [[https://github.com/HandsOnOpenCL/Exercises-Solutions.git| GIT URL to HandsOnOpenCL Exercises]] |