Lecture Date |
Topic | Slides | Reading Assignment/ code |
---|---|---|---|
Jan 18 | Introduction | Course Motivation and Organization | MATMUL code (various versions) |
Jan 21 | Bentley Rules for Optimizing Work (Writing Efficient Programs by John Benteley) | Benteley Rules | |
Jan 22 | Assembly Level Optimizations | Assembly | fib.c Buffer Overflow |
Jan 23 | Optimization Blockers LS 2 | Optimizations1 Optimizations2 |
code |
Jan 24 | Profiling Tools LS 3 (Branchless Merge sort vs Quick sort) | Profiling | Branch Prediction links Pipelining |
Feb 4 | Benchmarking (Issues with accurately timing a code) Nano level benchmarking: Discuss X-ray paper. | x-ray presentation | |
Feb 5 | Vectorization (SIMD programming) | SIMD Programming Compiler Vectorization | IACA User Guide |
Feb 19 | Memory Locality Optimizations | Operational Intensity Caches Introduction 1 Caches Introduction 2 Caches Stride Analysis Caches Models and Program Transformations |
|
Mar 12 | Intel MIC (Xeon Phi) based Programming | MIC (Xeon Phi Coprocessor) Native Computing (& Optimization) Symmetric Computing Offloading |
|
Mar 14 | Fooling the masses with performance results | Original Article Latest Stunts Fool with GPUs |
|
Mar 18 | GPU Programming | Lecture on CUDA | video
|