34 lines
1.4 KiB
Markdown
34 lines
1.4 KiB
Markdown
Parallel & Distributed Computer Systems HW3
|
|
|
|
January, 2025
|
|
|
|
Write a program that sorts $N$ integers in ascending order, using CUDA.
|
|
|
|
The program must perform the following tasks:
|
|
|
|
- The user specifies a positive integers $q$.
|
|
|
|
- Start a process with an array of $N = 2^q$ random integers is each processes.
|
|
|
|
- Sort all $N$ elements int ascending order.
|
|
|
|
- Check the correctness of the final result.
|
|
|
|
Your implementation should be based on the following steps:
|
|
|
|
V0. A kernel where each thread only compares and exchanges. This "eliminates" the 1:n innermost loop. Easy to write, but too many function calls and global synchronizations.
|
|
|
|
V1. Include the k inner loop in the kernel function. How do we handle the synchronization? Fewer calls, fewer global synchronizations. Faster than V0!
|
|
|
|
V2. Modify the kernel of V1 to work with local memory instead of global.
|
|
|
|
You must deliver:
|
|
- A report (about $3-4$ pages) that describes your parallel algorithm and implementation.
|
|
|
|
- Your comments on the speed of your parallel program compared to the serial sort, after trying you program on aristotelis for $q = [20:27]$.
|
|
|
|
- The source code of your program uploaded online.
|
|
|
|
Ethics: If you use code found on the web or by an LLM, you should mention your source and the changes you made. You may work in pairs; both partners must submit a single report with both names.
|
|
Deadline: 2 February, $2025$.
|