18 Questions
What is the term used in CUDA to refer to a function that is run by all the threads in a grid?
Kernel
In CUDA, threads are organized in blocks, and blocks are organized in what structure?
Grids
What determines the sizes of the blocks and grids in CUDA programming?
Device capabilities
What type of vector does the CUDA-supplied dim3 represent?
Integer vector of three elements
How can a 1D grid made up of five blocks, each with 16 threads be invoked in CUDA programming?
foo();
Which aspect of a program must be decomposed into a large number of threads to properly utilize a GPU?
The program itself
What is the primary reason for a programmer to understand how threads and warps are executed on a GPU?
To optimize performance by minimizing thread divergence
What happens when threads within a warp diverge due to a conditional operation?
The divergent paths are evaluated sequentially
Which statement best describes the relationship between GPU memory and host memory?
GPU memory and host memory are completely separate
In the context of CUDA programming, what is the significance of operation atomicity?
It maintains data consistency when multiple threads modify shared memory
What is the primary reason why a programmer cannot directly pass a pointer to an array in the host's memory to a CUDA kernel?
GPU memory and host memory are separate and disjoint
What percentage of multiprocessors would be idle during the execution of the last warp of each block, according to the text?
87.5%
What type of memory allocation is needed when shared memory requirements can only be calculated at run-time?
Dynamic allocation
What is the purpose of the third parameter in the execution configuration's alternative syntax?
To specify the size of shared memory to be reserved
In the given example of calculating a histogram for a grayscale image, what is the maximum number of categories (bins) allowed?
256
What is the key difference between the CUDA solution and the multithreaded solution for the histogram calculation problem?
The CUDA solution uses implicit data partitioning and coalesced memory accesses
What is the purpose of using a stride in the CUDA solution for the histogram calculation problem?
To coalesce memory accesses and cover all data
Which of the following best describes the concept of coalesced memory accesses in the context of CUDA programming?
Threads accessing contiguous memory locations simultaneously
Learn about how to properly utilize a GPU by decomposing programs into threads that run concurrently. Understand how GPU schedulers execute threads with minimum switching overhead under various configurations. Explore the hierarchy of threads organized in blocks and grids in CUDA programming.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free