Understanding GPU Architecture With Cornell

Written by Nikos Vaggalis

Friday, 11 April 2025

Find out everything there's to know about GPUs. This Cornell Virtual Workshop will be helpful for those who program in CUDA.

Cornell's Virtual Workshop is a learning platform designed to enhance the computational science skills of researchers.
It is comprised of many Roadmaps dedicated to various Computer Science topics which are grouped by category. Each roadmap consists of about 5-10 topics that are listed in a suggested order. A few follow:

Programming Languages

Introduction to C Programming
Introduction to Fortran Programming
Introduction to Python Programming
Introduction to R

Introduction to Advanced Cluster Architectures

Understanding GPU Architecture
Running Applications

Parallel Computing

Concepts
Parallel Programming Concepts and High-Performance Computing
Scalability
MPI

AI, Machine Learning, Data Science

An Overview of AI
Python for Data Science
AI with Deep Learning
Visualization

Large Data Visualization

ParaView
ParaView - Advanced
Interactive Data Visualization with Bokeh

From those we've picked the "Understanding GPU Architecture" one to look into, since here at IProgrammer we have a particular interest in CUDA programming.

This rodamap focuses in preparing application programs to run on GPUs by laying out the main features of GPU hardware design. It fits perfectly with the other resources we had covered recently, "Three NVIDIA CUDA Programming Super Resources",
"LeetGPU - The CUDA Challenges" and in particular "Demystifying GPU Terminology".

CUDA is of course NVIDIA's toolkit and programming model which provides a development environment for speeding up computing applications by harnessing the power of GPUs. But the problem when working with GPUs is that :

the documentation is fragmented, making it difficult to connect concepts at different levels of the stack, like Streaming Multiprocessor Architecture, Compute Capability, and nvcc compiler flags .

As such the people at Modal created the GPU glossary ("Demystifying GPU Terminology") to collect, amend and present that information, going to great lengths by collecting material from official documentation to Discord-dedicated channels and even trawling through old-fashioned books.

This virtual workshop follows a similar pattern to the Glossary, looking into:

The main architectural features of GPUs and explains how they differ from comparable features of CPUs
Discussing the implications for how programs are constructed for General-Purpose computing on GPUs (or GPGPU), and what kinds of software ought to work well on these devices
Describing the names, sizes, and speeds of the computational and memory components of specific models of NVIDIA GPU devices

The workshop is organized around the following sections:

GPU Characteristics
Goes into the hardware design for graphics processing units (GPUs) which is optimized for highly parallel processing. As a result, application programs for GPUs rely on programming models like NVIDIA CUDA that can differ substantially from traditional serial programming models based on CPUs.

GPU Memory
GPUs require the data to be in registers to be available for computations. This topic looks at the sizes and properties of the different elements of the GPU's memory hierarchy and how they compare to those found in CPUs.

GPU Example: Tesla V100 GPUs and Frontera: RTX 5000
At the actual hardware level, what does a particular GPU consist of, if one peeks "under the hood"? Sometimes the best way to learn about a certain type of device is to consider one or two concrete examples. First we'll look at the Tesla V100, one of the NVIDIA models that has been favored for HPC applications and then we do a similar deep dive into the Quadro RTX 5000, a GPU which is found in TACC's Frontera.

Exercises
that show you how to interrogate NVIDIA devices so that you can determine certain properties of the hardware.

There are no specific requirements for this roadmap; however, access to Frontera may be helpful, or to any computer that hosts an NVIDIA GPU and has the CUDA Toolkit installed. If you don't have access to any of those, fear not, since LeetGPU has got you covered. LeetGPU which have already covered, is a platform where you can write and test CUDA code without setting anything up, just using your browser. Makers AlphaGPU aim with it to "democratize access to hardware-accelerated languages" by emulating GPUs on CPUs using open-source simulators.

The roadmap does one thing and it does it right. It manages to deconstruct a complex topic into its building blocks to convey its concepts as clear and easy as possible.

Together it with it, another Roadmap, that of Parallel Programming Concepts and High-Performance Computing could be considered as a possible companion for those who seek to expand their knowledge of parallel computing in general, as well as on GPUs.

Make sure to check the rest of the Workshops; it's certain that you'll find something of interest.

More Information

Roadmap: Understanding GPU Architecture

Demystifying GPU Terminology

LeetGPU - The CUDA Challenges

Three NVIDIA CUDA Programming Super Resources

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Programming In The Age of AI
16/04/2025

Programmers have embraced AI to aid their productivity. But how should they adjust to really benefit? What skills are required for a successful relationship with AI?

+ Full Story

Google Adds Open-Source Development Kit To Vertex AI
15/04/2025

Google has added an Agent Development Kit (ADK) to Vertex AI, along with an agent engine and an Agent2Agent protocol that provides agents with a common, open language for collaboration. The anno [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments