dazzle/src/routes/articles/the-graphics-pipeline/geometry-processing/+page.svx

767 lines
29 KiB
Text
Raw Normal View History

2025-04-30 18:15:19 +03:30
---
title: The Graphics Pipeline ; Part 1
2025-04-30 18:15:19 +03:30
date: "April 20 - 2025"
---
2025-05-06 15:14:19 +03:30
<script>
import Image from "../../Image.svelte"
import Note from "../../Note.svelte"
import Tip from "../../Tip.svelte"
2025-05-06 15:14:19 +03:30
</script>
2025-05-05 16:54:00 +03:30
Ever wondered how games put all that gore on your display? All that beauty is brought into life by
a process called **rendering**, and at the heart of it, is the **graphics pipeline**.
2025-05-14 01:31:12 +03:30
In this article we'll dive deep into the intricate details of this powerful beast.
We'll cover all the terminologies needed to understand each stage and have many restatements so don't
2025-05-16 08:23:34 +03:30
worry if you don't fully grasp something at first. If you still had questions, feel free to contact me :)
2025-05-14 01:31:12 +03:30
2025-06-04 14:39:20 +03:30
My initial goal of putting everything in **1 article** proved to hurt the **brevity** and the **structure** of content.
Afterall, the **graphics pipeline** is incredibly **complex** and have gone through quite some **evolution**.
This impelled me split the concepts into an **article series** consisting of **4 parts**, which would allow me to explain everything in sufficient depth.
But why exactly **4 parts**?
2025-05-14 01:31:12 +03:30
## Overview
2025-05-05 16:54:00 +03:30
Like any pipeline, the **graphics pipeline** is comprised
2025-04-30 18:15:19 +03:30
of several **stages**, each of which can be a pipeline in itself or even parallelized.
2025-05-05 16:54:00 +03:30
Each stage takes some input (data and configuration) to generate some output data for the next stage.
2025-04-30 18:15:19 +03:30
2025-05-14 01:31:12 +03:30
<Note title="A coarse division of the graphics pipeline", type="diagram">
2025-04-30 18:15:19 +03:30
Application --> **Geometry Processing** --> Rasterization --> Pixel Processing --> Presentation
2025-05-12 15:25:01 +03:30
</Note>
2025-05-12 17:38:54 +03:30
Before the heavy rendering work starts on the <Tip text="GPU">Graphics Processing Unit</Tip>,
2025-05-14 01:31:12 +03:30
we simulate and update the world through **systems** such as physics engine, game logic, networking, etc.
during the **application** stage.
2025-05-12 17:38:54 +03:30
This stage is mostly ran on the <Tip text="CPU">Central Processing Unit</Tip>,
2025-05-14 01:31:12 +03:30
therefore it is extremely efficient on executing <Tip text="sequentially dependent logic">
2025-05-12 15:25:01 +03:30
A type of execution flow where the operations depend on the results of previous steps, limiting parallel execution.
In other words, **CPUs** are great at executing **branch-heavy** code, and **GPUs** are geared
2025-05-20 07:10:02 +03:30
towards executing a TON of **branch-less** or **branch-light** code in parallel---Like executing some
code for each pixel on your screen, there are a ton of pixels but they mostly do their own independent logic. </Tip>.
2025-05-12 15:25:01 +03:30
2025-05-14 01:31:12 +03:30
The updated scene data is then prepped and fed to the **GPU** for **geometry processing**. Here
2025-05-26 12:04:07 +03:30
we figure out where everything ends up on our screen by doing lots of fancy linear algebra.
2025-05-14 01:31:12 +03:30
We'll cover this stage in depth very soon so don't panic (yet).
2025-05-12 15:25:01 +03:30
2025-05-12 17:38:54 +03:30
Afterwards, the final geometric data are converted into <Tip text="pixels"> Pixel is the shorthand for **picture-element**, Voxel is the shorthand for **volumetric-element**. </Tip>
and prepped for the **pixel processing** stage via a process called **rasterization**.
In other words, this stage converts a rather abstract and internal presentation (geometry)
into something more concrete (pixels). It's called rasterization because end the product is a <Tip text="raster">Noun. A rectangular pattern of parallel scanning lines followed by the electron beam on a television screen or computer monitor. -- 1930s: from German Raster, literally screen, from Latin rastrum rake, from ras- scraped, from the verb radere. ---Oxford Languages</Tip> of pixels.
2025-05-12 15:25:01 +03:30
2025-05-12 17:38:54 +03:30
The **pixel processing** stage then uses the rasterized geometry data (pixel data) to do **lighting**, **texturing**,
2025-05-14 01:31:12 +03:30
and all the sweet gory details of a scene (like a murder scene).
2025-05-12 17:38:54 +03:30
This stage is often, but not always, the most computationally expensive.
A huge problem that a good rendering engine needs to solve is how to be **performant**. And a great deal
of **optimization** can be done through **culling** the work that we can deem unnecessary/redundant in each
2025-05-20 07:10:02 +03:30
stage before it's passed on to the next. More on **culling** later so don't worry (yet :D).
2025-05-12 15:25:01 +03:30
The pipeline will then serve (present) the output of the **pixel processing** stage, which is a **rendered image**,
2025-05-20 07:10:02 +03:30
to your pretty eyes using your <Tip text="display">Usually a monitor but the technical term for it is
2025-05-12 17:38:54 +03:30
the target **surface**. Which can be anything like a VR headset or some other crazy surface used for displaying purposes.</Tip>.
2025-06-04 14:39:20 +03:30
<Note type="info", title="Chapters of The Graphics Pipeline">
**Geometry Processing**: How geometry is **represented**, **interpreted**, **transformed** and **expanded**.
**Rasterization**: How the final geometric data is converted into **pixels** and what data they hold.
**Pixel Processing**: How we figure out the **final output color** of each pixel.
**Optimizations**: How modern game-engines like Unreal Engine 5 optimize the pipeline.
</Note>
I hope it is now evident why I chose to split the concepts through 4 parts. So... let's jump right into the gory details of the **geometry processing**
stage!
2025-04-30 18:15:19 +03:30
## Surfaces
2025-05-05 16:54:00 +03:30
2025-05-20 07:10:02 +03:30
Ever been jump-scared by this sight in an <Tip text="FPS">First person (shooter) perspective</Tip>? Why are (the inside of) things rendered like that?
2025-05-05 16:54:00 +03:30
2025-05-18 21:30:29 +03:30
<Note title="Boo!", type="image">
2025-05-06 15:14:19 +03:30
<Image
paths={["/images/boo.png"]}
/>
2025-05-18 21:30:29 +03:30
</Note>
2025-05-06 15:14:19 +03:30
2025-05-14 01:31:12 +03:30
In order to display a (murder) scene,
2025-05-12 17:38:54 +03:30
we need to have a way of **representing** the **surface** of its composing objects (like corpses) in computer memory.
2025-05-18 21:30:29 +03:30
We only care about the **surface** since we won't be seeing the insides anyway---Not that we want to.
2025-04-30 18:15:19 +03:30
At this stage, we only care about the **shape** or the **geometry** of the **surface**.
2025-05-06 16:49:21 +03:30
Texturing, lighting, and all the sweet gory details come at a much later stage once all the **geometry** has been processed.
2025-04-30 18:15:19 +03:30
2025-05-06 16:49:21 +03:30
But how do we represent surfaces in computer memory?
2025-04-30 18:15:19 +03:30
## Vertices
2025-05-05 16:54:00 +03:30
There are several ways to **represent** the surfaces of 3d objects for a computer to understand.
2025-05-12 17:38:54 +03:30
For instance, <Tip text="NURBS">
2025-05-12 18:14:39 +03:30
**Non-uniform rational basis spline** is a mathematical model using **basis splines** (B-splines) that is commonly used in computer graphics for representing curves and surfaces. It offers great flexibility and precision for handling both analytic (defined by common mathematical formulae) and modeled shapes. ---Wikipedia</Tip> surfaces are great for representing **curves**, and it's all about the
2025-05-12 17:38:54 +03:30
**high precision** needed to do <Tip text="CAD">Computer Assisted Design</Tip>. We could also do **ray-tracing** using fancy equations for
2025-05-06 16:49:21 +03:30
rendering **photo-realistic** images.
2025-04-30 18:15:19 +03:30
2025-05-12 17:38:54 +03:30
These are all great---ignoring the fact that they would take an eternity to process...
2025-05-05 16:54:00 +03:30
But what we need is a **performant** approach that can do this for an entire scene with
2025-05-06 16:49:21 +03:30
hundreds of thousands of objects (like a lot of corpses) in under a small fraction of a second. What we need is **polygonal modeling**.
2025-04-30 18:15:19 +03:30
2025-05-05 16:54:00 +03:30
**Polygonal modeling** enables us to do an exciting thing called **real-time rendering**. The idea is that we only need an
2025-05-06 16:49:21 +03:30
**approximation** of a surface to render it **realistically enough** for us to have some fun killing time!
We can achieve this approximation using a collection of **triangles**, **lines**, and **dots** (primitives),
2025-04-30 18:15:19 +03:30
which themselves are composed of a series of **vertices** (points in space).
2025-05-18 21:30:29 +03:30
<Note title="A sphere made out of triangles", type="image">
2025-05-06 15:14:19 +03:30
<Image
paths={["/images/polygon_sphere.webp"]}
/>
2025-05-18 21:30:29 +03:30
</Note>
2025-04-30 18:15:19 +03:30
A **vertex** is simply a point in space.
2025-05-06 16:49:21 +03:30
Once we get enough of these **points**, we can connect them to form **primitives** such as **triangles**, **lines**, and **dots**.
2025-05-05 16:54:00 +03:30
And once we connect enough of these **primitives** together, they form a **model** or a **mesh** (that we need for our corpse).
With some interesting models put together, we can compose a **scene** (like a murder scene :D).
2025-05-18 21:30:29 +03:30
<Note title="Stanford bunny model in increasing level of detail (LoD)", type="image">
2025-05-06 15:14:19 +03:30
<Image
2025-05-06 16:49:21 +03:30
paths={["/images/bunny.jpg"]}
2025-05-06 15:14:19 +03:30
/>
2025-05-18 21:30:29 +03:30
</Note>
2025-05-05 16:54:00 +03:30
But let's not get ahead of ourselves. The primary type of **primitive** that we care about during **polygonal modeling**
2025-05-06 16:49:21 +03:30
is a **triangle**. But why not squares or polygons with a variable number of edges?
2025-04-25 17:13:38 +03:30
## Why Triangles?
2025-05-16 08:23:34 +03:30
In <Tip text="Euclidean geometry"> Developed by **Euclid** around 300 BCE, is based on five axioms. It describes properties of shapes, angles, and space using deductive reasoning. It remained the standard model of geometry for centuries until non-Euclidean geometries and general relativity showed its limits. It's still widely used in education, engineering, and **computer graphics**. ---Wikipedia </Tip>, triangles are always **planar** (they exist only in one plane),
2025-05-05 16:54:00 +03:30
any polygon composed of more than 3 points may break this rule, but why does polygons residing in one plane so important
to us?
2025-05-18 21:30:29 +03:30
<Note title="Planar vs Non-Planar polygons" type="image">
2025-05-06 15:14:19 +03:30
<Image
paths={["/images/planar.jpg", "/images/non_planar_1.jpg", "/images/non_planar_2.png"]}
/>
2025-05-18 21:30:29 +03:30
</Note>
2025-05-05 16:54:00 +03:30
When a polygon exists only in one plane, we can safely imply that **only one face** of it can be visible
2025-05-06 16:49:21 +03:30
at any one time; this enables us to utilize a huge optimization technique called **back-face culling**.
2025-05-05 16:54:00 +03:30
Which means we avoid wasting a ton of **precious processing time** on the polygons that
we know won't be visible to us. We can safely **cull** the **back-faces** since we won't
2025-05-06 16:49:21 +03:30
be seeing the **back** of a polygon when it's in the context of a closed-off model.
We figure this out by simply using the **winding order** of the triangle to determine whether we're looking at the
2025-06-04 14:39:20 +03:30
back of the triangle or the front of it---I'll go in depth about **culling** in part 4.
2025-05-05 16:54:00 +03:30
2025-05-06 16:49:21 +03:30
Triangles also have a very small **memory footprint**; for instance, when using the **triangle-strip** topology (more on this very soon), for each additional triangle after the first one, only **one extra vertex** is needed.
2025-05-05 16:54:00 +03:30
2025-05-06 16:49:21 +03:30
The most important attribute, in my opinion, is the **algorithmic simplicity**.
2025-05-12 15:25:01 +03:30
Any polygon or shape can be composed from a **set of triangles**; for instance, a rectangle is simply **two coplanar triangles**.
Also, it is a common practice in computer science to break down hard problems into simpler, smaller problems.
2025-06-04 14:39:20 +03:30
Trust me, this will be a lot more convincing when we cover the **rasterization** stage in part 2 :)
2025-05-05 16:54:00 +03:30
2025-06-04 14:39:20 +03:30
<Note title="Evolution", type="info">
2025-05-05 16:54:00 +03:30
2025-06-04 14:39:20 +03:30
As a bonus point to consider; present-day **hardware** and **algorithms** have become **extremely efficient** at processing
2025-05-14 01:31:12 +03:30
triangles by doing operations such as sorting, rasterizing, etc, after eons of evolving around them.
2025-06-04 14:39:20 +03:30
We literary have a **fixed function** (unprogrammable) stage in the pipeline dedicated for rasterizing
triangles.
2025-05-14 01:31:12 +03:30
</Note>
2025-05-05 16:54:00 +03:30
## Primitive Topology
2025-05-14 01:31:12 +03:30
So, we got our set of vertices, but having a bunch of points floating around wouldn't make a scene very lively
2025-06-04 14:39:20 +03:30
(or gory), we need to form **triangles** out of them to compose **models** (like our beautiful corpse).
**Input assembler** is the stage responsible for **concatenating** our vertices (the input) to assemble **primitives**.
It is a **fixed function** stage so we can only configure it (it's not programmable).
We can tell the assembler how it should interpret the vertex data by configuring its **primitive** <Tip text="toplogy"> The way in which constituent parts are interrelated or arranged.--mid 19th century: via German from Greek topos place + -logy.---Oxford Languages </Tip>.
2025-06-04 14:39:20 +03:30
Instead of explaining with words, I'm going to show you how each type of topology works with pictures. Buckle up!
2025-05-18 21:30:29 +03:30
When the topology is **point list**, each **consecutive vertex** (v) defines a **single point** primitive (p)
2025-06-04 14:39:20 +03:30
and the number of primitives (n of p) is equals to the number of vertices (n of v).
2025-05-18 21:30:29 +03:30
<Note title="", type="image">
2025-05-14 01:31:12 +03:30
2025-05-18 21:30:29 +03:30
<Image
paths={["/images/primitive_topology_point_list.svg"]}
/>
2025-05-14 01:31:12 +03:30
</Note>
2025-05-18 21:30:29 +03:30
<Note type="math">
```math
2025-05-18 21:30:29 +03:30
\begin{aligned}
&p_i = \{ v_{i} \} \\ &n_p = n_v
\end{aligned}
```
2025-05-16 08:23:34 +03:30
</Note>
2025-06-04 14:39:20 +03:30
When the topology is **line list**, each **consecutive pair of vertices** defines a **single line**:
2025-05-18 21:30:29 +03:30
<Note title="", type="image">
2025-05-18 21:30:29 +03:30
<Image
paths={["/images/primitive_topology_line_list.svg"]}
/>
2025-05-16 08:23:34 +03:30
2025-05-18 21:30:29 +03:30
</Note>
<Note type="math">
```math
2025-05-18 21:30:29 +03:30
\begin{aligned}
&p_i = \{ v_{2i},\ v_{2i+1} \} \\ &n_p = ⌊ n_v / 2 ⌋
\end{aligned}
```
2025-05-16 08:23:34 +03:30
</Note>
2025-06-04 14:39:20 +03:30
When the primitive topology is **line strip**, **one line** is defined by each **vertex and the following vertex**:
2025-05-18 21:30:29 +03:30
<Note title="", type="image">
2025-05-18 21:30:29 +03:30
<Image
paths={["/images/primitive_topology_line_strip.svg"]}
/>
</Note>
<Note type="math">
2025-05-16 08:23:34 +03:30
```math
2025-05-18 21:30:29 +03:30
\begin{aligned}
&p_i = \{ v_i, v_{i+1} \} \\ &n_p = \text{max}(0, n_v - 1)
\end{aligned}
```
2025-05-16 08:23:34 +03:30
</Note>
2025-06-04 14:39:20 +03:30
When the primitive topology is **triangle list**, each **consecutive set of three vertices** defines a **single triangle**:
2025-05-16 08:23:34 +03:30
2025-05-18 21:30:29 +03:30
<Note title="", type="image">
2025-05-18 21:30:29 +03:30
<Image
paths={["/images/primitive_topology_triangle_list.svg"]}
/>
2025-05-18 21:30:29 +03:30
</Note>
2025-05-16 08:23:34 +03:30
2025-05-18 21:30:29 +03:30
<Note type="math">
```math
2025-05-18 21:30:29 +03:30
\begin{aligned}
&p_i = \{ v_{3i}, v_{3i+1}, v_{3i+2} \} \\ &n_p = ⌊n_v / 3⌋
\end{aligned}
```
2025-05-16 08:23:34 +03:30
</Note>
2025-06-04 14:39:20 +03:30
When the primitive topology is **triangle strip**, **one triangle** is defined by each **vertex and the two vertices that follow it**:
2025-05-16 08:23:34 +03:30
2025-05-18 21:30:29 +03:30
<Note title="", type="image">
2025-05-18 21:30:29 +03:30
<Image
paths={["/images/primitive_topology_triangle_strip.svg"]}
/>
2025-05-18 21:30:29 +03:30
</Note>
2025-05-16 08:23:34 +03:30
2025-05-18 21:30:29 +03:30
<Note type="math">
```math
2025-05-18 21:30:29 +03:30
\begin{aligned}
&p_i = \{ v_i,\ v_{i + (1 + i \bmod 2)},\ v_{i + (2 - i \bmod 2)} \} \\ &n_p = \text{max}(0, n_v- 2)
\end{aligned}
```
2025-05-16 08:23:34 +03:30
</Note>
2025-06-04 14:39:20 +03:30
When the primitive topology is **triangle fan**, **triangles** are defined **around a shared common vertex**:
2025-05-18 21:30:29 +03:30
<Note title="", type="image">
2025-05-18 21:30:29 +03:30
<Image
paths={["/images/primitive_topology_triangle_fan.svg"]}
/>
2025-05-18 21:30:29 +03:30
</Note>
<Note type="math">
2025-05-18 21:30:29 +03:30
```math
\begin{aligned}
&p_i = \{ v_{i+1}, v_{i+2}, v_0 \} \\ &n_p = \text{max}(0, n_v - 2)
\end{aligned}
```
</Note>
## Indices
2025-06-04 14:39:20 +03:30
**Indices** are an array of integers that reference the **vertices** in a vertex buffer.
They define the **order** in which vertices should be read (and re-read) by the **input assembler**.
Which allows **vertex reuse** and reduces memory usage by preventing duplicate vertices.
Imagine the following scenario:
```cc
float triangle_vertices[] = {
// x__, y__, z__
0.0, 0.5, 0.0, // center top
-0.5, -0.5, 0.0, // bottom left
0.5, -0.5, 0.0, // bottom right
};
```
Here we have one triangle primitive, cool! Now let's create a rectangle:
```cc
float vertices[] = {
// first triangle
// x__ y__ z__
0.5, 0.5, 0.0, // top right
0.5, -0.5, 0.0, // bottom right << DUPLICATE
-0.5, 0.5, 0.0, // top left << DUPLICATE
// second triangle
// x__ y__ z__
0.5, -0.5, 0.0, // bottom right << DUPLICATE
-0.5, -0.5, 0.0, // bottom left
-0.5, 0.5, 0.0, // top left << DUPLICATE
};
```
As indicated by the comments, we have two **identical** vertices. This situation only gets worse
for each additional attribute per vertex (yep, vertices pack a lot more information than positions, you'll understand soon).
And in a large model with hundreds of thousands of triangles, it becomes unacceptable. Hence we use
indexed rendering:
```cc
float vertices[] = {
// first triangle
// x__ y__ z__
0.5, 0.5, 0.0, // top right
0.5, -0.5, 0.0, // bottom right
-0.5, -0.5, 0.0, // bottom left
-0.5, 0.5, 0.0, // top left
};
unsigned int indices[] = {
0, 1, 3, // first triangle
1, 2, 3 // second triangle
};
```
And you might be asking, what about **triangle strips** we just talked about? Well, if you try to visualize it,
a large model cannot possibly be made from a single strip of triangles, but from many. And we might not even use
triangle strips (we might use triangle lists).
Either way, using indices is optional but almost always a good idea to use them!
<Note title="Post-Transform Vertex Cache", type="info">
Indexed rendering also allows the GPU to use a neat optimization trick called **post-transform vertex cache** where
if the same index is used after **transformations** happened, it'll fetch the result that's recently cached and
won't re-run the transformation logic again.
I'll explain how vertices are transformed soon, don't worry (yet).
</Note>
## **Input Assembler**
2025-06-04 14:39:20 +03:30
Alrighty! Do we have everything we need?
2025-06-04 14:39:20 +03:30
We got our **vertices** to represent geometry. We set our **primitive topology** to determine
how to concatenate them. And we optionally (but most certainly) provided some **indices** to avoid
duplicate vertex data.
2025-06-04 14:39:20 +03:30
All this data (and configuration) is then fed to the very first stage of the **graphics pipeline** called
the **input assembler**. Which as stated before, is responsible for **assembling** primitives from our **input** (vertices and indices).
<Note type="diagram", title="Geometry Processing">
</Note>
2025-06-04 14:39:20 +03:30
2025-05-20 07:10:02 +03:30
## Coordinate System -- Overview
We got our surface representation (vertices), we got our indices, we set the primitive topology type, and we gave these
to the **input assembler** to spit out triangles for us.
**Assembling primitives** is the **first** essential task in the **geometry processing** stage, and
everything you read so far only went over that part.
Its **second** vital responsibility is the **transformation** of the said primitives. Let me explain.
So far, all the examples show the geometry in NDC (Normalized Device Coordinates).
This is because the **rasterizer** expects the final vertex coordinates to be in the NDC range.
Anything outside of this range is **clipped** henceforth not visible.
2025-05-26 12:04:07 +03:30
Yet, as you'll understand soon, doing everything in the **NDC** is inconvenient and very limiting.
2025-05-20 07:10:02 +03:30
What we'd like to do is to transform these vertices through 5 different coordinate systems before ending up in NDC
(or outside of if they're meant to be clipped).
The purpose of each space will be explained shortly. But doing these **transformations** require
a lot of **linear algebra**, specifically **matrix operations**.
2025-06-04 14:39:20 +03:30
So let's get some refresher on the concepts
2025-05-20 07:10:02 +03:30
<Note title="Algebra Ahead!">
The concepts in the following sections may be difficult to grasp at first. And **that's okay**, you don't
2025-05-26 12:04:07 +03:30
need to pickup everything the first time you read them (I didn't). If you feel passionate about these topics
and want to have a better grasp, refer to the references at the bottom of this article and **take
your time** :)
2025-05-20 07:10:02 +03:30
</Note>
## Linear Algebra --- Vector Operations
** What is a vector**
**Additions and Subtraction**
**Division and Multiplication**
**Scalar Operations**
**Cross Product**
**Dot Product**
**Length**
**Normalization and the normal vector**
## Linear Algebra --- Matrix Operations
** What is a matrix**
**Addition and Subtraction**
**Scalar Operations**
**Multiplication**
**Division (or lack there of)**
**Identity Matrix**
## Linear Algebra --- Transformations
2025-05-20 07:10:02 +03:30
**Scale**
2025-05-26 12:04:07 +03:30
**Rotation**
2025-06-04 14:39:20 +03:30
<Note type="info", title="Gimbal Lock">
Representing rotations like this makes us prone to a phenomenon called **gimbal lock** where we lose
an axis of control. A way of avoiding this is to rotate around an arbitary axis (makes it a lot harder
to happen but still possible).
The ideal way is to use <Tip text="quaternions" >A quaternion is a four-part hyper-complex number used in three-dimensional rotations and orientations.
A quaternion number is represented in the form a+bi+cj+dk, where a, b, c, and d parts are real numbers, and i, j, and k are the basis elements, satisfying the equation: i2 = j2 = k2 = ijk = 1.</Tip>,
which not only make gimbal lock impossible but are also more computationally friendly.
A full discussion about quaternions is beyond the scope of this article. However, if you're so interested,
I've left links at the end of this article for further study.
</Note>
**Why Translation is not a linear transformation**
2025-05-20 07:10:02 +03:30
**Translation**
2025-05-26 12:04:07 +03:30
<Note type="info", title="Homogeneous coordinates">
Why are we using 4D matrixes for vertices that are three dimensional?
</Note>
2025-05-20 07:10:02 +03:30
**Embedding it all in one matrix**
Great! You've refreshed on lots of cool mathematics today, let's get back to the original discussion.
**Transforming** the freshly generated **primitives** through this **five** mysterious primary coordinates systems (or spaces),
2025-05-20 07:10:02 +03:30
starting with the **local space**!
2025-04-25 17:13:38 +03:30
## Coordinate System -- Local Space
2025-05-20 07:10:02 +03:30
Alternatively called the **object space**, is the space **relative** to your object's **origin**.
All objects have an origin, and it's probably at coordinates [0, 0, 0] (not guaranteed).
Think of a modelling application like **Blender**. If you create a cube in it and export it, the
**vertices** it outputs is probably something like this:
**insert outputted vertices**.
And the cube looks plain like this:
<Note title="Unit cube", type="image">
</Note>
I hope this one is easy to grasp since **technically** been using it in our initial triangle
and square examples already, the local space just happened to be in NDC though that is not necessary.
Say if we arbitrarily consider each 1 unit is 1cm, then a 10m x 10m cube would have the following
vertices whilst in the local space.
Basically the vertices that are read from a model file is initially in local space.
2025-04-25 17:13:38 +03:30
## Coordinate System -- World Space
2025-05-20 07:10:02 +03:30
This is the where our first transormation happens. If we were constructing a crime scene
without world space transformations then all our corpses would reside somewhere in [0, 0, 0] and
would be inside each other (horrid, or lovely?).
This transformation allows us to **compose** a (game) world, by transforming all the models from
their local space and scattering them around the world. We can **translate** (move) the model to the desired
spot, **rotate** it because why not, and **scale** it if the model needs scaling (capitan obvious here).
This transformation is stored in a matrix called the **model matrix**. This is the first of three primary
**transformation** matrices which gets multiplied by our vertices.
<Note tye="math", title="Model transformation">
```math
\text{model}_M * \text{local}_V
```
</Note>
So one down, two more to go!
2025-04-25 17:13:38 +03:30
## Coordinate system -- View Space
2025-05-20 07:10:02 +03:30
Alternatively names include: **eye space** or the **camera space**.
This is where the crucial element of **interactivity**
comes to life (well depends if you can move the view in your game or not).
Currently, we're looking at the world
through a fixed lens. Since everything that's rendered will be in the [-1.0, 1.0] range, that means
**moving** our selves or our **eyes** or the game's **camera** doesn't have a real meaning.
Now it's you that's stuck! (haha). But don't worry your layz-ass, instead of moving yourself
(which again would not make sense since everything visible ends up in the NDC), you can move the world! (how entitled).
We can achieve this illusion of moving around the world by **reverse transforming** everything based
on our own **location** and **orientation**. So imagine we're in the [+10.0, 0.0, 0.0] coordinates. How we simulate this
movement is to apply this translation matrix:
<Note type="math", title="Simplified movement to the right">
</Note>
2025-05-26 12:04:07 +03:30
** Position **
** Orientation **
2025-05-20 07:10:02 +03:30
We can **rotate** the camera, or more accurately **reverse-rotate** the world, via 3 unit vectors snuggled
inside a matrix, the **up** vector (U), the **target** or **direction** vector (D) and the **right**
vector (R)
<Note type="math", title="LookAt matrix">
```math
\begin{bmatrix} \color{red}{R_x} & \color{red}{R_y} & \color{red}{R_z} & 0 \\ \color{green}{U_x} & \color{green}{U_y} & \color{green}{U_z} & 0 \\ \color{blue}{D_x} & \color{blue}{D_y} & \color{blue}{D_z} & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} * \begin{bmatrix} 1 & 0 & 0 & -\color{purple}{P_x} \\ 0 & 1 & 0 & -\color{purple}{P_y} \\ 0 & 0 & 1 & -\color{purple}{P_z} \\ 0 & 0 & 0 & 1 \end{bmatrix}
```
</Note>
">>>>>" explain in depth why such operation makes the view rotate.
Just like the **world space** transformation which is stored in the **model matrix**.
This transformation is stored in anoher matrix called the **view matrix**.
So far we got this equation to apply the **world space** and **view space** transformations
to the **local space** vertices of our model:
<Note tye="math", title="Model_View transformation">
```math
\text{model}_M * \text{view}_M * \text{local}_V
```
</Note>
That's two down, one left to slay!
2025-04-25 17:13:38 +03:30
## Coordinate system -- Clip Space
2025-05-20 07:10:02 +03:30
2025-05-26 12:04:07 +03:30
**Overview***
**Aspect Ratio***
**Field of view***
**Normalization***
**Putting it all together**
<Note tye="math", title="Model_View transformation">
```math
\text{model}_M * \text{view}_M * \text{projection}_M * \text{local}_V
```
</Note>
2025-04-25 17:13:38 +03:30
## Coordinate system -- Screen Space
2025-05-26 12:04:07 +03:30
** Viewport transform **
## Coordinate system -- Putting it All Together
2025-05-20 07:10:02 +03:30
<Note title="Coordinate System", type="diagram">
</Note>
2025-04-25 17:13:38 +03:30
## Vertex Shader
2025-05-29 16:58:37 +03:30
<Note title="Shaders", type="info">
**Why is it called a "shader" when it's not "shading" anything?**
</Note>
2025-06-04 14:39:20 +03:30
## Geometry Shader (optional stage)
**We can generate more geometry here since some geometric details are expressed more efficiently through mathmatical expressions than raw vertex data**
2025-05-29 16:58:37 +03:30
2025-06-02 14:28:11 +03:30
**Different levels of parallelism (why do we still need the vertex shader)**
**Takes as input "a" primitive, outputs any type of (but only one of) primitive(s)**
**Adjecency primitive types**
**Primitive type only indicates number of input vertices since the primitive itself will get cconsumed**
**Geometry shader instancing**
**Geometry shader examples**
**Tessellation/Subdivision**
**Geometry shaders are out of fashion**
**Subdivision**
**Why do we subdivide?**
**Mathmatical presentation more compressed than actual vertex data**
**Geometry shaders are versatile, not performant**
**Data movement bottleneck**
**LoD**
2025-06-04 14:39:20 +03:30
## Tessellation Shader (optional stage)
2025-06-02 14:28:11 +03:30
**Tessellation Control Shader** (or Hull Shader in DirectX terminology)
**Tessllator**
2025-05-29 16:58:37 +03:30
2025-06-02 14:28:11 +03:30
**Quad Primitives**
**Isolines**
**Outer tessellation / Inner tessellation**
**Tessellation Evaluation Shader** (or Domain Shader in DirectX terminology)
**Tessellation examples**
2025-06-04 14:39:20 +03:30
## Geometry Processing --- Conclusion
Let's wrap up!
2025-05-29 16:58:37 +03:30
<Note type="diagram", title="Geometry Processing">
2025-05-29 16:58:37 +03:30
Prepared Vertex Data ->
2025-05-29 16:58:37 +03:30
Input Assembly turns Vertex Data into digestable structures for the Vertex Shader ->
2025-05-29 16:58:37 +03:30
Vertex Shader is invoked per vertex for applying transformations via some clever linear algebra ->
2025-05-29 16:58:37 +03:30
Geometry & Tessellation Shaders expand the geometry on-the-fly and may apply more transformations ->
2025-05-29 16:58:37 +03:30
... Rasterizer
2025-05-29 16:58:37 +03:30
</Note>
2025-05-29 16:58:37 +03:30
The geometric detail that we now have is not **real**. Perfect triangle do not exist in the real world.
Our next challenge in this journey is to turn these mathmatical representations into something
concrete and significant. We're gonna take these primitives and turn them into **pixels** through
a fancy process called **rasterization**.
2025-05-29 16:58:37 +03:30
2025-06-04 14:39:20 +03:30
You can continue on to [part 2](/articles/the-graphics-pipeline/rasterization) of this article series and learn all about how rasterization
works.
2025-04-25 17:13:38 +03:30
## Sources
2025-05-14 01:31:12 +03:30
<Note title="Reviewers", type="review">
2025-05-26 12:04:07 +03:30
MMZ ❤️
2025-05-20 07:10:02 +03:30
2025-05-29 16:58:37 +03:30
Grammarly
Some LLMs
2025-05-14 01:31:12 +03:30
</Note>
<Note title="Books", type="resource">
2025-06-04 14:39:20 +03:30
[Joey De Vriez --- LearnOpenGL](https://learnopengl.com/) <br/>
[Tomas Akenine Moller --- Real-Time Rendering (4th ed)](https://www.realtimerendering.com/intro.html) <br/>
[Gabriel Gambetta --- Computer Graphics from Scratch](https://gabrielgambetta.com/computer-graphics-from-scratch/) <br/>
2025-05-14 01:31:12 +03:30
</Note>
<Note title="Wikipedia", type="resource">
2025-06-04 14:39:20 +03:30
[Polygonal Modeling](https://en.wikipedia.org/wiki/Polygonal_modeling) <br/>
[Non-uniform Rational B-spline Surfaces](https://en.wikipedia.org/wiki/Non-uniform_rational_B-spline) <br/>
[Computer Aided Design (CAD)](https://en.wikipedia.org/wiki/Computer-aided_design) <br/>
[Rasterization](https://en.wikipedia.org/wiki/Rasterisation) <br/>
[Euclidean geometry](https://en.wikipedia.org/wiki/Euclidean_geometry) <br/>
2025-05-14 01:31:12 +03:30
</Note>
<Note title="Youtube", type="resource">
2025-06-04 14:39:20 +03:30
[Miolith --- Quick Understanding of Homogeneous Coordinates for Computer Graphics](https://www.youtube.com/watch?v=o-xwmTODTUI) <br/>
[Leios Labs --- What are affine transformations?](https://www.youtube.com/watch?v=E3Phj6J287o) <br/>
[3Blue1Brown --- Essence of linear algebra (highly recommended playlist)](https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab) <br/>
[3Blue1Brown --- Quaternions and 3d rotation, explained interactively](https://www.youtube.com/watch?v=zjMuIxRvygQ) <br/>
[pikuma --- Math for Game Developers (playlist)](https://www.youtube.com/watch?v=Do_vEjd6gF0&list=PLYnrabpSIM-93QtJmGnQcJRdiqMBEwZ7_) <br/>
[pikuma --- 3D Graphics (playlist)](https://www.youtube.com/watch?v=Do_vEjd6gF0&list=PLYnrabpSIM-97qGEeOWnxZBqvR_zwjWoo) <br/>
[Cem Yuksel --- Introduction to Computer Graphics (playlist)](https://www.youtube.com/watch?v=vLSphLtKQ0o&list=PLplnkTzzqsZTfYh4UbhLGpI5kGd5oW_Hh) <br/>
[Cem Yuksel --- Interactive Computer Graphics (playlist)](https://www.youtube.com/watch?v=UVCuWQV_-Es&list=PLplnkTzzqsZS3R5DjmCQsqupu43oS9CFN&pp=0gcJCV8EOCosWNin) <br/>
[javidx9 --- Essential Mathematics For Aspiring Game Developers](https://www.youtube.com/watch?v=DPfxjQ6sqrc) <br/>
2025-05-14 01:31:12 +03:30
</Note>
2025-05-26 12:04:07 +03:30
<Note title="Articles", type="resource">
2025-05-14 01:31:12 +03:30
2025-06-04 14:39:20 +03:30
[Stackoverflow --- Why do 3D engines primarily use triangles to draw surfaces?](https://stackoverflow.com/questions/6100528/why-do-3d-engines-primarily-use-triangles-to-draw-surfaces) <br/>
[The ryg blog --- The barycentric conspiracy](https://fgiesen.wordpress.com/2013/02/06/the-barycentric-conspirac/) <br/>
[Juan Pineda --- A Parallel Algorithm for Polygon Rasterization](https://www.cs.drexel.edu/~deb39/Classes/Papers/comp175-06-pineda.pdf) <br/>
[Kristoffer Dyrkorn --- A fast and precise triangle rasterizer](https://kristoffer-dyrkorn.github.io/triangle-rasterizer/) <br/>
[Microsoft --- Rasterization Rules](https://learn.microsoft.com/en-us/windows/win32/direct3d11/d3d10-graphics-programming-guide-rasterizer-stage-rules) <br/>
2025-05-14 01:31:12 +03:30
</Note>
2025-05-26 12:04:07 +03:30
<Note title="Documentations", type="resource">
2025-05-14 01:31:12 +03:30
2025-06-04 14:39:20 +03:30
[Vulkan Docs --- Drawing](https://docs.vulkan.org/spec/latest/chapters/drawing.html) <br/>
[Vulkan Docs --- Pipeline Diagram](https://docs.vulkan.org/spec/latest/_images/pipelinemesh.svg) <br/>
2025-05-14 01:31:12 +03:30
</Note>