When it comes to the most basic 3D theory, it's all about shapes in a 3D space using the coordinate system to calculate it's position.
Coordinate system
WebGL uses the right-hand coordinate system - the x
axis is pointing to the right, y
axis up and the z
axis out of the screen - see the image above. It is a counter-clockwise system with the Cartesian coordinates.
Rendering pipeline
Rendering pipeline is the process of preparing images and outputting them on the screen. The graphics rendering pipeline takes the 3D objects built from primitives described using vertices, apply processing, calculate the fragments and render them on the 2D screen as pixels.
All the shapes are built from vertices. Every vertex is described by these attributes:
- Position identifies it in a 3D space (
x
,y
,z
) - Color holds an RGBA value (alpha for transparency, ranged from
0.0
to1.0
) - Normal is a way to describe the direction the vertex is facing
- Texture is the 2D image that the vertex can use instead of a simple color
Primitive is an input to the pipeline, it's built from vertices and can be a triangle, point or line.
Pixel is a point on the screen arranged in the 2D grid, and holds an RGB color.
Fragment is a 3D projection of the pixel, and has all the pixel attributes.
Objects
A face of the given shape is a plane between vertices. For example, a cube have 8 different vertices (points in space) and 6 different faces, each constructed out of 4 vertices. Also, by connecting the points we're creating the edges of the cube. The geometry is built from a vertex and the face, while material is a texture, which uses an image. If we connect the geometry with the material we will get a mesh.
Rendering pipeline consists of vertex and fragment processing. They are programmable - you can write your own shaders that manipulate the output.
Transformation matrix
Vertex processing is all about transforming the coordinates and projecting them on the screen. A transform converts a vertex from one space to another, and is done by multiplying the vector with the transformation matrix.
There are four stages of that processing: arranging the objects in the world (called world or model transformation), positioning and setting the orientation of the camera (view transformation), defining the camera settings (projection transformation) and outputting the image (viewport transformation).
Model (world) transformation
Objects are drawn in local space, so they need to be transformed to be drawn in the global, world space. It is done with the affine transforms - for example the rotation and scaling belong to the linear transformation, while translation is not linear.
View transformation
View transformation is about placing the camera in the 3D space. The camera have three parameters: location, direction it points at and orientation. The vew matrix is used to transform the camera space to the world space.
Projection (perspective) transformation
Projection sets up what can be seen by the camera - the configuration includes field of view, aspect ratio and optional near and far planes. Objects outside of the view are not visible, and are ignored in the rendering process to boost performance. If an object is partially visible it is clipped to the camera's visible area. Projection transforms individual vertices.
Rasterization (viewport) transformation
Rasterization converts primitives to a set of fragments and maps them to the 3D viewport.
Fragment processing
Fragment processing focus on textures and lightning. It calculates final colors based on the given parameters.
Output manipulation
During the output manipulation we can take an advantage of using z-buffer, or depth-buffer. Removing everything that is not visible because it was hidden behind another object can greatly increase the performance.
If one object is in front of the other and it's not entirely opaque (the material have transparency), the object behind it have to be rendered with that in mind - alpha blending can be used to calculate the proper colors of the objects in this situation.
Lightning
The color we see on the screen is a result of the light source interacting with the surface's color of the object's material. Light might be absorbed or reflected. The standard Phong Lightning Model implemented in WebGL have four basic types of lighting:
- Diffuse is a distant directional light, like the sun
- Specular is a point light, just like a light bulb in a room or a flash light
- Ambient is the constant light applied to everything on the scene
- Emissive is the light emitted directly by the object
Conclusion
Now you know the basic theory behind 3D manipulation. If you want to move on to practice and see some demos in action, follow up with the tutorials below:
- Building up a basic demo with Three.js
- Building up a basic demo with Babylon.js
- Building up a basic demo with PlayCanvas
- Building up a basic demo with A-Frame
Go ahead and create some cool cutting-edge 3D experiments yourself!