Kristoffer Dyrkorn, March 9, 2025

Faster horizontal lines

(This article is part of a series. You can jump to the previous section or the next section if you would like to.)

Our rasterizer is so far strictly serial. There seems to be no subtask that can easily be parallelized. Given our runtime environment, JavaScript in the browser, we have no clear path to making the code do more per time unit. (Web workers have too much overhead to be useful here.)

Still, we want to improve the performance. We can use a little trick: Although we cannot make our code run in parallel on multiple threads, we can still make it process more data per unit of time.

What we will do is called SIMD: Single instruction, multiple data. The idea is that in a tight loop that does some data processing, it can be more effective to let each iteration work on a larger chunk of data at a time. SIMD often refers to a CPU architecture (or instruction set) that lets the processor work on many data items at a time. So, here we misuse the term a bit - but the principle is the same.

In the code for drawing horizontal lines, we currently write pixel values to our screen buffer one byte at a time. We send 4 bytes to the screen buffer to update one pixel. However, the write operation will typically be just as fast when writing 4 bytes as when writing 1 byte. So we can make the loop run faster if we set up the code inside the loop to draw one pixel - 4 bytes - per iteration.

The implementation

We do this by wrapping the raw image data buffer (an array buffer created by the Canvas API) in a typed array (here: a Uint32Array). This creates a 32-bit view into the raw image buffer for us - which enables writing 4 bytes, a full pixel, at a time.

The code looks like this:

  const arrayBuffer = imageData.data.buffer;
  screenBuffer = new Uint32Array(arrayBuffer);

When we want to write pixels on screen, we first create a 32-bit integer containing the 4 byte values in sequence. This only needs to be done once per triangle (since we only draw single-color triangles now).

    // bake the pixel color as a int32 RGBA value (little-endian)
    let finalColor = color[0];
    finalColor |= color[1] << 8;
    finalColor |= color[2] << 16;
    finalColor |= 255 << 24;

Then we let the innermost pixel-drawing loop, where performance really matters, do just one write per pixel:

    while (x <= endx) {
      this.buffer[screenOffset + x] = finalColor;
      x++;
    }

And with that, we now draw our horizontal lines faster. You might wonder why we are not using the Array.prototype.fill() method in JavaScript instead of looping over each pixel. It turns out that the fill() method actually is slower than the loop! I don’t know why, but the profiling I have done gives consistent results: Using the loop above is more performant.

In the next section, we will take a deep dive into scan conversion again. We will (finally) start improving the visual quality of the animation - we will add support for subpixel precision so the triangles can rotate smoothly.

If you want, you can have a look at the demo app for this section - or, you can just jump directly to the next section.