A Series of Epic Fails…

OldDogBlog post 7.

In the last post I made the decision to use an Intel MAX10M08 FPGA for the ezPixel design. Before starting that PCB design, however, I figured it was prudent to do some prototyping in the FPGA to ascertain whether or not I could fit as much into the FPGA as I am hoping.

FPGA RAMs can be configured as “simple” dual port RAMs, i.e. one side is write only and the other side is read only – Figure 1. This would conveniently allow one side of the RAM to be written to by the host interface circuitry, while the read side of the RAM could be accessed by the string driving circuits. FPGA RAMs tend to be synchronous, i.e. clocked, and thus have a clock input on each side.

Figure 1: Simple Dual Port RAM.

Without putting too much more thought into it, I designed a quick circuit that had a dual port RAM connected to a “String Engine” circuit. The circuit would be easy to replicate, one per string.

I should have thought about it a bit more…I got it designed and operational – Figure 2 – but couldn’t fit anywhere near the number of strings as I had expected before running out of routing resources. It turns out that having a separate RAM for each string was a poor choice as was using them in “simple” dual port mode. I had enough registers to easily fit more strings but connecting the host interface to the separate RAMs required a lot of routing resources. Similarly, trying to multiplex the RAM outputs back to the host interface used a lot more routing resources. We’ll label this iteration “Epic Fail #1.”

Figure 2: Epic Fail #1.

After spending a little time thinking about it more, using one large RAM in “true” dual port mode would be a better choice as it would allow RD/WR on both sides of the RAM. This provides full RD/WR access from the host interface side solving the read access problem and also reducing routing. Another way to save resources would be to use one common string engine and multiplex its serial output to each of the connected strings. With all the obvious problems solved I jumped into design iteration #2.

I should have thought about it a bit more…

In this configuration – Figure 3 – the strings are updated sequentially, ultimately limiting how fast any individual string could be updated. String one was written, then string two, etc. Worked like a champ! However, each pixel requires 30 uSec to write the R-G-B byte values. For a string of 240 pixels it takes 7.2 mSec to write the color values, and 230.4 mSec to write 32 strings. This is a pathetic 4.34 Hz refresh rate. D’oh! Queue the sad music. We’ll label this iteration “Epic Fail #2.”

Figure 3: Epic Fail #2.

I needed an architecture with all the advantages of common RAM while drastically improving string update rates. Combining parts of each Epic Fail ultimately solved the problem. I separated the common String Engine into a single String Feeder connected to individual String Engines – Figure 4. The String Feeder pulls RGB values out of the pixel RAM at high speed and writes them to each String Engine. It cycles through all the attached strings and waits for the RGB bytes to be written serially to each string before feeding the next batch of RGB bytes to the String Engines. Now a string of 240 pixels could update at the max rate supported by the WS2812B, i.e. 7.2 mSec + 50 uSec gap time = appx. 138 Hz. Winner-winner, chicken dinner! We’ll label this iteration “ezPixel.”

The initial fit tests show the String Feeder using ~130 registers and each String Engine using ~40 registers. The pixel RAM is shown as one big block but in reality it is three separate RAMs, one for each color. The three RAMs use common read and write addresses so they act as one big RAM. I decided to try for 9KB per RAM, or 9216 byte locations for each color. This number represents the number of pixels in 32 strings of length 288 pixels. 288 comes from a 2 meter length of string with 144 pixels/meter.

The configuration RAM allows the user to set the size of each string independently. Strings can all be the same length or can be of different lengths as long as the total number of pixels doesn’t exceed 9216. The pixel and config RAMs 28 of the 42 RAMs in the 10M08 device. The remaining RAMs can be used as program memory for a NIOS processor to act as a host interface.

Figure 4: ezPixel

I’m feeling confident that I can comfortably fit what I need into the 10M08 device. Before proceeding to a board schematic and PCB layout, I need to put on my architect hat…


Tom Burke


Leave a Reply

Your email address will not be published. Required fields are marked *