Search Microcontrollers

Monday, September 22, 2014

FPGA - 7 Segments LEDs and ROM - M9K 2

What did just happen?
We were discussing about RAM an now we jump into 7 segment LEDs?

Yup, but no worries, we will be back to RAM in no time, we just needed this little detour because being able to use these devices in the FPGA board will help us in testing the code we write.

We need something visible after all, right?

My FPGA Board has a 6 digit 7 segments display and I plan to use it to show some Hex numbers.
You are probably already familiar with how these things work, if you are not, then read on.
Each digit is composed of 7 leds that are lightened up to form an image we can interpret as 0,1,2,3...
To be precise normally there are 8 leds for each digit: 7 compose the digit itself and an 8th is used for the decimal point (.) .

My board interfaces a 6 digit device in this way :

There are 8 outputs each one directed to a specific LED (or series of LEDs), plus 6 lines that activate the different digits.
In other words, all the top leds are connected together, activated by the same output, all the center ones the same way etc etc.
The 6 lines that activate specific digits are used to "scan" them.

Say we want to display the value "1" in the rightmost digit.
Then we will enable ONLY the line connected to tie rightmost digit, activate the top right and bottom right leds and wait for a few milliseconds.
Then we will deactivate the line we used (the digit will be switched off), activate the one connected to the next digit, turn on the leds we need to represent the second digit, wait for the same amount of milliseconds etc... rinse and repeat.
If the scan is fast enough, to our eyes it will look like all the digits are on at the same time.

From a FPGA perspecive a viable approach would be to have one module that scans the six lines with a given clock.
Another module will activate the needed segments depending on the digit that must be displayed.

We will then change the value to be displayed by running a counter or something similar.

Since we are using a HEX display (0..F), each digit will get exactly 4 bits of our counter.
Such counter will need to cover up to 6 hex digits -> 6x4 bits -> it will be stored in a 24 bit reg.

The demo module provided with my board (designed to display a decimal number, not hex) uses the following approach :

     _0 = 8'b1100_0000, 
     _1 = 8'b1111_1001, 
     _2 = 8'b1010_0100,
     _3 = 8'b1011_0000, 
     _4 = 8'b1001_1001, 
     _5 = 8'b1001_0010, 
        _6 = 8'b1000_0010, 
     _7 = 8'b1111_1000, 
     _8 = 8'b1000_0000,
     _9 = 8'b1001_0000;

reg [7:0]rOne_SMG_Data;
 always @ ( posedge CLK or negedge RSTn )
    if( !RSTn )
      rOne_SMG_Data <= 8'b1111_1111;
      case( One_Data )       
4'd0 : rOne_SMG_Data <= _0;
        4'd1 : rOne_SMG_Data <= _1;
4'd2 : rOne_SMG_Data <= _2;
4'd3 : rOne_SMG_Data <= _3;
4'd4 : rOne_SMG_Data <= _4;
4'd5 : rOne_SMG_Data <= _5;
4'd6 : rOne_SMG_Data <= _6;
4'd7 : rOne_SMG_Data <= _7;
4'd8 : rOne_SMG_Data <= _8;
4'd9 : rOne_SMG_Data <= _9;

Basically it defines 10 parameters with the bit patterns (active low) of the digits from 0 to 9, then it does a case statement and assigns the LEDs bus the values.
I reckon it works, but, also for the sake of experimenting, I would use a different approach.
My idea is to load the bit patterns in a ROM segment (16 x 8 bit words) and while displaying the digit, access the relevant ROM location to load the pattern.

But first we need to have the demo work the way it is, maybe a little bit re-engineered, later on we will switch those parameters to ROM locations.

The original solution (from the DVDs provided with my board) was coded with a lot of redundant code, i will not get to the details, let's just say it now works the same (in Hex instead than Decimal) with less than half of the original code.
In my book that's already a good improvement, plus it was a great exercise for me.
It is possible that there were good reasons for that, possibly explained in the docs and videos supporting the code, but hey, like Sandra Bullock would say: " No hablo Chino!"  :)

The module hierarchy is represented in the image above.
The top module just wires the demo_control and the smg_interface, the first one increments the counter and the second one displays it on the seven segment display.
In my implementation the smg_interface module, besides wiring the other 3, implements a 1MS clock (used to scan the digits) and calculates which digit (position) should be displayed at any time.
The smg_control module extracts the value of the needed digit like this :

always @ ( posedge CLK or negedge RSTn )
    if( !RSTn )
  rNumber <= 4'd0;
     case( Scan_line )
      0:rNumber <= Number_Sig[23:20];
         1:rNumber <= Number_Sig[19:16];
      2:rNumber <= Number_Sig[15:12];
      3:rNumber <= Number_Sig[11:8];
      4:rNumber <= Number_Sig[7:4];
      5:rNumber <= Number_Sig[3:0];

The scan_module activates one line at a time :

always @ ( posedge CLK or negedge RSTn )
   if( !RSTn )
rScan = 6'b111111;
        rScan <= ~(1 << (5-Scan_line ));  
assign Scan_Sig = rScan;

and finally the encode one activates the led segments for the current digit

 parameter _0 = 8'b1100_0000, _1 = 8'b1111_1001, _2 = 8'b1010_0100, 
   _3 = 8'b1011_0000, _4 = 8'b1001_1001, _5 = 8'b1001_0010, 
   _6 = 8'b1000_0010, _7 = 8'b1111_1000, _8 = 8'b1000_0000,
   _9 = 8'b1001_0000, _a = 8'b1000_1000, _b = 8'b1000_0011,
   _c = 8'b1100_0110, _d = 8'b1010_0001, _e = 8'b1000_0110,
  _f = 8'b1000_1110;  
reg [7:0]rSMG;

always @ ( posedge CLK or negedge RSTn )
    if( !RSTn )
       rSMG <= 8'b1111_1111;
       case( Number_Data )         
 4'd0 : rSMG <= _0;
 4'd1 : rSMG <= _1;
 4'd2 : rSMG <= _2;
 4'd3 : rSMG <= _3;
 4'd4 : rSMG <= _4;
 4'd5 : rSMG <= _5;
 4'd6 : rSMG <= _6;
 4'd7 : rSMG <= _7;
 4'd8 : rSMG <= _8;
 4'd9 : rSMG <= _9;  
 4'd10 : rSMG <= _a;  
 4'd11 : rSMG <= _b;  
 4'd12 : rSMG <= _c;  
 4'd13 : rSMG <= _d;  
 4'd14 : rSMG <= _e;  
                  4'd15 : rSMG <= _f;

The result is visible in the video below.

What we now need to change is the encode module, so that it reads the segments from memory and before that we need to create a ROM memory and populate it with the segment data.

Megawizard plugin allows to create a 16 x 8 bit ROM and populate it with a mif file (see previous post, same steps, but this time I chose 1 Port ROM instead of RAM).

Then we simply substitute the encode module with the newly created rom module in smg_interface:

// remove the old one 
    /*smg_encode_module U2
    .CLK( clk2 ),
     .RSTn( RSTn ),
     .Number_Data( Number_Data ),   // input - from U2
     .SMG_Data( SMG_Data )          // output - to top

// add the new one 
digit_rom rom(

Can it get any simpler than that?
And yes, it works of course, will not upload a video proof as it is boringly identical to the previous, but it really works!

Sunday, September 21, 2014

FPGA - Using RAM Memory - M9K Blocks 1

We all know that MCUs are easier than normal CPUs because pretty much all you need for an application is "included", particularly a certain amount of RAM.

FPGAs are no different, in fact most of them include a number of ram blocks (called M9K in Altera Cyclone IV devices, or M4K in Cyclone II ones... not sure about the other families).

(available memory blocks by family - From Altera's documentation)

If you want to implement a CPU within your FPGA (typically a NIOS II, but any CPU would have the same requirement), you need some ram to store the code to be executed and eventual data.

Since FPGAs have a high count of I/O pins, you can dedicate a few of them to interface an external memory chip or you can use the internal one.

Altera M9K memory blocks are 9KBit memory blocks that can be organized in different ways (i.e. in 9 bit words x 1K cells).
The motivating reason to have 9 bits instead of 8 is to have an additional bit for parity control.

Different devices have a different amount of M9K blocks, the following table lists the specs of the Cyclone IVE family.

  ( from Altera's website )

One interesting feature of these memory blocks is that they are Dual Ported.
This means that they can be configured in a way that allows reading and writing from different addresses and data busses.
Does it matter?
I guess sometimes it does, think about a VGA interface, like the one we described in the previous post.

The VGA implementation we did, did not use memory, it was simply calculating the RGB components based on the X/Y coordinates, using a binary formula.
For practical uses you will hardly use that approach, instead you will probably have a process that writes data in a memory bank where memory locations represent pixels and all together form an image.
Another, concurrent, process will read sequentially the memory bank and use the content to drive the color components.
Technically this means one process will write in some random location L(x1,y1) while another will read from L(x2,y2) at the same time.
If the two processes can use separate ports, the problem is solved!

Note : I cannot compare the quality of Altera's documentation with the one provided by competitors since I only used Altera's when dealing with FPGAs, I can only say it is GREAT.
There is a ton of useful documentation, examples, tutorials, online training etc available on the Altera website.

Here I found an example implementation of a true dual port memory block :

module true_dp_ram (

input [3:0]  address_a;
input [3:0]  address_b;
input clock;
input [12:0]  data_a;
input [12:0]  data_b;
input rden_a;
input rden_b;
input wren_a;
input wren_b;
output [12:0]  q_a;
output [12:0]  q_b;


As you can see this example implements two 4 bits addresses, two separate 13 bits inputs and two 13 bits outputs.
Additional inputs are the clock (RAM needs one) and the read enable / write enable for each port.
How cool is that?!!

Obviously you can trim down your requirements and work with a simpler 1 Port interface if this is ok with your design.

Once you create a memory block, you can specify an initialization file in Quartus II, meaning that your memory could be populated at reset.
Problem is that to actually test if the memory works, we need to provide some kind of output, so we will use in our first example just a few LEDs which will represent the bits of the memory locations we will scan (slowly, so we can see them change).

A common way to provide init data is through a MIF file, follow this link to see the specs.

The MegaWizard plugin manager comes handy here :

A simple 1 Port ram here, for this test

I am planning to use 4 LEDs to output the values, so it is handy to have 4 bits wide words, I chose an arbitrary length of 32, should be more than enough to verify my pattern.

Finally I created a MIF file and loaded it here (it is possible to update it later, even without recompiling the project).

The result is a new module which looks like this :

module ram_module (

input [4:0]  address;
input clock;
input [3:0]  data;
input wren;
output [3:0]  q;

As you can see it created a 5 bit address (we have 32 locations), and two registers for input and output, 4 bit wide each as requested.
Wren will be used to write data, meaning that for this initial test it will not be used at all (so ROM memory would have produced the same result after all :) )

Ok, so now we need a top module that cycles the ram buffer and outputs the content to the leds.

module m9ktest_top(leds ,clock, reset );

output reg[3:0] leds;
input clock;
input reset;

reg  write;
reg [4:0] addr;
reg [3:0] datain;
wire [3:0] dataout;
reg ramclk;

// instantiate and connect the ram buffer

ram_module buffer

reg [31:0] counter; //24 bits would have been enoguh
wire CounterMaxed = (counter==32'hFFFFFF);

always @ (posedge clock or negedge reset)// on positive clock edge
 if (!reset)
 begin // initial conditions
  addr <= 5'b0;
  ramclk <= 1'b0;
  counter <= 32'h0;
  write <= 1'b0;
end else if(CounterMaxed)
 counter <= 32'b0;
 addr <= addr + 5'b1; // set address
 write <= 1'b0;    // disable write
 ramclk <= 1'b1;   // clock the ram
 leds <= dataout;  // read ram location 
 if (addr == 5'b11111)
  addr <= #1 5'b0;
// just delay, my eyes are not fast enough for MHz range!
  ramclk <= 1'b0;
  counter <= counter + 32'b1;// increment counter

I know, it is not really elegant and I am sure it can be coded much better, bear with me, this is still one of my first Verilog experiments.
If you have suggestion / corrections, tho, make sure to post them in the comments!

So, what to say about the code?
It is pretty self explanatory, actually the particular thing is that I am clocking the ram only when I need to read it (is it ok? Best practices? not sure there)
You will notice there is a single always block that is sensitive to both reset and main clock, this is because in Verilog you cannot manipulate the same signal from two different processes.
Makes sens if you think about it : altering the same signal in two independent processes would result in unpredictable results, after all.

Ah, it works by the way :)
Honestly I was really surprised it did work immediately, first attempt... guess I just got lucky, but that wouldn't  have been possible without the great docs I already mentioned in this post, kudos to Altera for that one.