Lab 4: TinyRV1 Processor
Part E: FPGA Prototype V2

Lab 4 will give you experience designing, implementing, testing, and prototyping a single-cycle processor microarchitecture and a specialized accelerator. The processor will implement the TinyRV1 instruction set. The instruction set manual is located here:

https://cornell-ece2300.github.io/ece2300-mkdocs/ece2300-tinyrv1-isa

The lab will continue to leverage concepts from Topic 2: Combinational Logic, Topic 3: Boolean Algebra, Topic 4: Combinational Building Blocks, Topic 6: Sequential Logic, Topic 7: Finite-State Machines, and Topic 8: Sequential Building Blocks. The lab will also leverage concepts from Topic 9: Instruction Set Architecture and Topic 10: Single-Cycle Processors The lab will continue to provide opportunities to leverage the three key abstraction principles: modularity, hierarchy, and regularity.

The lab includes seven parts:

Part A: Processor Components
- Due 11/6 @ 11:59pm via GitHub
- Students should work on Part A before, during, and after your assigned lab section during the week of 11/3
- Pre-lab survey on Canvas is (roughly) due by end of lab section during the week of 11/3
Part B: TinyRV1 Processor
- Due 11/13 @ 11:59pm via GitHub
- Students should work on Part B before, during, and after your assigned lab section during the week of 11/10
Part C: Accumulate Accelerator
- Due 11/25 @ 11:59pm via GitHub
- Students should plan to submit Part C before they leave for Thanksgiving Break
Part D: FPGA Prototype v1
- Due week of 11/17 during assigned lab section
- This part will focus on prototyping the code developed in Part A+B
- Even though completed with a partner, every student must turn in their own paper check-off sheet in their lab section!
Part E: FPGA Prototype v2
- Due week of 12/1 during assigned lab section
- This part will focus on prototyping the code developed in Part A+B+C
- Even though completed with a partner, every student must turn in their own paper check-off sheet in their lab section!
Part F: TinyRV1 Assembly
- Due 12/4 @ 11:59pm via GitHub
- This part will include all of the assembly developed during Part D+E
Part G: Report
- Due on 12/8 at 11:59pm for all groups!
- Post-lab survey on Canvas is due at the same time as the report

This handout assumes that you have read and understand the course tutorials and that you have attended the discussion sections. This handout assumes you have successfully completed Parts A, B, and C, meaning your processor, memory bus, SPI, and accumulate assembly program are all working in simulation.

What do we do if Parts A, B, and C are not done?

If Parts A and B are not done then you cannot get started on Part E. Based on our experiences in the very first lab section we have decided to skip prototyping the accumulate accelerator so we can focus on the processor in this lab, so if your accelerator is not working please continue to work on fixing it and have your revision submitted by this Thursday night.

Here are the steps to get started:

Step 1. Find your lab partner
Step 2. Find a free workstation
Step 3. Ask the TAs for a lab check-off sheet (each student needs their own check-off sheet)

For each lab report task you must take some notes, save a screenshot, and/or record some data for your lab report. The lab report is not due until the last day of classes.

For each lab check-off task you must raise your hand and have a TA come to check-off your work. The TA will ask you the questions included as part of the lab check-off task and the assess your understanding using the following rubric: mastery; accomplished; emerging; beginning. If the TA and students together feel the students have not mastered the lab check-off task, the students are encouraged to take a few minutes and try again.

The Final Lab

This is the final lab of the semester. It will require students to leverage all they have learned throughout the semester. Students will need to:

use iverilog to simulate various designs on ecelinux;
use Quartus to integrate, analyze, synthesize, and configure an FPGA prototype;
write, assemble, and load assembly programs for their TinyRV1 processor;
wire various electrical components on a breadboard; and
use the oscilloscope to visualize electrical signals.

If students have not mastered these skills, they might need to revisit material from previous lab assignments. Students must maintain a sense of urgency and leverage all of they learned in order to complete the lab assignment. There are no extensions and students cannot complete Part E at any other time except during their assigned lab section.

Lab Check-Off Task 1: Setup Lab Kit

The TAs will pass out an ECE 2300 Lab Kit to each group. The TAs will record the kit number on your check-off sheet. For this lab, you will receive the FPGA board, a USB-B cable, a USB-C cable, and a component box with some jumper wires, a piezo buzzer, a digital distance sensor, and a USB flash drive. Use the USB-B cable to plug the FPGA board into the workstation.

1. Simulate the Single-Cycle TinyRV1 Processor

Before starting to work on an FPGA prototype, you must make sure you have a working Verilog hardware design that has been thoroughly tested in simulation. We will be rerunning the same test simulations and using the same interactive simulators as last week to ensure our accumulate assembly program is fully functional. One student should start VS Code on the workstation, log into the ecelinux servers, source the setup script, and make sure their group repository is up to date.

% source setup-ece2300.sh
% cd ${HOME}/ece2300/groupXX
% git pull
% tree

Where XX is your group number.

1.1. Verify All Four Labs

Now run all of the tests from a clean build to ensure your design is fully functional.

% cd ${HOME}/ece2300/groupXX
% trash build
% mkdir build
% cd build
% ../configure
% make check

Lab Check-Off Task 2: Verify All Four Labs

Show a TA that your hardware designs are passing all of your tests. You must pass all of the tests from Lab 1, Lab 2, and Lab 3. You must also pass all of the Lab 4 Part A and B tests. You can still receive full credit if you are failing some of the tests from Part C as long as you explain your plan for fixing the accelerator by the Thursday deadline.

1.2. Verify Accumulate Assembly Program

We now want to run your accumulate program on the ISA simulator and the single-cycle processor simulator. Start by assembling your accumulate assembly program into a machine program.

% cd ${HOME}/ece2300/groupXX/build
% make proc-isa-sim
% make accumulate.bin

Recall from Part C that the interactive simulator for the accumulate accelerator is using the following array loaded into the memory.

addr  data  size  result result seven
(hex) (dec) (Dec) (dec)  (hex)  segment
---------------------------------------
000   36     1       36  0x024    4
004   26     2       62  0x03e   30
008   69     3      131  0x083    3
00c   57     4      188  0x0bc   28
010   11     5      199  0x0c7    7
014   68     6      267  0x10b   11
018   41     7      308  0x134   20
01c   90     8      398  0x18e   14

020   32     9      430  0x1ae   14
024   76    10      506  0x1fa   26
028   44    11      550  0x226    6
02c   19    12      569  0x239   25
030   17    13      586  0x24a   10
034   59    14      645  0x285    5
038   99    15      744  0x2e8    8
03c   49    16      793  0x319   25

040   65    17      858  0x35a   26
044   12    18      870  0x366    6
048   55    19      925  0x39d   29
04c    0    20      925  0x39d   29
050   51    21      976  0x3d0   16
054   42    22     1018  0x3fa   26
058   82    23     1100  0x44c   12
05c   23    24     1123  0x463    3

060   21    25     1144  0x478   24
064   54    26     1198  0x4ae   14
068   83    27     1281  0x501    1
06c   31    28     1312  0x520    0
070   16    29     1328  0x530   16
074   76    30     1404  0x57c   28
078   21    31     1425  0x591   17
07c    4    32     1429  0x595   21

The above table shows the address, data, and the expected result when accumulating every possible size up to 32 elements. Now build the ISA simulator and run the accumulate machine program on the ISA simulator using the text user interface (TUI).

% cd ${HOME}/ece2300/groupXX/build
% make proc-isa-sim
% ./proc-isa-sim +bin=accumulate.bin +tui

Remember once you start the simulation you can use the following commands to emulate entering the size on the switches (i.e., input in0) and pressing the button (i.e., setting input in2 to one for a few cycles, then setting input in2 back to zero).

/in0=4
/in2=1
/20
/in2=0
/200

Since in0 is 4 we find the row in the table when the size is 4. The final sum should be 188 which is 0x0bc in hex. Verify in the TUI that the final result is indeed 0x0bc.

The ISA simulator will display the total number of cycles in the lower right corner. Recall that our accumulate assembly program looks like this:

# set breadboard pin high for timing

addi x3, x0, 1
sw   x3, 0x21c(x0)

#''' LAB ASSIGNMENT ''''''''''''''''''''''''''''''''''''''''''''''''''''
# Write your accumulate loop
#'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
# Be sure to understand the code above and below so you know what
# register stores the size and what register stores the result.

# set breadboard pin low for timing

sw   x0, 0x21c(x0)

Recall that memory address 0x21c is the memory-mapped IO address for out3. So our assembly program includes instructions to set out3 to one before the accumulate loop and set out3 to zero after the accumulate loop. The ISA simulator monitors the number of cycles that out3 is one; we call these "stat" cycles (for performance statistics). The ISA simulator also reports the number of "stat" cycles in the lower right hand corner. Make sure the number of stat cycles makes sense given the number of instructions in each iteration of your accumulate loop and the number of iterations executed by the loop.

Now build the single-cycle processor simulator and run the accumulate machine program on this simulator using the text user interface (TUI).

% cd ${HOME}/ece2300/groupXX/build
% make proc-scycle-sim
% ./proc-scycle-sim +bin=accumulate.bin +tui

Just like on the ISA simulator, emulate entering a size of 4 on the switches and pressing the button. Verify that the final result is indeed 0x0bc.

Lab Check-Off Task 3: Verify Accumulate Assembly Program

Show a TA your accumulate assembly program correctly executing on the ISA simulator with a size of 31. Clearly explain to the TA why the number of stat cycles makes sense given your specific accumulate loop. Then show the TA your accumulate assembly program correctly executing on the single-cycle processor simulator.

1.4. Copying Files to Workstation

We now need to get the files for your design from ecelinux onto the workstation. This requires multiple steps.

Step 1. Click Microsoft Edge on the desktop to open a web-browser on the workstation to log into GitHub and then find your repository
Step 2. Start PowerShell by clicking the Start menu then searching for Windows PowerShell
Step 3. Use the following command to change to your home directory on the workstation in the lab (where netid is your Cornell NetID)

% cd C:\Users\netid

Step 4. Clone your repo onto the workstation by using this command in PowerShell (where netid is your Cornell NetID, notice we are using https!):

% git clone https://github.com/cornell-ece2300/groupXX

Step 5. In the Connect to GitHub pop-up, click Sign in with your browser
Step 6. You may be asked for your GitHub username again and you may be asked to authorize the Git Credential Manager; click authorize git-ecosystem
Step 7. Verify that you have successfully cloned your repo by changing into your repo and using tree on the workstation:

% cd groupXX
% tree

2. Setup Quartus Project

Click Quartus (Quartus Prime 18.1) on the desktop to start Quartus. Important: Ensure that the Quartus Version is 18.1 and not 23.1. Then, click Run the Quartus Prime software. You might need to try starting Quartus twice. Setup a new Quartus project using the New Project Wizard:

Directory, Name, Top-Level Entity
- You must enter the working directory as follows with your NetID!
- Working directory: C:\Users\netid\lab4e
- Name of this project: lab4d
- Name of top-level design entity: lab4e
- Click Next
Directory does not exist. Do you want to create it?
- Click yes
Project Type
- Choose Empty Project
- Click Next
Add Files
- Click User Libraries...
- Click triple dots to the right of Project library name
- Click on This PC, then navigate to your cloned repo by choosing Windows (C:) > Users > netid > groupXX where XX is your group number
- Click Select Folder
- Click Add
- Click OK
- Click triple dots to right of File name
- Click on This PC, then navigate to your cloned repo by choosing Windows (C:) > Users > netid > groupXX > lab4 where XX is your group number
- Shift-click on every Verilog hardware design file (do not include any files in the test or sim subdirectories)
- Click Open
- Click Next
Family, Device, and Board Settings
- Click Board tab
- Family: Cyclone V
- Select DE0-CV Development Board
- Make sure Create top-level design file is checked
- Click Next
EDA Tool Settings
- Click Next
Summary
- Click Finish

As in previous labs, you must use the following steps to ensure Quartus knows your design includes RTL modeling:

Choose Assignments > Settings from the menu
Select the category Compiler Settings > Verilog HDL Input
Under Verilog version click SystemVerilog
Click OK

3. TinyRV1 Processor FPGA Prototype

We will now integrate your TinyRV1 single-cycle processor into a complete embedded system, synthesize the design, and configure the FPGA so we can run machine programs on the real prototype. We will use the TinyRV1 processor FPGA prototype to measure the delay of your accumulate assembly program and to implement a door monitoring embedded system.

3.1. Integrate

We want to implement a TinyRV1 processor FPGA prototype with the following specification:

Input to clock divider is connected to the 50MHz clock on the FPGA board
rst (ACTIVE LOW!) is connected to the reset button on the board using a synchronizer
The left most push button will be the go button (see below)
The left five switches are in0
The right five switches are in1
The right three push buttons are the three least significant bits of in2 (ACTIVE LOW!)
The two seven-segment displays on the left are out0
The two seven-segment displays in the middle are out1
The two seven-segment displays on the right are out2
The left most LED shows the go bit
The right eight LEDs are for out3
The general-purpose pins will be used to connect to the SPI
- SCLK connects to GPIO_1[1] which connects to D0 on USB-to-SPI adapter
- MOSI connects to GPIO_1[3] which connects to D1 on USB-to-SPI adapter
- MISO connects to GPIO_1[5] which connects to D2 on USB-to-SPI adapter
- CS connects to GPIO_1[7] which connects to D3 on USB-to-SPI adapter
The general-purpose pins will also be used with the oscilloscope and two new external I/O devices
- The least significant bit of out3 connects to GPIO_1[11] which will be probed by the oscilloscope
- The distance sensor connects to GPIO_1[13]
- The output of the multi-note player connects to GPIO_1[15]

Here is a block diagram and annotated FPGA board and breadboard diagram illustrating the system we will be prototyping for Part E.

The differences between this system and what you implemented in the previous lab are highlighted in red on the block diagram and include two new external I/O devices.

Distance Sensor

We will be using a digital distance sensor in our door monitoring embedded system. The digital distance sensor looks like this:

The sensor uses light detection and ranging (LIDAR) to measure distance by emitting laser pulses and measuring the time it takes for them to bounce back. You can read more about the sensor here:

https://www.pololu.com/product/4069

The sensor has three pins:

Ground marked with a negative (-) sign
Vdd marked with a plus (+) sign
Output marked as out

You will need to connect the Vdd and ground pins appropriately on the breadboard, and connect the output to the correct general-purpose I/O pin. Make sure to point the LIDAR towards the edge of the breadboard and that no wires obstruct the LIDAR.

The sensor is preconfigured to detect objects within 100cm. This makes it a digital distance sensor meaning the output with be either one or zero based on whether or not the sensor detects an object. The sensor is ACTIVE LOW so the output is one when there is no object detected within 100cm and the output is zero when there is an object detected within 100cm.

Multi-Note Player

We will be integrating your multi-note player from Lab 3 as an external output device for our door monitoring embedded system. The note output from the multi-note player will be connected to a piezo buzzer using a general-purpose I/O pin just as in Lab 3. The play_note_rdy signal is connected back to the play_note_val through a DFFR. This means the multi-note player will continuously play whatever note is specified using play_note_num. The play_note_num input is connected to out3, so to play a note from an assembly program we can simply store the desired note to the appropriate memory-mapped I/O address. For example, the following assembly fragment plays the note 3:

addi x1, x0, 3
sw   x1, 0x20c(x0)

Here is a template you can use for your top-level design.

//----------------------------------------------------------------------
// Clock Divider
//----------------------------------------------------------------------
// You need to connect the ports to the appropriate top-level ports.

logic clk;

ClockDiv_RTL clock_div
(
  .clk_in     (/* fill this in */),
  .divide_sel (1),
  .clk_out    (clk)
);

//----------------------------------------------------------------------
// Reset Synchronizer
//----------------------------------------------------------------------
// You need to connect the ports to the appropriate top-level ports.

logic rst;

Synchronizer_RTL reset_sync
(
  .clk (clk),
  .d   (/* fill this in */), // REMEMBER: Reset port is active low!
  .q   (rst)
);

//----------------------------------------------------------------------
// DFFRE for go bit
//----------------------------------------------------------------------
// You need to connect the ports to the appropriate top-level ports.

logic go;

DFFRE_RTL go_reg
(
  .clk (clk),
  .rst (rst),
  .d   (1'b1),
  .en  (/* fill this in */), // REMEMBER: Push buttons are active low!
  .q   (go)
);

assign /* fill this in */ = go;

//----------------------------------------------------------------------
// Processor
//----------------------------------------------------------------------

logic        imem_val;
logic        imem_wait;
logic [31:0] imem_addr;
logic [31:0] imem_rdata;

logic        dmem_val;
logic        dmem_wait;
logic        dmem_type;
logic [31:0] dmem_addr;
logic [31:0] dmem_wdata;
logic [31:0] dmem_rdata;

logic        trace_val;
logic [31:0] trace_addr;
logic [31:0] trace_data;
logic        trace_wen;
logic [4:0]  trace_wreg;
logic [31:0] trace_wdata;

ProcScycle proc
(
  .imem_wait (~go),
  .dmem_wait (~go),
  .*
);

//----------------------------------------------------------------------
// Memory Bus
//----------------------------------------------------------------------

logic        hmem_val;
logic [31:0] hmem_addr;
logic [31:0] hmem_wdata;

logic        mem0_val;
logic        mem0_wait;
logic        mem0_type;
logic [31:0] mem0_wdata;
logic [31:0] mem0_addr;
logic [31:0] mem0_rdata;

logic        mem1_val;
logic        mem1_type;
logic [31:0] mem1_addr;
logic [31:0] mem1_wdata;
logic [31:0] mem1_rdata;
logic        mem1_wait;

logic [31:0] in0, in1, in2, in3;
logic [31:0] out0, out1, out2, out3;

MemoryBus_RTL memory_bus
(
  .*
);

//----------------------------------------------------------------------
// Physical Memory
//----------------------------------------------------------------------

Memory physical_memory
(
  .*
);

//----------------------------------------------------------------------
// SPI
//----------------------------------------------------------------------
// You need to connect sclk, mosi, miso, and cs to the appropriate
// top-level ports.

logic sclk;
logic mosi;
logic miso;
logic cs;

assign sclk = /* fill this in */;
assign mosi = /* fill this in */;
assign miso = /* fill this in */;
assign cs   = /* fill this in */;

SPI_RTL minion
(
  .*
);

//----------------------------------------------------------------------
// External Inputs
//----------------------------------------------------------------------
// You need to connect the ports to the appropriate top-level ports.

assign in0 = { 27'b0, /* fill this in */ };
assign in1 = { 27'b0, /* fill this in */ };

// REMEMBER: The push buttons are active low!
assign in2 = { 29'b0, /* fill this in */ };

// REMEMBER: The distance sensor is active low!
assign in3 = { 31'b0, /* fill this in */ };

//----------------------------------------------------------------------
// External Outputs
//----------------------------------------------------------------------
// You need to connect the ports to the appropriate top-level ports.

DisplayOpt_GL display_out0
(
  .in       (out0[4:0]),
  .seg_tens (/* fill this in */),
  .seg_ones (/* fill this in */)
);

DisplayOpt_GL display_out1
(
  .in       (out1[4:0]),
  .seg_tens (/* fill this in */),
  .seg_ones (/* fill this in */)
);

DisplayOpt_GL display_out2
(
  .in       (out2[4:0]),
  .seg_tens (/* fill this in */),
  .seg_ones (/* fill this in */)
);

assign /* fill this in */ = out3[7:0];
assign /* fill this in */ = out3[0];

//----------------------------------------------------------------------
// Multi-Note Player Clock Divider
//----------------------------------------------------------------------
// You need to connect the ports to the appropriate top-level ports.

logic mnp_clk;

ClockDiv_RTL mnp_clock_div
(
  .clk_in     (/* fill this in */),
  .divide_sel (9),
  .clk_out    (mnp_clk)
);

//----------------------------------------------------------------------
// Instantiate the multi-note player
//----------------------------------------------------------------------
// You need to connect the ports to the appropriate top-level ports.

logic play_note_val;
logic play_note_rdy;

DFFR_RTL mnp_val_reg
(
  .clk (clk),
  .rst (rst),
  .d   (play_note_rdy),
  .q   (play_note_val)
);

logic [2:0] note_sel;
logic note;

MultiNotePlayer_RTL player
(
  .clk           (mnp_clk),
  .rst           (rst),

  .note_duration (16'd512),
  .note1_period  (8'd123),
  .note2_period  (8'd109),
  .note3_period  (8'd97),
  .note4_period  (8'd91),
  .note5_period  (8'd81),
  .note6_period  (8'd72),
  .note7_period  (8'd68),

  .play_note_val (play_note_val),
  .play_note_rdy (play_note_rdy),
  .play_note_num (out3[2:0]),


  .note_sel      (note_sel),
  .note          (note)
);

assign /* fill this in */ = note;

You will need to also include the following modules.

`include "lab1/DisplayOpt_GL.v"
`include "lab3/DFFRE_RTL.v"
`include "lab3/DFFR_RTL.v"
`include "lab3/ClockDiv_RTL.v"
`include "lab3/MultiNotePlayer_RTL.v"
`include "lab4/Synchronizer_RTL.v"
`include "lab4/ProcScycle.v"
`include "lab4/MemoryBus_RTL.v"
`include "lab4/SPI_RTL.v"
`include "lab4/Memory.v"

Use the following steps when you are ready to integrate the counter.

Double-click on DE0_CV_golden_top
Instantiate the template shown above
Fill in the connections to the top-level ports
Be sure to include all the required modules at the top
Choose File > Save from the menu

Then make sure to hook up the SPI interface on the breadboard as mentioned above.

SCLK connects to GPIO_1[1] which connects to D0 on USB-to-SPI adapter
MOSI connects to GPIO_1[3] which connects to D1 on USB-to-SPI adapter
MISO connects to GPIO_1[5] which connects to D2 on USB-to-SPI adapter
CS connects to GPIO_1[7] which connects to D3 on USB-to-SPI adapter

Connect the oscilloscope probe to GPIO_1[11]. Make sure your distance sensor is connected to GPIO_1[13] and the piezo buzzer is connected to GPIO_1[15]. There is a red LED on the back of the distance sensor which will go on and off when it detects an obstacle; test it out to make sure it is working.

Lab Check-Off Task 4: Discuss TinyRV1 Processor Integration

Show a TA your breadboard wiring and your top-level integration. Discuss how the connections provided in the template implement the specification. Confirm with the TA that you have correctly accounted for the fact that reset, all push-buttons, and the distance sensor are active low! Show how you have connected GPIO_1[11] to the oscilloscope. Make sure the red LED on the back of the distance sensor goes on and off when it detects an obstacle. Take your time and double check all connections! Synthesizing the processor takes 10 minutes so get your connections right the first time!

3.2. Synthesize

You will need to use the following timing constraint file:

set_max_delay -from [all_inputs] -to [all_outputs] 20
set_min_delay -from [all_inputs] -to [all_outputs] 0

create_clock -name CLOCK_50 -period 20 [get_ports {CLOCK_50}]
create_generated_clock -name clk -divide_by 4 -source CLOCK_50 \
  [get_nets {clock_div|counter_reg|q[1]}]
create_generated_clock -name mnp_clk -divide_by 1024 -source CLOCK_50 \
  [get_nets {mnp_clock_div|counter_reg|q[9]}]

set_output_delay -add_delay -clock clk -max 0 [all_outputs]
set_output_delay -add_delay -clock clk -min 0 [all_outputs]

set_input_delay  -add_delay -clock clk -max 0 [all_inputs]
set_input_delay  -add_delay -clock clk -min 0 [all_inputs]

We are now using two different clock dividers, one for the processor and one for the multi-note player, so we must specify two different generated clocks.

Now use the following steps to synthesize your design.

Choose Processing > Start Compilation from the menu
Wait ten minutes for synthesis to complete
While you are waiting go ahead and start reading the next section and you can even start working on your door monitoring assembly program!

How do I fix "can't open Verilog Design File" errors?

This probably means you did not setup the user library correctly, so Quartus cannot find the files you are including using the include Verilog preprocessor directive. You can use the following steps to fix this:

Choose Assignments > Settings from the menu
Select the category Libraries
Click triple dots to the right of Project library name
Click on This PC, then navigate to your cloned repo by choosing Windows (C:) > Users > netid > groupXX where XX is your group number
Click Select Folder
Click Add
Click OK
Choose Processing > Start Compilation from the menu to see if this fixes the issue

How do I fix "Verilog HDL syntax" errors?

This might be because you did not configure Quartus to use SystemVerilog! You can use the following steps to ensure Quartus knows your design includes RTL modeling:

Choose Assignments > Settings from the menu
Select the category Compiler Settings > Verilog HDL Input
Under Verilog version click SystemVerilog
Click OK

Once your design is fully synthesized, let's use the RTL viewer to appreciate how far we have come this semester.

RTL Viewer
- Choose Tools > Netlist Viewer > RTL Viewer from the menu
- Use the Netlist Navigator to gradually drill down in the hierarchy as follows:
  - ProcScycle
  - ProcScycleDpath
  - ALU_32b
  - Adder_32b_GL
  - AdderCarrySelect_16b_GL
  - AdderRippleCarry_8b_GL
  - FullAdder_GL
- Take a screenshot of the entire RTL viewer window; it must clearly show the Netlist Navigator with the full hierarchy from the top to the full adder on the left and the gate-level implementation of the full adder on the right.
- Choose File > Close from menu to close the RTL viewer

Lab Report Task 1: Collect Data for Single-Cycle Processor

Save the screenshot of the entire RTL viewer window for your lab report.

3.3. Configure

Now we are finally ready to configure the FPGA for TinyRV1 single-cycle processor prototype.

Choose Tools > Programmer from the menu
Click Hardware Setup
Currently selected hardware: USB-Blaster [USB-0]
Click Close
Click Start

Start by testing out the default assembly program as in the last lab. Then use PowerShell on the workstation to assemble the assembly test programs and load them into the physical memory in our FPGA prototype over SPI. These commands should be done on the workstation not on ecelinux!

% cd C:\Users\netid\groupXX
% mkdir build
% cd build
% python ..\scripts\tinyrv1-assemble -o test1.bin ..\lab4\asm\test1.asm
% python ..\scripts\tinyrv1-load test1.bin

Remember to first reset the processor, then load the program, then press the go button. Then try assembling and loading both test1.asm and test2.asm and confirm the perform as expected. Then take a look at test3.asm.

loop:
  lw   x1, 0x208(x0)
  sw   x1, 0x21c(x0)
  jal  x0, loop

This test program simply reads the push buttons and outputs the result to out3 which is connected to the oscilloscope, LEDs, and the multi-note player. Try pressing the right most push button and confirm that you can hear a note and that you can see a signal on the oscilloscope.

Lab Check-Off Task 5: Demonstrate TinyRV1 Processor Prototype

Show a TA your processor running test3.asm. Show pressing the right most push button and demonstrate you can hear a note and can see a signal on the oscilloscope.

4. TinyRV1 Accumulate Assembly Program

We will now perform an experiment to measure the delay of the accumulate assembly program you wrote in Part C. Reset your processor and load the accumulate assembly program.

% cd C:\Users\netid\groupXX
% mkdir build
% cd build
% python ..\scripts\tinyrv1-assemble -o accumulate.bin ..\lab4\asm\accumulate.asm
% python ..\scripts\tinyrv1-load accumulate.bin

Then set the size switches to be 4 and press the go button. Recall that the expected final result is 0x0bc. Since the seven-segment display only shows values from 0-31, it will only be able to show the bottom five bits of the complete result. We can look in the "seven segment" column in the table included above to see what the seven-segment display should show in the real FPGA prototype. For size 4, the seven-segment display should show 28. Verify that this is indeed what we see using the FPGA prototype. Try some other sizes.

You already calculated the number of cycles it will take your processor to execute the assembly program to accumulate 31 elements. Given the clock period of the processor, what do you expect the delay of your processor running the accumulate assembly program with a size of 31 to be in nanoseconds?

Go ahead and try a size of 31 on your FPGA prototype and confirm the output is 17. Recall that our assembly program includes instructions to set out3 to one before the accumulate loop and set out3 to zero after the accumulate loop. The least significant bit of out3 is connected to GPIO_1[11] so we can probe this on the oscilloscope to get an accurate measurement of the delay of our accumulate loop. Go ahead and run this experiment. Use the cursors on the oscilloscope to measure how long out3 is one in nanoseconds and capture a screenshot of the oscilloscope. Does this match your expectation?

Lab Report Task 2: Save Screenshot for Accumulate Assembly Program

Save the screenshot from the oscilloscope that clearly shows the total time of your accumulate assembly program when accumulating 31 elements.

Lab Check-Off Task 6: Demonstrate the Accumulate Assembly Program

The TA will ask you to try a size and confirm that your FPGA prototype is producing the correct result. Then show a TA your processor FPGA prototype correctly accumulating 31 elements. Show the TA the screen capture of your oscilloscope data including the cursors showing the delay of your accumulate assembly loop when accumulating 31 elements. Clearly explain to the TA why the delay measured using the oscilloscope makes sense. Hint: Consider both the number of instructions per iteration, the number of iterations, and then time per cycle.

5. Door Monitoring Embedded System

The final culminating design is to implement both the software and hardware for a door monitoring embedded system which leverages all you have learned throughout the semester. The hardware is based on the TinyRV1 embedded system you implemented on the FPGA in the previous parts which includes the two new external I/O devices: a distance sensor and a multi-note player. It should be no surprise by now that we will be taking an incremental approach.

Step 1: Implement and test distance sensor
Step 2: Implement and test sensor and counter
Step 3: Implement and test sensor, counter, and alarm
Step 4: Implement and test sensor, counter, and beep

For each step, we recommend first writing out some pseudo-code. Then you can either develop the assembly program on ecelinux and use the ISA and single-cycle processor simulator first to verify its correctness or just go ahead and try developing the program directly on the FPGA prototype. If you are more confident in your mastery of both the software and hardware in your embedded system feel free to try directly developing on the FPGA prototype, or consider using the simulator for just some of the steps.

5.1. Step 1: Distance Sensor

Implement an assembly program in door-monitor-step1.asm that reads the distance sensor and outputs the result on the seven-segment displays using out0. If the system does not detect an object, then it should display a zero. If the system does detect an object it should display a one.

5.2. Step 2: Sensor and Counter

Implement an assembly program in door-monitor-step2.asm that adds a traffic counter to the door monitor embedded system. If the system detects a person walking by it should increment the traffic counter by one. The traffic counter should be displayed on the seven-segment displays using out0.

Note that you cannot simply increment the counter whenever the distance sensor is one. Whenever a person walks by, the distance sensor will output one for many, many cycles. You need to instead wait for the distance sensor to be one, then wait for the distance sensor to go back to zero, and then increment the counter.

5.3. Step 3: Sensor, Counter, and Alarm

Implement an assembly program in door-monitor-step3.asm that adds an alarm to the door monitoring embedded system.

If the system detects a person walking by it should increment the traffic counter by one. The traffic counter should be displayed on the seven-segment displays using out0.

In addition, the system should play a note using the multi-note player whenever a person walks by. You can play the note when the sensor first goes high, or when the sensor goes low. For this step, the system should just start playing a note when it detects the first person and then never stop playing the note. You can reset the system to turn off the alarm.

5.4. Step 4: Sensor, Counter, and Beep

Implement an assembly program in door-monitor-step4.asm that turns the alarm into a beep for the door monitoring embedded system.

If the system detects a person walking by it should increment the traffic counter by one. The traffic counter should be displayed on the seven-segment displays using out0.

In addition, the system should play a beep using the multi-note player whenever a person walks by. You can play the beep when the sensor first goes high, or when the sensor goes low. To play a beep, first play a note using the multi-note player, then add a delay, then play note zero to turn off the multi-note player. To add a delay simply create a loop that does nothing except iterates for maybe 500ms. With a clock period of 40ns, waiting 500ms might require several million iterations. You can initialize a register with a large immediate value (which will then serve as the loop counter) by loading the large number from memory like this:

  # load the value 1000000 into x1
  lw x1, 0x100(x0)
  ...

  .data
  .word 1000000

When you are finished with this step your door monitoring embedded system should sense whenever anyone walks by, increment the traffic counter, and make a small beep.

Take a minute to appreciate all you have accomplished! You have now created all of the hardware and software for a complete (albeit simple) embedded system including:

GL implementation of a five-digit numeric display
GL implementation of a 32-bit ALU including an adder and equality comparator
RTL implementation of a multi-note player including various structural datapaths and a FSM-based control unit
GL and RTL implementation of a TinyRV1 single-cycle processor
RTL implementation of a memory bus and SPI
Integrating all of this into the hardware for a door monitoring embedded system with many different memory-mapped external I/O devices
Bread boarding the SPI adapter, distance sensor, and piezo buzzer
Developing a software door monitor program with 30+ assembly instructions

This has required implementing over 50 hardware modules using over 5000 lines of both gate-level and register-transfer level Verilog modeling. Even more importantly, you have verified these modules with hundreds of test cases. You have also manually implemented eight software programs with a total of almost 100 assembly instructions. You should be very proud of this tremendous accomplishment!

Lab Check-Off Task 7: Demonstrate the Door Monitoring Embedded System

Show a TA your door monitoring embedded system in action! You do not need to show the door monitoring system in simulation; just show the TA everything working on the real prototype.

Lab Check-Off Task 8: Turn in Lab Kit

When you are finished with your demo, pack up your ECE 2300 Lab Kit. Put the jumper wires, a piezo buzzer, a digital distance center, and a USB flash drive back in the component box. Return the FPGA board, USB cables, and component box to a TA who will then record the kit number on your check-off sheet, initial the final check-off, and then collect your check-off sheet.

6. Optional Extension: Advanced Door Monitoring Embedded System

This section is an optional extension to the main lab and is meant for students who have already completed all of the above tasks. Students might consider adding one of several more advanced features to their door monitoring embedded system:

Play a few notes instead of just beeping every time a person passes by.
Use the switches to set a "room capacity limit" which should be displayed on the seven-segment displays; when the room reaches capacity an alarm should go off indicating that the room is now full.
Use a switch to configure the door monitoring embedded system into one of two modes: in "day mode" should function exactly as above; in "night mode" instead of counting traffic, the system should sound an alarm if someone passes by
Use switches to enable a secret alarm passcode to turn off the alarm in night mode.
Use a second distance sensor to see if you can detect which way a person is passing; increment the counter when a person walks one way and decrement the counter when a person walks the opposite way

Lab 4: TinyRV1 ProcessorPart E: FPGA Prototype V2