Module

Module 3 Basic Peripherals and Port I/O
1) Write the C code to initialize Port G bit 6 as an input and then test if it is one or 0.
Remember the I/O bit may have an analog select feature which must be disabled. Use the
structures defined in xc.h, such as LATAbits.LATA3.
ANSELGbits.ANSELG6 = 0;
TRISGbits.TRISG6 = 1;
if (PORTGbits.PORTG6 == 1)

2) Rewrite the previous questions code using the atomic access control registers (SET, CLR,
INV).
ANSELGCLR = 1 << 6;
TRISGSET = 1 << 6;
if (PORTGbits.PORTG6 == 1)

3) Write the C code to initialize Port F bits 8 and 12 to outputs, setting bit 8 to 1 and bit 12 to
0. Remember the I/O bit may have an analog select feature which must be disabled. Use
the structures defined in xc.h, such as LATAbits.LATA3.
TRISFbits.TRISF8 = 0;
ANSELFBits.ANSF12 = 0;
TRISFbits.TRISF12 = 0;
LATFbits.LATF8 = 1;
LATFbits.LATF12 = 0;
4) Rewrite the previous questions code using the atomic access control registers (SET, CLR,
INV).
ANSELFCLR = 1 << 12;

TRISFSET = (1 << 8) | (1 << 12);
LATFSET = 1 << 8;
LATFCLR = 1 << 12;
Module 4 Basic Concurrency

Consider the following code.

Assume that each call of Delay(delay_count) takes 100 milliseconds, and all other code takes
no time at all.
5) Describe the situation which results in the shortest delay between switch 2 (connected to
BTN2_PORT_BIT) being pressed and the LEDs turning off. Draw and label a timeline to
show what code the processor executes and when.
Switch 2 is pressed immediately before the if statement. The if statement executes and
then the LEDs are cleared, so there is no delay.
6) Describe the situation which results in the longest delay between switch 2 (connected to
BTN2_PORT_BIT) being pressed and the LEDs turning off. Draw and label a timeline to
show what code the processor executes and when.
Switch 2 is pressed immediately after the if statement. There is a delay of 400 ms before
the LEDs are turned off.
7) Describe the situation which results in the longest time that switch 2 can be pressed (and
then released) without the LEDs turning off. Draw and label a timeline to show what code
the processor executes and when.
Switch 2 is pressed immediately after the if statement. It is released before the if

(BTN2_PORT_BIT) code executes. So switch 2 must be pressed for less than 400 ms.
Now consider the following state-machine-based code. An infinite loop calls

TASK_Read_Switches and then TASK_Scan_LEDs_Once_FSM.
Assume that each call of Delay(FAST_DELAY) takes 20 milliseconds, each call of
Delay(SLOW_DELAY) takes 100 milliseconds, and all other code takes no time at all.
8) Describe the situation which results in the shortest delay between switch 2 (connected to
BTN2_PORT_BIT) being pressed and all four of the LEDs turning off. Draw and label a
timeline to show what code the processor executes and when.
Switch 1 is pressed, so the delay calls use FAST_DELAY (20 ms).
Switch 2 is pressed immediately before the second if statement in TASK_Read_Switches.
LD1 is turned off immediately, LD2 after 20 ms, LD3 after 40 ms, and LD4 after 60 ms.
9) Edit the code in the case statements to reduce this delay to zero.
Edit each case so that Delay is called only if G_Allow_Lit_LEDs is true. This eliminates all
delay calls as long as G_Allow_Lit_LEDs is false.
10) Describe the situation for the original code (not including your changes from question 9))
which results in the longest delay between switch 2 (connected to BTN2_PORT_BIT) being
pressed and the LEDs turning off. Draw and label a timeline to show what code the
processor executes and when.
Switch 1 is not pressed, so the delay calls use SLOW_DELAY (100 ms).
Switch 2 is pressed immediately after the start of any Delay call in a case statement. For
this discussion, assume it is case 1. There is a delay of 100 ms before TASK_Read_Switches
runs and polls the switch, changing G_Allow_Lit_LEDs. TASK_Scan_LEDs_Once_FSM runs
case 2, turning off LD2 and delaying again. At 200 ms TASK_Scan_LEDs_Once_FSM runs
case 1, turning off LD1 and delaying again.
So the total delay is 400 ms.
11) Now include your changes from question 19. Describe the situation for your code which
results in the longest delay between switch 2 (connected to BTN2_PORT_BIT) being
pressed and the LEDs turning off. Draw and label a timeline to show what code the
processor executes and when.
Switch 2 is pressed immediately after the start of any Delay call in a case statement. For
this discussion, assume it is case 1. There is a delay of 100 ms before TASK_Read_Switches
runs and polls the switch, changing G_Allow_Lit_LEDs. TASK_Scan_LEDs_Once_FSM runs
case 2, turning off LD2 but skipping the delay. TASK_Scan_LEDs_Once_FSM runs case 3,
turning off LD3 but skipping the delay. TASK_Scan_LEDs_Once_FSM runs case 4, turning off
LD4 but skipping the delay. TASK_Scan_LEDs_Once_FSM runs case 1, turning off LD1 but
skipping the delay.
So the total delay is 100 ms.
12) Describe the situation which results in the longest time that switch 2 can be pressed (and
then released) without the LEDs turning off. Draw and label a timeline to show what code
the processor executes and when.
Switch 2 is pressed immediately after the start of any Delay call in a case statement. The
Delay call takes 100 ms, and then TASK_Read_Switches will run again. So the switch must
be released before 100 ms to be missed by the code..
Module 5 Analog Interfacing
13) What code would you expect a 12-bit ADC to produce when converting a 2.2 V input?
Assume the positive reference voltage is 3.3 V and the negative reference voltage is 0 V.
round(212*(2.2 V/3.3 V)) = round(4096*2.2/3.3) = 2731
14) What code would you expect a 10-bit ADC to produce if converting the voltage from a
temperature sensor at 137 C? Assume the sensors output voltage is 8 mV/C 200 mV.
Assume the positive reference voltage is 3.3 V and the negative reference voltage is 0 V.
round(210*(137 C*8 mV/C 200 mV)/3.3 V) = 278
15) Write the code to configure a PIC32MZ2048EFG100 to read the analog voltage on pin 100
with the ADC. Assume the ADC has already been configured correctly to read another
input channel. Table 1-1 in the PIC32MZ EF Data manual shows pin connections for the
ADC. Remember that this MCU uses a 100-pin TQFP package. Be sure to configure the
inputs ANSEL field, select channel 18 with the ADC multiplexer, and request a software-
triggered conversion.
Table 1-1 shows that pin 100 is connected to AIN18.

Configure that pin as an analog input.
RE4
ANSELEbits.ANSE4 = 1;
ADCCON3bits.ADINSEL = 18;
ADCCON3bits.RQCNVRT = 1;
while (ADCSTAT1bits.ARDY18)
;
// Value is now available in ADCDATA18
Module 7 Communications
16) Configure the UART to communicate at 128300 baud, with 8 data bits, no parity, and one
stop bit. Assume the peripheral bus clock is running at 100 MHz.
Need to divide 100 MHz clock down to either 4X or 16X the desired baud rate.
100 MHz/(4*128300 Hz) = 194.86
100 MHz/(16*128300 Hz) = 48.72. We will get better noise immunity with BRGH = 0.
17) Which PIC32MZ EF MCU pins (for the TQFP-100 package) could be used for the UART3
transmit (U3TX) and receive (U3RX) data signals using Peripheral Pin Select? Examine Table
12-3 for the output and Table 12-2 for the input. Specify the pin name.
Output: RPD2, RPG8, RPF4, RPD10, RPF1, RPB9, RPB10, RPC14, RPB5, RPC1, RPD14, RPG1,
RPA14
Input: RPD3, RPG7, RPF5, RPD11, RPF0, RPB1, RPE5, RPC13, RPB3, RPC4, RPD15, RPG0,
RPA15
18) If we connect U3TX to RPD10 and U3RX to RPD15 using Peripheral Pin Select, what pin
numbers will these signals be on? Assume we are using the PIC32MZ EF MCU in the TQFP-
100 package.
U3TX will be on pin 69. U3RX will be on pin 48.
19) Write the C code to connect U3TX to RPD10 and U3RX to RPD15 using Peripheral Pin
Select.
RPD10R = 1; // U3TX, binary 0001

U3RXR = 11; // U3RX, binary 1011
Module 8 Other Peripherals

20) What does a watchdog timer (WDT) do?
It resets the microcontroller if the WDT is not refreshed frequently enough.
21) What must the application program do in order to use a watchdog timer (WDT)?
The application software must periodically refresh the watchdog timer before it expires.
22) What is the difference between a watchdog timer and a windowed watchdog timer?
A watchdog timer can be refreshed at any time before it expires, while a windowed WDT
can only be refreshed within a fixed time window before it expires.
23) What does a direct memory access (DMA) controller do?
A DMA controller copies data within the processors memory space.
24) Why is using DMA to copy data faster than using software?
The software copy operation needs to execute multiple instructions per word of data: (1)
read the data from the source memory location into a register, (2) store the data from the
register into a destination memory location. It may also need to (3) update pointers and (4)
decide whether to repeat the copy (if in a loop). Using DMA eliminates the need to fetch
and execute instructions to copy data, since the DMA controller performs the reads, writes,
pointer increments and counting. The only software overhead is configuring the DMA
controller, and possibly triggering it.
25) We wish to create an embedded system which generates a square wave whose frequency
is determined by an analog input voltage. Describe how you could use DMA, two timers,
and an analog-to-digital converter to do this.
Configure the first timer to overflow at a fixed frequency. Configure the second timer to
generate a square wave. Use the first timers overflow event to trigger the ADC to start a
conversion. Use the ADC conversion complete event to trigger a DMA copy from the ADC
result register to the second timers period register.
Module 10 Advanced Concurrency
26) Consider two task sets and a prioritized scheduler. Set A has 10 tasks, each of which takes
40 time units to execute. Set B has 10 tasks; one takes 100 time units to execute and the
rest take 40 time units each. Which task set would benefit more from switching from a
non-preemptive scheduler to a preemptive one, and why?
Set B would benefit more, because it has more variation in task execution time. With set A
and no preemption, the longest a task can be delayed by another task is 40 time units.
With set B and no preemption, the longest time is 100 time units, which is much longer
than 40.
27) Consider a system with a prioritized, non-preemptive scheduler and three tasks (A, B and
C, in order of decreasing priority). Show in the table below the order the tasks execute if
task B starts running at time 0, and tasks A and C are released (become ready to run) at
time 2. Assume each task takes 3 time units to execute.
Time 0 1 2 3 4 5 6 7 8 9
A C (idle)
Runinng B
Task
What is the response time for each task? This is the delay between task being released
(becoming ready to run)) and its completion.
Task Response Time
A 4
B 3
C 9
28) Now consider a system with a prioritized, preemptive scheduler and three tasks (A, B and
C, in order of decreasing priority). Show in the table below the order the tasks execute if
task B starts running at time 0, and tasks A and C are released (become ready to run) at
time 2. Assume each task takes 3 time units to execute. Also show the state of each task
(Ready, running, blocked, or inactive).
Time 0 1 2 3 4 5 6 7 8 9
A B C (idle)
Runinng B
Task
running inactive inactive inactive

Task A inactive
State
blocked running inactive inactive
Task B running
State
blocked blocked running inactive

Task C inactive
State
What is the response time for each task? This is the delay between task being released
(becoming ready to run)) and its completion.
Task Response Time
A 3
B 6
C 9
29) How can we use a mutex to provide mutually exclusive access to a resource or variable by
tasks?
A task which needs to use the resource must first obtain the mutex. After using the
resource, the task must release the mutex. This gives the other tasks a chance to try to
obtain the mutex in order to use the resource.
30) What happens if a task X tries to obtain a mutex which is not available? What happens
when the mutex finally does become available?
The scheduler moves task X into the blocked state and runs the highest priority ready task.
Task X remains blocked until the mutex becomes available, at which task X may start
running again (if there are no higher priority ready tasks) or be placed into the ready state.
31) What code does a scheduler run if there are no ready tasks?
It runs the idle task.
32) What happens to the contents of the CPU registers when a task is preempted?
They are saved to memory in the tasks task control block.
Module 11A CPU

33) How many registers are in the register set of the MIPS architecture, and how many bits
long is each?
32 registers, and 32 bits each.

34) What is the name of each of the following registers, and why is it special?
a) register 0
$0. Always holds a value of 0.
b) register 29
$sp. Holds the stack pointer.
c) register 31
$sp. Holds the return address.
35) How many bits are needed to hold a MIPS instruction?
32.
36) List the three possible locations for an operand.
Registers, memory, constants/immediates (in the instruction itself).
37) How is an immediate 32-bit value loaded into a MIPS register, if the instruction itself is
only 32 bits long?
The upper 16 bits are loaded with a LUI instruction, and then the lower 16 bits are loaded
with an ORI instruction.
38) What does the MIPS instruction beq $t1, $t0, next do?
If the values in registers $t0 and $t1 are equal, branch to the label next, else continue
executing instructions in sequence.
39) Which registers are used to pass arguments to called functions? And which is used for the
functions return value?
Arguments go in $a0, $a1, $a2, and $a3 (if more are needed, the stack is used). The return
value is passed through $v0 (or through the stack, if too large).
40) Why are high-level languages used, if you can already write any program in assembly or
machine language?
A high-level language lets the programmer work at a higher level of abstraction compared
with assembly and machine languages, so each statement in the high-level language gets
more work done. As a result, the programmer needs to write fewer statements, finishing
the program sooner.
41) How are assembly and machine language related, and what is their most important
difference?
They both represent the same program. Machine language is a computer-readable form,
while assembly language is the human-readable form..
Module 11B MIPS Memory System
42) Why does the memory system include a cache?
To speed up memory access.
43) What does temporal locality mean?
If a piece of data is used, it will probably be used again soon.
44) What does spatial locality mean?
If a piece of data is used, nearby data will probably also be used soon.
45) List and explain the three types of cache misses.
Compulsory: this is the first time the data item has been accessed, so it is not in the cache
yet.
Capacity: the cache is too small to hold all the data at the same time, so the data item was
in the cache but was evicted by other data later.
Conflict: data item was evicted because another data item used the same location later.
46) Consider a 1024-byte direct-mapped cache. Each block is 16 bytes long.

a) How many sets does the cache have?
1024 bytes/16 bytes = 64 sets
b) Which set and tag are used for address 61?
Set = int(61/16) modulo 64 = 3

Tag = int (61/1024) = 0
c) Which set and tag are used for address 2193?
Set = int(2193/16) modulo 64 = int (137.0625) modulo 64 = 137 modulo 64 = 9

Tag = int (2193/1024) = 2
47) Consider a 1024-byte two-way set-associative cache. Each block is 16 bytes long.
a) How many sets does the cache have?
1024 bytes/(2 ways * 16 bytes) = 32 sets
b) Which set and tag are used for address 61?
Set = int(61/16) modulo 32 = 3

Tag = int (61/1024) = 0
c) Which set and tag are used for address 2193?
Set = int(2193/16) modulo 32 = int (137.0625) modulo 32 = 137 modulo 32 = 9

Tag = int (2193/1024) = 2
48) Why do set-associative caches perform better than direct-mapped caches?
In a set-associative cache, a data item can be placed into one of multiple cache locations. In
a direct-mapped cache, a data item can only be placed in one cache location. So a direct-
mapped cache will have more conflicts than a set-associative cache.
49) Assume a memory system with cache has an access time of one cycle for a hit and 20
cycles for a miss. How long will a program take to execute if it has
Answer.
50) Complete the table below describing the PIC32MZ EF caches.

Attribute Instruction Cache Data Cache
4096
Size (bytes) 16384
16
Block size (bytes) 16
4
Associativity (ways) 4
4096/(4*16) = 64
Number of sets 16384/(4*16) = 256
Module 11C Advanced MIPS Architecture and Microarchitecture

51) What is the range of priorities available for an interrupt? Which number gives the highest
priority?
0 to 7. 7 is the highest priority.
52) Explain how to make the processor ignore interrupts with a priority of 0, 1, or 2.
Set the IPL to 2.
53) How does the processor respond to an interrupt?

a) List the three steps performed by hardware
Finish current instruction

Save critical processor state by it into coprocessor 0 (CP0)
Change PC to start executing ISR
b) List the five steps performed by software
Copy data from CP0 to allow nested interrupts/exceptions

Save other parts of processor state
Configure processor to operate in interrupt mode
When done, restore previous processor and CP0 state
Execute return from exception/interrupt instruction (eret)
54) How can you make an ISR interruptable?
Clear the EXL bit.
55) Which registers (if any) must an ISR preserve?
All registers except $0.
56) What are the five stages of the PIC32 instruction execution pipeline?
I: Instruction Fetch
E: Execution
M: Memory Fetch
A: Align
W: Writeback
57) What is pipeline latency?
The time taken to execute one instruction
58) What is pipeline throughput?
How many instructions are completed per unit of time
59) What is a a data hazard?
An instruction which depends on the result of an instruction which hasnt written its result
to the register file yet.
60) What is a control hazard?
A conditional branch which depends on the result of an instruction which hasnt completed
yet.
61) How many pipeline stages are in the PIC32MZ EF Floating Point Unit?
Seven
62) What is the execution latency for the following single-precision floating point instructions
on a PIC32MZ EF CPU?
a) Add
4 cycles
b) Subtract
4 cycles
c) Multiply
4 cycles
d) Divide
17 cycles
e) Square root
17 cycles
f) Reciprocal
13 cycles
g) Reciprocal of square root
17 cycles
63) How long would it take a PIC32MZ EF running at 200 MHz to execute four multiplies and
three adds?
These seven instructions are pipelined, so it would seven cycles plus three cycles for the rest
of the last add instruction, for a total of ten cycles. At 200 MHz this is 50 ns.
Module 12 - Performance
64) What is the compilers primary goal when generating code?
To generate correct code.
65) Why is premature optimization so bad?
You are likely to waste your efforts by optimizing parts of the program which dont really
matter.
66) What is a programs execution time profile, and how does it help with optimization?
Profiling shows how much time different parts of the program take, allowing you to focus
on optimizing the most important parts.
67) Why is it important to look at the object code which the compiler generates?
To see if there are any extra parts you didnt expect.
68) How does program counter sampling work to provide profiling information?
The program is occasionally interrupted by a sampling ISR, which examines the saved PC on
the stack to determine which function was executing. The ISR then updates the count of
times which that function was interrupted. The relative counts provide the programs
execution profile.
69) Which three parts of the following code do you think will dominate the programs
execution time? Explain why and list the line numbers sorted by largest execution time.
Use the following assumptions:
the CPU does not have floating point math hardware support
a floating-point operation takes 1000 cycles
an integer operation takes 1 cycle
the arrays have been initialized
all memory accesses take ten cycles
int i, m[1000];
float a[1000], c[1000], sum=0;
i=0; // line 1
while (i<1000) { // line 2
if (i & 0x1) { // If odd // line 3
c[i] = 317.1 + a[i]; // line 4
} else { // even // line 5
m[i] += i + m[i-1]; // line 6
} // line 7
sum += m[i+1]; // line 8
i++; // line 9
} // line 10
Line 8 has a memory read and a floating point operation. T = 10 + 1000 = 1010 cycles.
Line 4 has floating point add a read and a write. It only runs half of the time (for odd values
of i). T = (10 + 1000 + 10)/2 = 510
Line 6 has an integer multiply, a subract, two memory reads and a write, and it happens
half of the time. T = (1 + 30 + 1)/2 = 16
70) Which three parts of the following code do you think will dominate the programs
execution time? Explain why and list the line numbers sorted by largest execution time.
Use the following assumptions:
the CPU has floating point math hardware support

a floating-point operation takes 1 cycle
an integer operaton takes 1 cycle
the arrays have been initialized
all memory accesses take ten cycles
Line 6 has to read memory twice and write it once. It has two math operations. T = (20 + 10
+ 2)/2 = 16
Line 8 has one memory read and two math operations, and it runs on every loop iteration T
= 10 + 2 = 12.
Line 4 only has to read memory once and write it once. It has one math operation. T = (10 +
10 + 1)/2 = 10.5

Module

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Module

Diunggah oleh

Hak Cipta:

Format Tersedia

Module 3 Basic Peripherals and Port I/O

ANSELFCLR = 1 << 12;

Module 4 Basic Concurrency

Switch 2 is pressed immediately after the if statement. It is released before the if

Now consider the following state-machine-based code. An infinite loop calls

round(212*(2.2 V/3.3 V)) = round(4096*2.2/3.3) = 2731

round(210*(137 C*8 mV/C 200 mV)/3.3 V) = 278

Table 1-1 shows that pin 100 is connected to AIN18.

U3TX will be on pin 69. U3RX will be on pin 48.

RPD10R = 1; // U3TX, binary 0001

Module 8 Other Peripherals

It resets the microcontroller if the WDT is not refreshed frequently enough.

23) What does a direct memory access (DMA) controller do?

A DMA controller copies data within the processors memory space.

running inactive inactive inactive

blocked blocked running inactive

Task Response Time

It runs the idle task.

They are saved to memory in the tasks task control block.

Module 11A CPU

32 registers, and 32 bits each.

$0. Always holds a value of 0.

$sp. Holds the stack pointer.

$sp. Holds the return address.

35) How many bits are needed to hold a MIPS instruction?

36) List the three possible locations for an operand.

Registers, memory, constants/immediates (in the instruction itself).

To speed up memory access.

43) What does temporal locality mean?

If a piece of data is used, it will probably be used again soon.

44) What does spatial locality mean?

45) List and explain the three types of cache misses.

46) Consider a 1024-byte direct-mapped cache. Each block is 16 bytes long.

1024 bytes/16 bytes = 64 sets

b) Which set and tag are used for address 61?

Set = int(61/16) modulo 64 = 3

c) Which set and tag are used for address 2193?

Set = int(2193/16) modulo 64 = int (137.0625) modulo 64 = 137 modulo 64 = 9

1024 bytes/(2 ways * 16 bytes) = 32 sets

b) Which set and tag are used for address 61?

Set = int(61/16) modulo 32 = 3

c) Which set and tag are used for address 2193?

Set = int(2193/16) modulo 32 = int (137.0625) modulo 32 = 137 modulo 32 = 9

50) Complete the table below describing the PIC32MZ EF caches.

Module 11C Advanced MIPS Architecture and Microarchitecture

0 to 7. 7 is the highest priority.

Set the IPL to 2.

53) How does the processor respond to an interrupt?

Finish current instruction

b) List the five steps performed by software

Copy data from CP0 to allow nested interrupts/exceptions

54) How can you make an ISR interruptable?

Clear the EXL bit.

55) Which registers (if any) must an ISR preserve?

All registers except $0.

57) What is pipeline latency?

The time taken to execute one instruction

58) What is pipeline throughput?

How many instructions are completed per unit of time

59) What is a a data hazard?

60) What is a control hazard?

g) Reciprocal of square root

To generate correct code.

65) Why is premature optimization so bad?

round(212(2.2 V/3.3 V)) = round(40962.2/3.3) = 2731

round(210(137 C8 mV/C 200 mV)/3.3 V) = 278