CST 131 -- Previous Homework

7.29) Only do parts a and b. (7.25 in the 3rd ed) In this problem, there is a typo. The track to track seek time is given, but the average seek time is not. Assume that the table entry "Track-to-track seek time of 8ms" was supposed to be "Average seek time of 8ms". Make sure you show all your work in each part. Present your answer in part a in MB using the definition that MB = 10⁶Bytes (instead of 2²⁰ Bytes). Disk manufactures use this definition in order to attempt to avoid confusing consumers (though I'm not sure how much it makes things better or worse).

29) Suppose a disk drive has the following characteristics:
• 5 surfaces
• 1024 tracks per surface
• 256 sectors per track
• 512 bytes/sector
• Track-to-track seek time of 8 milliseconds
• Rotational speed of 7500 RPM.

a) What is the capacity of the drive?
b) What is the access time?
~~c) Is this disk faster than the one described in question 28?~~

7.33) (7.29 in the 3rd ed) Ignore the hint given in this question in the text! Note that the question is not about large versus small sectors, but about the relative advantages and disadvantages of having more or less of a particular size (i.e. 512 byte) of sectors in a cluster. That is, what are the relative advantages and disadvantages of dividing up a disk into fewer big clusters (clusters with more sectors in them) or more smaller clusters (clusters with fewer sectors in them). Don't just list the advantages and disadvantages of having fewer sectors per cluster, also describe how these advantages and disadvantages result from fewer sectors per cluster.

33) What are the advantages and disadvantages of having a small number of sectors per disk cluster? ~~(Hint: You may want to think about retrieval time and the required lifetime of the~~ ~~archives.)~~

C.5) Headers and Trailers in hard disk sectors

a) A unique pattern appears in the header at the beginning of each sector in order to mark the beginning of the sector. One use of this pattern is to allow the read electronics to get in sync with the bit spacing on the disk (since there will be some variation in the speed of the disk over time). Other than its use for synchronization, why is a mark at the beginning of each sector needed?
b) Why do disks need track and sector ID information in the header of each sector? How is this information used?
c) Error correcting codes (ECC) are stored in the trailer of each sector. These codes use up space we could use for storing more data. Instead of using ECCs, we could test a drive when we formatted it by writing known data to each sector and comparing the data we read back with the known value. If it didn't match, we would mark the cluster containing that sector bad in the file allocation table. How does the use of an ECC actually make more data available than the proposed approach?

C.6) Optical Storage

a) Do CDs use a constant or variable bit density along the spiral track.
b) What advantage does this type of bit density give CDs?
c) Why was this type of bit density a disadvantage for early CD reader/writers?
d) Why is Mode 2 (as described in the text) not appropriate for computer data storage on a CD?
e) Blu-Ray drives use a 405nm wavelength laser in order to allow more bits on each disk layer. Why was a shorter wavelength laser (shorter than for DVD and CD) needed in order to achieve this?

C.7) RAID (assuming the disk array has no failed disks)

a) In RAID Levels 4 and 5, the strip may be large enough that during a write, only one strip in the stripe needs to be changed. When writing one strip in either level 4 or 5, the parity for the strip also needs to be updated. It is not necessary to read all the other disks in order to determine what the new parity should be. To see how this can be, consider changing the msb on Disk 3 from 0 to 1.

Disk 0	Disk 1	Dsk 2	Disk 3	Disk 4	Parity disk
????????	????????	?????????	0.......	????????	1.......

i) What will the new MSb on the parity disk be? _____ (Assume even parity).
ii) Why don't we need to know what the other (msb) bits are?

b) Based on the above, describe the sequence of steps needed to write one strip of a stripe in RAID level 4 or 5 (assuming no disks are failed).
c) What makes RAID level 5 better for writes to individual strips than RAID level 4? Describe why this is so.
d) Sometimes the strips are small enough, or the data being changed is large enough so that it makes more sense to write the whole stripe. How does the sequence of steps you described in b) change if we write the whole stripe?
e) If we are writing the whole stripe, does RAID level 5 have an advantage over RAID level 4? Why or why not?
f) Does RAID level 5 have an advantage over RAID level 4 for reads if there are no failed disks? Why or why not?

C.8) RAID (assuming that there is one failed disk in the disk array that has not yet been replaced)

a) Assume (hypothetically) that a read of byte-wide strips in a stripe on a RAID level 3 system provides the following data (only the MSb of each byte is shown):

Disk 0	Disk 1	Disk 2	Disk 3	Parity disk
1........	0........	0........	1........	1........

i) If we assume even parity, is at least one MSb of the data bad? How do you know?
ii) Can parity tell us which disk provided the bad data?
iii) So, When a disk in a RAID system fails (provides incorrect data) we need some other mechanism to know which disk failed. How can we know which disk failed?

b) Suppose that disk 2 in a 5 disk RAID level 3 system fails. The other disks provide the following data on the read of the bit-wide strips:

Disk 0	Disk 1	Disk 2	Disk 3	Parity disk
1........	0........	bad	1........	1........

Assuming that even parity is used, what should the msb of the byte from Disk 2 actually be? ___ .

c) Based on part b, describe what must be done in a RAID level 5 system in order to be able to "read" the data from a strip on a failed disk. Assume we only need to read the missing strip. (Note that the system can continue to provide the data that would have come from the disk that failed so long as only one disk fails. This data will eventually be used to reconstruct the data on a replacement disk, but for the purposes of this question, assume we haven't replaced the disk yet.)

d) There are two factors involved in "reading" data from a strip on a failed disk (as described in part c) that will slow down the system compared to one that has no failed disks. Describe them both. (Assume we haven't replaced the failed disk yet. Also don't assume that we have a "hot" spare.) Note that one of the factors is not "using parity to compute the missing data". While this does have to be done, the time needed to do so is negligible compared to the time needed to do a disk seek.

7.2) Calculate the overall speedup of a system that spends 40% of its time in calculations with a processor upgrade that provides for 100% greater throughput.
Note: In this problem, saying that the new processor is100% faster is the same as saying that the new processor is 2 times faster than the old one.

7.8) Show your work. First determine the speedup and performance increase resulting from both options. Note that the performance increase is the amount over 100%. Then in part a) give the cost for a 1% performance increase for each option and state which is best when judged on this basis. This is also the criteria to be used in part c). Keep intermediate results in your calculator rather than using the rounded off intermediate results determined in one part in the next part of the problem. It will make a difference in this problem.

7.8) Suppose the daytime processing load consists of 60% CPU activity and 40% disk activity. Your customers are complaining that the system is slow. After doing some research, you have learned that you can upgrade your disks for $8,000 to make them 2.5 times as fast as they are currently. You have also learned that you can upgrade your CPU to make it 1.4 as fast for $5,000.
a) Which option would you choose to yield the best performance improvement for the least amount of money?
b) Which option would you choose if you don't care about the money, but want a faster system?
c) What is the break-even point for the upgrades? That is, what price would we need to charge for the CPU (or the disk – change only one) so the result was the same cost per
1% increase for both?

C.1) Complete a table like the following one naming the 4 different I/O methods, listing the hardware (beyond the CPU and the I/O device and its device interface (i.e. its "ports")) needed to implement each architecture, and ranking each architecture by how much software overhead is involved in that architecture (compared to the other methods) and describing where the software overhead comes from in that method and why it is more or less than the previous method. Note that software overhead is instructions the CPU executes in support of I/O. Also note, that (as discussed in class) Memory mapped I/O and Instruction-based I/O are not I/O methods.

I/O Method	Additional Hardware	Software Overhead Rank (most, least, less than ... etc.)	Source of Software Overhead

C.2) Response time is the time from when the I/O device requests service and when the device actually gets service. For each of the I/O control methods, describe the response time by describing the sequence of events that occurs between when a device requests service (by setting a bit in a status register, asserting an interrupt line, or asserting a DMA request line) and when the device is serviced (data is read from or written to the device).

C.3) Protocol, Handshake

a) Define the term protocol as it relates to I/O. (Use the definition the author gives early in section 7.4 of the text, but add "timing (or sequence) of signals" to the definition.)
b) Define the term handshake as it relates to I/O. (Again, use the definition the author gives in section 7.4 of the text.)
c) Describe why handshaking is necessary in I/O.
d) What is a timing diagram and what is its purpose for I/O?

B.8) Cache Write Policies

a) At first it seems that "write through" would really slow down the operation of a cache memory system. Describe why it does not slow it down as much as you might at first think for most programs.
b) In what situations can a "write back" policy be more efficient (result in fewer writes to main memory) than a "write through" policy? Consider that in write back, we have to write back the whole cache block even if only one word of it was changed since there is only a single dirty bit for the whole block.

B.9) A particular memory system consists of a cache and a main memory. (This system does not support virtual memory). The cache has a hit ratio of 95% and an access time of 12 ns. The main memory has an access time of 25ns. What is the effective access time (EAT) of this memory system? Please show your work.

B.10) The equation for EAT given on page 378 (page 367 4th ed, page 334 3rd ed) describes how to compute the effective access time for a two level memory system consisting of cache and main memory. The equation for EAT on page 392 (page 382 4th ed, page 349 3rd ed) describes how to compute the effective access time for a two-level system consisting of virtual memory (on disk) and the main memory (but note that it assumes either no TLB or that the TLB missed -- in class we adjusted this equation in order to represent what happens when a TLB hits : EAT = 0.99 × (5ns + 200ns) + .01 × 10ms : where the page fault rate was .01 (or 1%) and the TLB hit time was 5ns, the main memory access time was 200ns and the disk access time is 10ms). A full memory system including virtual memory usually has three levels: Cache, Main memory, and Disk. Let the Cache have a 12ns (nanoseconds) access time and a hit rate of 95%. Let the Main memory have an access time of 50ns and a page fault rate of 0.0001% (10^-6 or .000001). Let the disk have an access time of 10ms (milliseconds). Compute the effective access time of this memory system.

a) First, compute the effective access time of the two level system containing the Main memory and Disk. Assume that a TLB is used, that it always hits, and that a TLB hit takes 5ns.
b) Next, compute the overall effective access time of the whole memory system by assuming it is a two-level system containing the Cache and a main memory that has an access time equal to the effective access time you found in part a.

Make sure you show your work.

B.11) Virtual Memory.

a) What does the term "page fault" describe?
b) What (in general) must a computer system do in response to a page fault?
c) The page fault rate can be determined by dividing the number of page faults by _______ .

For each of the following options, indicate if it could (by itself) be used to correctly complete the previous statement (put a Y or N in each slot).

the number of page frames ___
the number of virtual pages in the program ___
100 ___
the number of times we check the page table ___
the number of times we check the TLB ___
the number of page hits ___
the total number of times the CPU attempts a data access ___
the total number of times the CPU attempts an instruction access ___
the total number of times the CPU attempts an instruction or data access ___

d) What does the term 'TLB" stand for?
e) What is a TLB?
f) Running a virtual memory system without a TLB would make it very slow because every CPU request to reference main memory for an instruction or data item would require (fill in the bank with a number) ____ references to main memory (assuming there is no page fault). Also, describe what each reference is for.

B.12) For each of the following cases, first say if the sequence of events is possible or not. Then, in those cases where it is not possible, explain why it is not possible.

a) TLB hit, Cache hit
b) TLB hit, Cache miss
c) TLB miss, Page table entry valid, Cache hit
d) TLB miss, Page table entry valid, Cache miss
e) TLB miss, Page table entry invalid (Page Fault), Cache hit (after the page fault is handled, the TLB is updated, and the access is restarted and the TLB now hits).

6.25) This problem from the text assumes that bytes are addressed. Note that the problem does not show all the page table entries, only the valid ones are given. Please show your work.

25.) A system implements a paged virtual address space for each process using a one-level page table. The maximum size of virtual address space is 16MB. The page table for the running process includes the following valid entries (the → notation indicates that a virtual page maps to the given page frame, that is, it is located in that frame):

Virtual page 0 → page frame 1 Virtual page 3 → page frame 16
Virtual page 1 → page frame 2 Virtual page 4 → page frame 9
Virtual page 2 → page frame 4

The page size is 1024 bytes and the maximum physical memory size of the machine is 2MB.

a) How many bits are required for each virtual address?
b) How many bits are required for each physical address?
c) What is the maximum number of entries in a page table?
d) To which physical address will the virtual address 0x5F4 translate?
e) Which virtual address will translate to physical address 0x400?

B.14) Suppose that a page table for a particular process contains the entries shown below.

Frame	Valid	Dirty
1	1	0
-	0	0
0	1	0
3	1	1
-	0	0
-	0	0
2	1	1
-	0	0

a) The valid bit indicates whether or not the page table entry itself is valid. What does a valid or invalid page table entry also imply?
b) What is the purpose of the dirty bit?
c) What must be done for pages that have their dirty bit set that does not have to be done for pages that don't have their dirty bit set?
d) In the page table itself, the entries do not contain virtual page numbers, so how do we know where to go in the table in order to find out what memory frame a particular virtual page maps to?
e) Based on the above table, what virtual page maps to memory frame 2?
f) Why do we have to add virtual page numbers to the page table entries in the TLB? (Consider that, as noted in part d., the page table entries don't have virtual page numbers in them. So, why should TLB entries have them?)
g) When main memory is full, and we need to bring in another page, how do we decide which page to replace?
h) Assuming that we want to bring virtual page #5 into main memory and that we will put it in memory frame 3, show how the page table changes. Make your changes directly on a copy of the above table.

B.1)

a) What is meant by the term "volatile" as it refers to a memory characteristic?
b) Why is it important that at least some of the memory in the physical address space of the processor in a computer system is non-volatile?
c) What, specifically, is meant by the term "dynamic" as it refers to a memory characteristic? Note that "dynamic" is a general term referring to more than just memory that is constructed using capacitors as storage elements, so don't reference capacitors or capacitor behavior in your answer.

B.2) In a cache, the "mapping function" determines where in the cache certain blocks of memory may be placed. Most mapping functions restrict where a block from memory can be placed in the cache and some are more restrictive than others.

a) Fill in the following table, under the column "How Restrictive" with the phrases "none", "some", or "completely restrictive" as appropriate. Under "Description", describe what restriction (if any) the mapping function imposes on memory block placement.

Mapping Function	How Restrictive	Description
Fully Associative
Direct Mapped
Set Associative

b) One would suspect that the least restrictive mapping function would result in the highest hit rate. Why then is this mapping function not commonly used?
c) Which mapping function is most commonly used and why? (Note that no mapping function is "faster" or "slower" than any other, so speed is not a reason.)

B.3) A particular Direct Mapped cache has a total of 1024 blocks with 16 32-bit words (64 bytes) in each block. The cache is designed to work with a CPU that uses 32-bit byte addresses.

a) How many tags are stored in the cache?
b) How many bits are needed to select the byte in a word?
c) How many bits are needed to select a word in a block?
d) How many bits are needed to select the cache block?
e) How many bits are in each tag?
f) In what cache block could one expect to find the word whose main memory address is 71F32AC2h? Give the answer in Hex or in decimal. Show your work.
g) How many comparators does this cache have?
h) How many bits wide is each comparator?

B.4) A particular 8-way Set-Associative cache has a total of 1024 blocks with 16 32-bit words in each block. The cache is designed to work with a CPU that uses 32-bit byte addresses.

a) How many blocks are assigned to each set?
b) How many tags are stored in the cache?
c) How many sets are in the cache?
d) How many bits are needed to select a set?
e) In what cache set could one expect to find the word whose main memory address is 71F32AC2h? Give the answer in Hex or in decimal. Show your work.
f) How many comparators does this cache have?
g) How many bits wide is each comparator?

B.5) A particular fully associative cache has a total of 1024 blocks with 16 32-bit words in each block. The cache is designed to work with a CPU that uses 32-bit byte addresses.

a) How many tags are stored in the cache?
b) How many comparators does this cache have?
c) How many bits wide is each comparator?
d) In what cache block could one expect to find the word whose main memory address is 71F32AC2h?

6.9) (Text exercise 9 at the end of chapter 6) (In the 4th and 3rd edition this problem states that it is based on word addresses, not byte addresses, but you would get the same answers).

Suppose a byte-addressable computer using set associative cache has 2¹⁶ bytes of main memory and a cache of 32 blocks, and each cache block contains 8 bytes.

a) If this cache is 2-way set associative, what is the format of a memory address as seen by the cache, that is, what are the sizes of the tag, set, and offset fields?
b) If this cache is 4-way set associative, what is the format of a memory address as seen by the cache?

6.11) (Based on text exercise 11 at the end of chapter 6) Note that the word size is 1 byte and that each block has 4 bytes. Note that the exercise in the text gives the address format. From this work out the cache block each address maps to and what other addresses are brought in to the cache block on a miss.

a) Complete the following table showing the cache block that each reference will use and note if the reference is a Hit or a Miss.

Reference (Word Address):

6E

B9

17

E0

4E

4F

50

91

A8

A9

AB

AD

93

94

Cache Block Used:

Hit or Miss

Note that the cache is initially empty before reference 6E.

b) Give the overall hit rate for this sequence. _____

c) Fill in the following table showing what will be in cache block 2 (addresses of data bytes and actual tag) after the last reference. (The text question asks you to show what is in all 4 cache blocks, but just showing the contents of block 2 will be sufficient.)

Block 2
Tag Contents	Cache Contents (by address)

B.6) In addition to a mapping function, a cache may have to have a replacement policy.

a) What is a replacement policy? (Note that since you are defining this term, you shouldn't use the words "replacement" or "policy" in your answer.)
b) What types of cache's need replacement policies?
c) When does a replacement policy need to be applied? (i.e. for what event in the course of the operation of the cache do you need to know what the replacement policy is?)
d) How does one measure the effectiveness of a replacement policy?

B.7) Fill in the following table, as we did in class, showing the contents of the tags in one set of a 4 way set associative cache that uses an LRU replacement policy. Assume that each column represents the tag fields of the four blocks (B0, B1, B2 and B3) of the set at each point in time. Assume that the "CPU tag field" row indicates the tag fields of the sequence of addresses that, over time, come to this set from the CPU. Also, mark the hits and determine the hit rate.

`time`	`1`	`2`	`3`	`4`	`5`	`6`	`7`	`8`	`9`	`10`	`11`
`CPU tag field`	`2`	`4`	`1`	`2`	`5`	`3`	`1`	`5`	`4`	`1`	`2`
`B0 tag`
`B1 tag`
`B2 tag`
`B3 tag`

Hit rate: ______

5.22) (Problem 22 in chapter 5 in the 4th and 5th eds, 5.18 on the 3rd ed.) Given that memory contains the values shown below, and that R1 contains 0x200,

Memory Address	Contents
0x100	0x600
...
0x400	0x300
...
0x500	0x100
...
0x600	0x500
...
0x700	0x800

Show what value will be loaded into the AC for each of the indicated instructions. (Also note in this problem, the memory addresses and contents are given in hex.)

Instruction	Mode	Value loaded into AC
Load #0x500	Immediate
Load 0x500	Direct
Load (0x500)	Indirect
Load 0x500(R1)	Indexed

A.6) External memory accesses tend to take a large portion of the execution time of an instruction. Complete the following table by giving the total number of external memory accesses that would be required in the execution of a MOV D, S instruction (where D is destination and S is source) assuming D and S are addressed as given. Include the initial fetch of the MOV instruction as one of the memory accesses.

	Addressing for S	Addressing for D	# of memory accesses
a)	Immediate	Direct
b)	Register	Indirect
c)	Direct	Indirect
d)	Register	Direct
e)	Immediate	Register
f)	Register	Register Indirect

A.7) Many modern computers do not provide "memory indirect" addressing (just listed as indirect addressing in the text and the handout), but do provide all the other addressing modes we talked about. Give a sequence of two Load instructions that would do the same thing as the following instruction (which uses memory indirect addressing for its source):

Load R1, (X)

(where X is a memory address) without using memory indirect addressing. Also, give a comment next to each instruction in your answer that states what addressing mode it uses for the source of the instruction.

A.8) Consider the following program:

500		MOV R1, X
501		MOV R2, Y
502		MOV R3, Z
503		PUSH R1
504		CALL 550
505		POP R3
506		CALL 560
507		HALT
·		·
550		PUSH R2
551		CALL 560
552		POP R2
553		RET
·		·
560		PUSH R3
561		PUSH R1
562		POP R2
563		POP R1
564		RET

a) Give a diagram like those handed out in class that shows the values on the stack after the execution of each of the underlined instructions. Label each stack image with the appropriate underlined instruction address -- note that the first two are already labeled. Also note that sometimes an underlined instruction will be executed twice -- give a separate diagram for each time the instruction is executed. Assume that call and return instructions use the stack. Let the initial value of memory locations X, Y, and Z be 150, 250, and 350 respectively. Assume that the program begins execution with the instruction at address 500.

Inst 504

Inst 551

b) What is the contents of each of R1, R2 and R3 after the execution of the program (i.e., when the program reaches the instruction at address 507? (be careful here!)

A.9) In some instruction sets, call and return instructions make use of a RR register to hold the return address rather than doing so on the stack. Show how the code in the above problem should be changed if call and return instructions used a RR register for return addresses rather than the stack. Don't change the program's function or its call graph (i.e. don't change the routines called or the order in which they are called and don't change what each routine does).

5.24) (4th ed. 5.24, 3rd ed. 5.20)

a) A nonpipelined system takes 100ns to process an instruction. The same instructions can be processed in a 5-segment pipeline with a clock cycle of 20ns. Determine the speedup ratio of the pipeline for 100 instructions. Assume that the pipeline is initially empty when the 100 instructions begin execution, and assume that there are no branches and no data-dependencies or resource dependencies.
b) What is the maximum theoretical speedup that could be achieved with this pipeline system over a nonpipelined system?
Also remember that nano = 10^-9 so that 1 sec = 1,000,000,000 ns!

5.29) (4th ed. 5.29, 3rd ed. 5.25)

Suppose an instruction takes four cycles to execute in a nonpipelined CPU: one cycle to fetch the instruction, one cycle to decode the instruction, one cycle to perform the ALU operation, and one cycle to store the result. In a CPU with a 4-stage pipeline, that instruction still takes four cycles to execute, so how can we say the pipeline speeds up the execution of the program?

Make sure your answer describes how the pipeline actually speeds up execution while addressing the concerns raised in the question.

A.10) Pipelines

a) List the factors that can keep a k-stage pipeline from providing a true k times speedup.
b) Why would you expect that a pipeline with more stages will usually have a larger branch penalty (the number of cycles that no useful work is getting done)?
c) What is a data-dependency? Describe both what it is and state how it can slow down a pipeline.

A.1) Write the code needed to execute the following expression: X = (A + B × C) / (D - E × F) on --

a) a 3 address machine.

Use instructions like those given in the 3 address part of example 5.7 (Example 7 in chapter 5) on page 303. Assume that you can have at most 2 memory addresses, and no more than three register addresses in each instruction.

b) a 3 address load-store machine.

In a load-store machine, the only instructions that can access memory are load and store and these instructions take only one memory address and one register (i.e. Load R2, X). All the instructions that carry out arithmetic and logical operations are register-to-register and specify at most three register addresses (and no memory addresses). For example, the instruction: Sub R1, R2, R3 subtracts the contents of register R3 from the contents of register R2 and puts the result in register R1.

c) a 2 address machine.

Use instructions like those given in the 2 address part of example 5.7. Assume that you can have at most 1 memory address, and no more than two register references in each instruction.

d) a 1 address machine that has only an AC (accumulator) register.

Use instructions like those given in the 1 address part of example 5.7. Hint: This part can be done without using any temporary memory locations other than X.

When doing these problems...

Note that only part b has the load-store restriction. That is, in the other parts, instructions other than load and store are allowed to reference memory.
Make sure you remember operation precedence -- i.e. multiplies and divides are computed before adds and subtracts unless parentheses dictate otherwise -- so, in the above, B × C takes place before adding A etc.
Note that DIV and SUB are not commutative, so operand order in the instruction is important. In these problems, use a left to right order, that is dest <- src1 OP src2. So, SUB R3,X,Y would imply that R3 <- X - Y and DIV R3,X,Y would imply that R3 <- X/Y.
Try to use the least number of instructions possible.
Assume that A, B, C, D, E, F and X represent memory addresses.
It is good programming practice to not modify the sources, so only read from A, B, C, D, E, and F.
You may need to use additional locations in memory to temporarily hold intermediate results. Let U, T and Y represent these temporary locations.
Let R1, R2, ..., etc. represent registers.
In each case try to use the fewest registers and the fewest temporary locations possible, and remember that registers are better than memory for temporary storage.

A.2) RPN and Stacks

a) Write the expression given in problem A.1 in postfix notation.
b) Write the code needed to execute this expression on a stack based machine.

Use instructions like those given in the 0 address part of example 5.7.
On such a machine, most instructions are zero address instructions.
Binary operations (operations that take two operands) operate on the top two elements of the stack, replacing those top two elements with the result of the operation.
Assume that we have the operations ADD (add top two elements of the stack), SUB (second (from the top) element of the stack - top element of the stack), MUL (multiply top two elements of the stack) and DIV (second element of the stack / top element of the stack).
The machine also has the instructions PUSH X (pushes data from memory at address X on the top of the stack) and POP X (removes the first element on the top of the stack and places it in memory at address X).

Use as few operations as possible.

A.3) Assuming that the instruction set is created as given in Example 5.8 (on page 306 in the text), is the 16-bit instruction represented by the hex value 0xFE64 a zero-address, one-address, two address, or three-address instruction? _______ Assuming the 4 bit address fields in the instruction are used to select registers, what registers (if any) does this instruction select? ______

5.17 a) (Based on text Exercise 17 at the end of chapter 5, part a) (5.15 in the 3rd ed.)
Show how to encode the opcodes for an set of instructions that are all 11 bits long, that have 4-bit address fields and provides:

5 2-address instructions,
45 1-address instructions,
32 0-address instructions.

Do this by showing the range of opcodes needed for the two, one and zero address instructions as was done in Example 5.8.
Also, list the opcodes that were not needed in your approach in order to meet the problem requirements, if any.

A.4) Consider the following program.

XOR R1, R2
XOR R2, R1
XOR R1, R2

Where R1 and R2 are registers and the XOR instruction performs a bit-wise exclusive-or function.

a) Assume that the contents of registers R1 and R2 are initially 11001011b and 11100101b respectively. Give the values that will be in each of R1 and R2 after the execution of each instruction. Put your answer in the form of the following table:

	R1	R2
original values:	1100 1011b	1110 0101b
XOR R1, R2
XOR R2, R1
XOR R1, R2

b) Describe the overall function of this program. (Hint: compare what finally ends up in R1 and R2 with their original values.)
c) Give a program using only MOV instructions (i.e. MOV R1, R2) that accomplishes the same overall function as the above program. MOV R1, R2 simply moves the contents of R2 into R1.

A.5) A 32 bit quantity is stored in byte-addressable memory.

a) How many bytes of memory does the 32-bit quantity use?
b) If the address of the 32-bit quantity is given as 0x5C4A1234, give the range the addresses of the bytes that store it. That is, what is the lowest address and highest address that refers to any of the bytes in the 32-bit quantity.

5.2) (Based on text Exercise 2 at the end of chapter 5) Fill in the following table (or one like it) to answer this question showing what bytes would be at each memory address in each case. B.E = big endian and L.E = little endian. Note that Figure 5.1 (Figure 1 in chapter 5) in the text only shows the two LSbs (least significant bits) of the memory address.

		Address
		0x10	0x11	0x12	0x13
a) 0x456789A1	B.E.
a) 0x456789A1	L.E.
b) 0x0000058A	B.E
b) 0x0000058A	L.E.
c) 0x14148888	B.E.
c) 0x14148888	L.E.

CST 131 Previous Homework Assignments -- Spring 2025