# **VTU Computer Organization -Solution (18cs34)**

**Ans 1** Q1 Connection between memory and processor (operating steps) with diagram?



Memory :- Stores data and instructions.

- · Instruction register (IR) Holds instructions that is currently being executed. It's output is available to the control circuits which generate the tining signals that control various processing elements involved in executing the instruction
- · Program counter (PC) contains memory address of next instruction to be fetched and executed.
- · Memory address register (MAR) holds address of bastion to be accessed
- " Memory data register (MDR) contains data to be written into or read out of the addressed location.

Operating steps

- 8 Programs (list of instructions) riside in memory (usually obstact these though get there through impul imit).
- 1 PC is set to point to first, instruction of program. 1 This PC contents is transferred to MAR and Read control
- signal is sent to memory 3 After time required to access the memory elapses,
- addressed word (1st instruction in this case) is read out from memory and loaded in MDR.
- 1 MOR contents are transferred to IR.
- 3 of instruction involves an operation by ALU?

get operands from memory or general purpose register.<br>Of operand sesides in memory, its address is sent to MAR. Read cycle is initialized. Opinion corrected<br>MDR. It is sent to ALU. Similarly, more operands<br>are sent to ALU (if required). are sent to ALC up regulares. The address of location where result is to be stored.<br>is sent to MAR, and write cycle is initiated. 6 PC is incremented to point to next instruction

NOTE: If a source of destination is a register (R), MAR, MDR steps are not required as registers are directly accessible to ALU as both reside inside processor. MAR and MDR are required only if we want to access main memory for read or write operation.

## Q2 Basic performance equation and SPEC rating



Performance Measurement

Therefore, now a days, computer performance is measured using benchmark programs. Standardized programs are used for better comparisons. The performance measure is the time taken by compater to execute a given benchmask. A non profit organization called System Performance Evaluation Corporation (SPEC) selects and published representative

representative application programs for different application domains, together with that results for many commercially available computers. The programs selected range from game playing, compiler and database applications to numerically intensive programs in astrophysics and quantum chemistry. In each case, the program is compiled for the computer under test, and running time on real computer is measured. Simulation is not allowed. The same poogram is also compiled and run on ome compider selected as reference

SPEC rating = Running time on the reference computer Running time on the computer under test

The fest is repeated for all the programs in SPEC suite, and geometric mean of results is computed let SPEC; be the rating for program i in the suite. Overall SPEC sating is given by \* Geometric means-SPEC rating = n<sup>th</sup>root of  $(\prod s_{2c_i})^m$ product of n values. where n is no. of programs in the suite.

## Q3 Byte addressability (Big-endian and Little-endian assignments with diagram)

Byte Addressebility Successive addresses refer to successive byte socations in the memory. The term byte-addressability memory Byte locations have addresses 0, 1,2,...  $1 \text{b} \cdot \text{d}t = 8 \text{b} \cdot \text{d}t$ . If word length of machine is 32-bits, sucressive words are located at addresses 0, 7, 8, and each word emisting four bytes (St bits)



Big-endian and little-endian assignments.

These 2 methods are used for byte addessing. Any one method is selected out of these.

Big-endian assignment - Lower byte addresses<br>are used for more significant bytes (leftmost byte ) of the word.

Little-endian assignment - Lower byte addenses are<br>used for the less significant bytes (rightmost Lytes) of the word.



Words are said to be aligned in memory if they<br>begin at a byte address that is a multiple of the

# Q4 Instruction types (one-address, two-address, three-address instructions)

basic Instruction Types

 $C \leftarrow 127 + 189$ 

Add contents of A &B , place sum in C. A,B contents are unchanged Consider three-address instruction Operation Sourcel, Source 2, Destination

 $Add A, B, C$ 

Two-addess instruction

Operation Source, Destination

Move B, C  $Add A, C$ It may be happen that a-address instruction doesnot fit is one word for usual word length he address size. In that cases we may adopt one-address sytuction A processor register, isrally called accumulator the or this purpose. This may be Le be used & hold values a register to temporarily Copy A contents to accumulate  $\overline{A}$ Load Add Bcontents to accumulator  $\mathcal{B}$ Add Store accumulator contents  $\subset$  $S$ tore to C.

## MODULE 2

1. With a neat diagram, explain general 8-bit parallel interface.





## Module 3

## **Q1 Internal organization of RAM/memory chip/128bit memory chip?**

- Memory-cells are organized in the form of array (Figure 8.2).
- . Each cell is capable of storing 1-bit of information.
- . Each row of cells forms a memory-word.
- . All cells of a row are connected to a common line called as Word-Line.
- . The cells in each column are connected to Sense/Write circuit by 2-bit-lines.
- . The Sense/Write circuits are connected to data-input or output lines of the chip.
- During a write-operation, the sense/write circuit
	- $\rightarrow$  receive input information &
	- $\rightarrow$  store input info in the cells of the selected word.



. The data-input and data-output of each Sense/Write circuit are connected to a single bidirectional data-line.

- Data-line can be connected to a data-bus of the computer.
- Following 2 control lines are also used:
	- 1)  $R/W'$   $\rightarrow$  Specifies the required operation.
	- 2)  $CS'$   $\rightarrow$  Chip Select input selects a given chip in the multi-chip memory-system.



Figure 5.3. Organization of a  $1K \times 1$  memory chip.

10- bit address line is needed, but there is only one data line resulting in 15 external connections. 10-bit address is divided into two groups of 5 bits each to form the row and the column addresses for the cell array. A row address selects a row of 32 cells, all of which are accessed in parallel. However, according to the column address, only one of these cells is connected to the external data line by output multiplexer and input demultiplexer.

# **Q3 Asynchronous and Synchronous DRAM Asynchronous DRAM or 2M\*8 Asynchronous DRAM**

- Less expensive RAMs can be implemented if simple cells are used.
- . Such cells cannot retain their state indefinitely. Hence they are called Dynamic RAM (DRAM).
- . The information stored in a dynamic memory-cell in the form of a charge on a capacitor.
- . This charge can be maintained only for tens of milliseconds.
- . The contents must be periodically refreshed by restoring this capacitor charge to its full value.



Figure 8.6 A single-transistor dynamic memory cell.

- In order to store information in the cell, the transistor T is turned 'ON' (Figure 8.6).
- . The appropriate voltage is applied to the bit-line which charges the capacitor.
- After the transistor is turned off, the capacitor begins to discharge.
- . Hence, info. stored in cell can be retrieved correctly before threshold value of capacitor drops down.
- During a read-operation,
	- $\rightarrow$  transistor is turned 'ON'
		- $\rightarrow$  a sense amplifier detects whether the charge on the capacitor is above the threshold value.
			- $\triangleright$  If (charge on capacitor) > (threshold value)  $\rightarrow$  Bit-line will have logic value '1'.
				- > If (charge on capacitor) < (threshold value)  $\rightarrow$  Bit-line will set to logic value '0'.



Figure 5.7. Internal organization of a  $2M \times 8$  dynamic memory chip.

- During Read/Write-operation.
	- $\rightarrow$  row-address is applied first.
	- $\rightarrow$  row-address is loaded into row-latch in response to a signal pulse on RAS' input of chip. (RAS = Row-address Strobe CAS = Column-address Strobe)
- . When a Read-operation is initiated, all cells on the selected row are read and refreshed.
- Shortly after the row-address is loaded, the column-address is
- . 21 bit address is needed to access a byte in the memory. 21 bit is divided as follows: 1) 12 address bits are needed to select a row.
	- i.e.  $A_{8-0} \rightarrow$  specifies row-address of a byte.
	- 2) 9 bits are needed to specify a group of 8 bits in the selected row.
		- i.e.  $A_{20-9} \rightarrow$  specifies column-address of a byte.

## **FAST PAGE MODE:**

When DRAM in the above diagram is accessed, the contents of all 4096 cells in the selected row are sensed, but only 8 bits are placed on the data lines D7-0, as selected by A8-0. Fast page mode makes it possible to access the other bytes in the same row without having to reselect the row. A latch is added at the output of the sense amplifier in each column.



#### **DIRECT MAPPING**

This technique is easy to implement but not very flexible.

Block j of the main memory maps onto j modulo 128 of the cache. For example, whenever one of the main memory blocks 0, 128, 256, .... Is loaded in the cache, it is stored in cache block Q., Main memory blocks 1, 129, 257, ..... are stored in cache block 1 (one at a time), and so on. Contention may occur for a single cache block required by multiple memory blocks. E.g when for program execution both memory block 1 and 129 are required but cache block 1 can only store one memory block. To resolve this, new blocks are allowed to overwrite the currently resident block.

#### From example,

4096 memory blocks need to be mapped to 128 cache blocks. i.e., each cache block identified 32 memory blocks(4096/128).

#### Main memory address is divided into three parts:

Tag (5 bits): identify which memory block (out of 32 in this case) is currently resident in the cache

Block (7 bits): cache block position where the new memory block must be stored

Word (4 bits): selects one of the words of the memory block (out of 16 words per block in this case)

#### **ASSOCIATIVE MAPPING**



- It is more flexible than direct mapping technique but more expensive. Main memory block can be placed into any cache block position.
- Memory address is divided into two fields: - Low order 4 bits identify the memory word within a block. - High order 12 bits or tag bits identify a memory block when residing in the cache.
- · Flexible, and uses cache space efficiently.
- Replacement algorithms can be used to replace an existing block in the cache when the cache is full.
- Cost is higher than direct-mapped cache because of the need to search all 128 patterns to determine whether a given block is in the cache.



#### **SET-ASSOCIATIVE MAPPING**

It is a combination of direct mapping and associative mapping techniques. Blocks of the cache are grouped into sets, and the mapping allows a block of the main memory to reside in any block of a specific set.

Contention problem of direct mapping is eased by having a few choices for block placement. Hardware cost is reduced by decreasing the associative search.

## Module 4

With a figure, explain circuit arrangement for binary division.

• An n-bit positive-divisor is loaded into register M. An n-bit positive-dividend is loaded into register

Q at the start of the operation. Register A is set to 0

• After division operation, the n-bit quotient is in register Q, and the remainder is in register A.

• Procedure:

step 1:

Page 11 of 13

Do the following n times

i) If the sign of A is 0, shift A and Q left one bit position and subtract M from

A; otherwise, shift A and Q left and add M to A .

ii) Now, if the sign of A is 0, set q0 to 1; otherwise set q0 to 0.

Step 2:

If the sign of A is 1, add M to A (restore).

bit carry look-ahead adder

# CARRY-LOOKAHEAD ADDITIONS

• The logic expression for si(sum) and ci+1(carry-out) of stage i are

 $s = x + y + c$ i ------(1)

ci+1=xiyi+xici+yici ------(2)

Page 13 of 13

```
• Factoring (2) into
```

```
ci+1=xiyi+(xi+yi)ci
```
we can write ci+1=Gi+ PiCi where Gi=xiyi and Pi=xi+yi

• The expressions Gi and Pi are called generate and propagate functions .

• If Gi=1, then ci+1=1, independent of the input carry ci. This occurs when both xi and yi are 1.

Propagate function means that an input-carry will produce an output-carry when either xi=1 or yi=1.

• All Gi and Pi functions can be formed independently and in parallel in one logicgate delay.

• Expanding ci terms of i-1 subscripted variables and substituting into the ci+1 expression, we obtain

```
ci+1= Gi+PiGi-1+PiPi-1Gi-2. . . . . .+P1G0+PiPi-1 . . . P0c0
```
- Conclusion: Delay through the adder is 3 gate delays for all carry-bits &
- 4 gate delays for all sum-bits.

• Consider the design of a 4-bit adder. The carries can be implemented as

 $c1 = G0 + P0c0$ 

c2=G1+P1G0+P1P0c0

c3=G2+P2G1+P2P1G0+P2P1P0c0

```
c4=G3+P3G2+P3P2G1+P3P2P1G0+P3P2P1P0c0
```
• The carries are implemented in the block labeled carry-look ahead logic. An adder implemented in

this form is called a carry-look ahead adder.

• Limitation: If we try to extend the carry-look ahead adder for longer operands, we run into a

problem of gate fan-in constraints.

## Module 5

#### . Give the control sequence for execution of complete instruction ADD (R3), R1.

### Add (R3), R1

#### Step Action

- PC<sub>out</sub>, MAR<sub>in</sub>, Read, Select4, Add, Z<sub>in</sub>  $\mathbf{1}$
- Zout, PCn, Yn, WMFC  $\overline{2}$
- 3 MDR<sub>out</sub>, IR<sub>n</sub>
- $\overline{4}$ R<sub>3out</sub>, MAR<sub>in</sub>, Read
- R<sub>1</sub><sub>out</sub>, Y<sub>in</sub>, WMFC 5
- MDR<sub>out</sub>, SelectY, Add, Z<sub>n</sub> 6
- $\overline{7}$ Zout, R1m, End

· Instruction execution proceeds as follows:

- Step1--> The instruction-fetch operation is initiated by
	- → loading contents of PC into MAR &
	- $\rightarrow$  sending a Read request to memory.
	- The Select signal is set to Select4, which causes the Mux to select constant 4. This value is added to operand at input B (PC's content), and the result is stored in Z.
- Step2--> Updated value in Z is moved to PC. This completes the PC increment operation and PC will now point to next instruction.
- Step3--> Fetched instruction is moved into MDR and then to IR.
	- The step 1 through 3 constitutes the Fetch Phase. At the beginning of step 4, the instruction decoder interprets the contents of the IR. This enables the control circuitry to activate the control-signals for steps 4 through 7.
- The step 4 through 7 constitutes the Execution Phase.
- Step4--> Contents of R3 are loaded into MAR & a memory read signal is issued. Step5--> Contents of R1 are transferred to Y to prepare for addition.
- Step6--> When Read operation is completed, memory-operand is available in MDR, and the addition is performed.
- Step7--> Sum is stored in Z, then transferred to R1.The End signal causes a new instruction fetch cycle to begin by returning to step1.

#### 3 Explain the differences between Hardwired and Micro-programmed control.



Figure 8.1 Basic idea of instruction pipelining.

The computer is controlled by a clock whose period is such that the fetch and execute steps of any instruction can each be completed in one clock cycle. Operation of the computer proceeds as in Figure 8.1c. In the first clock cycle, the fetch unit fetches an instruction  $I_1$  (step  $F_1$ ) and stores it in buffer B1 at the end of the clock cycle. In the second clock cycle, the instruction fetch unit proceeds with the fetch operation for instruction  $I_2$  (step  $F_2$ ). Meanwhile, the execution unit performs the operation specified by instruction  $I_1$ , which is available to it in buffer B1 (step  $E_1$ ). By the end of the



(a) Instruction execution divided into four steps





Figure 5.33 Organization of data on magnetic tape.