# Advanced VLSI Design: 2021-22 Lecture 6 Latch Based Clocking & Asynchronous Clocking By Dr. Sanjay Vidhyadharan ELECTRICAL # **Pipelining** | <b>Performance Figure</b> | Value | |---------------------------|-------------------------------------------------------| | Function Delay | $2*T_{ m Clock}$ | | Power | $3 * P_{adder} + 3 * P_{Regsiter}$ | | Throughput | $@ T_{Clock} (T_{clk-Q} + T_{pd\_adder} + T_{setup})$ | | Gate complexity | $3 * G_{adder} + 2 * G_{Register}$ | | Functional Flexibility | Nil | | Function Expandability | Nil | 2/20/2022 2 # **Latch Based Clocking** - A stable input is available to the combinational logic block A (CLB\_A) on the falling edge of CLK1 (at edge 2) and it has a maximum time equal to the T CLK /2 to evaluate (that is, the entire low phase of CLK1). On the falling edge of CLK2 (at edge 3), the output CLB\_A is latched and the computation of CLK\_B is launched. CLB\_B computes on the low phase of CLK2 and the output is available on the falling edge of CLK1 (at edge 4) - It possible for a logic block to utilize time that is left over from the previous logic block and this is referred to as slack borrowing 2/20/2022 # **Latch Based Clocking** # **Latch Based Clocking** 2/20/2022 5 # **Synchronous Clocking** $$T > \max(t_{pd1}, t_{pd2}, t_{pd3}) + t_{pd,reg}$$ - > Clock period is chosen to be larger than the worst-case delay of each pipeline stage. - ➤ Hence, The throughput rate of the pipelined system is directly linked to the worst-case delay of the slowest element in the pipeline - As all the clocks in a circuit transitions at the same time, significant current flows over a very short period of time (due to the large capacitance load). This causes significant noise problems due to package inductance and power supply grid resistance. # **Self-Timed Circuit Design** This avoids all problems and overheads associated with distributing high-speed clocks. A self-timed circuit proceeds at the average speed of the hardware in contrast to the worst-case model of synchronous logic. The automatic shut-down of blocks that are not in use can result in power savings. Additionally, the power consumption overhead of generating and distributing high-speed clocks can be partially avoided. Self-timed circuits are by nature robust to variations in manufacturing and operating conditions such as temperature. # **Completion-Signal Generation** ### **Dual-Rail Coding** Redundant signal representation to include transition state. | В | <i>B</i> 0 | <i>B</i> 1 | |--------------------------|------------|------------| | in transition (or reset) | 0 | 0 | | 0 110 | 0 | 1 | | .10.7 | 1 | 0 | | illegal | 1 | 1 | # **Completion-Signal Generation** ### **Dual-Rail Coding** Dual-rail coding above allows tracking of the signal statistics, it comes at the cost of power dissipation. Every single gate must transition for every new input vector, regardless of the value of the data vector # The Ripple-Carry Adder ### Worst case delay linear with the number of bits For $4 - bit\ Adder \approx 12\ Gate\ Delays$ For $8 - bit Adder \approx 24$ Gate Delays For $16 - bit Adder \approx 48$ Gate Delays # **Manchester-Carry Adder Circuit** For $4 - Bit Adder \approx 6 Gate Delays$ For $8 - Bit\ Adder \approx 12\ Gate\ Delays$ For $16 - Bit\ Adder \approx 18\ Gate\ Delays$ # **Self-Timed Adder Circuit** # **Replica Delay** The advantage of this approach is that the logic can be implemented using a standard non-redundant circuit style such as complementary CMOS. Also, if multiple logic units are computing in parallel, it is possible to amortize the overhead of the delay line over multiple blocks # **Completion-Signal Generation using Current Sensing** $t_{MDG} > \min(t_{delay}) \Rightarrow t_{overlap} > 0$ # **Self-Timed Signaling** The four events, data change, request, data acceptance, and acknowledge, proceed in a cyclic order. This protocol is called two-phase. - The Req event terminates the active cycle of the sender. The sender is free to change the data during its active cycle. - The receiver's cycle is completed by the Ack event. The receiver can only accept data during its active cycle. # Muller C-element. (a) Schematic | A | В | $F_{n+1}$ | |----|----|-----------| | 0 | 0 | 0 | | 0 | 10 | $F_n$ | | 1 | 0 | $F_n$ | | 10 | 4 | 1 | (b) Truth table # **Self-Timed Signaling** | Data Ready | Ack | Ack' | Req | |------------|--------|--------|----------| | 0 | 0 | 1 | 0 | | 0 -> 1 | 0 | 1 | 0 -> 1 | | 1/1 -> 0 | 0 | 1 | 1 | | 1/0 | 0 -> 1 | 1 -> 0 | 1/1 -> 0 | | 1 -> 0 | 1 | 0 | 1 -> 0 | # **Two-Phase Self-Timed FIFO** # **Four-Phase Self-Timed FIFO** # **Four-Phase Self-Timed FIFO** | Data Ready | Req | S | S' | Ack | Ack' | |-------------|--------|---|----|--------|------| | 0 | 0 | 0 | 1 | 0 | | | 0 -> 1 -> 0 | 0 -> 1 | 0 | 1 | 0 | | | | | | | 0->1 | | | 0 | 0 | 1 | 0 | 1 | | | 0 -> 1 -> 0 | 0 | 1 | 0 | 1 | | | | | | | 1 -> 0 | | | | | 0 | 1 | | | Thank you