Google
 
Welcome to RustySpigot, simple solutions for complex problems

main page

blog

forum


Downloads:
Recovery Linux
Boolean Network

Remote Access
WebEx PcNow
GoToMyPc
LogMeIn
pcAnywhere

Conferencing
FreeChat
DimDim
WebExMeetMeNow
Zoho Meeting
GoToMeeting
Online conferencing

Popular
NetZero Review
Groovyweb
Best web hosting
Broadband isp
VLSI

VLSI

Transistor Design

Semiconductors

 A high valency implant such as phosphorous gives free electrons, creating an n-type region.

 A low valency implant such as boron gives free holes, creating a p-type region.

 p-n creates a diode that only conducts if the p-type region (the anode) is more positive than the n-type region (the cathode).

 In bipolar circuits logic is determined by whether or not there is current flowing, in MOS circuits it is the presence or absence of charge. The charge in MOS circuits persists for about 1/1000th of a second.

Passive circuits are always switched on. whereas active circuits are selectively switched on.

collector

A bipolar transistor is formed by a sandwich of npn regions in a single crystalline lattice:

n+



p


base


n



emitter

  • A small current flowing from the base to the emitter induces a large current to flow from the collector to the emitter.

  • A pnp transistor has the opposite polarity

  • An enhancement mode, n-channel, metal-oxide-silicon field effect transistor (nMOS FET) consists of two n-type regions (called diffusion) lie on either side of a region of p-type substrate which is covered by a thick layer of silicon dioxide and a metal plate:



drain




p

n

gate





n



source



  • When the gate is positive with respect to the source an n-type channel is formed under the gate and current is conducted from the drain to the source.

  • Even when turned on, a MOS transistor has a resistance of about 10kO

  • A


    p-channel MOS FET has the opposite polarity and conducts when its gate is low. However, the resistance of a p-type channel is about 2 ½ times that of an n-type channel of the same size:






  • The NMOS transistor operates in three modes:

  1. Off when Vgs < Vt

  2. Saturated when Vgs > Vt and Vds > Vgs - Vt

  3. Linear when Vgs > Vt and Vds < Vgs - Vt

Where Vt is the threshold voltage (=0.2 Vdd =1V for a 5V system)

  • Vt can be adjusted by implanting further impurities, to the extent where it is negative, creating a depletion mode nMOS FET which always conducts. This can be used as a compact way of making a resistor:




  • nMOS uses n-channel enhancement mode and depletion mode transistors

  • pMOS uses p-channel enhancement mode and depletion mode transistors

  • CMOS uses n-channel and p-channel enhancement mode transistors


Stick Diagrams


In MOS:

nFET pFET Depletion mode nFET

  • With CMOS, the nMOS transistors are good at conducting low signals and the pMOS transistors are good at conducting high signals, so transmission gates are often made from a pair of complementary transistors

  • MOS transistors can be used simply as switches to steer current using pass transistor logic

  • With pass transistor logic, rather than the input controlling the gate, the input is fed to the source

  • A multiplexor can be made by lines for each input and push to make transistors dependent upon the number of the selected input in binary (see p11)

  • C

    ___

    1.LD

    an use depletion mode transistors as conductors to avoid using metal rails:

Simple Logic

  • In the example below of an nMOS FET, if X is high, Q will be high too:

  • Each time a MOS transistor controls a pass transistor, there is a significant voltage drop (as the source potential cannot rise above the gate potential less the threshold voltage)

  • A NAND gate in nMOS:


Frame16



(If A and B are 1 then Q is connected to ground, otherwise is pulled high by the depletion mode nFET)


  • A NOR gate in nMOS:

Frame17



  • A NAND gate in CMOS:

  • The delay through a MOS gate is the time that it takes to charge (or discharge) its output signal above (or below) the threshold voltage of any transistors in further circuitry that it drives. The series resistance can be reduced (speeding up the gate) by increasing the size of the driving transistor.

  • Stereotyped design lays out logic in space efficient gate arrays for ease of automatic layout

  • deMorgans laws:

  • Clocked logic often uses a two-phase non-overlapping clock, which can be produced by taking a clock and putting it through a flip flop with one of its inputs inverted

  • Eg a two phase non-overlapping clock can be used with pass transistors to build a shift register:



  • Similarly a pseudo static register (clocked latch) can be made by storing a bit as charge on the input of an inverter, and refreshing it at least 1000 times a second:

  • A one bit stack can be built from a two phase non-overlapping clock and four control signals (see p 22)

  • In both CMOS and nMOS pull ups have to be weaker than pull-down transistors to give the correct ratio across the potential divider, which makes switching to 1 slower than switching to 0. One solution is to pre-charge the output of a date during one clock phase and then discharge it selectively through a pull-down during a second phase. Eg a pre-charged NOR gate:

  • Care must be taken to avoid reading the output before it has been discharged, and in particular two logic blocks can’t be concatenated as the transient high could discharge the output of the next stage: this is known as an internal race.

  • You can avoid the internal race by using a four stage clock

  • Domino logic avoids internal race by precharging when ᴓ=0, at which the output goes low. On ᴓ=1, the pull-down network is evaluated and the output may rise from 0 to 1. Since only a rising edge is possible, there cannot be a spurious discharge of the next stage.

  • Domino logic is limited to non-inverting structures, and extra buffers are required. There can also be charge-sharing and race problems. A solution to this is NORA logic.

  • NORA logic uses n-type and p-type blocks: p-type blocks drive n-type blocks, and n-type blocks drive p-type blocks- though similarly type blocks can be connected by an inverter (effectively using a Domino circuit).

  • On ᴓ=0, the n-type block pre-charges its output high and the p-type block pre-charges its output low. On ᴓ=1, both blocks evaluate.

  • You can make a FSM from a stage register and a PLA

Memory

  • A memory of 2w bits each of 2b bits will generally be constructed as 2w rows and 2b columns

  • ROMs store arrays of memory by vertical metal lines cutting across horizontal polysilicon lines with diffusion tabs running under the polysillicon where a 0 is to be stored. Programmable ROMs allow the diffusion tabs to be written electronically. Eg;

  • The simplest form of writeable memory is static memory. A bit is stored in a pair of cross coupled invertors, with separate circuits to control the reading and writing of the data.

  • Dynamic RAM uses fewer transistors by storing the bit as charge on the gate of a FET:

  • Data is written by putting data on Data and strobing Write.

  • Data is read by pre-charging Data and strobing Read; the value obtained has to be inverted.

  • Data has to be refreshed by re-writing atleast every millisecond or so.

  • Memory can be stored with just one transistor, using a second transistors ground state to charge the actual value.

ALU’s

  • Normally a barrel shifter, that concatenates two words and selects a certain word size portion of the result, is usually implemented with control signals indicating the size of the shift in unary.

  • A Manchester carry chain can be built from output Q, C­out (-Q), Kill –(A+B), Propagate (A XOR B)

A

B

in

Q

Cout

K

P

0

0

0

0

0

1

0

0

0

1

1

0

1

0

0

1

0

1

0

0

1

0

1

1

0

1

0

1

1

0

0

1

0

0

1

1

0

1

0

1

0

1

1

1

0

0

1

0

0

1

1

1

1

1

0

0

  • Carry look ahead is more efficient and is implemented as a tree structure.

  • Carry select splits the n-bit word up into blocks. Two adders are used for each block, one assuming a carry in of 0 and the other a carry in of 1. The actual carry-in is then used to select between the outputs of the two adders.

  • Carry skip exploits the observation that propagate signals are much easier to compute than generate.

  • Carry save is used in multipliers and retains the carries arising from partial sums to be included in the next addition.

  • Function generators are essentially multiplexors. Eg;

A and B are selecting which of the four F(A,B) results is required. Yellow squares indicate a depleted mode transistor (ie no connection, just like blue metal lines).

Fabrication

  • Masks are used to control each process

  • CMOS uses either bulk p-well or bulk n-well with an insulator such as sapphire to separate the two types.

  • The main problem with CMOS production, apart from needing more masks, is the formation of parasitic bipolar transistors between p wells and n substrate.

  • Design rules such as the minimum size of diffusion, metal and polysilicon exist for each process due the limits of accuracy involved in the processes.

  • Long tracks with a large fan out (eg clock signals) have to be driven carefully to ensure reasonably sharp transitions. One method to drive the signal is to have a series of invertors, each e (natural log) bigger than the last.

Logical effort

  • Logical effort is an estimate of gate delay, and is determined by the RC product of the resistance

  • The unit less delay, d, is the sum of an effort delay and a parasitic delay: d = f + p

  • The effort delay is the product of the logical effort , which depends on the logic function and its implementation, and the electrical effort, which depends on the load that is being driven: f = g h

  • Several stages of logic may be cascaded to effect a function. The logical effort along a path is the product of all the logical efforts along it.

  • The branching effort at each stage is the ratio of the total output capacitance that is on the path being analysed: b=C­total/Cuseful

  • The total branching effort is the product of the different efforts: F=GBH

  • The total delay is the sum of the effort delays

  • The total parasitic delays is the sum of the effort delays. This total is minimised when the effort is spread equally between the stages.

  • The total delay of a path is the path effort plus the parasitic delay: D = DF + P

  • The logical effort of each input on an n-input NAND gate is (n+1)/(y+1)

  • The local effort of each input on an n-input NOR gate is 2/(y+1)

  • Due to an effect known as velocity saturation dynamic gates are actually, even faster than the logical effort analysis suggests

  • Pseudo nMOS logic uses a pFET with its gate tied low as a passive pull up. The logical effort is 4/9 for a falling output, and 4/3 for a rising output: The overall logical effort is the average, 8/9

The average logical effort for an n-input NAND gate is 8n/9, and for a NOR gate is 8/9. Consider the design of a 6-input NOR function in CMOS. This could be implemented in several different ways:

  • a single, static gate;

  • a chain of logic consisting of a pair of 3-input NOR gates whose outputs are combined by a NAND gate followed by an inverter;

  • a single, dynamic gate pulled up when φ=0 and evaluated when φ=1;

  • a single, pseudo-nMOS gate.

Calculate the logical effort, parasitic delay and unitless delay for each of the four designs. Which is likely to be fastest?

What difference would it make if the gate were driving a large capacitative load?



6-input NOR function in CMOS. Approximations use γ=2.

  • Single, static gate:
    Logical effort (6γ+1)/(γ+1) ≈ 4.3
    Parasitic delay 6 pinv ≈ 6

  • Chain of logic consisting of a pair of 3-input NOR gates whose outputs are combined by a NAND gate followed by an inverter:
    Logical effort (3γ+1)/(γ+1) × (γ+2)/(γ+1) × 1 ≈ 3.1
    Effort delay over three stages is 3 times cube root of 3.1 ≈ 4.4.
    Parasitic delay (3 + 2 + 1) pinv = 6 pinv ≈ 6

  • Single, dynamic gate pulled up when φ=0 and evaluated when φ=1:
    Logical effort 2/(γ+1) ≈ 0.7
    Parasitic delay [(γ/2 + 2n) / (γ + 1)] pinv ≈ 4.3

  • Single, pseudo-nMOS gate:
    Logical effort falling 4/3/(γ+1) ≈ 0.4, rising 4/(γ+1) ≈ 1.3, average 0.9
    Parasitic delay [(γ/3 + 6×4/3) / (γ + 1)] pinv ≈ 2.9

  • Comparision:
    Pseudo nMOS faster than dynamic faster than static faster than chain.

  • Capacitative load:
    The break points are interesting, so tabulate the overall unitless delay for a range of loads from 1 up to 25. Static becomes slowest as soon as the load rises above 1. Dynamic becomes fastest when the load rises above 7, and the chain becomes fastest as the load rises above 20.



Fundamental limitations

  • The constant field model of MOS scaling applies a dimensionless factor a to all manufacturing dimensions (length width and thickness)C but leaving the electric field and channel thickness remain unchanged. Eg with a=2 all dimensions would be halved.

  • Constant voltage is an alternative model in which only the manufacturing dimensions are scaled, leaving voltages unchanged, so the channel thickness increases by a factor a.

  • The yield of a fabrication process is the proportion of manufactured chips that work. The yield roughly varies as follows with the die size A and the defect density D: Yield=k.e-A.D­

  • Other technologies that may be used include BiCMOS (mixing bipolar and CMOS on a single chip), Gallium arsenide, directly coupled FET logic, superconductors and quantum computing.

  • Modern chips can be tested using electron microscopes that are in synch to the clock.

Self timed circuits

Self timed circuits are coming back into fashion for a number of reasons:

  • Wire delays are becoming more significant than logic delays, this clock skew is becoming a problem

  • Distributing a fast clock and running dynamic circuitry takes up a lot of power, even when calculations aren’t being made

  • The maximum clock speed is governed by the by the worst delay, which may be very occasional

  • There must be a safety margin on the clock speed to cater for variations in performance for age etc.

  • A strong clock signal will produce em radiation that may pose a threat to security or reliability.

  • Matched delays involves local clocking via a delay element, by watching crucial paths and involves very careful design.

  • Completion detection involves Two encoding a completion signal along with the data.

Delay Models

  • Fundamental mode circuits assume upper and lower bounds on gate and wire delays so that outputs settle between changes of the inputs.

  • Speed independent circuits assume that all gate delays are finite (but unbounded) but with no wire delays. Completion detection is clearly necessary with this approach.

  • Delay insensitive circuits assume that both gate and wire delays are finite (but unbounded).

  • Quasi delay insensitive circuits broaden the DI model to allow isochronic forks, separate paths where the difference in delays on two paths is less than a gate delay.

  • Field forks are a special arrangement in MOS where a signal can control a sequence of transistors by running polysilicon across their gates. It can then be assumed that the transistors will switch in the same sequence.

  • Completion signals can be encoded using dual rail encoding. Two wires are used to represent every logical bit:

Code

Meaning

1Q0


00

Clear

01

Logical 0

10

Logical 1

Eg every A in a truth table is now replaced by an A­0 and A1

  • Completion detection can be achieved by ORing each pair of wires and then using a tree of Muller C-elements spanning all the signal wires.

  • The muller C element can be considered as a majority gate with feed-back:

Frame18


Seitz’s weak conditions

This is a protocol to ensure that large blocks can be composed safely:

  1. Some input becomes defined before any output becomes defined

  2. All inputs become defined before all outputs become defined

  3. All outputs become defined before any input becomes undefined

  4. Some input becomes undefined before any output becomes undefined

  5. All inputs become undefined before all outputs become undefined

  6. All outputs become undefined before any input becomes defined

Event FIFO

  • Self-timed micropipelines can be created by interposing latches controlled by the events that store values on a data bus feeding through blocks of combinatorial logic.







Terms of Use | Contact Unless otherwise noted, content on this site is licensed under Creative Commons Attribution 2.5| Computer_Science/VLSI.html was last modified on 2008-09-27 08:56:07