|
Making
Monte Carlo Simulations Viable
within an RFIC Design Flow
By Andy Howard, Thomas Miller, Richard Lazansky, Agilent
EEsof EDA
We have developed a technique for running
Monte Carlo simulations on the extracted views of RFIC circuits
that make these simulations possible in a very reasonable
length of time. The technique is based on the combination
of three things: 1) numerous incremental improvements to
harmonic balance that have improved its speed and capacity,
2) a Monte Carlo analysis technique that speeds up these
simulations, and 3) the use of multiple CPUs in parallel.
The application of this technique shows designers the variability
of their circuits and enables them to achieve high-yield
designs much more quickly than without applying statistical
simulation.
Index Terms: DFM, Monte Carlo methods, RFIC, power amplifiers,
simulation.
Introduction
The simulation of RFIC circuits is often very time consuming,
especially when using time-domain based simulators and when
simulating extracted views that include 100s of thousands
or possibly millions of parasitic elements. Simulating such
circuits, even with nominal model parameters, can require
so much time that most designers would not even consider
running Monte Carlo simulations of their extracted views.
However, the failure to carry out such simulations can be
very expensive, as designers end up having only a limited
understanding of the yield they will obtain in production.
Quite likely, the yield will be such that a re-spin will
be necessary, incurring additional mask costs as well as
lost market opportunity.
This paper shows simulation statistics and results from
the extracted view of a power amplifier from a wireless
LAN transceiver reference design [1]. (This
design is based on an actually-fabricated design that was
modified from using an actual foundry PDK (Process Design
Kit) to instead use a generic PDK. The generic PDK is similar
in complexity to an actual foundry PDK, but the model parameters
have been changed to prevent the unauthorized disclosure
of a real foundry’s IP. The use of this generic PDK
enables us to share freely with our customers the design
database and simulation results). Also included is a discussion
of what we have done to speed up these Monte Carlo simulations.

II. Simulation Results
Using Harmonic Balance
We used GoldenGate Monte Carlo analysis to simulate the
extracted view of the power amplifier cell. The extracted
view had slightly more than 250,000 parasitic elements,
including resistors, capacitors and mutual inductors, and
464 nonlinear devices. A single, non-Monte Carlo simulation
of this extracted view required 4 minutes and 57 seconds
on a 3.4 GHz Linux PC. This is a remarkably short simulation
time, given the complexity of the circuit, and is the result
of numerous, incremental improvements to our harmonic balance
engine in recent years. It is this fundamental simulation
speed and capacity that now makes possible the Monte Carlo
simulations described here. The PDK had process and mismatch
statistical variables defined in the model files. A 100-iteration
Monte Carlo simulation using only the process variables
required 3 hours and 24 minutes on the same PC. This is
less than half of the 8 hours and 15 minutes (100 times
the non-Monte Carlo simulation time) that would be required
if there were no algorithm applied to speed up Monte Carlo
iterations after the initial one.
The algorithm used to speed up Monte Carlo iterations after
the first one makes use of the fact that the circuit’s
response does not change much from one Monte Carlo iteration
to the next. Because of this, and since we are running harmonic
balance, which solves for the steady-state solution, the
solution of the circuit at Monte Carlo iteration N may use
the solution at iteration N-1 as an initial guess and thus,
converges very quickly. This is in direct contrast to time-domain
simulators that do not solve for the steady state solution
directly and require that the simulation start from time=0
with each Monte Carlo iteration.
Each Monte Carlo iteration, using harmonic balance, had
two input power levels, one in the linear region at -30
dBm, and one near the 1-dB gain compression region at -1
dBm.
Figure 1 shows the variation in the small-signal
gain of the extracted view of the power amplifier. It clearly
shows that there is a significant variation in the gain
and that if, for example, a specification of 22 dB were
required, there would be significant yield loss.
Figure 2 shows the variation in the output
power of the power amplifier when the input power is -1
dBm. It indicates that if an output power of 19 dBm is sufficient
for this application, the yield should be quite high. If
an output power of 21 dBm or more is required, then the
yield loss might make it worthwhile to modify the design.
III. Speeding Up Monte Carlo Simulations by Using
Parallel CPUs
Monte Carlo simulations may be sped up greatly by running
them in parallel on different machines or by using different
CPUs on a single machine.

When running a parallel Monte Carlo simulation, a “sentinel”
simulation is done to derive a “start from”
condition. With enough processors, it is possible for the
parallel Monte Carlo simulation time to approach twice the
time required for the initial Monte Carlo iteration. i.e.,
a 200 iteration Monte Carlo run can be done in less than
twice a single nominal simulation. This is not very practical,
but running 200 iterations on 20 processors is. A rule of
thumb for the time required to run a parallel Monte Carlo
simulation is 1.5 (n/p)x where n is the number of Monte
Carlo iterations, p is the number of processors, and x is
the time required for the first Monte Carlo iteration.
This same 100-iteration Monte Carlo simulation of the PA’s
extracted view was repeated on an LSF (Load Sharing Facility
from Platform™ [2]) cluster of 5 medium-capability
CPUs. This required just over 1 hour. (With PCs that have
multiple CPUs becoming more common, another option is to
run a parallel Monte Carlo simulation on such a PC, which
would not require the use of LSF).
Using this technique, it is possible to get statistical
information on the performance of your circuit, even if
it is the extracted view, in a very reasonable amount of
time. Assuming the statistical process information is accurate,
this gives you the information you need to understand what
your circuit’s variability and yield will be in manufacturing,
and helps you determine whether modifications to the design
would be worthwhile.
Figure 3 shows graphically the speedup
in simulation time due to the application of this technique.
Line 1 would be the amount of time required to run the Monte
Carlo simulations if each iteration had to start over from
scratch. Line 2 shows the simulation time achieved on a
single, fast CPU when the initial Monte Carlo iteration
is used as an initial guess for all subsequent iterations.
Figure 3 shows the simulation time when
5 medium speed CPUs are used in parallel. In Figure
2, “A” is the speedup due to the fast
Monte Carlo algorithm, and “B” is the speedup
due to running the Monte Carlo iterations in parallel. This
secondary speedup will vary with the number of CPUs used
as well as their speed.

IV. Simulation Results Using Fast Envelope Transient
It is also possible to use the Fast Envelope Transient analysis
technique [3] to simulate the modulated output power and
spectrum of the extracted view of the same PA driven by
a WLAN input signal. This simulation makes use of a source
that is able to read arbitrary I and Q time-domain data
from a file, enabling designers to determine the performance
of their designs with real, modulated signals. They no longer
have to rely on performance estimates derived from one-
or two-tone simulations.
Figure 4 shows the output spectra with
a WLAN input signal centered at 2.45 GHz, with input power
at -10 dBm and at 0 dBm, for the nominal model parameters.

Each simulation of the extracted view of the PA using
Fast Envelope Transient required about 14 minutes for a
single input power level of -2 dBm. Therefore, 100 Monte
Carlo iterations using Fast Envelope would require about
24 hours without running them in parallel. Unfortunately,
when running Fast Envelope Transient with Monte Carlo, the
simulator is unable to reuse the solution from the initial
Monte Carlo iteration. Using multiple CPUs in parallel is
certainly capable of speeding this up to be less than an
overnight simulation.
V. Conclusion
With recent advances in simulator speed and capacity as
well as the ability to use parallel CPUs, it is now possible
to obtain statistical information on even very large RFIC
blocks, including extracted parasitics. These simulations
would have been completely beyond consideration just a few
years ago. Using the techniques described here, designers
are now able to understand the variability of their designs
before they are fabricated. The Monte Carlo simulations
described here should now be a standard part of any RFIC
design flow.
Ackowledgement
The authors wish to acknowledge the assistance and support
of the GoldenGate team.
References
[1] A. Howard, “An innovative approach to faster RFIC
transmitter design,” http://www.wirelessdesignmag.com/0818_2005.html
[2] http://www.platform.com
[3] E. Ngoya, R. Larcheveque, “Envelop transient analysis:
a new method for the transient and steady state analysis
of microwave communication circuits and systems,”
IEEE MTT Symposium Digest, pp. 1365-1368, 1996.
[4] V. Veremey, “Simulation and design verification
for fully integrated radio frequency (RF) transceivers,
problems and solutions,” 2006 IEEE Int. Conf. on Mathematical
Methods in Electromagnetic Theory, pp. 258-263, June 26-29,
2006.
Agilent
EEsof EDA
www.agilent.com
TXTLINX.COM 101
Email
this article to a friend!
|