Anda di halaman 1dari 14

Mi

roar hite ture veri ation by Compositional


Model Che king
Ranjit Jhala1 ? and Kenneth L. M Millan2
1
University of California at Berkeley
2
Caden e Berkeley Labs
Abstra t. Compositional model he king is used to verify a pro essor
mi roar hite ture ontaining most of the features of a modern mi ropro-
essor, in luding bran h predi tion, spe ulative exe ution, out-of-order
exe ution and a load-store bu er supporting re-ordering and load for-
warding. We observe that the proof methodology s ales well, in that the
in remental proof ost of ea h feature is low. The proof is also quite
on ise with respe t to proofs of similar mi roar hite ture models using
other methods.

1 Introdu tion
Compositional model he king methods redu e the proof of a omplex system,
through de omposition and abstra tion, to a set of lemmas that an be ver-
i ed by a model he ker. It has been shown that the proof of systems with
unbounded or in nite state an be redu ed to tra table model he king prob-
lems on nite state abstra tions. For example, an instru tion pro essing unit
using Tomasulo's algorithm [Tom67℄ was proved using the method [M M00℄ for
unbounded resour es. The proof was substantially simpler than that of a similar
model using a general purpose theorem prover [AP99℄. The safety proof involved
just three simple lemmas veri ed by a model he ker. The relative simpli ity of
the proof using ompositional model he king owed prin ipally to the la k of user
generated indu tive invariants and the lesser need for manual proof guidan e.
Nonetheless, the important question of the s alability of the method remains
open. That is, does the manual proof e ort in rease in reasonable proportion to
the size and omplexity of a system?
We approa h this question by onsidering the veri ation of a omplete pro-
essor mi roar hite ture, ontaining most of the important features of a modern
mi ropro essor. These in lude bran h predi tion, spe ulative exe ution, out-of-
order exe ution (with in-order retirement and lean ex eptions) and a load-store
bu er supporting re-ordering and load forwarding. The question is whether the
omplexity of the proof in reases by some reasonable in rement with ea h new
ar hite tural feature, or whether it in reases intra tably, making proofs of om-
plex systems impra ti al. We nd that the in remental proof ost of ea h ar hi-
te tural feature is small (just a few additional lemmas) and that the intera tion
of these features, though omplex, does not make the proof expand intra tably.
? Supported by SRC ontra t 99-TJ-683.003, AFOSR MURI grant F49620-00-1-0327,
NSF Theory grant CCR-9988172
The mi roar hite ture model that we verify is similar in its feature set to
models that have been veri ed using theorem proving methods [HGS00,SH98℄.
We ompare our proof to the proofs obtained by these methods, with emphasis
on the use of indu tive invariants and its e e t on proof omplexity.
Se tion 2 provides a brief overview of the proof method. Then se tion 3 de-
s ribes the mi roar hite ture model that we veri ed, and its spe i ation. In
se tion 4 we dis uss the proof, and onsider the question of s alability. Se -
tion 5 ompares the proof with proofs obtained previously for similar mi roar-
hite tures. In se tion 6 we on lude with some remarks on the strengths and
weaknesses of the method, and how the weaknesses might be addressed.
2 Ba kground
To verify the mi roar hite ture, we use the SMV proof assistant [M M00℄. This
tool supports the redu tion of orre tness onditions for unbounded or in nite-
state systems to lemmas that an be veri ed by model he king. The general
approa h is to divide the intended omputation into \units of work" that use
only nite resour es in the implementation, su h as instru tions in a pro essor,
or pa kets in a pa ket router. Corre tness of a given unit of work is then redu ed
to a nite state problem using a built-in olle tion of abstra t interpretations.
In e e t, we disregard those omponents of the system state not involved in the
given unit of work. Be ause spe i ations an be temporal, we avoid the need
to write and verify an indu tive invariant of the system. Instead, we exploit the
model he ker's ability to ompute the rea hable states (strongest invariant) of
the abstra t models. This greatly simpli es the proofs.
The proof methodology A system is spe i ed with respe t to a referen e
model. For a pro essor, this is an \instru tion set ar hite ture" (ISA) model that
exe utes one instru tion at a time in program order. The orre tness ondition is
a temporal property relating exe utions of the implementation to exe utions of
the referen e model. We de ompose orre tness into \units of work" by spe ifying
re nement relations. These are temporal properties spe ifying the data values
at internal points in the implementation in terms of the referen e model. For
example, in a pro essor we may spe ify the operands read from the register le
and the results omputed by the ALU. To make su h spe i ations possible, we
may add auxiliary state variables that re ord the orre t data values as they
are omputed by the referen e model. A de nitional me hanism in the proof
assistant allows us to add auxiliary variables in a sound manner.
Mutually indu tive temporal proofs The re nement relations are then
proved by mutual indu tion over time. Ea h re nement relation is a tempo-
ral property of the form G, meaning that  is true at all times t. To prove
that  is true at time t, we may assume by indu tion that the other re nement
relations hold for all times less than t. This is useful in a methodology based on
model he king, be ause the notion that q up to time t 1 implies p at time
t an be expressed in temporal logi as :(q U :p). Hen e, this proposition an
be he ked by a model he ker.1 This mutually indu tive approa h is important
1
In some ases we an also assume that another re nement relation holds for all times
less than or equal to t, provided we do not do this in a ir ular way.
to the proof de omposition. It allows us to assume, for example, when proving
orre tness of an instru tion's sour e operand, that the results of all earlier in-
stru tions have been orre t. Note that this is quite di erent from the method
of proof by invariant, in whi h we show that some state property at time t 1
implies itself at t. Here the properties are temporal, and the indu tive hypothe-
ses are assumed for all times less than t, and not just at t 1. This is important,
sin e it allows us to avoid writing auxiliary invariants.
Temporal ase splitting Next we spe ialize the properties we wish to prove, so
that they depend on only a nite part of the overall state. For example, suppose
there is a state variable v , whi h is read and written by pro esses p1 : : : pn . We
wish to prove a property G of v . We add an auxiliary state variable w whi h
points to the most re ent writer of variable v . Now, suppose we an prove for all
pro ess indi es i that G((w = i) ) ). That is,  holds whenever the most re ent
writer is pi . Then G must hold, sin e at all times w must have some value. We
all this \splitting ases" on the variable w, sin e it generates a parameterized
property with one instan e for ea h value of w. For a given value of i, we may now
be able to abstra t away all pro esses ex ept pi , sin e the ase w = i depends
dire tly only on pro ess pi .
Abstra t interpretation Finally, we wish to redu e the veri ation of ea h
parameterized property to a set of tra table model he king problems. The diÆ-
ulty is that there may be variables in the model with large or unbounded ranges
(su h as memory addresses) and arrays with a large or unbounded number of
elements (su h as memory arrays). We solve this problem by using abstra t in-
terpretation to redu e ea h data type to a small number of abstra t values. For
example, suppose we have a property with a parameter i ranging over memory
addresses. We redu e the type A of memory addresses to a set ontaining two
values: the parameter value i, and a symbol A n i representing all values other
than i. In the abstra t interpretation, a essing an array at lo ation i will pro-
du e the value of that lo ation, whereas a essing the array at A n i produ es ?,
a symbol representing an unknown value.
In e e t, for ea h time the user \splits ases" on a variable of a given type,
there is one value in the abstra t type and one element in ea h abstra ted array
indexed by that type. If there are two parameters i and j of type A, the proof
assistant may split the problem into two ases: one where i = j and one where
i 6= j . Alternatively, it may onsider separately the ases i < j , i = j and i > j ,
if information about the order of these values is important to the property.
The abstra tions used by the proof assistant are sound, in the sense that
validity of a formula in the abstra t interpretation implies validity in the on rete
model for all valuations of the parameters. Of ourse, the abstra tion may be too
oarse to verify the given property (i.e., the truth value in the abstra t model
may be ?) even though the property is true. Note, however that the user does
not need to verify the orre tness of the abstra tion, sin e this is drawn from a
xed set built into the proof assistant.
The proof pro ess pro eeds as followings. First, the user spe i es re nement
relations (and other lemmas, as ne essary), whi h are proved by mutual temporal
indu tion. These properties are parameterized by \splitting ases" on appropri-
ate variables, so that any parti ular ase depends on only a nite part of the
system state. Finally, the proof assistant abstra ts the model relative to the
parameter values, redu ing the types with large or unbounded ranges to small
nite sets. The resulting proof obligations are dis harged by a model he ker.
We now onsider how this methodology an be applied to pro essor mi roar-
hite tures with features su h as spe ulative exe ution, out-of-order exe ution
and load-store bu ers.
3 The Pro essor Model
The pro essor mi roar hite ture that we model has out-of-order, spe ulative
exe ution using a variant of Tomasulo's algorithm with a reorder bu er. It
implements bran h predi tion and pre ise ex eptions, and has an out-of-order
load-store bu er with load forwarding. For simpli ity, we separate program and
data memories. The model is generi , in that many fun tions, su h as the ALU
(arithmeti -logi unit) and the instru tion de oder have been repla ed by unin-
terpreted fun tion symbols. A spe i ISA may be implemented by de ning these
fun tions appropriately. Our proof, however, is independent of these fun tions.
3.1 The Spe i ation
The mi roar hite ture is spe i ed with respe t to a referen e model, whi h exe-
utes one instru tion per step in program order. The ISA onsists of the following
instru tion lasses. A load (LD) takes two register operands, sour e address and
destination. It reads data memory at the sour e address, and loads the value into
the destination register. A store (ST) takes two register operands, the sour e and
the destination address. It stores the sour e value at the destination address in
data memory. An ALU operation (ALU) takes two register operands and a des-
tination register. This generi instru tion models all the instru tions using the
ALU by a single uninterpreted fun tion. Although we do not expli itly model im-
mediate operands, these an be folded into the generi ALU fun tion. A bran h
(BC) performs a test on its two register operands. If true, it sets the program
ounter to the bran h target value. Both the test and the bran h target ompu-
tation are modeled by uninterpreted fun tions. A jump (JMP) sets the program
ounter to the address in the sour e register. This is to implement non-lo al
jumps su h as returns from ex eption handlers. Finally, an output operation
(OUT) sends its register operand to the pro essor's output port. The LD, ST
and ALU operations an ause an ex eption to be raised, in whi h ase ontrol
is transferred to the ex eption handler address. Asyn hronous interrupts are not
modeled.
3.2 The Implementation Model
The mi roar hite ture is depi ted in gure 1. It is out-of-order, in that instru -
tions are exe uted when their operands are available, not ne essarily in program
order. Instru tion exe ution begins by fet hing the instru tion from program
memory at the program ounter address (PC). The instru tion is then de oded
to determine the operation type, the operand registers, the bran h target, et .
The program ounter is updated by in rementing its urrent value. Sin e the
in rement depends on the instru tion width, we model in rementation by an
Register retired results
File
decode EU
RS
Program EU
Memory instructions RS OPS RES RB
EU

PC branch results
RS EU

branch
predictor Data
LSQ Memory

Fig. 1. Mi roar hite ture

uninterpreted fun tion. In ase of a onditional bran h, however, the bran h


predi tor guesses the value of the bran h ondition. Thus we ontinue fet hing
instru tions even though the a tual bran h ondition is not yet known, at the
risk of having to an el the ensuing instru tions if the guess is in orre t. If the
predi ted bran h ondition is true, the PC is loaded from the bran h target.
Sin e bran h predi tions do not a e t orre tness, the bran h predi tor is mod-
eled as a non-deterministi hoi e, though this an be repla ed by any desired
fun tion.
The instru tion then reads its sour e operands from the register le, and
is loaded into the next available reservation station (RS) to await exe ution. A
sour e register may ontain an a tual data value, or it may ontain a tag, pointing
to the RS that will produ e the data value when it ompletes. In the ase of a tag,
the RS must wait until the orresponding data value returns on the result bus
(RES). When both operand values are available, the instru tion may be issued
to an exe ution unit. When the result of the operation is omputed, it returns on
the result bus, with its tag, and may be forwarded to any instru tions holding
that tag. The result is stored in the reorder bu er (RB) until the instru tion
retires. At retirement, the result is written to the register le. Instru tions are
retired in program order, so that the state of the register le is always onsistent.
This allows lean re overy from ex eptions or mispredi ted bran hes.
When a bran h instru tion retires, we ompare the omputed value of the
bran h ondition to the predi ted value. If these are not the same, subsequent
instru tions may have been fet hed from an in orre t program ounter. Thus,
they must be ushed. When this happens, the program ounter is set to the
alternative that was not hosen at fet h time.
Load and store operations are re orded in a load-store bu er (LSQ) in pro-
gram order. In our model, this bu er is unbounded, however it ould be re ned
by any xed size bu er. Loads and stores are not ne essarily exe uted in program
order. A load operation may exe ute after it has issued (i.e., its operands have
been obtained) and after all earlier stores to the same address have exe uted.
Alternatively, a load instru tion may exe ute by forwarding the data value from
the most re ent store to that address, even if that store has not yet exe uted. A
store instru tion an exe ute after it has issued, and after all previous loads and
stores to the same address have exe uted.2
The above onditions avoid the lassi hazard onditions (RAW, WAR and
WAW), guaranteeing orre t operation even when operations o ur out of pro-
gram order. In addition, we must ensure that a store annot exe ute until the
instru tion has a tually retired, sin e the store annot be undone if the instru -
tion were to be ushed. When a store instru tion retires, it is marked ommitted
in the load-store bu er, and annot subsequently be ushed. The hoi e of whi h
available operation to exe ute is non-deterministi , though this ould be repla ed
by any desired s heduling poli y.
4 Veri ation
Our orre tness riterion is that the sequen e of output values produ ed by
the referen e model and the mi roar hite ture model should be the same, for
orresponding initial states. The referen e model hooses non-deterministi ally
at ea h time whether to take a step. By witnessing this hoi e, we align the
referen e model's operation temporally with that of the implementation.
The two most interesting aspe ts of the proof deal with spe ulative exe ution
and with partially ordered operations, su h as register reads/writes or memory
loads/stores. We introdu e proof de ompositions to handle these situations, us-
ing ompositional model he king.3
4.1 Spe ifying re nement relations
Our basi approa h is to de ompose the proof into \units of work", in this ase
instru tions. We prove orre tness of a single instru tion, relative to the refer-
en e model, given that all earlier instru tions exe ute orre tly. To redu e the
veri ation omplexity, we may further de ompose the instru tion into smaller
steps, su h as operand read, result omputation, memory load, et . We then
write re nement relations, spe ifying the data values at various points in the
implementation, in terms of the referen e model.
Of ourse, to spe ify data items in the implementation, we must determine
their orre t values. This is done by de ning auxiliary variables that re ord the
orre t data values as omputed by the referen e model. For example, when an
instru tion is fet hed, the referen e model exe utes it atomi ally, omputing the
orre t operand and result values. The instru tion is then stored in an RS. We
re ord the orre t operands and result for that RS. For example, here is the
SMV ode that does this:
if(:stallout ^ iopin in fALU,LD,ST,BCg)f
next(aux[st hoi e℄.opra) := opra;
next(aux[st hoi e℄.oprb) := oprb;
next(aux[st hoi e℄.res) := res;g
2
Note this implies that the a tual address operands of all earlier stores (and loads)
must be known before a load (store) an exe ute.
3
Proof and prover may be found at http://www- ad.ee s.berkeley.edu/~kenm mil
Here, st hoi e is the index of the reservation station, and opra, oprb and res
are values from the referen e model. We now spe ify that, when the reservation
station holds an operand value, it is equal to the stored orre t value in the aux
stru ture (and similarly for result values).
To do this, we must take into a ount spe ulative exe ution. That is, if an
instru tion o urs after an ex eption or a mispredi ted bran h, we say it is shad-
owed. A shadowed instru tion does not orrespond to any instru tion exe uted
by the referen e model. Thus we annot spe ify its orre t operand and result
values. In fa t, these values are spurious, and must never a e t the register le
or memory. To write re nement relations, we must know whether an instru tion
in the implementation is shadowed. Fortunately, this is easy to determine. We
set an auxiliary state bit shadow when the predi ted bran h ondition di ers
from the orre t bran h ondition, or when an ex eption o urs. The shadow bit
is leared when a ush o urs. Here is the SMV des ription:
init(shadow) := 0;
next(shadow) := : ush ^ (shadow _
:stallar h ^ ( exn raised _ (opin = BC ^ taken 6= itaken)));
Here, taken is the orre t bran h ondition (from the referen e model) and itaken
is the predi ted bran h ondition. Now, any instru tion fet hed while shadow is
true is marked shadowed, by setting the auxiliary bit aux [st hoi e ℄.shadow.
While shadow is set, we stall the referen e model, sin e no valid instru tions are
being exe uted. Now we write the re nement relation for operands. We spe ify
that if a non-shadowed RS holds an operand value, it must be the orre t value.
Here is the spe i ation for the a operand:
forall(k in TAG) layer lemma1 :
if(st[k℄.valid ^ st[k℄.opra.valid ^ :aux[k℄.shadow)
st[k℄.opra.val := aux[k℄.opra;
This spe i es the a operand value for RS k , when it is valid (holding and instru -
tion), and when the a operand is valid, and when it is not shadowed. Otherwise
the value is unspe i ed. We an write a similar spe i ation for the result value,
and for other data values in the ma hine as ne essary.
4.2 Verifying operand orre tness
Now we must verify the above lemma. To verify data, we split ases on the
possible sour es of the data. Here, an operand value we read is generated by
the most re ent instru tion to write the sour e register. We an identify this
instru tion's RS by re ording the tag of the most re ent RS to write ea h register.
We then assume, by indu tion, that results omputed at earlier times are orre t.
We need one additional fa t, however: that the most re ent writer in exe ution
order is in fa t the most re ent writer in program order. If this is the ase, then
we must read the same value read by the referen e model.
One way to establish this is to split ases on both the most re ent writer in
the implementation and the most re ent writer in program order. Sin e the im-
plementation retires instru tions in program order, these two must be the same,
hen e orre t values are always read. However, there is a omplexity problem:
the abstra tion in this ase will involve three distin t tag values, and hen e the
states of three distin t RS's. In pra ti e, we found the time and spa e required
to verify this model prohibitive. Instead, we used an intermediate lemma to sim-
plify the problem. We observed that a register value is only read when no writes
to the register pending, in whi h ase its value is up-to-date with respe t to the
referen e model. Thus, we spe i ed the register ontents as follows:
forall (i in REG) layer uptodateReg :
if (:ir[i℄.resvd) ir[i℄.val := r[i℄;
That is, if no write is pending to register ir [i ℄, its value mat hes referen e model
register r [i ℄. This is veri ed using the ase split des ribed above, whi h is given
to SMV as follows:
sub ase uptodateReg[i℄[k℄[ ℄ of ir[j℄.val//uptodateReg
for auxLastIssuedRS[j℄=i ^ auxLastWriterRS[j℄=k ^ r[j℄= ;
That is, we let i be the last writer to register j in program order, k the last
writer in the implementation, and the orre t data value. In this ase there are
only two distinguished tag values, i and k , so the abstra tion ontains only two
RS's.
In fa t, the rst attempt to he k this property produ ed a ounterexample
in whi h some abstra ted instru tion auses a ush, an elling the instru tion
that should write register j . The abstra t model allows this be ause the states
of RS's other than i and k are unknown. To deal with this, we introdu e a
non-interferen e lemma, stating that no unshadowed instru tion is ushed:
forall(i in TAG) lemma5[i℄ : assert G
( ush ) shadow ^ ( omplete st6=i ) :(st[i℄.valid ^ :aux[i℄.shadow)));
Here, omplete st is the tag of the RS ausing the ush. We prove this by split-
ting ases on the ushing instru tion. This eliminates the above ounterexample
to the up-to-date register property, leaving another ounterexample in whi h
a shadowed instru tion writes register j and orrupts its value. This alls for
another lemma stating that no shadowed instru tion retires:
lemma6 : assert G ( retiring ) :aux[ omplete st℄.shadow);
This an be proved by splitting ases on the urrently retiring instru tion and
the instru tion that set the shadow bit (e.g. a mispredi ted bran h). That is, the
latter must retire and ause a ush before the shadowed instru tion an retire.
With this additional lemma, the up-to-date register property is veri ed. Now
operand orre tness is easily proved by splitting ases on the sour e register and
the operand's tag, whi h indi ates the data sour e when forwarding from the
result bus:
sub ase lemma1[i℄[j℄[ ℄ of st[k℄.opra.val//lemma1
for st[k℄.opra.tag = i ^ aux[k℄.sr a = j ^ aux[k℄.opra = ;
The spe i ation for results returning from exe ution units an be veri ed us-
ing operand orre tness. This requires a non-interferen e lemma stating that
unexpe ted results are never returned.
4.3 Verifying memory data orre tness
We also spe ify the the results returning from the data memory, as follows:
lemma4 : assert G ( :mqaux[mq head℄.shadow ^ mem ld ^ mem enable
^ load from mem ) mem rd data = mqaux[mq head℄.data);
Here, mq head points to the urrently exe uting operation in the load-store
queue. That is, if the urrent operation is an unshadowed load, then the data
from memory are the orre t data stored in the auxiliary array mqaux. We break
this into two ases { when data are read from memory and when data are for-
warded from the load-store queue. Here we onsider only the former ase, al-
though the latter is similar.
This property is similar to the one spe ifying values read from the register
le. Here, we must prove that, for any load, the most re ently exe uted store to
the same address ( all it SE ) is also the most re ent in program order ( all it
SP ). As before, we use auxiliary variables to identify SE and SP in the queue.
Splitting ases on these two stores and the urrent load, we should be able to
prove that SE and SP are the same, hen e read data are orre t.
Unfortunately, the abstra t model with two stores and one load is too large to
model he k. We annot solve this problem as before by writing an \up-to-date"
lemma for the memory, sin e we may read the memory when it is not up-to-date.
Instead, we split ases only on the urrent load L and on SE . This produ es a
ounterexample in whi h SE < SP < L in program order. That is, at the time L
o urs, SE has exe uted but SP has not. This annot really happen, be ause the
unexe uted store SP would blo k load L. However, sin e SP is abstra ted, this
information is lost. To avoid splitting ases on SP , we simply state as a lemma
that SP  SE . In SMV, we say:
lemma4a : assert G ( :mqaux[mq head℄.shadow ^ mem ld ^ mem enable
^ load from mem ) (imtag[mem addr℄  mqaux[mq head℄.lastWrite ));
Here imtag [mem addr ℄ is SE , while mqaux [mq head ℄.lastWrite is SP . This an
be proved using another lemma, stating that stores always o ur in program or-
der. All three properties an be proved using just two memory queue elements.
We redu e the problem further by writing a re nement relation for the data in
the load-store queue. This allows us to abstra t out the RS's when proving mem-
ory properties. This required a lemma stating that unshadowed queue elements
are never ushed, whi h follows dire tly from the fa t that unshadowed RS's
are never ushed. The resulting abstra t models an be handled easily by the
model he ker. At the ost of additional lemmas, we have redu ed an intra table
problem to a tra table one.
4.4 Remaining steps
For the program ounter (PC), we write a re nement relation stating that, when
the shadow bit is not set, the implementation PC equals the referen e model PC:
layer opok : if(:shadow) ip := p ;
Sin e the PC an be loaded from an RS (in ase of a ush) or from a register
(for a JMP), we split ases on the most re ent reservation station to and on the
sour e register of the previous instru tion. We also use the two lemmas about
Model Proof size
A (baseline) 5700 bytes
B = A + out-of-order 7000 bytes
C = B + spe ulation 13K bytes
D = C + load-store bu er 18K bytes
Table 1. Proof size vs. feature set.

spe ulation. Further re nement relations spe ify the de oded instru tion and
bran h target. This isolates the uninterpreted fun tions omputing these values.
Finally, we must prove our overall orre tness riterion, orre tness of out-
puts. The OUT instru tion reads a register and sends its value to the output
port. Thus, the up-to-date register property suÆ es to prove output orre tness.
Overall, the proof4 onsists of the following elements: (1) re nement maps for the
program ounter, instru tion de oder, register le, RS's and load-store queue,
(2) two non-interferen e lemmas for spe ulative exe ution, two for the result bus,
and four for the load-store queue (3) ase splitting instru tions for the above and
hints for adjusting the abstra tions, and (4) auxiliary variable de larations. All
told, this information omprises less than 18K bytes, somewhat less than the
size of the mi roar hite ture model and its spe i ation.
To summarize, our strategy is to redu e the veri ation problem \units of
work", in this ase instru tions. Sin e ea h instru tion uses only nite resour es,
we an verify its orre tness using a nite abstra tion of the system. We identify
the resour es used by the instru tion (e.g. RS's, registers, et .), by introdu ing
auxiliary variables. On e we \split ases" on these resour es, the pointer types
and arrays are automati ally redu ed, yielding a nite abstra t model.
The novel aspe ts of this proof are in the treatment of spe ulation, and of
read/write hazards. We handled spe ulation by introdu ing an auxiliary shadow
bit for ea h instru tion in the ma hine. We then show two key fa ts about the
system: that unshadowed instru tions are never an eled, and that shadowed
instru tions never retire. To handle read/write hazards, we use an abstra tion
strong enough to prove that the most re ent writes to an address in exe ution
and program order are the same.
Finally, to address the question of s alability, we onsider four designs of
in reasing omplexity: design A is a simple in-order pro essor, design B adds
Tomasulo's algorithm for out-of-order exe ution, design C adds spe ulative exe-
ution and design D adds a load-store bu er. Table 1 shows the textual size of the
proofs we obtained for these four designs. Adding Tomasulo's algorithm is the
simplest step, involving only a few additional ase splits and two non-interferen e
lemmas. Adding spe ulation and the load-store bu er is more omplex, be ause
of the register and memory ordering properties we must prove. Nonetheless, we
nd that the omplexity of the intera tions between these features does not
make the proof intra table. Rather, the proof in rement asso iated with adding
a feature remains moderate, at least for this example.
4
By \proof", we mean all the input used to guide a me hani al prover, and not a
proof in the mathemati al sense.
5 Comparison with Other Approa hes
We now ompare our proof with proofs of similar mi roar hite ture models us-
ing other methods. We onsider proofs by Sawada and Hunt [SH98℄, Velev and
Bryant [VB00℄ and Hosabettu et al. [HGS00℄. All of these proofs are variations
in some form on the method of Bur h and Dill [BD94℄, in whi h an abstra tion
fun tion is onstru ted by \ ushing" the implementation, i.e., inserting null op-
erations until all pending instru tions are ompleted. This yields a \ lean" state
whi h an be ompared to the referen e model state. One then proves a ommu-
tative diagram, that is, that taking one implementation step and then applying
the abstra tion fun tion yields the same state as applying the abstra tion fun -
tion followed by zero or more referen e model steps. This an be done in an
almost fully automated way for simple pipelines, and has the advantage that the
abstra tion fun tion is me hani ally onstru ted.
However, the method has two distin t disadvantages. First, for omplex ar hi-
te tures, the abstra tion fun tion is generally not strong enough to be indu tively
invariant. It must be manually strengthened with information about rea habil-
ity of ontrol states. In our method, no su h information is required. Se ond,
the the abstra tion fun tion depends on the entire ma hine state, in luding all
the instru tions that are urrently in the ma hine. For omplex ar hite tures,
it be omes intra table to deal with it automati ally. In our method, we reason
about only one or two instru tions. Thus, the proof obligations are lo al, and an
be handled by model he king. By ontrast, most re ent work using abstra tion
fun tions manually de omposes the ushing fun tion into smaller, more tra table
parts. Thus the Bur h and Dill method's advantage of full automation is lost. To
see this, we onsider the extant proofs in more detail. A omparison of textual
sizes of models and proofs is given in table 2.
Sawada and Hunt The work of Sawada and Hunt [SH98℄ is perhaps the
rst formal proof a \modern" mi ropro essor ar hite ture. Their pro essor model
uses Tomasulo's algorithm, bran h predi tion, pre ise ex eptions and a load store
bu er with forwarding. The model is qualitatively similar to ours, with a few
di eren es. They model asyn hronous interrupts, while we do not. They use a
xed set of exe ution units (one per instru tion type) while we do not. Thus,
they asso iate RS's stati ally with exe ution units, while we hoose the exe ution
unit at issue time, to maximize use of the exe ution units. Also, their load-store
bu er holds two loads and one store, while we model an arbitrary number of
entries.
The model is de ned by a olle tion of Common LISP fun tions in the the-
orem prover ACL2 [KM96℄. We report in table 2 the approximate textual size
of the fun tions des ribing the pro essor ar hite ture, ex luding theorems and
generi fun tions not related to pro essor modeling. This is roughly three times
the textual size of our model in the SMV language. In our estimation, this di er-
en e is largely a ounted for by the greater on iseness of the SMV language as a
hardware des ription language. However, some details present in the Sawada and
Hunt model, su h as an expli it instru tion de oding fun tion, are not present in
our model, sin e we model them generi ally using uninterpreted fun tions. De n-
ing these fun tions expli itly would in rease the des ription size, but would not
a e t the proof.
Sawada and Hunt use an intermediate abstra tion alled a MAETT, a table
tra king of the status of all instru tions being exe uted in the ma hine. They
then relate the MAETT to the implementation and the referen e model using
invariants, whi h are proved by indu tion. We do not use an intermediate ab-
stra tion, although our auxiliary variables do ontain information similar to that
in the MAETT. The hief diÆ ulty reported by Sawada and Hunt is that the
invariant must be strengthened by auxiliary invariants of the implementation
state. No su h invariants o ur in our proof (although we do need a few lemmas
on erning whi h events may o ur in ertain states). This leads to a stark dif-
feren e in the textual size of the proofs: their proof (for the FM9801 pro essor)
is roughly 1909K bytes, of whi h nearly a megabyte is the indu tive invariant.
Our proof is less that 20K bytes, smaller than the model des ription itself. This
di eren e of two orders of magnitude is more than enough to a ount for di er-
en es in models, the su in tness of representation, whitespa e, et . By another
measure, the Sawada and Hunt proof has roughly 4000 lemmas, whereas ours
has approximately 18 (depending on how one ounts).
Velev and Bryant The approa h of Velev and Bryant [VB00℄ is losely
based on the Bur h-Dill te hnique. They fo us on eÆ iently he king the om-
mutativity ondition for omplex mi roar hite tures by redu ing the problem
to he king equivalen e of two terms in a logi with equality, and uninterpreted
fun tion symbols. Under ertain onditions, their de ision algorithm is able to
he k equivalen e of the massive formulas obtained from ushing omplex mod-
els. Some manual work is required, however, to put the problem in a form suit-
able for the tool. They handle ar hite tures with deep and multiple pipelines,
multiple-issue, multi- y le exe ution units, ex eptions and bran h predi tion, for
xed nite models (note, we treat models with unbounded resour es). Notably,
they do not treat out-of-order exe ution, or load-store bu ers. We onje ture
that this is due to the omplexity of the ushing fun tions, and the need for
omplex auxiliary invariants in these ases.
Hosabettu et al. Hosabettu et al. have published a series of papers on
mi ropro essor veri ation, based on the \ ompletion fun tions" approa h. The
mi roar hite ture they model in [HGS00℄ is similar to ours in that it has out-of-
order exe ution, bran h predi tion, pre ise ex eptions and it bu ers stores (but
not loads, whi h are atomi ). Stores are exe uted in program order, while in our
model they an be out-of-order. Also, they model a pro essor status word, while
we do not.
Hosabettu et al. prove a ommutative diagram, but de ompose the abstra -
tion fun tion into ompletion fun tions for ea h instru tion in the ma hine. A
ompletion fun tion spe i es the future e e t of an un nished instru tion on the
observable state. They de ne ompletion fun tions for ea h instru tion type, in
terms of the present status of the instru tion in the ma hine, and also whether
that instru tion will squash subsequent instru tions, ensuring they do not a e t
the program state. The abstra tion fun tion is the omposition of the omple-
Te hnique Used Proof Assistant Size of Ma hine Spe Size of Proof
Sawada & Hunt [SH98℄ ACL2 ~60K bytes 1909K bytes
Hosabettu et al. [HGS00℄ PVS ~70K bytes ~2300K bytes
Compositional Model Che king SMV 20K bytes 18K bytes
Table 2. Textual sizes of the Models and Proofs

tion fun tions. A ommutative diagram is proved using PVS [ORSvH95℄ for the
de omposed abstra tion fun tion.
This approa h has the advantage of avoiding applying a de ision pro edure
to the entire ushing fun tion. However, proofs of the ommutativity obligations
require auxiliary invariants that hara terize the rea hable states of the model.
To reason about the omposite abstra tion fun tion, one must enumerate man-
ually the various instru tions in a parti ular state, the exa t transitions they
might make, the position of the \squashing" instru tion, and so on. While de-
omposing the abstra tion fun tion makes reasoning about ea h ase simpler,
onsiderable manual e ort is still required in stating invariants and guiding the
prover.
The authors report that the proof took mu h less time than that of Sawada
and Hunt. However, the textual size is omparable. The proof uses approxi-
mately 300K bytes of PVS spe i ations, and 2000K bytes of proof s ript (man-
ual prover guidan e). The latter, while generated manually, ontains onsiderable
redundan y. Thus its large size may not a urately re e t the e ort needed to
reate it. We onje ture the large proof size results from the need for auxiliary
invariants, and the theorem prover's greater need for manual guidan e vis-a-vis
model he kers.
6 Con lusion
We have shown that ompositional model he king methods an verify a pro-
essor mi roar hite ture with most of the ar hite tural features of a modern
mi ropro essor. We introdu ed proof strategies to handle spe ulative exe ution
(using shadow bits) and to handle read/write hazards ( ase splitting on the
most re ent writes in program and exe ution order). The proof methodology
s ales well in that the in remental proof ost asso iated with ea h pro essor fea-
ture is low. Moreover, the proof is on ise relative to proofs using other methods
(and is smaller than the model des ription itself). Although proof size is not
ne essarily an indi ation of the human e ort required, we onsider the di eren e
of two orders of magnitude to re e t a qualitative di eren e in proof omplexity.
We as ribe this di eren e to several fa tors.
First, as reported both by Sawada and Hunt and by Hosabettu et al., one
of the most time onsuming aspe ts of their methods is spe ifying auxiliary
invariants. We exploit the model he ker's ability to ompute rea hable states to
avoid writing su h invariants. Se ond, by stating re nement relations as temporal
properties we an de ompose the proof into \units of work", su h as instru tions,
that are temporally and spatially distributed but use nite resour es. This avoids
reasoning about the entire state of the ma hine, and allows us to use small, nite-
state abstra tions. Finally, we exploit the fa t that model he kers require less
manual guidan e than theorem provers do.
Nonetheless, there remains mu h room for improvement. For example, some
lemmas in our proof ould be eliminated if the model he ker were able to handle
three instru tions in the abstra tion instead of two. We have found that the
symboli model he ker an handle abstra t models with only about half the
number of state bits that an be handled with on rete models. The reason for
this is un lear, though it may be that the abstra t state spa es are less sparse,
or that there is greater nondeterminism in the transition relation. This does not
a e t the s alability of the proof methodology, but the \ onstant fa tor" would
be improved if the model he ker ould handle larger abstra t models.
To handle asyn hronous interrupts, it would be useful to implement \prophe y
variables", so that the witness fun tion that stalls the referen e model ould de-
pend on the future of the implementation. Also, to implement a spe i instru -
tion set ar hite ture, we must substitute on rete fun tions for the uninterpreted
fun tions in our model. Support for this is urrently la king in the prover, though
it would be straightforward to implement.
On the whole, although proofs of this sort are onsiderably more laborious
than model he king nite state ma hines, we feel that the methodology s ales
well, and that additional pro essor features, su h as a rst-level a he, an address
translation unit, or multiple-issue ould be handled in a straightforward manner,
with the addition of a few lemmas for ea h feature.
Referen es
[AP99℄ T. Arons and A. Pnueli. Verifying tomasulo's algorithm by re nement. In
12th Int. Conf. on VLSI Design (VLSI'99), pages 306{9. IEEE Comput.
So ., June 1999.
[BD94℄ J. R. Bur h and D. L. Dill. Automated veri ation of pipelined mi ropro-
essor ontrol. In D. L. Dill, editor, Computer-Aided Veri ation (CAV94),
LNCS 818, pages 68{80. Springer-Verlag, 1994.
[HGS00℄ R. Hosabettu, G. Gopalakrishnan, and M. Srivas. Verifying advan ed mi-
roar hite tures that support spe ulation and ex eptions. In E. A. Emerson
and A. P. Sistla, editors, Computer-Aided Veri ation (CAV2000), LNCS
1855, pages 521{37. Springer-Verlag, 2000.
[KM96℄ M. Kaufmann and J. S. Moore. ACL2: An industrial strength version of
Nqthm. In Conf. on Computer Assuran e (COMPASS-96), pages 23{34.
IEEE Comp. So . Press, 1996.
[M M00℄ K. L. M Millan. A methodology for hardware veri ation using omposi-
tional model he king. S i. of Comp. Prog., 37(1{3):279{309, May 2000.
[ORSvH95℄ S. Owre, J. Rushby, N. Shankar, and F. von Henke. Formal veri ation
for fault tolerant ar hite tures: Prolegomena to the design of PVS. IEEE
Trans. on Software Eng., 21(2):17{125, Feb 1995.
[SH98℄ J. Sawada and W. D. Hunt. Pro essor veri ation with pre ise ex eptions
and spe ulative exe ution. In A. J. Hu and M. Y. Vardi, editors, Computer-
Aided Veri ation (CAV98), LNCS 1427, pages 135{146. Springer, 1998.
[Tom67℄ R. M. Tomasulo. An eÆ ient algorithm for exploiting multiple arithmeti
units. IBM J. of Resear h and Development, 11(1):25{33, Jan. 1967.
[VB00℄ M. Velev and R. E. Bryant. Formal veri ation of supers alar mi ropro-
essors with multi y le fun tional units, ex eptions and bran h predi tion.
In 37th Design Automation Conferen e (DAC 2000). IEEE, June 2000.

Anda mungkin juga menyukai