The Retiming Lemma : A Simple Proof and Applications Guy Even Abstract We present a new proof of the Retiming Lemma, which was rst formulated and proved by Leiserson and Saxe [LS81]. Our proof relies on space-time transformations, and shows how retiming can be interpreted in the domain of space-time transformations. Two applications of the Retiming Lemma are given: one for designing circuits and the other for testing circuits.
keywords:
retiming, synchronous circuits, space time transformations, circuit initialization, circuit testing.
Universitat des Saarlandes, FB 14 Informatik, Lehrstuhl Prof. Paul, Bau 36.1, Im Stadtwald, 66123 Saarbrucken, Germany. e-mail:
[email protected]. Research conducted while in the Computer Science Dept., Technion, Haifa 32000, Israel, and supported by the Miriam and Aaron Gutwirth Memorial Fellowship.
1 Introduction Retiming is a transformation that can be employed to improve the performance of synchronous circuits. It was rst de ned and treated in a broad and general setting by Leiserson and Saxe [LS81]. They modeled a synchronous circuit as a directed graph with non-negative integer weights on the edges. In this model, computations are performed in the vertices, which model gates (i.e. combinational units), and communication is performed by the edges. The weight of an edge denotes the number of registers (also called unit delay elements) along the wire connecting the vertices. Each register samples its input at the end of each clock cycle, and the sampled value is output during the next clock cycle. Therefore, it takes an edge k clock cycles to deliver a signal if its weight equals k. Retiming only changes the number of registers along edges, and does not change the graph. From the point of view of a single vertex, retiming shifts the vertex in time, causing it to perform the same computations either earlier or later than it originally did. This shift in time is done by moving registers from incoming edges to outgoing edges or vice-versa. A vertex that is shifted to perform the computation earlier is said to be advanced, whereas a vertex shifted to perform computations later is said to be lagged. Leiserson and Saxe utilized retiming for improving the performance of synchronous circuits. In particular, they used retiming for transforming synchronous circuits into systolic circuits. They state the Retiming Lemma, which claims that under \appropriate initializations", retiming does not change the functionality of the circuit. The main diculty in the proof is to show, that every initial state of the original circuit, that is \suciently old", has a corresponding initial state in the retimed circuit, so that the input/output patterns of both (initialized) circuits are identical. This correspondence should not depend on the inputs that are fed after the initialization. The proof of Leiserson and Saxe is \not enlightening", and the exact quanti cation of the extent to which retiming preserves functionality is vague. An exact de nition and quanti cation is given by Even and Litman [EL91b] without proof. Litman [Li92] proves the Retiming Lemma using predicate calculus. Neither proof gives an intuition of the exact characterization of \appropriate initializations". The paper of German and Wang [GW85] and the paper of Brookes [B93] prove the correctness of speci c retimings of speci c circuits that were given as examples by Leiserson and Saxe to demonstrate the bene ts of retiming (in fact, for these speci c retimings, all initializations are \appropriate"). The book of Megson [Me92] misrepresents the Retiming Lemma, and mixes issues such as slow-down and algorithms that nd retiming functions. The proof presented by Megson completely ignores issues such as functional equivalence and fails to mention that retiming may eect functionality. We believe that the proof of the general case presented in this paper succeeds in giving intuition, and in explaining the extent to which functionality is preserved after retiming. The proof presented here is based on space-time transformations. Space-time transformations of synchronous circuits are a convenient tool for analyzing the behavior of such circuits [H61]. For example, Vergis and Steiglitz [VS86] utilized space-time transformations for testing bidirectional systolic arrays. The main advantage of a space-time diagram representation of a synchronous circuit is that it is combinational (i.e. it is acyclic and does not contain registers). This representation facilitates the analysis of the communication and the relative delays of signals in the circuit. We believe that analyzing synchronous circuits according to 1
the methods of this paper can help establish a better understanding of communication issues in synchronous circuits, and in particular provide a model that describes \what goes on" in the retiming lemma. We also describe two applications of the Retiming Lemma. These applications demonstrate the advantages of using non-negative retiming functions (i.e no vertices are advanced). The rst application examines the eect of retiming on the functionality of circuits designed to function properly regardless of their initial state. We call such circuits instantlyinitializable. In instantly-initializable circuits, beginnings of computations are marked by a special input (called a reset signal) that is fed by a designated input host, and consequently causes the circuit to function as if it is initialized to a xed state. We show that applying a non-negative retiming on an instantly-initializable circuit results in an instantly-initializable circuit. Moreover, the original circuit and the retimed one are equivalent with respect to input patterns that \mark" the beginnings of computations. This application was used implicitly by [LS81, EL91a, E92] when retiming was employed for eliminating broadcast. The second application deals with the eect of non-negative retimings to the detection capabilities of functional test patterns. A functional test pattern is an input pattern that when given to the circuit can verify that the circuit was in a correct initial state and that each of the vertices is functioning correctly. We show that non-negative retiming does not make testing harder. In particular, a test pattern for a synchronous circuit is also a test pattern for a non-negative retiming of it, and hence, non-negative retimings preserve testability. This paper is organized as follows: In Section 2 review the communication graph model for synchronous circuits. In Section 3 we de ne space-time transformations and space-time circuits. In Section 4 we discuss terms related to the functionality of circuits. In Section 5 we de ne retiming both on synchronous circuit and on space-time circuits. In Section 6 we prove the Retiming Lemma. In Section 7 we describe two applications of the Retiming Lemma.
2 Synchronous circuits We follow Leiserson and Saxe [LS81] and model synchronous circuits by communication graphs. A communication graph, Y = hV; E; w; F i, consists of a directed graph (V; E ) with non-negative integer weights, w(e), on the edges. The vertices model combinational functional units (e.g. gates). The edges model interconnects between functional units (e.g. wires), and therefore, only point to point interconnections are allowed between components in the circuits. The weight of an edge models the number of registers along the corresponding interconnect. Each register has exactly one input and one output; at the end of each clock cycle, a register samples its input, and the sampled value is output during the next clock cycle. Therefore, a signal sent from vertex u along a positive weighted edge u ! v during the i'th clock cycle arrives at vertex v in the beginning of clock cycle i + w(u ! v). Vertices having only incoming edges are called output hosts. Vertices having only outgoing edges are called input hosts. The input and output hosts enable communication with the external world. The functionality of the circuit is de ned by the mapping F as follows: The functions computed by vertices, which are not hosts, are speci ed by the mapping F . Namely, the 2
function computed at vertex u, the result of which is sent along the edge u ! v, is denoted by Fu ! v . In a communication graph, every cycle has at least one edge with a positive weight. Without this restriction, values of the signals carried along such zero-weight cycles might not be well de ned, due to instability problems. Moreover, this condition is sucient for uniquely de ning the computation performed by a synchronous circuit: First, note that removing the edges containing registers partitions the circuit into disjoint acyclic combinational components. The inputs of each such combinational component are given by registers and by the host. The outputs of each combinational component enter the host or the registers. The computation in a synchronous circuit is de ned as follows: During every clock cycle, the registers output the values they sampled and stored at the end of the previous clock cycle, and the input hosts send values according to the input sequence. Therefore, the inputs of each combinational component are well de ned. The outputs of each combinational component are computed according to the input values, and the output values either enter the output host or a register. Values entering registers are sampled at the end of the clock cycle, and stored for output during the next clock cycle.
3 Space-time circuits and transformations Space-time circuits are combinational circuits having an iterative property. These circuits are useful for analyzing synchronous circuits.
De nition 1: A space-time circuit is a 3-tuple hTV; TE; TF i, where hTV; TE i is a directed
graph, and TF is a mapping which assigns a function to each edge. A space-time circuit satis es the following conditions:
C1: The vertex set is a Cartesian product of a set V and the integers, in other words, TV = V Z. We refer to the second component of each vertex as the time-component.
C2: (no \backward edges") If (u; i) ! (v; j ) 2 TE then i j . C3: The graph hTV; TE i is acyclic. C4: (repetitive structure and functionality) If (u; i) ! (v; j ) 2 TE then (u; i + k) ! (v; j + k) 2 TE for every k 2 Z. Moreover, for every edge (u; i) ! (v; j ) 2 TE , and every integer k , the mapping TF assigns identical functions to the edges (u; i) ! (v; j ) and (u; i + k) ! (v; j + k). The input and output hosts of space-time circuits are de ned in the same fashion as they are in synchronous circuits. A space-time transformation maps a synchronous circuit into a space-time circuit by trading time with space. Intuitively, it assigns a copy of each vertex for each clock cycle, and thus facilitates reference to the computation performed by a vertex during the ith clock cycle. 3
De nition 2: The space-time transformation maps a synchronous circuit Y = hV; E; w; F i into the space-time circuit TY = hTV; TE; TF i de ned as follows: 1. TV =4 V
Z
2. TE =4 f(u; i) ! (v; j ) : u ! v 2 E; w(u ! v) = j ? ig 3. TF ((u; i) ! (v; j )) =4 F (u ! v)
Note, that a space-time circuit contains an in nite number of copies of each vertex of the synchronous circuit. Each copy has the same input ports, output ports and functionality as the vertex of the synchronous circuit. Figure 1 depicts a synchronous circuit and its space-time transformation. A space-time circuit may be transformed back into a synchronous simply by \folding" all vertices with the same time-component into a single vertex. The repetitive structure and functionality of the space-time circuit enable this folding.
4 Functionality Our goal is to show that retiming hardly eects functionality. Therefore, we brie y discuss terms related to functionality such as: inputs, outputs, states, initial states and computations, both for synchronous circuits and for space-time circuits. For the sake of simplicity let us assume that the circuit has only a single input host and a single output host. Moreover, assume that the degrees of the input and output host is one. An input pattern to the circuit is a sequence of values fIn(i)g1i , where In(i) denotes the value input during the ith clock cycle. In a synchronous circuit, In(i) is the value sent by the input host at the beginning of the ith clock cycle along the edge emanating from it. In a space-time circuit, In(i) is the value sent along the edge emanating from the instance of the input host whose time component equals i. An output pattern is de ned similarly by a sequence fOut(i)g1 i . The state of a synchronous circuit describes the values stored by the registers. Hence, the state of a circuit at the ith clock cycle speci es the values that the registers output at the beginning of the ith clock cycle. We de ne states in space-time circuits using cuts. A cut Cuti(TY ) is the set of edges (u; j ) ! (u; k) in TY that satisfy: j < i and k i. A state in a space-time circuit, TY , at the ith clock cycle de nes the values sent along the edges of the cut Cuti(TY ). It is straightforward to translate a state of a synchronous circuit to a state of its corresponding space-time circuit, and vice-versa. An initial state is a state of the circuit at the beginning of a computation (usually at clock cycle 0). The initial state is imposed on the circuit by an external mechanism regardless of its previous state. Equivalence of states is de ned in the following de nition. =0
=0
4
a
b
c
d
(A)
(a; 0)
(b; 0)
(c; 0)
(d; 0)
(a; 1)
(b; 1)
(c; 1)
(d; 1)
(a; 2)
(b; 2)
(c; 2)
(d; 2)
(a; 3)
(b; 3)
(c; 3)
(d; 3)
(B)
Figure 1: (A) a synchronous circuit (B) a segment of its space-time transformation
5
De nition 3: Two states, S and S , of a circuit are (functionally) equivalent, if for every 1
2
input pattern the same output pattern is output regardless of whether the initial state of the circuit is S1 or S2 .
Computations in synchronous circuits were described in Section 2. One of the diculties with this description is that edges have memory and carry simultaneously as many values as there are registers along them. Describing computations in space-time circuits is much easier: it is simply a labeling of the edges. Each edge is labeled with the value sent along it. This paper focuses on determining which initial states of a synchronous circuit have a corresponding initial state in the retimed circuit, and in this way determining the extent to which the retimed circuit can simulate the original circuit. Such a distinction requires being able to distinguish between states that are obtainable during the course of computation and states which are obtainable either only at the beginning of computations or shortly after them. The following de nition de ned by Even and Litman [EL91b] quanti es this property.
De nition 4: A state is m-old if it is equivalent to a state of a circuit during clock cycle m, for some computation of the circuit that started at clock cycle 0.
Note, that every state is 0-old, and that every (m + 1)-old state is also m-old.
5 Retiming Retiming is a transformation that maps a synchronous circuit into a synchronous circuit having the same topology, but with dierent weights on the edges (in other words, retiming can only add or delete registers from existing edges). The purpose of this transformation is to improve the performance of the circuit, without changing its functionality. Retiming is carried out by assigning each vertex an integral value called a lag. The lag speci es a number of clock cycles the vertex is \shifted in the time axis". From the point of view of a single vertex, retiming amounts to performing the same computation either earlier or later. A vertex with a positive lag performs the computations later than it does in the original circuit, and a vertex with a negative lag performs the computations earlier than it originally does. Retiming has been de ned by Leiserson and Saxe using the abstraction of a synchronous system as a directed graph with non-negative integer weights on the edges ([LS81]). In subsection 5.1 we review the de nition of retiming and in subsection 5.2 we de ne retiming in the domain of space-time circuits.
5.1 Retiming of synchronous circuits For the sake of completeness we review the de nition of retiming on synchronous circuits [LS81] (see gure 2 for an example of a retimed synchronous circuit): Let hV; E; wi be a synchronous circuit. Suppose we want to retime this circuit, according to a function lag : V ! Z, where lag(v) denotes the lag of vertex v. Intuitively, when a vertex is lagged by k, it means that we will remove k registers from each edge emanating it, and add k 6
a
b
c
d
a
b
c
d
0
0
1
0
(A)
(B)
Figure 2: (A) synchronous circuit (B) retimed circuit: lags are written next to each vertex registers to each edge entering it. This implies that after performing the retiming, an edge u ! v will have w(u ! v) ? lag(u) + lag(v) registers along it. Since the number of registers along an edge must be non-negative, we require that the function lag() satisfy the following equation: 8u ! v : w(u ! v) lag(u) ? lag(v) (1) A function which satis es equation 1 is called a retiming function. The weights of the retimed circuit, hV; E; w0 i, resulting from retiming the circuit hV; E; wi according to the retiming function lag() are de ned by:
w0(u ! v) =4 w(u ! v) ? lag(u) + lag(v)
5.2 Retiming of space-time circuits Retiming of space-time circuits is an isomorphism of (in nite) graphs which maps a spacetime circuit into an isomorphic space-time circuit. We call this isomorphism a retiming because one can retime a synchronous circuit as follows: perform a space-time transformation, retime the space-time circuit and perform the inverse of the space-time transformation on the retimed space-time circuit. This property is formally stated in Lemma 1. Performing retimings of synchronous circuits via retimings of space-time circuit will enable us to picture more clearly the properties of retiming. Let TY = hTV; TE i be a space-time circuit, and let lag : V ! Z be a function. The retimed circuit, TV 0 = hTV; TE 0i, can be visualized as following: Suppose that the vertices TV are placed on a two dimensional grid, so that vertices with identical time-components are on the same row, and vertices of the set fvg Z are on the same column. Suppose the edges TE are elastic. The interpretation of the retiming function is a description of how much each column should be \moved" vertically: if lag(v) = `, then the column fvg Z is pushed ` units downwards (if ` < 0 then it is pulled j`j units upwards). The retimed circuit corresponds to the original circuit after experiencing a \geological shear". See gure 4 for space-time representations of the circuits of gure 2. 7
Formally, the edge set of the retimed circuit, TE 0, is de ned as:
TE 0 =4 f(u; i + lag(u)) ! (v; j + lag(v)) : (u; i) ! (v; j ) 2 TE g Since we want the circuit obtained by retiming a space-time circuit to be also a space-time circuit we require that the function lag() satisfy: for every edge (u; i) ! (v; j ) 2 TE : i + lag(u) j + lag(v)
(2)
A function which satis es equation 2 is called a retiming function of space-time circuits. A retimed space-time circuit does not contain any \backward-edges", and is, therefore, a space-time circuit. The following lemma summarizes the relationship between retiming synchronous circuits, retiming space-time circuits, and space-time transformations. The proof is straightforward.
Lemma 1: The retiming and the space-time transformations commute. In other words, suppose that the synchronous circuit hV; E; w0 ; F i is obtained by retiming the synchronous circuit hV; E; w; F i according to the function lag (). Suppose that the circuit hTV; TE; TF i is the space-time transformation of the synchronous circuit hV; E; w; F i. Suppose that the circuit hTV; TE 0; TF i is obtained by retiming the circuit hTV; TE; TF i according to the function lag(). Then the circuit hTV; TE 0; TF i is the space-time transformation of the synchronous circuit hV; E; w0 ; F i. (see gure 3.) hV; E; w; F i
hTV; TE; TF i
lag ()
lag ()
hV; E; w0; F i
hTV; TE 0; TF i
Figure 3: Retiming and space-time transformations commute
8
(a; 0)
(b; 0)
(c; 0)
(d; 0)
(a; 0)
(b; 0)
(c; 0)
(d; 0)
(a; 1)
(b; 1)
(c; 1)
(d; 1)
(a; 1)
(b; 1)
(c; 1)
(d; 1)
(a; 2)
(b; 2)
(c; 2)
(d; 2)
(a; 2)
(b; 2)
(c; 2)
(d; 2)
(a; 3)
(b; 3)
(c; 3)
(d; 3)
(a; 3)
(b; 3)
(c; 3)
(d; 3)
(A)
(B)
Figure 4: (A) original space-time circuit (B) retimed space-time circuit
9
Init
t=
?t0
In?t0 t