Architecture rtl of arith_div
The basic approach is radix 2 (RAPOW=1). A higher radix power may be specified so as to unroll the corresponding number of elementary radix-2 steps into one clock cycle. This cuts the computational latency accordingly. However, the amount of combinational logic does increase and the critical path will eventually be affected.
A PIPELINED instantiation unrolls all necessary iteration steps in space so that every stage uses its own set of registers. The pipeline can accept a new pair of inputs in every clock cycle.
- Internally, the divider uses two state registers:
DR - the divisor, which remains constant throughout one operation, and
AR holding the actual computation state, which is iteratively tranformed from the initial dividend to the quotient.
The active computation is performed on the left end of AR, which represents the relevant prefix of the current residue to be tested against the divisor. The remaining residue is shifted step-by-step into this region. The freed space on the right is immediately re-used for the generated quotient digits.
The layout transformation of AR (either over time or in space) is as follows:
00 ... 00 | A |
- -------v-------/
- P-D ?
/ ______________________________ | v v
P' | << A << | Q|R | Q |
Name |
Description |
---|---|
residue |
|
divisor |
|
residue_vector |
|
divisor_vector |