Discussion:
[OpenRISC] Piplined FPU
BAndViG
2014-10-25 12:21:21 UTC
Permalink
The pipelined nature of FPU-100 is restored with fresh commit into
https://github.com/openrisc/mor1kx/tree/withfpu .

Actually it isn't completely pipelined because it doesn't implement
- pipelined division,
- intermediate registers for PC and destination identifier,
- etc.

In fact, in the cappuccino-pipe environment the FPU operates in
non-pipelined mode (the pipe stalls till FPU rises ready flag). So, it is
just initial point for further development as FPU itself as new and more
efficient pipeline (perhaps similar to BA25).

Opposite to non-pipelined version, the pipelined variant includes two stage
multiplier for fractional parts. The 24x24 bits multiplier is sectioned on 4
multipliers 13x13 (1st stage) and adder (2nd stage). That allows to
synthesis the module with involving a built-in FPGA DSP cells. Now, the
multiplying consumes 6 clocks (original FPU100 takes 12 or 35 clocks for
parallel or serial implementation accordingly).

Additionally, instead of using a counter as operation complete flag the
direct propagation (through pipeline) of ready signal is implemented. The
approach is removed extra delays presented (legacy from OpenRISC-1200
design) in non-pipelined variant of FPU .

The intermediate benchmarking versus with previous variant.

The previous variant (let me repeat):

case #2: -mhard-float, fpu32_v1.0:
Single Precision C/C++ Whetstone Benchmark

Loop content MFLOPS MOPS Seconds

N1 floating point 2.400 0.008
N2 floating point 2.240 0.060
N3 if then else 3.450 0.030
N4 fixed point 3.938 0.080
N5 sin,cos etc. 0.019 4.300
N6 floating point 1.199 0.450
N7 assignments 1.680 0.110
N8 exp,sqrt etc. 0.009 4.300

MWIPS 1.071 9.338


The new one:

Single Precision C/C++ Whetstone Benchmark

Loop content MFLOPS MOPS Seconds

N1 floating point 4.800 0.004
N2 floating point 3.360 0.040
N3 if then else 3.450 0.030
N4 fixed point 4.500 0.070
N5 sin,cos etc. 0.019 4.300
N6 floating point 1.635 0.330
N7 assignments 1.680 0.110
N8 exp,sqrt etc. 0.009 4.300

MWIPS 1.089 9.184


To activate the pipelined FPU:
- add the following lines into parameter list of mor1kx unit instance:
.FEATURE_FPU("ENABLED")
.FEATURE_PIPELINED_FPU("ENABLED") // makes sense only if
FEATURE_FPU==ENABLED
- add into project all files from "pfpu32" folder

Andrey

Loading...