Matt Thomas
2014-08-20 17:30:27 UTC
Recently, I started making NetBSD support OpenRisc.
I'm using binutils from the top of the tree and GCC 4.9 for my toolchain.
I looked at using llvm-openrisc but NetBSD's LLVM is 3.6 while llvm-openrisc
is 3.1. Since my expertise with toolchains is more gcc centric, I went that
way.
So I'm wondering on what ISA features I can count on. Are OR32BIS II
instructions widely implemented? floating point?
I was deciding on whether to focus on whether to just support the no-delay
version of the ISA. I found that PIC code and -mno-delay seem to be
incompatible at the moment.
The problem is computing the GOT pointer doesn't take into account -mno-delay
or -mcompat-delay. It's always emitted as:
l.jal 8
l.movhi r16,gotpchi(_GLOBAL_OFFSET_TABLE_-4)
l.ori r16,r16,gotpclo(_GLOBAL_OFFSET_TABLE_+0)
l.add r16,r16,r9
The problem is for no-delay the l.jal should have an argument of 4 or the
l.movhi will never be executed since it was branched over. I think for -m
no-delay or -mcompat-delay it should be:
l.jal 4
l.movhi r16,gotpchi(_GLOBAL_OFFSET_TABLE_+0)
l.ori r16,r16,gotpclo(_GLOBAL_OFFSET_TABLE_+4)
l.add r16,r16,r9
I notice that r16 is being as the GOT pointer and r10 as the thread pointer
though there aren't document as such in the OpenRISC 1.1 Architecture.
I was surprised to see that patterns for ffssi2, ctzsi2, and clzsi2 aren't
present for gcc given the l.ff1 and l.fl1 instructions.
Looking at the emitted gcc code, I see.
l.addi r1,r1,16
l.lwz r9,-4(r1) # SI load
l.lwz r1,-16(r1) # SI load
The load of r1 after the l.addi serves no useful purpose.
One nice thing I have noticed is that it is rather easy to convert
PowerPC assembly to OpenRISC.
I'm using binutils from the top of the tree and GCC 4.9 for my toolchain.
I looked at using llvm-openrisc but NetBSD's LLVM is 3.6 while llvm-openrisc
is 3.1. Since my expertise with toolchains is more gcc centric, I went that
way.
So I'm wondering on what ISA features I can count on. Are OR32BIS II
instructions widely implemented? floating point?
I was deciding on whether to focus on whether to just support the no-delay
version of the ISA. I found that PIC code and -mno-delay seem to be
incompatible at the moment.
The problem is computing the GOT pointer doesn't take into account -mno-delay
or -mcompat-delay. It's always emitted as:
l.jal 8
l.movhi r16,gotpchi(_GLOBAL_OFFSET_TABLE_-4)
l.ori r16,r16,gotpclo(_GLOBAL_OFFSET_TABLE_+0)
l.add r16,r16,r9
The problem is for no-delay the l.jal should have an argument of 4 or the
l.movhi will never be executed since it was branched over. I think for -m
no-delay or -mcompat-delay it should be:
l.jal 4
l.movhi r16,gotpchi(_GLOBAL_OFFSET_TABLE_+0)
l.ori r16,r16,gotpclo(_GLOBAL_OFFSET_TABLE_+4)
l.add r16,r16,r9
I notice that r16 is being as the GOT pointer and r10 as the thread pointer
though there aren't document as such in the OpenRISC 1.1 Architecture.
I was surprised to see that patterns for ffssi2, ctzsi2, and clzsi2 aren't
present for gcc given the l.ff1 and l.fl1 instructions.
Looking at the emitted gcc code, I see.
l.addi r1,r1,16
l.lwz r9,-4(r1) # SI load
l.lwz r1,-16(r1) # SI load
The load of r1 after the l.addi serves no useful purpose.
One nice thing I have noticed is that it is rather easy to convert
PowerPC assembly to OpenRISC.