Discussion:
[OpenRISC] mor1kx multicore
Stefan Wallentowitz
2014-04-28 12:08:03 UTC
Permalink
Dear all,

finally, a big blob with some first multicore changes plus some massive
changes in newlib for better usability. Summary:

* mor1kx changes (https://github.com/wallento/mor1kx/tree/multicore)
* adds SPR_COREID and SPR_NUMCORES
* cherry-pick Stefan's l.lwa/l.swa
* adds snoop port to abort atomic on write to same address
* adds trace port (mor1kx_monitor does not work for two cores), this
can then be used for synthesizable trace processing
* adds ISR0 and ISR1 as "shadow register" alternative for calculations
in the prolog of exceptions

* or1k-src general changes (https://github.com/wallento/or1k-src)
* reentrant libc as we may want to use libc in exception handling
* different stacks for exception and software due to virtual memory
* Extend or1k support with restore and some service functions
* malloc_lock and malloc_unlock to avoid interrupts during malloc

* multicore stuff (https://github.com/wallento/or1k-src/tree/multicore)
* build two libgloss versions: libor1k.a and libor1k_mc.a, crt0{_mc}.o
* use SPR_COREID and SPR_NUMCORES in or1k_coreid() and or1k_numcores()
* cherry-pick Stefan's l.lwa/l.swa
* Basic synchronization functions in or1k-support
* Reentrant or1k-support for multiple cores
* libc Reentrancy for multiple cores
* malloc_lock uses l.lwa/l.swa

* or1k-gcc (https://github.com/wallento/or1k-gcc/tree/multicore)
* Add -mmulticore to switch between libor1k.a/crt0.o and
libor1k_mc.a/crt0_mc.o

You can run a demo (currently in Modelsim only) using fusesoc:

* Modified orpsoc-cores at
https://github.com/wallento/orpsoc-cores/tree/multicore-demo
* (for 64-bit Modelsim: fusesoc at https://github.com/wallento/fusesoc)

There is a very small example at: http://pastebin.com/1dyUnygK

When you want to try it out and have ~20 minutes, try this walkthrough:
http://pastebin.com/uDsh5DJE

I am looking forward to all discussion and input.

Bye,
Stefan
Stefan Kristiansson
2014-04-29 06:02:27 UTC
Permalink
On Mon, Apr 28, 2014 at 3:08 PM, Stefan Wallentowitz
Post by Stefan Wallentowitz
Dear all,
finally, a big blob with some first multicore changes plus some massive
Wow, this is way cool!
And massive bonus points for using bleeding edge "OpenRISC community
technology", like fusesoc/orpsoc-cores etc.
I will definitely set a side some time to play with this.

As for the mor1kx changes, I think some of them can be picked in to
openrisc/mor1kx without much discussion, so I'd like to do that ASAP.
I have added some small comments on the bullet points below.
Post by Stefan Wallentowitz
* mor1kx changes (https://github.com/wallento/mor1kx/tree/multicore)
* adds SPR_COREID and SPR_NUMCORES
This can go in as is, but we'll need to add this to the architecture
specification.
Do you think you could lay out the text for this so it can easily be
copy-pasted into the arch spec?
Post by Stefan Wallentowitz
* cherry-pick Stefan's l.lwa/l.swa
No comment ;)
Post by Stefan Wallentowitz
* adds snoop port to abort atomic on write to same address
No comment here neither, we can apply this as is, it just needs a
better commit message.
Post by Stefan Wallentowitz
* adds trace port (mor1kx_monitor does not work for two cores), this
can then be used for synthesizable trace processing
This is nice, I've been meaning to do something like this for a while.
I'll take a closer look at it and apply it if I don't find any issues with it.
Post by Stefan Wallentowitz
* adds ISR0 and ISR1 as "shadow register" alternative for calculations
in the prolog of exceptions
I'll leave this one out.
*But*, I've given this some more thought since our last conversation about this.
And, I think, if not properly implementing the "fast context switch"
stuff (which is perhaps a bit overkill) among the options you gave I
think adding "SPR scratch regs", like you are using the ISR0 and ISR1
here, is the best option.
If we'd be going down that path, we'll need to properly document those
in the architecture specification.
I'd really like others to chip in on this discussion before moving
forward with that though.

Stefan
Olof Kindgren
2014-04-29 06:16:30 UTC
Permalink
On Tue, Apr 29, 2014 at 8:02 AM, Stefan Kristiansson
Post by Stefan Kristiansson
On Mon, Apr 28, 2014 at 3:08 PM, Stefan Wallentowitz
Post by Stefan Wallentowitz
Dear all,
finally, a big blob with some first multicore changes plus some massive
Wow, this is way cool!
And massive bonus points for using bleeding edge "OpenRISC community
technology", like fusesoc/orpsoc-cores etc.
I will definitely set a side some time to play with this.
I totally agree. It's so encouraging to see everyone's hard work being used
and elevated into a cool multicore platform. Great work! I'll do something about
the modelsim issue, and maybe we could even get it running in Icarus as well?

//Olof
Stefan Wallentowitz
2014-04-29 07:07:57 UTC
Permalink
Post by Stefan Kristiansson
As for the mor1kx changes, I think some of them can be picked in to
openrisc/mor1kx without much discussion, so I'd like to do that ASAP.
I have added some small comments on the bullet points below.
Some of the stuff is still experimental, especially you should not pull
the snoop stuff until it also includes cache coherency I think (which I
am integrating).
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* mor1kx changes (https://github.com/wallento/mor1kx/tree/multicore)
* adds SPR_COREID and SPR_NUMCORES
This can go in as is, but we'll need to add this to the architecture
specification.
Do you think you could lay out the text for this so it can easily be
copy-pasted into the arch spec?
Yes, there is a proposal in the Wiki:
http://opencores.org/or1k/Architecture_Specification on bottom. I think
describing the SPRs in the table is sufficient
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* adds snoop port to abort atomic on write to same address
No comment here neither, we can apply this as is, it just needs a
better commit message.
Yes, I will extend this when I also have the cache coherency in there.
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* adds trace port (mor1kx_monitor does not work for two cores), this
can then be used for synthesizable trace processing
This is nice, I've been meaning to do something like this for a while.
I'll take a closer look at it and apply it if I don't find any issues with it.
At the moment it only captures PC, insn and potential write back. In
OpTiMSoC we use it to extract l.nop K and R3 combinations for software
instrumentation/tracing plus program counter traces. I already kept the
naming that it is an execution trace. I think it is rather simple to add
other traceports like memory trace etc.
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* adds ISR0 and ISR1 as "shadow register" alternative for calculations
in the prolog of exceptions
I'll leave this one out.
*But*, I've given this some more thought since our last conversation about this.
And, I think, if not properly implementing the "fast context switch"
stuff (which is perhaps a bit overkill) among the options you gave I
think adding "SPR scratch regs", like you are using the ISR0 and ISR1
here, is the best option.
If we'd be going down that path, we'll need to properly document those
in the architecture specification.
I'd really like others to chip in on this discussion before moving
forward with that though.
Yes, definitely. I just needed this and wanted to demonstrate the
advantage and easiness of this approach ;) We should keep the discussion
alive, as the multicore version will rely on this and or1k-src cannot be
cleanly merged until this is clarified.

I made vast changes to the newlib and libgloss. Do you or anybody else
(Jeremy, Julius?) have feelings or comments about the reentrancy for
exceptions and separate stack thing? We for example build our lean
runtime system directly with this libc as I always loved the flexibility
in exception handling provided by or1k-support. But I think (independent
of the necessity of reentrancy for SMP multicore), that the flexibility
is further increased by using a different reentrancy structure for
exceptions (at least for printf).

Bye,
Stefan
Christian Svensson
2014-04-29 08:03:00 UTC
Permalink
Way cool!

Let's make this a thing! Multi-core OpenRISC is just what Debian for
OpenRISC needs ;)

Regards,
Christian

On Tue, Apr 29, 2014 at 8:07 AM, Stefan Wallentowitz
Post by Stefan Wallentowitz
Post by Stefan Kristiansson
As for the mor1kx changes, I think some of them can be picked in to
openrisc/mor1kx without much discussion, so I'd like to do that ASAP.
I have added some small comments on the bullet points below.
Some of the stuff is still experimental, especially you should not pull
the snoop stuff until it also includes cache coherency I think (which I
am integrating).
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* mor1kx changes (https://github.com/wallento/mor1kx/tree/multicore)
* adds SPR_COREID and SPR_NUMCORES
This can go in as is, but we'll need to add this to the architecture
specification.
Do you think you could lay out the text for this so it can easily be
copy-pasted into the arch spec?
http://opencores.org/or1k/Architecture_Specification on bottom. I think
describing the SPRs in the table is sufficient
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* adds snoop port to abort atomic on write to same address
No comment here neither, we can apply this as is, it just needs a
better commit message.
Yes, I will extend this when I also have the cache coherency in there.
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* adds trace port (mor1kx_monitor does not work for two cores), this
can then be used for synthesizable trace processing
This is nice, I've been meaning to do something like this for a while.
I'll take a closer look at it and apply it if I don't find any issues with it.
At the moment it only captures PC, insn and potential write back. In
OpTiMSoC we use it to extract l.nop K and R3 combinations for software
instrumentation/tracing plus program counter traces. I already kept the
naming that it is an execution trace. I think it is rather simple to add
other traceports like memory trace etc.
Post by Stefan Kristiansson
Post by Stefan Wallentowitz
* adds ISR0 and ISR1 as "shadow register" alternative for calculations
in the prolog of exceptions
I'll leave this one out.
*But*, I've given this some more thought since our last conversation about this.
And, I think, if not properly implementing the "fast context switch"
stuff (which is perhaps a bit overkill) among the options you gave I
think adding "SPR scratch regs", like you are using the ISR0 and ISR1
here, is the best option.
If we'd be going down that path, we'll need to properly document those
in the architecture specification.
I'd really like others to chip in on this discussion before moving
forward with that though.
Yes, definitely. I just needed this and wanted to demonstrate the
advantage and easiness of this approach ;) We should keep the discussion
alive, as the multicore version will rely on this and or1k-src cannot be
cleanly merged until this is clarified.
I made vast changes to the newlib and libgloss. Do you or anybody else
(Jeremy, Julius?) have feelings or comments about the reentrancy for
exceptions and separate stack thing? We for example build our lean
runtime system directly with this libc as I always loved the flexibility
in exception handling provided by or1k-support. But I think (independent
of the necessity of reentrancy for SMP multicore), that the flexibility
is further increased by using a different reentrancy structure for
exceptions (at least for printf).
Bye,
Stefan
_______________________________________________
OpenRISC mailing list
http://lists.openrisc.net/listinfo/openrisc
Loading...