Discussion:
[OpenRISC] mor1kx: More than two ways in caches
Stefan Wallentowitz
2014-04-14 11:45:46 UTC
Permalink
Hi all,

finally I found time to commit the multiple ways caches. Everybody is
invited to test it from: https://github.com/wallento/mor1kx/commits/master

There is one commit that are simple naming fixes:
https://github.com/wallento/mor1kx/commit/06527d424fa7818d258513936a60beb94cc7a98f

Then there is the LRU unit, which is optimized with respect to memory
utilization:
https://github.com/wallento/mor1kx/commit/5c71606c2cfaa6950758cee097abbb7f697a8da7

Finally, this patch contains the actual updates to the caches:
https://github.com/wallento/mor1kx/commit/d734c4330fef4b0dacaa9d5b428c565383446591

I would be happy if you could review it and give me some bug reports and
other feedback!

Bye,
Stefan
Stefan Kristiansson
2014-04-14 12:41:06 UTC
Permalink
On Mon, Apr 14, 2014 at 2:45 PM, Stefan Wallentowitz
Post by Stefan Wallentowitz
Hi all,
finally I found time to commit the multiple ways caches. Everybody is
invited to test it from: https://github.com/wallento/mor1kx/commits/master
https://github.com/wallento/mor1kx/commit/06527d424fa7818d258513936a60beb94cc7a98f
Then there is the LRU unit, which is optimized with respect to memory
https://github.com/wallento/mor1kx/commit/5c71606c2cfaa6950758cee097abbb7f697a8da7
https://github.com/wallento/mor1kx/commit/d734c4330fef4b0dacaa9d5b428c565383446591
I would be happy if you could review it and give me some bug reports and
other feedback!
Great!
I just browsed through the patches briefly now, and I see the minor
nits I had from the first round of (part of the patches) have been
dealt with.
I'll take a more thorough look later in the evening and comment inline
on github (if there is anything to be commented).
If there are no major issues, do you mind if I pull this into openrisc/mor1kx?

Stefan
Stefan Wallentowitz
2014-04-14 12:47:37 UTC
Permalink
Post by Stefan Kristiansson
Great!
I just browsed through the patches briefly now, and I see the minor
nits I had from the first round of (part of the patches) have been
dealt with.
I'll take a more thorough look later in the evening and comment inline
on github (if there is anything to be commented).
If there are no major issues, do you mind if I pull this into openrisc/mor1kx?
Stefan
No, of course it would be nice if you pull it after review.
I will also finalize the smal change of the multicore core identifier
including or1ksim and or1k-src extensions as soon as possible.
Afterwards I will integrate the snooping cache (still working on a nice
LSU solution), so maybe we can reuse parts of this for the LL/SC
implementation you proposed recently.

Thanks and Bye,
Stefan
Stefan Kristiansson
2014-04-14 19:26:40 UTC
Permalink
On Mon, Apr 14, 2014 at 3:47 PM, Stefan Wallentowitz
Post by Stefan Wallentowitz
No, of course it would be nice if you pull it after review.
I didn't really find anything to complain about, so I've applied the
patches with only a small change to a comment referring to
"OPTIMSOC_I/DCACHE_WAYS".
Post by Stefan Wallentowitz
I will also finalize the smal change of the multicore core identifier
including or1ksim and or1k-src extensions as soon as possible.
Afterwards I will integrate the snooping cache (still working on a nice
LSU solution), so maybe we can reuse parts of this for the LL/SC
implementation you proposed recently.
Yes, I think the snooping unit for cache coherency and atomicity
should be able to share a lot of logic.
I've actually added the LL/SC functionality to mor1kx, but I'm holding
that off until we are sure about the details.
If you're going to poke around the LSU, you might want to base your
work upon it, I don't think the final version will be a lot different.
https://github.com/skristiansson/mor1kx/commit/3f77e56e8e89ac1de2b77c996d0104ccae69b52a

Stefan
Stefan Kristiansson
2014-04-16 06:46:24 UTC
Permalink
On Mon, Apr 14, 2014 at 10:26 PM, Stefan Kristiansson
Post by Stefan Kristiansson
On Mon, Apr 14, 2014 at 3:47 PM, Stefan Wallentowitz
Post by Stefan Wallentowitz
No, of course it would be nice if you pull it after review.
I didn't really find anything to complain about, so I've applied the
patches with only a small change to a comment referring to
"OPTIMSOC_I/DCACHE_WAYS".
After some more in-depth testing, I found one small issue, for which I
have already pushed a fix for:
https://github.com/openrisc/mor1kx/commit/7db7a341f9886851fb18d6c4a1a0b00de7016f24

As a slightly related note - in retro-perspective, I think it was a
mistake to keep all the ways in *one* tag memory.
I did that as a way to more efficiently use the block ram in FPGAs,
but in the end, I think the trade-offs are to costly.
The split-up of the tag memory data when it's read and written that
you did in those commits have already made it easier to make a
transition for that.

Stefan
Stefan Wallentowitz
2014-04-16 07:38:13 UTC
Permalink
Hi,
Post by Stefan Kristiansson
After some more in-depth testing, I found one small issue, for which I
https://github.com/openrisc/mor1kx/commit/7db7a341f9886851fb18d6c4a1a0b00de7016f24
Perfect, thank you!
Post by Stefan Kristiansson
As a slightly related note - in retro-perspective, I think it was a
mistake to keep all the ways in *one* tag memory.
I did that as a way to more efficiently use the block ram in FPGAs,
but in the end, I think the trade-offs are to costly.
The split-up of the tag memory data when it's read and written that
you did in those commits have already made it easier to make a
transition for that.
At least Synplify (did not try other synthesis) still uses Block RAM
rather efficiently, but maybe it makes more sense to split it.

Another thing along those lines: I read that one way to cope with the
issues of VIPT caches is to store tag plus index. Should we consider
this? At the moment we set the cache size to page size to avoid the issues.

Bye,
Stefan
Stefan Kristiansson
2014-04-16 08:14:01 UTC
Permalink
On Wed, Apr 16, 2014 at 10:38 AM, Stefan Wallentowitz
Post by Stefan Wallentowitz
Post by Stefan Kristiansson
As a slightly related note - in retro-perspective, I think it was a
mistake to keep all the ways in *one* tag memory.
I did that as a way to more efficiently use the block ram in FPGAs,
but in the end, I think the trade-offs are to costly.
The split-up of the tag memory data when it's read and written that
you did in those commits have already made it easier to make a
transition for that.
At least Synplify (did not try other synthesis) still uses Block RAM
rather efficiently, but maybe it makes more sense to split it.
Yeah, I wasn't claiming that the Block RAM usage would be suffering,
the trade-off I'm speaking about is that we have to save a copy of the
tag data from *all* the ways (instead of just the way we are
updating).
With two ways as a maximum, we could work-around it by only saving the
"other" way, but with increasing ways, the data to save increases
quite quickly.
It's not a huge issue, just wanted to make clear that I don't have any
strong feelings for keeping the "one tag RAM for all ways".
Post by Stefan Wallentowitz
Another thing along those lines: I read that one way to cope with the
issues of VIPT caches is to store tag plus index. Should we consider
this? At the moment we set the cache size to page size to avoid the issues.
That's cache size/ways to page size, so with e.g. 4 ways you get max
32 KB cache (since we have 8KB pages).
That should be enough for everybody, right? =)
Joking aside, sure, why not?
If we can make it conditional on cache size/ways > page size, I think
it'd be a win-win situation.

Stefan
Stefan Wallentowitz
2014-04-16 08:16:49 UTC
Permalink
Post by Stefan Kristiansson
Yeah, I wasn't claiming that the Block RAM usage would be suffering,
the trade-off I'm speaking about is that we have to save a copy of the
tag data from *all* the ways (instead of just the way we are
updating).
With two ways as a maximum, we could work-around it by only saving the
"other" way, but with increasing ways, the data to save increases
quite quickly.
It's not a huge issue, just wanted to make clear that I don't have any
strong feelings for keeping the "one tag RAM for all ways".
Ah, okay, got it. Sounds reasonable.
Post by Stefan Kristiansson
That's cache size/ways to page size, so with e.g. 4 ways you get max
32 KB cache (since we have 8KB pages).
That should be enough for everybody, right? =)
Joking aside, sure, why not?
If we can make it conditional on cache size/ways > page size, I think
it'd be a win-win situation.
Okay, perfect, I have it on my mid-term TODO list :)

Bye, Stefan

Loading...