mail archive of the barebox mailing list
 help / color / mirror / Atom feed
From: Lucas Stach <l.stach@pengutronix.de>
To: Ahmad Fatoum <a.fatoum@pengutronix.de>
Cc: barebox@lists.infradead.org
Subject: Re: [PATCH 3/5] ARM: cache-armv7: start invalidation from outer levels
Date: Thu, 25 Apr 2019 11:57:16 +0200	[thread overview]
Message-ID: <1556186236.2584.7.camel@pengutronix.de> (raw)
In-Reply-To: <bf60018a-fcf5-20cb-1650-35aabe7d3658@pengutronix.de>

Am Dienstag, den 23.04.2019, 19:21 +0200 schrieb Ahmad Fatoum:
> On 4/23/19 7:18 PM, Ahmad Fatoum wrote:
> > The ARM Cortex-A Series Programmer's Guide notes[1]:
> > > Some care is required with cache maintenance activity in multi-core
> 
> Unfortunately, the only error scenario I could find in the ARM docs
> was this one here, which presumes a multi-core system.
> 
> Lucas, do you have a better example in context of barebox on how things
> could go awry without this patch?

In the Barebox case we don't care about SMP, but about the HW
prefetcher, which is an independent master and isn't bound by the
instructions executed on the core.

So the sequence that could go wrong in Barebox is as follows:
1. CPU core starts invalidating at L1 cache level
2. HW prefetcher decides that a specific address should be brought into
the L1 cache.
3. HW prefetcher finds a valid block for the requested address in L2
cache and moves cached data from L2 to L1.
4. Only now CPU core invalidates L2 cache.

In the above sequence we now have invalid data in the L1 cache line.
The correct sequence will avoid this issue:

1. CPU core starts invalidating at L2 cache level
2. HW prefetcher decides that a specific address should be brought into
the L1 cache.
3. HW prefetcher sees invalid tags for the requested address in L2
cache and brings in the data from external memory.
4. CPU core invalidates L1 cache, discarding the prefetched data.

With the correct sequence we never end up with invalid data in valid
cache lines, we just waste some external memory bandwidth on the
discarded prefetch.

Regards,
Lucas

> 
> > > systems that include a L2C-310 L2 cache (or similar). Cleaning or
> > > invalidating the L1 cache and L2 cache will not be a single atomic
> > > operation. A core might therefore perform cache maintenance on
> > > a particular address in both L1 and L2 caches only as two discrete
> > > steps. If another core were to access the affected address between
> > > those two actions, a coherency problem can occur. Such problems can
> > > be avoided by following two simple rules.
> > > 
> > > * When cleaning, always clean the innermost (L1) cache first and then
> > >   clean the outer cache(s).
> > > * When invalidating, always invalidate the outermost cache first and
> > >   the L1 cache last.
> > 
> > The current code correctly iterates from inner to outer cache levels
> > when flushing/cleaning (r8 == 0), invalidation (r8 == 1) occurs in the
> > same direction though. Adjust the invalidation iteration order to start
> > from the outermost cache instead.
> > 
> > Equivalent C-Code:
> > 
> > 	enum cache_op { CACHE_FLUSH = 0, CACHE_INVAL = 1 };
> > 	register enum cache_op operation asm("r8");
> > 	register int i asm("r12");
> > 	register int limit asm("r3") = max_cache_level << 1; // e.g. 4 with L2 max
> > 
> > 	+if (operation == CACHE_FLUSH) {
> > > > 	 	i = 0;
> > 	+} else {
> > > > 	+	i = limit - 2;
> > 	+}
> > 
> > 	 bool loop_again;
> > 	 do {
> > > > 		/* [snip] */
> > > > 	+	if (operation == CACHE_FLUSH) {
> > > > 			i += 2;
> > > > 			loop_again = limit > i;
> > > > 	+	} else {
> > > > 	+		loop_again = i > 0;
> > > > 	+		i -= 2;
> > > > 	+	}
> > 	} while (loop_again);
> > 
> > [1]: 18.6 "TLB and cache maintenance broadcast", Version 4.0
> > 
> > > > Suggested-by: Lucas Stach <l.stach@pengutronix.de>
> > > > Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
> > ---
> >  arch/arm/cpu/cache-armv7.S | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm/cpu/cache-armv7.S b/arch/arm/cpu/cache-armv7.S
> > index 0eb0ecfee756..b2d70bece7b9 100644
> > --- a/arch/arm/cpu/cache-armv7.S
> > +++ b/arch/arm/cpu/cache-armv7.S
> > @@ -83,7 +83,10 @@ hierarchical:
> > > > > > > >  		ands	r3, r0, #0x7000000	@ extract loc from clidr
> > > > > > > >  		mov	r3, r3, lsr #23		@ left align loc bit field
> > > > > > > >  		beq	finished		@ if loc is 0, then no need to clean
> > > > > > > > -		mov	r12, #0			@ start clean at cache level 0
> > > > > > +		cmp	r8, #0
> > > > > > > > +THUMB(		ite	eq			)
> > > > > > +		moveq	r12, #0
> > > > > > > > +		subne	r12, r3, #2		@ start invalidate at outmost cache level
> >  loop1:
> > > > > > > >  		add	r2, r12, r12, lsr #1	@ work out 3x current cache level
> > > > > > > >  		mov	r1, r0, lsr r2		@ extract cache type bits from clidr
> > > > > > > > @@ -118,8 +121,15 @@ THUMB(		ite	eq			)
> > > > > > > >  		subs	r7, r7, #1		@ decrement the index
> > > > > >  		bge	loop2
> >  skip:
> > > > > > +		cmp	r8, #0
> > > > > > +		bne	inval_check
> > > > > > > >  		add	r12, r12, #2		@ increment cache number
> > > > > >  		cmp	r3, r12
> > > > > > +		b	loop_end_check
> > +inval_check:
> > > > > > +		cmp	r12, #0
> > > > > > > > +		sub	r12, r12, #2		@ decrement cache number
> > +loop_end_check:
> >  #ifdef CONFIG_ARM_ERRATA_814220
> > > >  		dsb
> >  #endif
> > 
> 
> 

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

  reply	other threads:[~2019-04-25  9:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-23 17:18 [PATCH 0/5] ARM: mmu: misc armv7 cache/MMU fixes Ahmad Fatoum
2019-04-23 17:18 ` [PATCH 1/5] ARM: cache-armv7: add work-around for errata 814220 Ahmad Fatoum
2019-04-23 17:39   ` Sam Ravnborg
2019-04-25 10:38     ` Ahmad Fatoum
2019-04-25 10:02   ` Lucas Stach
2019-04-23 17:18 ` [PATCH 2/5] ARM: imx: work around i.MX6UL ERR008958 (ARM errata 814220) Ahmad Fatoum
2019-04-23 17:41   ` Sam Ravnborg
2019-04-25 10:39     ` Ahmad Fatoum
2019-04-23 17:18 ` [PATCH 3/5] ARM: cache-armv7: start invalidation from outer levels Ahmad Fatoum
2019-04-23 17:21   ` Ahmad Fatoum
2019-04-25  9:57     ` Lucas Stach [this message]
2019-04-23 17:18 ` [PATCH 4/5] ARM: mmu: remove doubly defined macro Ahmad Fatoum
2019-04-23 17:18 ` [PATCH 5/5] ARM: mmu: mark uncached regions as eXecute never on v7 Ahmad Fatoum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1556186236.2584.7.camel@pengutronix.de \
    --to=l.stach@pengutronix.de \
    --cc=a.fatoum@pengutronix.de \
    --cc=barebox@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox