mail archive of the barebox mailing list
 help / color / mirror / Atom feed
From: Ahmad Fatoum <a.fatoum@pengutronix.de>
To: Lucas Stach <l.stach@pengutronix.de>,
	Fabian Pflug <f.pflug@pengutronix.de>,
	barebox@lists.infradead.org
Cc: rouven.czerwinski@linaro.org
Subject: Re: [PATCH 2/2] ARM: optee-early: invalidate caches before jump to OP-TEE
Date: Tue, 3 Jun 2025 12:18:56 +0200	[thread overview]
Message-ID: <6ce5c98c-4f8e-46f6-8dd3-7c911578feb8@pengutronix.de> (raw)
In-Reply-To: <569963942cf35755dfdf34b240c350986fda4727.camel@pengutronix.de>

Hello Lucas,

On 6/3/25 11:57, Lucas Stach wrote:
> Hi Fabian,
> 
> Am Dienstag, dem 03.06.2025 um 11:20 +0200 schrieb Fabian Pflug:
>> The optee-early code was initially added for i.MX6UL. Trying to naively
>> enable it on an i.MX6Q boards was observed to cause spurious hangs on
>> return from OP-TEE to barebox.
>>
>> The root cause seems to be inadequate cache handling by OP-TEE: OP-TEE
>> enables the MMU and caches with it, but didn't take care to invalidate
>> all cache lines before enabling the MMU, which triggered the
>> aforementioned hangs.
>>
>> To paper over this issue, let's just invalidate the cache lines on the
>> barebox side instead before jumping to OP-TEE. This issue did likely not
>> affect the original i.MX6UL, because its Cortex-A7 has an architected L2
>> cache that's guaranteed zeroed (no dirty cache lines) on power-on reset,
>> unlike the i.MX6Q's Cortex-A9, where the external L2 cache powers on
>> with unpredictable content including the dirty bits.
>>
> The explanation here doesn't make too much sense to me. I don't think
> the outer L2 cache is even enabled at this point, but even if it were
> arm_early_mmu_cache_invalidate() only handles architected caches, so it
> wouldn't affect the PL310 on the i.MX6Q/DL.

You're right. I recalled issues that bit us in the past on the
Cortex-A9, but not on the A7 and took a wrong turn trying to rationalize
this change with a spotty recollection.

> The real issue with the Cortex A9 caches is that the tags aren't
> cleared on power-up, so some sets/ways may end up in "valid" state if
> not explicitly invalidated.

I see, thanks for the clarification. So the issue is with our handling
of the L1 cache instead.

> Thus any write to memory may get stuck in
> the cache, even if caching is disabled, as this knob only turns off 
> allocation in the cache, but doesn't prevent updates of such bogus
> valid lines.

Ok, so if CR_C is unset, the cache is still used when reading/writing,
provided that the cache line is valid.

> If you then proceed to invalidate the cache, you may
> discard data that has not yet reached DRAM. So IMO this fix here seems
> risky, as it assumes that there have been no writes to memory that are
> worth keeping before calling start_optee_early(). While this might be
> the case in the current implementation, this assumption is quite non-
> obvious to someone just looking at the individual functions.

Agreed. If the issue is with the valid and not the dirty bit,
invalidation at this location is incorrect.

> The stuck writes are also why OP-TEE is unable to handle this itself:
> any cache invalidation there would risk discarding writes from software
> running before OP-TEE. So the only way to handle this properly is to
> invalidate the caches before issuing any writes.

This makes me wonder though about the regular case without any OP-TEE as
we are already doing arm_early_mmu_cache_invalidate() inside
__barebox_arm_entry:

 - low level init code writes something to handoff object or to scratch
   area

 - The freshly written data ends up in (L1) cache as tag was valid

 - arm_early_mmu_cache_invalidate() discards these writes

 - The uncompressor or barebox proper ends up with corrupted data.

We don't have many objects that are accessed both before and after
arm_early_mmu_cache_invalidate, so maybe that's why we didn't run into
more problems?

> I guess it would be much better to simply have the
> arm_early_mmu_cache_invalidate() as part of the Cortex A9 lowlevel CPU
> initialization at the very start of the PBL entry.

We don't have a dedicated Cortex-A9 lowlevel entry function
unfortunately, just some for specific processors, e.g. the
imx6_cpu_lowlevel_init.

We could add CONFIG_CPU_CORTEX_A9, select it from the relevant SoC
options and depending on it, add the invalidation to
arm_cpu_lowlevel_init()? What do you think?

Thanks,
Ahmad


> 
> Regards,
> Lucas
> 
>> This means on e.g. the i.MX6UL, we will now do one extra cache invalidation
>> that's not needed. This should be negligible and we are already had an
>> unconditional invalidation in __barebox_arm_entry.
>>
>> Note that this is a different implementation than what we do on ARM64,
>> there we load TF-A before it jumps to OP-TEE and assuming
>> non-architected caches or caches with uninitialized content on power-on
>> to be a dying breed, our ARM64 implementation is likely not affected.
>>
>> Co-authored-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
>> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
>> Signed-off-by: Fabian Pflug <f.pflug@pengutronix.de>
>> ---
>>  arch/arm/lib32/optee-early.c | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/arm/lib32/optee-early.c b/arch/arm/lib32/optee-early.c
>> index 0cda0ab163..b1dba67d42 100644
>> --- a/arch/arm/lib32/optee-early.c
>> +++ b/arch/arm/lib32/optee-early.c
>> @@ -35,6 +35,19 @@ int start_optee_early(void *fdt, void *tee)
>>  	/* We use setjmp/longjmp here because OP-TEE clobbers most registers */
>>  	ret = setjmp(tee_buf);
>>  	if (ret == 0) {
>> +		/*
>> +		 * At least OP-TEE v4.1.0 seems to not invalidate all dirty cache
>> +		 * lines before enabling the MMU. This can lead to spurious hangs
>> +		 * on return to barebox on systems where there might be left-over
>> +		 * dirty cache lines, whether from BootROM or because L2 cache
>> +		 * is non-architected and powers on with unpredictable content
>> +		 * like is the case with PL310 on i.MX6Q.
>> +		 *
>> +		 * Let's invalidate the caches here, so board entry points need
>> +		 * not bother.
>> +		 */
>> +		arm_early_mmu_cache_invalidate();
>> +
>>  		tee_start(0, 0, fdt);
>>  		longjmp(tee_buf, 1);
>>  	}
> 
> 

-- 
Pengutronix e.K.                  |                             |
Steuerwalder Str. 21              | http://www.pengutronix.de/  |
31137 Hildesheim, Germany         | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686  | Fax:   +49-5121-206917-5555 |




  reply	other threads:[~2025-06-03 10:48 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-03  9:20 [PATCH 1/2] ARM: optee-early: drop superfluous sync_caches_for_execution Fabian Pflug
2025-06-03  9:20 ` [PATCH 2/2] ARM: optee-early: invalidate caches before jump to OP-TEE Fabian Pflug
2025-06-03  9:57   ` Lucas Stach
2025-06-03 10:18     ` Ahmad Fatoum [this message]
2025-06-03 14:47       ` Lucas Stach
2025-06-03 14:51         ` Ahmad Fatoum
2025-06-03 15:20           ` Lucas Stach
2025-06-04  9:57             ` Rouven Czerwinski
2025-06-04 10:00 ` [PATCH 1/2] ARM: optee-early: drop superfluous sync_caches_for_execution Rouven Czerwinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ce5c98c-4f8e-46f6-8dd3-7c911578feb8@pengutronix.de \
    --to=a.fatoum@pengutronix.de \
    --cc=barebox@lists.infradead.org \
    --cc=f.pflug@pengutronix.de \
    --cc=l.stach@pengutronix.de \
    --cc=rouven.czerwinski@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox