mail archive of the barebox mailing list
 help / color / mirror / Atom feed
* [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
@ 2016-12-21 21:38 Christian Hemp
  2016-12-21 22:29 ` Sam Ravnborg
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Christian Hemp @ 2016-12-21 21:38 UTC (permalink / raw)
  To: barebox

The function "nand_enable_edo_mode" changed the NAND clk rate, without turning
it off. In this case it is posible to get the following errors:
    MXS NAND: Error sending command
    MXS NAND: Error sending command
    MXS NAND: DMA read error

This can be fixed if the NAND clk is disabled before we change the clk
rate.

Tested with:
nand: NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron
MT29F4G08ABADAWP), 512MiB, page size: 2048, OOB size: 64

Signed-off-by: Christian Hemp <c.hemp@phytec.de>
---
 drivers/mtd/nand/nand_mxs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mtd/nand/nand_mxs.c b/drivers/mtd/nand/nand_mxs.c
index cba0bee..ce79bca 100644
--- a/drivers/mtd/nand/nand_mxs.c
+++ b/drivers/mtd/nand/nand_mxs.c
@@ -2047,7 +2047,9 @@ static int mxs_nand_enable_edo_mode(struct mxs_nand_info *info)
 	nand->select_chip(mtd, -1);
 
 	/* [3] set the main IO clock, 100MHz for mode 5, 80MHz for mode 4. */
+	clk_disable(info->clk);
 	clk_set_rate(info->clk, (mode == 5) ? 100000000 : 80000000);
+	clk_enable(info->clk);
 
 	dev_dbg(info->dev, "using asynchronous EDO mode %d\n", mode);
 
-- 
1.9.1


_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2016-12-21 21:38 [PATCH] mtd: nand_mxs: fix NAND error when change clk rate Christian Hemp
@ 2016-12-21 22:29 ` Sam Ravnborg
  2016-12-22  8:50   ` Christian Hemp
  2016-12-23 10:39 ` Sam Ravnborg
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Sam Ravnborg @ 2016-12-21 22:29 UTC (permalink / raw)
  To: Christian Hemp; +Cc: barebox

Hi Christian.

On Wed, Dec 21, 2016 at 10:38:41PM +0100, Christian Hemp wrote:
> The function "nand_enable_edo_mode" changed the NAND clk rate, without turning
> it off. In this case it is posible to get the following errors:
>     MXS NAND: Error sending command
>     MXS NAND: Error sending command
>     MXS NAND: DMA read error
> 
> This can be fixed if the NAND clk is disabled before we change the clk
> rate.
Very interesting.
I have ~50 targets and a few of these are used for power cycle tests.
On one of these targets I have seen this exact same pattern - once...

Can you give any hints that makes it simpler to verify your fix or anything?


This is a board with i.MX6 SoloCore - where we have 2 GiB FLASH + 512 MB RAM.

Anyway - it saved my day to see this landing on the mailing list!

	Sam

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2016-12-21 22:29 ` Sam Ravnborg
@ 2016-12-22  8:50   ` Christian Hemp
  2016-12-22 17:55     ` Sam Ravnborg
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Hemp @ 2016-12-22  8:50 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: barebox

Hello Sam,


On 21.12.2016 23:29, Sam Ravnborg wrote:
> Hi Christian.
>
> On Wed, Dec 21, 2016 at 10:38:41PM +0100, Christian Hemp wrote:
>> The function "nand_enable_edo_mode" changed the NAND clk rate, without turning
>> it off. In this case it is posible to get the following errors:
>>      MXS NAND: Error sending command
>>      MXS NAND: Error sending command
>>      MXS NAND: DMA read error
>>
>> This can be fixed if the NAND clk is disabled before we change the clk
>> rate.
> Very interesting.
> I have ~50 targets and a few of these are used for power cycle tests.
> On one of these targets I have seen this exact same pattern - once...
>
> Can you give any hints that makes it simpler to verify your fix or anything?
To reproduce the issue and test the fix I reseted the board in a loop.
For this I added 'reset' to /env/bin/init. With 'reset" in /env/bin/init I
saw the issue after 1 - 2 Minutes.
Also I have done 1000 power cuts and resets without any error in our 
test rack.

power cut:
1. start barebox to prompt
2. power cut

soft reset:
1. start barebox to prompt
2. reset

     Christian

>
>
> This is a board with i.MX6 SoloCore - where we have 2 GiB FLASH + 512 MB RAM.
>
> Anyway - it saved my day to see this landing on the mailing list!
>
> 	Sam


_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2016-12-22  8:50   ` Christian Hemp
@ 2016-12-22 17:55     ` Sam Ravnborg
  0 siblings, 0 replies; 9+ messages in thread
From: Sam Ravnborg @ 2016-12-22 17:55 UTC (permalink / raw)
  To: Christian Hemp; +Cc: barebox

Hi Christian,

thanks for the quick reply.

> >Can you give any hints that makes it simpler to verify your fix or anything?
> To reproduce the issue and test the fix I reseted the board in a loop.
> For this I added 'reset' to /env/bin/init. With 'reset" in /env/bin/init I
> saw the issue after 1 - 2 Minutes.
> Also I have done 1000 power cuts and resets without any error in our
> test rack.
> 
> power cut:
> 1. start barebox to prompt
> 2. power cut
> 
> soft reset:
> 1. start barebox to prompt
> 2. reset

We have deployed an updated barebox in our test setup that do
all our ON/OFF testing.
A quick grep from the serial logs shows that we have seen this
error message much more often than I originally anticipated.
So I expect to have feedback already tomorrow.

From your information I am quite positive.

	Sam

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2016-12-21 21:38 [PATCH] mtd: nand_mxs: fix NAND error when change clk rate Christian Hemp
  2016-12-21 22:29 ` Sam Ravnborg
@ 2016-12-23 10:39 ` Sam Ravnborg
  2016-12-27 17:07 ` Fabio Estevam
  2017-01-09 10:21 ` Sascha Hauer
  3 siblings, 0 replies; 9+ messages in thread
From: Sam Ravnborg @ 2016-12-23 10:39 UTC (permalink / raw)
  To: Christian Hemp; +Cc: barebox

On Wed, Dec 21, 2016 at 10:38:41PM +0100, Christian Hemp wrote:
> The function "nand_enable_edo_mode" changed the NAND clk rate, without turning
> it off. In this case it is posible to get the following errors:
>     MXS NAND: Error sending command
>     MXS NAND: Error sending command
>     MXS NAND: DMA read error
> 
> This can be fixed if the NAND clk is disabled before we change the clk
> rate.
> 
> Tested with:
> nand: NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron
> MT29F4G08ABADAWP), 512MiB, page size: 2048, OOB size: 64
> 
> Signed-off-by: Christian Hemp <c.hemp@phytec.de>
Tested-by: Sam Ravnborg <sam@ravnborg.org>

We had it running over night - not a single DMA error.
Tested with 16 different targets in an ON/OFF test.
I do not have numbers for the ON/OFF cycles but it is more than 10 each.
Previously we had a high frequency of these.
And tonight there was none.

Christian - if we ever meet in person I owe you a beer!

	Sam

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2016-12-21 21:38 [PATCH] mtd: nand_mxs: fix NAND error when change clk rate Christian Hemp
  2016-12-21 22:29 ` Sam Ravnborg
  2016-12-23 10:39 ` Sam Ravnborg
@ 2016-12-27 17:07 ` Fabio Estevam
  2017-01-09 10:21 ` Sascha Hauer
  3 siblings, 0 replies; 9+ messages in thread
From: Fabio Estevam @ 2016-12-27 17:07 UTC (permalink / raw)
  To: Christian Hemp; +Cc: barebox

Hi Christian,

On Wed, Dec 21, 2016 at 7:38 PM, Christian Hemp <c.hemp@phytec.de> wrote:
> The function "nand_enable_edo_mode" changed the NAND clk rate, without turning
> it off. In this case it is posible to get the following errors:
>     MXS NAND: Error sending command
>     MXS NAND: Error sending command
>     MXS NAND: DMA read error
>
> This can be fixed if the NAND clk is disabled before we change the clk
> rate.
>
> Tested with:
> nand: NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron
> MT29F4G08ABADAWP), 512MiB, page size: 2048, OOB size: 64
>
> Signed-off-by: Christian Hemp <c.hemp@phytec.de>
> ---
>  drivers/mtd/nand/nand_mxs.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/mtd/nand/nand_mxs.c b/drivers/mtd/nand/nand_mxs.c
> index cba0bee..ce79bca 100644
> --- a/drivers/mtd/nand/nand_mxs.c
> +++ b/drivers/mtd/nand/nand_mxs.c
> @@ -2047,7 +2047,9 @@ static int mxs_nand_enable_edo_mode(struct mxs_nand_info *info)
>         nand->select_chip(mtd, -1);
>
>         /* [3] set the main IO clock, 100MHz for mode 5, 80MHz for mode 4. */
> +       clk_disable(info->clk);
>         clk_set_rate(info->clk, (mode == 5) ? 100000000 : 80000000);
> +       clk_enable(info->clk);

Yes, this is needed to fix erratum ERR007117.
(http://cache.nxp.com/assets/documents/data/en/errata/IMX6DQCE.pdf)

I will prepare the same fix for the kernel, thanks.

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2016-12-21 21:38 [PATCH] mtd: nand_mxs: fix NAND error when change clk rate Christian Hemp
                   ` (2 preceding siblings ...)
  2016-12-27 17:07 ` Fabio Estevam
@ 2017-01-09 10:21 ` Sascha Hauer
  2017-08-21 16:26   ` Uwe Kleine-König
  3 siblings, 1 reply; 9+ messages in thread
From: Sascha Hauer @ 2017-01-09 10:21 UTC (permalink / raw)
  To: Christian Hemp; +Cc: barebox

On Wed, Dec 21, 2016 at 10:38:41PM +0100, Christian Hemp wrote:
> The function "nand_enable_edo_mode" changed the NAND clk rate, without turning
> it off. In this case it is posible to get the following errors:
>     MXS NAND: Error sending command
>     MXS NAND: Error sending command
>     MXS NAND: DMA read error
> 
> This can be fixed if the NAND clk is disabled before we change the clk
> rate.
> 
> Tested with:
> nand: NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron
> MT29F4G08ABADAWP), 512MiB, page size: 2048, OOB size: 64
> 
> Signed-off-by: Christian Hemp <c.hemp@phytec.de>
> ---
>  drivers/mtd/nand/nand_mxs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/mtd/nand/nand_mxs.c b/drivers/mtd/nand/nand_mxs.c
> index cba0bee..ce79bca 100644
> --- a/drivers/mtd/nand/nand_mxs.c
> +++ b/drivers/mtd/nand/nand_mxs.c
> @@ -2047,7 +2047,9 @@ static int mxs_nand_enable_edo_mode(struct mxs_nand_info *info)
>  	nand->select_chip(mtd, -1);
>  
>  	/* [3] set the main IO clock, 100MHz for mode 5, 80MHz for mode 4. */
> +	clk_disable(info->clk);
>  	clk_set_rate(info->clk, (mode == 5) ? 100000000 : 80000000);
> +	clk_enable(info->clk);

Calling clk_disable doesn't guarantee that the clock is actually
disabled. If there's another user of the same clock then clk_disable
will only decrease the usage counter.
I think if possible we should fix this in the clock driver instead.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2017-01-09 10:21 ` Sascha Hauer
@ 2017-08-21 16:26   ` Uwe Kleine-König
  2017-08-21 16:31     ` Fabio Estevam
  0 siblings, 1 reply; 9+ messages in thread
From: Uwe Kleine-König @ 2017-08-21 16:26 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: barebox

Hello,

On Mon, Jan 09, 2017 at 11:21:33AM +0100, Sascha Hauer wrote:
> On Wed, Dec 21, 2016 at 10:38:41PM +0100, Christian Hemp wrote:
> > The function "nand_enable_edo_mode" changed the NAND clk rate, without turning
> > it off. In this case it is posible to get the following errors:
> >     MXS NAND: Error sending command
> >     MXS NAND: Error sending command
> >     MXS NAND: DMA read error
> > 
> > This can be fixed if the NAND clk is disabled before we change the clk
> > rate.

BTW, this is even documented in the reference manual---a bit at least:
The description for CCM_CS2CDR has for example:

	20-18 enfc_clk_pred
	  Divider for enfc clock pred divider.
	  NOTE: Divider should be updated when output clock is gated.

I patched the imx clk driver to not allow changes to any clock that
have this note. The nand_mxs driver then changes enfc_clk_podf which
doesn't have this note and still hangs occasionally.

> > Tested with:
> > nand: NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron
> > MT29F4G08ABADAWP), 512MiB, page size: 2048, OOB size: 64
> > 
> > Signed-off-by: Christian Hemp <c.hemp@phytec.de>
> > ---
> >  drivers/mtd/nand/nand_mxs.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/mtd/nand/nand_mxs.c b/drivers/mtd/nand/nand_mxs.c
> > index cba0bee..ce79bca 100644
> > --- a/drivers/mtd/nand/nand_mxs.c
> > +++ b/drivers/mtd/nand/nand_mxs.c
> > @@ -2047,7 +2047,9 @@ static int mxs_nand_enable_edo_mode(struct mxs_nand_info *info)
> >  	nand->select_chip(mtd, -1);
> >  
> >  	/* [3] set the main IO clock, 100MHz for mode 5, 80MHz for mode 4. */
> > +	clk_disable(info->clk);
> >  	clk_set_rate(info->clk, (mode == 5) ? 100000000 : 80000000);
> > +	clk_enable(info->clk);
> 
> Calling clk_disable doesn't guarantee that the clock is actually
> disabled. If there's another user of the same clock then clk_disable
> will only decrease the usage counter.
> I think if possible we should fix this in the clock driver instead.

I agree, this is something the clock driver should be aware of. Still
more as not only the nand clk is affected.

I wonder how this should be done though. This would imply that a clk can
somehow determine its children. And this is not static as for example
clko could be a child of enfc_clk_pred. And I think we need to recurse
because if a direct child is a mux that cannot be disabled and so the
gates below the mux must be disabled instead.  Alternatively we must
provide a list L of clks to such a clk A such that if A is changed the
clks in L can be disabled first (maybe after a check that the affected
clk is really an descendant of A).

And we'd need a function different to clk_disable that hard disables the
clk independent of its enable count. (Or the change function of A knows
about the interna of all clocks in L and can just disable them.) In
Linux this might be possible with clk notifiers, but I don't know for
sure.

I didn't find an idea yet that makes this all look less evil, so
comments are welcome.

@Fabio, in a different end of this thread you wrote that you want to fix
this for Linux. Did you already address this?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mtd: nand_mxs: fix NAND error when change clk rate
  2017-08-21 16:26   ` Uwe Kleine-König
@ 2017-08-21 16:31     ` Fabio Estevam
  0 siblings, 0 replies; 9+ messages in thread
From: Fabio Estevam @ 2017-08-21 16:31 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: barebox

Hi Uwe,

On Mon, Aug 21, 2017 at 1:26 PM, Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:

> @Fabio, in a different end of this thread you wrote that you want to fix
> this for Linux. Did you already address this?

Not really.

Initially I was thinking in fixing it in the GPMI driver, but it seems
that the right location for the fix would be in drivers/clk/imx, but
never had a chance to implement it properly.

Regards,

Fabio Estevam

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-08-21 16:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-21 21:38 [PATCH] mtd: nand_mxs: fix NAND error when change clk rate Christian Hemp
2016-12-21 22:29 ` Sam Ravnborg
2016-12-22  8:50   ` Christian Hemp
2016-12-22 17:55     ` Sam Ravnborg
2016-12-23 10:39 ` Sam Ravnborg
2016-12-27 17:07 ` Fabio Estevam
2017-01-09 10:21 ` Sascha Hauer
2017-08-21 16:26   ` Uwe Kleine-König
2017-08-21 16:31     ` Fabio Estevam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox