mail archive of the barebox mailing list
 help / color / mirror / Atom feed
* Designware MAC reset timeout after Linux reboot
@ 2016-11-07 17:56 Ian Abbott
  2016-11-08  8:08 ` Sascha Hauer
  2016-11-08  8:59 ` Steffen Trumtrar
  0 siblings, 2 replies; 6+ messages in thread
From: Ian Abbott @ 2016-11-07 17:56 UTC (permalink / raw)
  To: barebox

Hi everyone,

I'm using barebox 2016.10.0 with some custom BSP patches for my Cyclone 
V socfpga based board.  I've noticed that after issuing a reboot in 
Linux, followed by an 'ifup eth0' command in barebox, I get a "eth0: MAC 
reset timeout" error, which causes dwc_ether_init() to bail out early. 
My Linux kernel is Linux 4.1.17, plus LTSI-4.1.17 patches, plus Altera 
patches from linux-socfpga kernel branch socfpga-4.1.22-ltsi, in that 
order (git rebase is a wonderful thing!).

Socfpga has two Ethernet MAC controllers.  Like several other Cyclone V 
boards, my board's device tree disables the first one (&gmac0) and 
aliases ethernet0 to the second one (&gmac1).

I don't need the ethernet to work to boot Linux, and Linux manages to 
reinitialize the ethernet okay, so it's more of a inconvenience to me 
than a show-stopper - I just need to power-cycle the board if I want 
ethernet access in barebox.

I am aware of Trent Piepho's patch (commit 
f0ae0c33f52ced89da080673ca89a3c5f2ea70e6) which brings the PHY out of 
power-down mode before resetting the MAC DMA controller.  In fact, the 
PHY doesn't seem to be in power-down mode in my case, as the value read 
from the MII_BMCR in phy_resume() is 0x1140 (BMCR_ANENABLE | 
BMCR_FULLDPLX | BMCR_SPEED1000).

There must be something else stopping the software reset of the MAC 
completing successfully, but I'm not sure what.  The Cyclone V Hard 
Processor System Technical Reference Manual says this about the MAC DMA 
software reset bit:

| Note: * The Software reset system is driven only by this bit. *
| The reset operation is completed only when all resets in all
| active clock domains are de-asserted. Therefore, it is
| essential that all the PHY inputs clocks (applicable for the
| selected PHY interface) are present for the software reset
| completion.

Perhaps the timeout isn't waiting long enough.  If I interrupt the 'ifup 
eth0' command and display the approriate 'Bus_Mode' register 
(0xff703000) with the 'md' command, the DMAMAC_SRST bit (bit 0) is no 
longer set:

barebox@xxxx:/ md -l 0xff703000+4
ff703000: 00020100

I tried porting over a few old patches from the U-Boot version of the 
driver, in particular these two patches for the mac_reset() function:

http://git.denx.de/?p=u-boot.git;a=patch;h=7091915ad7a58d7884b7353b87373847ae943e1c

http://git.denx.de/?p=u-boot.git;a=patch;h=227ad7b2b6fab024fff6f60613b0e90c9e3a6724

They didn't solve my problem, but I'll send those two patches and a 
couple of others adapted from the U-Boot version of the driver to the 
list separately.

Sorry for waffling on for so long.  Thanks for your time, and any 
helpful hints you can offer!  On the whole, hacking PTXdist and barebox 
is a much more pleasant experience than hacking U-Boot and Yocto!

-- 
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk> )=-
-=(                          Web: http://www.mev.co.uk/  )=-

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Designware MAC reset timeout after Linux reboot
  2016-11-07 17:56 Designware MAC reset timeout after Linux reboot Ian Abbott
@ 2016-11-08  8:08 ` Sascha Hauer
  2016-11-08 12:13   ` Ian Abbott
  2016-11-08  8:59 ` Steffen Trumtrar
  1 sibling, 1 reply; 6+ messages in thread
From: Sascha Hauer @ 2016-11-08  8:08 UTC (permalink / raw)
  To: Ian Abbott; +Cc: barebox

Hi Ian,

On Mon, Nov 07, 2016 at 05:56:51PM +0000, Ian Abbott wrote:
> Hi everyone,
> 
> I'm using barebox 2016.10.0 with some custom BSP patches for my Cyclone V
> socfpga based board.  I've noticed that after issuing a reboot in Linux,
> followed by an 'ifup eth0' command in barebox, I get a "eth0: MAC reset
> timeout" error, which causes dwc_ether_init() to bail out early. My Linux
> kernel is Linux 4.1.17, plus LTSI-4.1.17 patches, plus Altera patches from
> linux-socfpga kernel branch socfpga-4.1.22-ltsi, in that order (git rebase
> is a wonderful thing!).
> 
> Socfpga has two Ethernet MAC controllers.  Like several other Cyclone V
> boards, my board's device tree disables the first one (&gmac0) and aliases
> ethernet0 to the second one (&gmac1).
> 
> I don't need the ethernet to work to boot Linux, and Linux manages to
> reinitialize the ethernet okay, so it's more of a inconvenience to me than a
> show-stopper - I just need to power-cycle the board if I want ethernet
> access in barebox.

Have you searched in the Linux code what it does differently so that it
can successfully reset the MAC?

> 
> I am aware of Trent Piepho's patch (commit
> f0ae0c33f52ced89da080673ca89a3c5f2ea70e6) which brings the PHY out of
> power-down mode before resetting the MAC DMA controller.  In fact, the PHY
> doesn't seem to be in power-down mode in my case, as the value read from the
> MII_BMCR in phy_resume() is 0x1140 (BMCR_ANENABLE | BMCR_FULLDPLX |
> BMCR_SPEED1000).
> 
> There must be something else stopping the software reset of the MAC
> completing successfully, but I'm not sure what.  The Cyclone V Hard
> Processor System Technical Reference Manual says this about the MAC DMA
> software reset bit:
> 
> | Note: * The Software reset system is driven only by this bit. *
> | The reset operation is completed only when all resets in all
> | active clock domains are de-asserted. Therefore, it is
> | essential that all the PHY inputs clocks (applicable for the
> | selected PHY interface) are present for the software reset
> | completion.
> 
> Perhaps the timeout isn't waiting long enough.  If I interrupt the 'ifup
> eth0' command and display the approriate 'Bus_Mode' register (0xff703000)
> with the 'md' command, the DMAMAC_SRST bit (bit 0) is no longer set:
> 
> barebox@xxxx:/ md -l 0xff703000+4
> ff703000: 00020100

The timeout is 10ms, this should be way enough. The return value of
dwc_ether_init() is not checked, so the driver happily continues with
further register writes, I assume there must be something that clears
this bit afterwards, either directly or indirectly.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Designware MAC reset timeout after Linux reboot
  2016-11-07 17:56 Designware MAC reset timeout after Linux reboot Ian Abbott
  2016-11-08  8:08 ` Sascha Hauer
@ 2016-11-08  8:59 ` Steffen Trumtrar
  2016-11-08 12:25   ` Ian Abbott
  1 sibling, 1 reply; 6+ messages in thread
From: Steffen Trumtrar @ 2016-11-08  8:59 UTC (permalink / raw)
  To: Ian Abbott; +Cc: barebox

Hi!

On Mon, Nov 07, 2016 at 05:56:51PM +0000, Ian Abbott wrote:
> Hi everyone,
> 
> I'm using barebox 2016.10.0 with some custom BSP patches for my Cyclone V
> socfpga based board.  I've noticed that after issuing a reboot in Linux,
> followed by an 'ifup eth0' command in barebox, I get a "eth0: MAC reset
> timeout" error, which causes dwc_ether_init() to bail out early. My Linux
> kernel is Linux 4.1.17, plus LTSI-4.1.17 patches, plus Altera patches from
> linux-socfpga kernel branch socfpga-4.1.22-ltsi, in that order (git rebase
> is a wonderful thing!).
> 

FYI: I just tested on a Socrates board with Linux 4.9-rc3 and barebox 2016.08.0
and can not reproduce your problem. Does that always happen or just sometimes?

Regards,
Steffen

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Designware MAC reset timeout after Linux reboot
  2016-11-08  8:08 ` Sascha Hauer
@ 2016-11-08 12:13   ` Ian Abbott
  2016-11-09 14:10     ` Ian Abbott
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Abbott @ 2016-11-08 12:13 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: barebox

On 08/11/16 08:08, Sascha Hauer wrote:
> Hi Ian,
>
> On Mon, Nov 07, 2016 at 05:56:51PM +0000, Ian Abbott wrote:
>> Hi everyone,
>>
>> I'm using barebox 2016.10.0 with some custom BSP patches for my Cyclone V
>> socfpga based board.  I've noticed that after issuing a reboot in Linux,
>> followed by an 'ifup eth0' command in barebox, I get a "eth0: MAC reset
>> timeout" error, which causes dwc_ether_init() to bail out early. My Linux
>> kernel is Linux 4.1.17, plus LTSI-4.1.17 patches, plus Altera patches from
>> linux-socfpga kernel branch socfpga-4.1.22-ltsi, in that order (git rebase
>> is a wonderful thing!).
>>
>> Socfpga has two Ethernet MAC controllers.  Like several other Cyclone V
>> boards, my board's device tree disables the first one (&gmac0) and aliases
>> ethernet0 to the second one (&gmac1).
>>
>> I don't need the ethernet to work to boot Linux, and Linux manages to
>> reinitialize the ethernet okay, so it's more of a inconvenience to me than a
>> show-stopper - I just need to power-cycle the board if I want ethernet
>> access in barebox.
>
> Have you searched in the Linux code what it does differently so that it
> can successfully reset the MAC?

The Linux code paths are more convoluted, including calls into the reset 
manager.  I found the code that resets the MAC DMA controller though - 
see below....

>> I am aware of Trent Piepho's patch (commit
>> f0ae0c33f52ced89da080673ca89a3c5f2ea70e6) which brings the PHY out of
>> power-down mode before resetting the MAC DMA controller.  In fact, the PHY
>> doesn't seem to be in power-down mode in my case, as the value read from the
>> MII_BMCR in phy_resume() is 0x1140 (BMCR_ANENABLE | BMCR_FULLDPLX |
>> BMCR_SPEED1000).
>>
>> There must be something else stopping the software reset of the MAC
>> completing successfully, but I'm not sure what.  The Cyclone V Hard
>> Processor System Technical Reference Manual says this about the MAC DMA
>> software reset bit:
>>
>> | Note: * The Software reset system is driven only by this bit. *
>> | The reset operation is completed only when all resets in all
>> | active clock domains are de-asserted. Therefore, it is
>> | essential that all the PHY inputs clocks (applicable for the
>> | selected PHY interface) are present for the software reset
>> | completion.
>>
>> Perhaps the timeout isn't waiting long enough.  If I interrupt the 'ifup
>> eth0' command and display the approriate 'Bus_Mode' register (0xff703000)
>> with the 'md' command, the DMAMAC_SRST bit (bit 0) is no longer set:
>>
>> barebox@xxxx:/ md -l 0xff703000+4
>> ff703000: 00020100
>
> The timeout is 10ms, this should be way enough. The return value of
> dwc_ether_init() is not checked, so the driver happily continues with
> further register writes, I assume there must be something that clears
> this bit afterwards, either directly or indirectly.

The bit is supposed to clear itself, but I guess something else could be 
clearing it too.

The code to reset the MAC DMA controller in Linux kernel 4.1 is 
dwmac1000_dma_init() in 
"drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c".  In Linux kernel 
4.6, the function is dwmac_dma_reset() in "dwmac_lib.c".  In both cases, 
the code to reset the DMA controller is basically as follows:

	u32 value = readl(ioaddr + DMA_BUS_MODE);
	int limit;

	/* DMA SW reset */
	value |= DMA_BUS_MODE_SFT_RESET;
	writel(value, ioaddr + DMA_BUS_MODE);
	limit = 10;
	while (limit--) {
		if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
			break;
		mdelay(10);
	}
	if (limit < 0)
		return -EBUSY;

It's interesting that it only bothers to check for reset completion 
every 10 ms (timing out after 100 ms), so it must be expecting it to 
take a while!

I'll experiment with the timeout on my board to see if the bit ever 
clears itself.

-- 
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk> )=-
-=(                          Web: http://www.mev.co.uk/  )=-

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Designware MAC reset timeout after Linux reboot
  2016-11-08  8:59 ` Steffen Trumtrar
@ 2016-11-08 12:25   ` Ian Abbott
  0 siblings, 0 replies; 6+ messages in thread
From: Ian Abbott @ 2016-11-08 12:25 UTC (permalink / raw)
  To: Steffen Trumtrar; +Cc: barebox

On 08/11/16 08:59, Steffen Trumtrar wrote:
> Hi!
>
> On Mon, Nov 07, 2016 at 05:56:51PM +0000, Ian Abbott wrote:
>> Hi everyone,
>>
>> I'm using barebox 2016.10.0 with some custom BSP patches for my Cyclone V
>> socfpga based board.  I've noticed that after issuing a reboot in Linux,
>> followed by an 'ifup eth0' command in barebox, I get a "eth0: MAC reset
>> timeout" error, which causes dwc_ether_init() to bail out early. My Linux
>> kernel is Linux 4.1.17, plus LTSI-4.1.17 patches, plus Altera patches from
>> linux-socfpga kernel branch socfpga-4.1.22-ltsi, in that order (git rebase
>> is a wonderful thing!).
>>
>
> FYI: I just tested on a Socrates board with Linux 4.9-rc3 and barebox 2016.08.0
> and can not reproduce your problem. Does that always happen or just sometimes?

It always happens on my board.  I could try reproducing it on a Socrates 
board.  I have a couple of Socrates version 1.2 boards and a Socrates 
2.0 board, so I could try and reproduce the problem if I find time to 
set it up.

My board is actually a prototype.  The PHY clock was originally wired up 
to completely the wrong pin on the FPGA (since it was based on an older 
NiosII based design).  It has been surgically altered so the PHY clock 
is on a different wrong pin, but at least the new pin is clocked at the 
correct frequency.  This may or may not be related to my problem, but 
the PHY seems to work OK before bringing up the MAC controller - miitool 
shows it manages to establish a link at the physical level.

-- 
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk> )=-
-=(                          Web: http://www.mev.co.uk/  )=-

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Designware MAC reset timeout after Linux reboot
  2016-11-08 12:13   ` Ian Abbott
@ 2016-11-09 14:10     ` Ian Abbott
  0 siblings, 0 replies; 6+ messages in thread
From: Ian Abbott @ 2016-11-09 14:10 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: barebox

On 08/11/16 12:13, Ian Abbott wrote:
> On 08/11/16 08:08, Sascha Hauer wrote:
>> Hi Ian,
>>
>> On Mon, Nov 07, 2016 at 05:56:51PM +0000, Ian Abbott wrote:

>>> Perhaps the timeout isn't waiting long enough.  If I interrupt the 'ifup
>>> eth0' command and display the approriate 'Bus_Mode' register
>>> (0xff703000)
>>> with the 'md' command, the DMAMAC_SRST bit (bit 0) is no longer set:
>>>
>>> barebox@xxxx:/ md -l 0xff703000+4
>>> ff703000: 00020100
>>
>> The timeout is 10ms, this should be way enough. The return value of
>> dwc_ether_init() is not checked, so the driver happily continues with
>> further register writes, I assume there must be something that clears
>> this bit afterwards, either directly or indirectly.
>
> The bit is supposed to clear itself, but I guess something else could be
> clearing it too.
>
> The code to reset the MAC DMA controller in Linux kernel 4.1 is
> dwmac1000_dma_init() in
> "drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c".  In Linux kernel
> 4.6, the function is dwmac_dma_reset() in "dwmac_lib.c".  In both cases,
> the code to reset the DMA controller is basically as follows:
>
>     u32 value = readl(ioaddr + DMA_BUS_MODE);
>     int limit;
>
>     /* DMA SW reset */
>     value |= DMA_BUS_MODE_SFT_RESET;
>     writel(value, ioaddr + DMA_BUS_MODE);
>     limit = 10;
>     while (limit--) {
>         if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
>             break;
>         mdelay(10);
>     }
>     if (limit < 0)
>         return -EBUSY;
>
> It's interesting that it only bothers to check for reset completion
> every 10 ms (timing out after 100 ms), so it must be expecting it to
> take a while!
>
> I'll experiment with the timeout on my board to see if the bit ever
> clears itself.
>

The problem seems to be related to some other problems I've been having 
with the Ethernet on this prototype board, which is something to do with 
the PHY chip's passive support components (inductors, capacitors, etc.) 
This problem manifests as lower-than-expected 'iperf' times when the 
Ethernet port is plugged into certain models of Ethernet switch.

I experimented with the timeout in mac_reset() in designware.c, setting 
it to 1 second, and printing out a debug message with the time taken for 
the reset to complete.

After 20 trials of rebooting from Linux to barebox and issuing the 'ifup 
eth0' command, I got a pretty random spread of times between 29.3 and 
850.2 ms, with a mean of 312.5 ms.  (It looks like a pretty linear 
distribution.  Some other stats: Q1: 141.4 ms, Median: 240.1 ms, Q3: 
480.2 ms, SD: 211.9 ms.)

I ran another trial with the is_timeout() call replaced with 
is_timeout_non_interruptible() and got a similar random spread of times 
(but smaller than the first trial) from 11.4 ms to 654.6 ms, with a mean 
of 266.2 ms.

Both of those trials were performed with the Ethernet port connected to 
a 1000 Base-T Ethernet switch.

Now here's the kicker.... If I plug it into a different brand of 1000 
Base-T Ethernet switch, the mac_reset() times (after rebooting from 
Linux) are more like 360 ns (not ms!).  If I plug it into a 100 Base-T 
switch, the times are more more like 900 ns to 2300 ns.  If I disconnect 
it completely, the times are about 360 ns.

For comparison, after running 'ifup eth0' after powering up into 
barebox, the mac_reset() times are about 360 ns independent of what the 
Ethernet port is plugged into.


I'm still not sure what state my Linux kernel is leaving the Ethernet 
controller and PHY in following a reboot, but I'm reasonably confident 
the problem is related to the PHY hardware components on my board.

-- 
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk> )=-
-=(                          Web: http://www.mev.co.uk/  )=-

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-11-09 14:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-07 17:56 Designware MAC reset timeout after Linux reboot Ian Abbott
2016-11-08  8:08 ` Sascha Hauer
2016-11-08 12:13   ` Ian Abbott
2016-11-09 14:10     ` Ian Abbott
2016-11-08  8:59 ` Steffen Trumtrar
2016-11-08 12:25   ` Ian Abbott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox