mail archive of the barebox mailing list
 help / color / mirror / Atom feed
From: "Jan Lübbe" <jlu@pengutronix.de>
To: Oleksij Rempel <fishor@gmx.net>,
	Oleksij Rempel <o.rempel@pengutronix.de>,
	barebox@lists.infradead.org
Subject: Re: [PATCH v1 6/6] watchdog: add watchdog poller
Date: Thu, 08 Mar 2018 16:33:39 +0100	[thread overview]
Message-ID: <1520523219.31759.140.camel@pengutronix.de> (raw)
In-Reply-To: <cd46083c-123e-8ec9-015b-cd121c87434b@gmx.net>

Hi Oleksij,

On Thu, 2018-03-08 at 15:16 +0100, Oleksij Rempel wrote:
> > Also, it should be documented explicitly, that this will cause barebox
> > to keep triggering the watchdog, even when it drops to the shell after
> > a boot error. This makes it unsuitable for unattended use.
> 
> I would prefer to use controlled reboot over uncontrolled watchdog reset.
> For example it would be better to have boot and fail strategy. In case
> of network boot, it would be better to retry download in some time and
> not cause watchdog reset. If retry count exceeded then some thing should
> be done. It can be power off, reboot, fall back to CLI.

In my experience, the watchdog is used as a last resort to handle any
*unanticipated* problems. So, by definition, there isn't any code to
handle these problems. The way to do this is that the watchdog is only
triggered when the boot process has made actual progress towards a
running system. For example:
- once barebox probes the watchdog driver
- from the shell init scripts
- after loading the kernel, just before jumping to the kernel

This way, there is no possible way which could cause barebox to just
wait on the prompt: an idle or hung system will always be restarted via
the watchdog.

> The reason for controlled reboot is the fact that the reset impact (or
> Reset Sensitivity) is different for every product and source of reset.
> 
> This example is take from MiniRISC EZ4021-FC documentation:
> 				Soft				TAP Ctrl
> Module		Reset	Reset	PrRst	ERst	TRST	Reset
> CPU			yes	yes	yes	no	no	no
> CP0			yes	yes	yes	no	no	no
> ICCi			yes	yes	yes	no	no	no
> DCC			yes	yes	yes	no	no	no
> BIU			yes	yes	yes	no	no	no
> MMU			yes	no	no	no	no	no
> MDU			yes	yes	yes	no	no	no
> EJTAG iface:
> - DMA/CPU Acc		yes	yes	yes	yes	yes	yes
>   logic	
> - Protocol engine	yes	no	no	yes	yes	yes
> - Breakpoint		yes	no	no	yes	no	no
> - PC trace yes no no yes no no

It is not clear to me from this table which reset is triggered by the
hardware watchdog. I would expect that it is the first column, which
resets everything.

> Most Atheros/QCA WiSoCs will not reset complete SoC even with watchdog
> triggered reset.

If you can't be sure that the watchdog resets enough to recover from
any transient problem, you cannot rely on it at all (and should
possibly use an external watchdog).

Regards,
Jan
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

      reply	other threads:[~2018-03-08 15:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-08 11:05 [PATCH v1 1/6] watchdog: rename dev to hwdev Oleksij Rempel
2018-03-08 11:05 ` [PATCH v1 2/6] watchdog: move max timeout test in to wd_core Oleksij Rempel
2018-03-08 11:05 ` [PATCH v1 3/6] watchdog: register watchdog virtual device with short name wdog Oleksij Rempel
2018-03-12 10:35   ` Sascha Hauer
2018-03-08 11:05 ` [PATCH v1 4/6] watchdog: set some reasonable timeout_max value if no other is available Oleksij Rempel
2018-03-12 10:37   ` Sascha Hauer
2018-03-08 11:05 ` [PATCH v1 5/6] watchdog: provide timeout_cur value Oleksij Rempel
2018-03-08 11:05 ` [PATCH v1 6/6] watchdog: add watchdog poller Oleksij Rempel
2018-03-08 13:49   ` Jan Lübbe
2018-03-08 14:16     ` Oleksij Rempel
2018-03-08 15:33       ` Jan Lübbe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1520523219.31759.140.camel@pengutronix.de \
    --to=jlu@pengutronix.de \
    --cc=barebox@lists.infradead.org \
    --cc=fishor@gmx.net \
    --cc=o.rempel@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox