mail archive of the barebox mailing list
 help / color / mirror / Atom feed
* [PATCH 0/3] Documentation: devel: add new troubleshooting
@ 2025-07-04 14:38 Ahmad Fatoum
  2025-07-04 14:38 ` [PATCH 1/3] Documentation: devel: porting: split out architecture intro Ahmad Fatoum
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ahmad Fatoum @ 2025-07-04 14:38 UTC (permalink / raw)
  To: barebox; +Cc: David Picard

A consequence of running bare metal is that early failures are difficult
to diagnose. Let's add a troubleshooting section to help users take
the first step in diagnosing issues.

Ahmad Fatoum (3):
  Documentation: devel: porting: split out architecture intro
  Documentation: devel: architecture: detail first/second stage handling
  Documentation: devel: troubleshooting: add new chapter

 Documentation/devel/architecture.rst    | 200 +++++++++++++
 Documentation/devel/devel.rst           |   2 +
 Documentation/devel/porting.rst         |  83 +-----
 Documentation/devel/troubleshooting.rst | 377 ++++++++++++++++++++++++
 Documentation/devicetree/index.rst      |   2 +
 5 files changed, 588 insertions(+), 76 deletions(-)
 create mode 100644 Documentation/devel/architecture.rst
 create mode 100644 Documentation/devel/troubleshooting.rst

-- 
2.39.5




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] Documentation: devel: porting: split out architecture intro
  2025-07-04 14:38 [PATCH 0/3] Documentation: devel: add new troubleshooting Ahmad Fatoum
@ 2025-07-04 14:38 ` Ahmad Fatoum
  2025-07-04 14:38 ` [PATCH 2/3] Documentation: devel: architecture: detail first/second stage handling Ahmad Fatoum
  2025-07-04 14:38 ` [PATCH 3/3] Documentation: devel: troubleshooting: add new chapter Ahmad Fatoum
  2 siblings, 0 replies; 4+ messages in thread
From: Ahmad Fatoum @ 2025-07-04 14:38 UTC (permalink / raw)
  To: barebox; +Cc: David Picard, Ahmad Fatoum

In preparation for adding a debugging chapter, move the parts in the
porting guide that are equally applicable to debugging to its separate
file, so they can be referenced easily.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 Documentation/devel/architecture.rst | 78 ++++++++++++++++++++++++++
 Documentation/devel/porting.rst      | 83 +++-------------------------
 2 files changed, 85 insertions(+), 76 deletions(-)
 create mode 100644 Documentation/devel/architecture.rst

diff --git a/Documentation/devel/architecture.rst b/Documentation/devel/architecture.rst
new file mode 100644
index 000000000000..83556095098a
--- /dev/null
+++ b/Documentation/devel/architecture.rst
@@ -0,0 +1,78 @@
+.. _architecture:
+
+####################
+barebox architecture
+####################
+
+The usual barebox binary consists of two parts. A prebootloader doing
+the bare minimum initialization and then the proper barebox binary.
+
+barebox proper
+==============
+
+This is the main part of barebox and, like a multi-platform Linux kernel,
+is platform-agnostic: The program starts, registers its drivers and tries
+to match the drivers with the devices it discovers at runtime.
+It initializes file systems and common management facilities and finally
+starts an init process. barebox knows no privilege separation and the
+init process is built into barebox.
+The default init is the :ref:`Hush`, but can be overridden if required.
+
+For such a platform-agnostic program to work, it must receive external
+input about what kind of devices are available: For example, is there a
+timer? At what address and how often does it tick? For most barebox
+architectures this hardware description is provided in the form
+of a flattened device tree (FDT). As part of barebox' initialization
+procedure, it unflattens (parses) the device tree and starts probing
+(matching) the devices described within with the drivers that are being
+registered.
+
+The device tree can also describe the RAM available in the system. As
+walking the device tree itself consumes RAM, barebox proper needs to
+be passed information about an initial memory region for use as stack
+and for dynamic allocations. When barebox has probed the memory banks,
+the whole memory will become available.
+
+As result of this design, the same barebox proper binary can be reused for
+many different boards. Unlike Linux, which can expect a bootloader to pass
+it the device tree, barebox *is* the bootloader. For this reason, barebox
+proper is prefixed with what is called a prebootloader (PBL). The PBL
+handles the low-level details that need to happen before invoking barebox
+proper.
+
+Prebootloader (PBL)
+===================
+
+The :ref:`prebootloader <pbl>` is a small chunk of code whose objective is
+to prepare the environment for barebox proper to execute. This means:
+
+ - Setting up a stack
+ - Determining a memory region for initial allocations
+ - Provide the device tree
+ - Jump to barebox proper
+
+The prebootloader often runs from a constrained medium like a small
+(tens of KiB) on-chip SRAM or sometimes even directly from flash.
+
+If the size constraints allow, the PBL will contain the barebox proper
+binary in compressed form. After ensuring any external DRAM can be
+addressed, it will unpack barebox proper there and call it with the
+necessary arguments: an initial memory region and the FDT.
+
+If this is not feasible, the PBL will contain drivers to chain load
+barebox proper from the storage medium. As this is usually the same
+storage medium the PBL itself was loaded from, shortcuts can often
+be taken: e.g. a SD-Card could already be in the correct mode, so the
+PBL driver can just read the blocks without having to reinitialize
+the SD-card.
+
+barebox images
+==============
+
+In a typical build, the barebox build process generates multiple images
+(:ref:`multi_image`).  All enabled PBLs are each linked with the same
+barebox proper binary and then the resulting images are processed to be
+in the format expected by the loader.
+
+The loader is often a BootROM, but maybe another first stage bootloader
+or a hardware debugger.
diff --git a/Documentation/devel/porting.rst b/Documentation/devel/porting.rst
index 9dab2a301f2a..88847f42a8f7 100644
--- a/Documentation/devel/porting.rst
+++ b/Documentation/devel/porting.rst
@@ -15,83 +15,14 @@ about porting barebox to new hardware.
 Introduction
 ************
 
-Your usual barebox binary consists of two parts. A prebootloader doing
-the bare minimum initialization and then the proper barebox binary.
+Before starting your barebox port, you'll want to familiarize yourself with
+key concepts of the :ref:`barebox architecture <architecture>`  namely the
+prebootloader, barebox proper and the multi-image support.
 
-barebox proper
-==============
-
-This is the main part of barebox and, like a multi-platform Linux kernel,
-is platform-agnostic: The program starts, registers its drivers and tries
-to match the drivers with the devices it discovers at runtime.
-It initializes file systems and common management facilities and finally
-starts an init process. barebox knows no privilege separation and the
-init process is built into barebox.
-The default init is the :ref:`Hush`, but can be overridden if required.
-
-For such a platform-agnostic program to work, it must receive external
-input about what kind of devices are available: For example, is there a
-timer? At what address and how often does it tick? For most barebox
-architectures this hardware description is provided in the form
-of a flattened device tree (FDT). As part of barebox' initialization
-procedure, it unflattens (parses) the device tree and starts probing
-(matching) the devices described within with the drivers that are being
-registered.
-
-The device tree can also describe the RAM available in the system. As
-walking the device tree itself consumes RAM, barebox proper needs to
-be passed information about an initial memory region for use as stack
-and for dynamic allocations. When barebox has probed the memory banks,
-the whole memory will become available.
-
-As result of this design, the same barebox proper binary can be reused for
-many different boards. Unlike Linux, which can expect a bootloader to pass
-it the device tree, barebox *is* the bootloader. For this reason, barebox
-proper is prefixed with what is called a prebootloader (PBL). The PBL
-handles the low-level details that need to happen before invoking barebox
-proper.
-
-Prebootloader (PBL)
-===================
-
-The :ref:`prebootloader <pbl>` is a small chunk of code whose objective is
-to prepare the environment for barebox proper to execute. This means:
-
- - Setting up a stack
- - Determining a memory region for initial allocations
- - Provide the device tree
- - Jump to barebox proper
-
-The prebootloader often runs from a constrained medium like a small
-(tens of KiB) on-chip SRAM or sometimes even directly from flash.
-
-If the size constraints allow, the PBL will contain the barebox proper
-binary in compressed form. After ensuring any external DRAM can be
-addressed, it will unpack barebox proper there and call it with the
-necessary arguments: an initial memory region and the FDT.
-
-If this is not feasible, the PBL will contain drivers to chain load
-barebox proper from the storage medium. As this is usually the same
-storage medium the PBL itself was loaded from, shortcuts can often
-be taken: e.g. a SD-Card could already be in the correct mode, so the
-PBL driver can just read the blocks without having to reinitialize
-the SD-card.
-
-barebox images
-==============
-
-In a typical build, the barebox build process generates multiple images
-(:ref:`multi_image`).  All enabled PBLs are each linked with the same
-barebox proper binary and then the resulting images are processed to be
-in the format expected by the loader.
-
-The loader is often a BootROM, but maybe another first stage bootloader
-or a hardware debugger.
-
-Let us now put these new concepts into practice. We will start by adding
-a new board for a platform, for which similar boards already exist.
-Then we'll look at adding a new SoC, then a new SoC family and finally
-a new architecture.
+Once you've read through, let us now put these new concepts into practice.
+We will start by adding a new board for a platform, for which similar boards
+already exist.  Then we'll look at adding a new SoC, then a new SoC family
+and finally a new architecture.
 
 **********************
 Porting to a new board
-- 
2.39.5




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/3] Documentation: devel: architecture: detail first/second stage handling
  2025-07-04 14:38 [PATCH 0/3] Documentation: devel: add new troubleshooting Ahmad Fatoum
  2025-07-04 14:38 ` [PATCH 1/3] Documentation: devel: porting: split out architecture intro Ahmad Fatoum
@ 2025-07-04 14:38 ` Ahmad Fatoum
  2025-07-04 14:38 ` [PATCH 3/3] Documentation: devel: troubleshooting: add new chapter Ahmad Fatoum
  2 siblings, 0 replies; 4+ messages in thread
From: Ahmad Fatoum @ 2025-07-04 14:38 UTC (permalink / raw)
  To: barebox; +Cc: David Picard, Ahmad Fatoum

barebox is at mercy of the BootROM in relation to how its first/second
stages need to be structured. In the optimal case, the user need not
about it, but when issues arises, it becomes very important to know
_where_ the issue occurs.

Let's add some general background information about that.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 Documentation/devel/architecture.rst | 136 +++++++++++++++++++++++++--
 1 file changed, 129 insertions(+), 7 deletions(-)

diff --git a/Documentation/devel/architecture.rst b/Documentation/devel/architecture.rst
index 83556095098a..b557ec830dc7 100644
--- a/Documentation/devel/architecture.rst
+++ b/Documentation/devel/architecture.rst
@@ -60,19 +60,141 @@ addressed, it will unpack barebox proper there and call it with the
 necessary arguments: an initial memory region and the FDT.
 
 If this is not feasible, the PBL will contain drivers to chain load
-barebox proper from the storage medium. As this is usually the same
-storage medium the PBL itself was loaded from, shortcuts can often
-be taken: e.g. a SD-Card could already be in the correct mode, so the
-PBL driver can just read the blocks without having to reinitialize
-the SD-card.
+barebox proper from the storage medium, usually the same
+storage medium the PBL itself was loaded from.
 
 barebox images
 ==============
 
 In a typical build, the barebox build process generates multiple images
-(:ref:`multi_image`).  All enabled PBLs are each linked with the same
-barebox proper binary and then the resulting images are processed to be
+(:ref:`multi_image`).  Normally, all enabled PBLs are each linked with the
+same barebox proper binary and then the resulting images are processed to be
 in the format expected by the loader.
 
 The loader is often a BootROM, but maybe another first stage bootloader
 or a hardware debugger.
+
+Ideally, a single image is all that's needed to boot into barebox.
+Depending on BootROM, this may not be possible, because the BootROM hardcodes
+assumptions about the image that it loads. This is often the case with
+BootROMs that boot from a file system (often FAT): They expect a ``BOOT.BIN``
+file, ``MLO`` or similarly named file that does not exceed a fixed file size,
+because that file needs to be loaded into the small on-chip SRAM available on
+the SoC. In such cases, most barebox functionality will be located in a separate
+image, e.g. ``barebox.bin``, which is loaded by ``BOOT.BIN``, once DRAM has
+been successfully set up.
+
+There are two ways we generate such small first stage bootloaders in barebox:
+
+Old way: ≥ 2 configs
+---------------------
+
+The old way is to have a dedicated barebox first stage config that builds a
+very small non-interactive barebox for use as first stage.
+Due to size constraints, the first stage config is usually board-specific, while
+the second-stage config can target multiple boards at once.
+
+In this setup, each of the first and second stage each consist of their own
+prebootloader and barebox proper.
+
+* first stage prebootloader: Does DRAM setup and extracts first stage
+  barebox proper into DRAM.
+
+* first stage barebox proper: runs in DRAM and chainloads the second stage binary.
+
+* first stage prebootloader: is already running in DRAM, so it doesn't need to do
+  any hardware setup and instead directly extract second stage barebox proper
+
+* second stage barebox proper: your usual barebox experience, which can have an
+  interactive shell and boot an operating system
+
+An Example for this is ``am335x_mlo_defconfig``.
+
+New way: single config
+----------------------
+
+The new way avoids having a dedicated barebox first stage config by doing both
+the low level first stage setup and the chainloading from the same barebox
+prebootloader. Despite the prebootloader lacking nearly all driver frameworks found
+in barebox proper this is often made possible by reusing the hardware set up
+by the BootROM:
+
+BootROMs usually don't deinitialize hardware before jumping to the first stage
+bootloader. That means that the barebox prebootloader could just keep reusing
+the pre-configured SD-Card and host controller for example and issue read block
+commands right away without having to reinitialize the SD-card.
+
+In this setup, only one prebootloader binary will be built. Depending on
+the bootflow defined by the BootROM, it may be executed more than once however:
+
+BootROM loads directly into DRAM
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some BootROMs have built-in support for executing multiple programs and
+loading them to different locations. In that case, a full second stage
+barebox binary can be loaded as second stage with the first stage RAM
+setup handled outside barebox.
+
+An example for Mask ROM loading one image into SRAM and another afterwards
+into DRAM are the boards in ``rockchip_v8_defconfig``: A DRAM setup blob
+is loaded as first stage followed by barebox directly into DRAM as second
+stage.
+
+A slightly different example are what most boards in ``imx_v7_defconfig``
+are doing: The i.MX bootrom can execute the bytecode located in the DCD
+(Device Configuration Data) table that's part of the bootloader header.
+This byte code has simple memory read/write and control flow primitives
+that are sufficient to setup a DDR2/DDR3 DRAM, so that barebox can be
+loaded into it right away.
+
+The option of being loaded into SRAM first and chainloading from there
+is also available, but not used frequently for the 32-bit i.MX platforms.
+For the 64-bit platforms with (LP)DDR4, the RAM controller setup is
+too complex to express with DCD opcodes, leading to the approach described
+below.
+
+BootROM loads into SRAM from offset
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For BootROMs, where the first stage bootloader is loaded from a raw offset
+on the boot medium, the barebox image is usually a single binary, which
+is processed as follows:
+
+* The BootROM reads the first X bytes containing the prebootloader (and
+  some truncated barebox proper content) into the on-chip SRAM and executes
+  it.
+
+* The prebootloader will set up DRAM and then chainload the whole of barebox
+  proper into it. The offset and size of barebox proper are compiled into
+  the PBL, so it knows where to look.
+
+* The prebootloader will then invoke itself again, but this time while running
+  from DRAM. The re-executed prebootloader detects that it's running in DRAM
+  or at a lower exception level and will then proceed to extract barebox
+  proper to the end of the initial memory region and execute it.
+
+And example for this is the ``imx_v8_defconfig``.
+
+.. note:: Some SoCs like the i.MX8M Nano and Plus provide a boot API in ROM
+  that can be used by the prebootloader to effortlessly chainload the second stage
+  cutting down complexity in the prebootloader greatly. Thanks BootROM authors!
+
+BootROM loads into SRAM from file
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When the BootROM expects a file, it most often does a size check,
+necessitating a second binary for the whole of barebox proper.
+
+The boot flow then looks as follows:
+
+* The BootROM reads the first binary, which includes only the prebootloader
+  from the file system into on-chip SRAM and executes it
+
+* The prebootloader will set up DRAM and then load the second binary
+  into it. It has no knowledge of its offsets and sizes, but gets that
+  information out of the FAT filesystem.
+
+* The second stage binary contains both its own prebootloader and a barebox
+  binary. The second stage prebootloader does not need to do any special
+  hardware setup, so it will proceed to extract barebox proper to the end of
+  the initial memory region and execute it.
-- 
2.39.5




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 3/3] Documentation: devel: troubleshooting: add new chapter
  2025-07-04 14:38 [PATCH 0/3] Documentation: devel: add new troubleshooting Ahmad Fatoum
  2025-07-04 14:38 ` [PATCH 1/3] Documentation: devel: porting: split out architecture intro Ahmad Fatoum
  2025-07-04 14:38 ` [PATCH 2/3] Documentation: devel: architecture: detail first/second stage handling Ahmad Fatoum
@ 2025-07-04 14:38 ` Ahmad Fatoum
  2 siblings, 0 replies; 4+ messages in thread
From: Ahmad Fatoum @ 2025-07-04 14:38 UTC (permalink / raw)
  To: barebox; +Cc: David Picard, Ahmad Fatoum

A consequence of running bare metal is that early failures are difficult
to diagnose. Let's add a troubleshooting section to help users take
the first step in diagnosing issues.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
---
 Documentation/devel/devel.rst           |   2 +
 Documentation/devel/troubleshooting.rst | 377 ++++++++++++++++++++++++
 Documentation/devicetree/index.rst      |   2 +
 3 files changed, 381 insertions(+)
 create mode 100644 Documentation/devel/troubleshooting.rst

diff --git a/Documentation/devel/devel.rst b/Documentation/devel/devel.rst
index d985bff40d42..b90805263bbd 100644
--- a/Documentation/devel/devel.rst
+++ b/Documentation/devel/devel.rst
@@ -8,7 +8,9 @@ Contents:
 .. toctree::
    :maxdepth: 2
 
+   architecture
    porting
+   troubleshooting
    filesystems
    background-execution
    project-ideas
diff --git a/Documentation/devel/troubleshooting.rst b/Documentation/devel/troubleshooting.rst
new file mode 100644
index 000000000000..67c4e3102be2
--- /dev/null
+++ b/Documentation/devel/troubleshooting.rst
@@ -0,0 +1,377 @@
+.. _troubleshooting:
+
+##########################
+Boot Troubleshooting Guide
+##########################
+
+Especially during development or bring-up, very early failure situations can leave
+the system hanging before recovery is even possible.
+
+This guide helps diagnose and debug such issues across barebox' different boot stages.
+
+Boot Flow Overview
+==================
+
+A barebox binary consists of two main stages:
+
+1. **PBL (Pre-Bootloader)**: This is a smaller barebones loader that does
+   what's necessary to download the full barebox binary.
+   At the very least, this is decompressing barebox proper and jumping
+   to it while passing it a device tree.
+   Depending on platform, it may also need to setup DRAM, install a secure
+   monitory like TF-A or a secure operating system like OP-TEE and chainload
+   barebox from a boot medium.
+2. **barebox proper**: The main bootloader logic. This is always loaded
+   by a prebootloader passing a device tree and including drivers for
+   device initialization, environment setup, and booting the OS.
+
+If barebox hangs, it's essential to identify *where* in this process the
+failure occurs. Here's how to debug different stages.
+
+Refer to the :ref:`barebox architecture <architecture>` for more background
+information on the different stages and the images.
+
+Completely silent console
+=========================
+
+Even the barebox prebootloader is most often loaded by another
+bootloader. This is commonly a mask BootROM hardwired into the
+System-on-chip.
+
+**Common problems**:
+
+- Wrong bootloader image or format
+- Bootloader installed to wrong location
+- System hang before serial driver probe
+- enabled, but misconfigured CONFIG_DEBUG_LL
+
+**What to try**:
+
+- Check for BootROM boot indicators:
+
+  Some BootROMs (e.g. AT91) write to a serial port when they start up
+  or blink a GPIO (e.g. STM32MP) if they fail to boot the next stage
+  bootloader.
+
+- Check that barebox is in the format and at the location that the
+  previous stage bootloader expects. Compare with a previously working
+  bootloader image, refer to the barebox documentation and/or the
+  vendor documentation or ask around.
+
+- Enable ``CONFIG_DEBUG_LL``
+
+  This enables very early low-level UART debugging.
+  It bypasses console frameworks and writes directly to UART registers.
+  Many boards in barebox, print a ``>`` character, when ``CONFIG_DEBUG_LL``
+  is enabled. If you see such a character after enabling ``DEBUG_LL``, it
+  indicates that the barebox prebootloader has been found and control was
+  successfully handed over to it. Note that on some SoCs, ``DEBUG_LL``
+  requires co-operation from the board entry point, e.g., the pin muxing for
+  the serial console needs to be done in software in some situations before
+  the UART is accessible from the outside.
+
+  .. note::
+     Make sure the correct UART index or address is selected under
+     **Kernel low-level debugging por** in ``menuconfig``.
+     Configuring the wrong UART might hang your system, because barebox would
+     be tricked into accessing hardware that's not there or is powered off.
+     The numbering/addresses of ports are described in the System-on-Chip
+     datasheet or reference manual and may differ from labels on the hardware.
+     Refer to the config symbol help text and ``/chosen/stdout-path`` in the
+     device tree if unsure.
+
+- Enable ``CONFIG_PBL_CONSOLE`` and ``CONFIG_DEBUG_PBL``
+
+  For boards that don't have an early ``putc_ll('>');``, the first output
+  being printed is often the debugging output from the uncompress entry
+  point (``barebox_pbl_start()``). Enable these options to see if the
+  CPU gets that far.
+
+  .. warning::
+     CONFIG_DEBUG_PBL increases the size of the PBL, which can make it
+     exceed a hard limit imposed by a previous stage bootloader.
+     Best case, this will be caught by the build system, but might not
+     if you are adding a new board and haven't told it yet.
+
+- Toggle a GPIO from the board entry point
+
+  A number of platforms (e.g. i.MX or STM32MP) have header-only GPIO helper
+  functions that can be used to toggle a GPIO. These can be used for
+  debugging early hangs by toggling an LED for example.
+
+- Trace BootROM activity
+
+  If you have no indication that the barebox prebootloader is being started,
+  consider tracing what the BootROM is doing, e.g. via JTAG or a logic analyzer
+  for the SD-Card.
+
+If you managed to get some serial output, move along to the next step.
+
+Hang after first stage PBL console output
+=========================================
+
+The first stage prebootloader handles:
+- Basic initialization (e.g., clocks, SDRAM)
+- installation of secure firmware if applicable
+- invocation of the second stage
+
+**Common problems**:
+
+- issues in board entry point
+- Hang in firmware
+
+**What to try**:
+
+- Check where hang occurs
+
+  If you get just some early output, you'll need to pinpoint, where the issue
+  occurs. if enabling ``CONFIG_PBL_CONSOLE`` along with a correctly configured
+  ``CONFIG_DEBUG_PBL`` doesn't help, try adding ``putc_ll('@')`` (or any other
+  character) to find out, where the startup is stuck. ``putc_ll`` has the
+  benefit of being usable everywhere, even before ``setup_c()`` is or
+  ``relocate_to_current_adr()`` is called. Once these are called, you may
+  also use ``puts_ll()`` or just normal ``printf`` if ``CONFIG_PBL_CONSOLE=y``.
+
+- Check if hang occurs in other loaded firmware
+
+  On platforms like i.MX8/9 and RK35xx, barebox will install ARM trusted
+  firmware as secure monitor and possibly OP-TEE as secure OS.
+  Hangs can happen if TF-A or OP-TEE is configured to access the wrong
+  console (hang/abort on accessing peripheral with gated clock).
+  If output ends with the banner of the firmware, jumping back to barebox
+  may have failed. In that case, double check that the memory size
+  configured for TF-A/OP-TEE is correct and that the entry addresses
+  used in barebox and TF-A/OP-TEE are identical.
+
+Hang during chainloading
+========================
+
+Once basic system initialization is done, barebox prebootloader
+will load the second stage.
+
+**Common problems**:
+
+- wrong SDRAM setup
+- corrupted barebox proper read from boot medium
+
+**What to try**:
+
+- Check computed addresses
+
+  If your last output is ``jumping to uncompressed image``, this suggests that
+  the hang occured while trying to execute barebox proper. barebox prints
+  the regions it uses for its stack, barebox itself and the initial RAM
+  as debug output. Verify these with the actual size of RAM installed and
+  check if values are sane.
+
+- Check that barebox was loaded correctly
+
+  You can enable ``CONFIG_COMPILE_TEST`` and ``CONFIG_PBL_VERIFY_PIGGY``
+  to have the barebox build system compute a hash of barebox proper,
+  which the prebootloader will compare against the hash it computes
+  over the compresed data read from the boot medium.
+
+- Check SDRAM setup
+
+  SDRAM setup differs according to the RAM chip being used, the System-on-chip,
+  the PCB traces between them as well as outside factors like temperature.
+  When a System-on-Module is used, the hardware vendor will optimally provide
+  a validated RAM setup to be used. If RAM layout is custom, the System-on-Chip
+  vendor usually provides tools for calculating initial timings and tuning them
+  at runtime.
+
+  Because writes can be posted, issues with wrongly set up SDRAM may only become
+  apparent on first execution or read and not during mere writing.
+
+  Issues of writes silently misbehaving should be detectable by
+  ``CONFIG_PBL_VERIFY_PIGGY``, which reads back the data to hash it.
+
+  If the prebootloader is already running from SDRAM, boot hangs due to completely
+  wrong SDRAM setup are less likely, but running a memory test from within barebox
+  proper is still recommended.
+
+- Check if an exception happened
+
+  barebox can print symbolized stack traces on exceptions, but support for that
+  is only installed in barebox proper. Early exceptions are currently not enabled
+  by default, but can be enabled manually with ``CONFIG_ARM_EXCEPTIONS_PBL``.
+
+Preinitcall Stage
+=================
+
+The prebootloader ``barebox_pbl_start`` ends up calling ``barebox_non_pbl_start``
+in barebox proper. This function does:
+
+- relocation and setting up the C environment
+- setting up the malloc area and KASAN
+- calling ``start_barebox``, which runs the registered initcalls
+
+**Common problems**:
+
+- None, this is quite straight-forward code
+
+**What to try**:
+
+- Check if the code is executed. This can be done with ``putc_ll``. ``printf``
+  is not safe to use everywhere in this function, because the C environment
+  may not be set up yet.
+
+initcall Stage
+=================
+
+After decompression and jumping to barebox proper, barebox will walk through
+the compiled in initcalls.
+
+**Symptoms**:
+
+- Hangs after PBL output but before typical barebox banners
+
+**What to try**:
+
+- Enable ``CONFIG_DEBUG_INITCALLS`` while ``CONFIG_DEBUG_LL`` is enabled
+
+  This shows output for each initcall level, helping pinpoint where execution stops.
+  ``CONFIG_DEBUG_LL`` is useful here, because it allows showing output, even
+  before the first serial driver is probed.
+
+Driver Probe Stage
+==================
+
+Initcalls don't necessarily correspond to driver probes as a driver may be
+registered before a device or the device probe is postponed until resources
+become available.
+
+**Symptoms**:
+
+- Hangs during hardware initialization
+
+**What to try**:
+
+- Enable``CONFIG_DEBUG_PROBES``
+
+  This prints each driver probe attempt and can help isolate the problematic peripheral.
+
+- Disable drivers selectively to see if a shell can be reached.
+
+Interactive Console
+===================
+
+If you see output only with ``CONFIG_DEBUG_LL``, but not otherwise, you may not
+have any consoles enabled or you are looking at the wrong console.
+
+For testing, you can enable ``CONFIG_CONSOLE_ACTIVATE_ALL`` to have barebox
+proper print out logs on all console devices that it registers.
+
+Once you have the correct console figured out, consider enabling the option
+``CONFIG_CONSOLE_ACTIVATE_ALL_FALLBACK``. This will fall back to activating all
+consoles, when no console was activated by normal means (e.g. via the environment
+or the device tree ``/chosen/stdout`` property).
+
+Kernel hang
+===========
+
+**Symptoms**:
+
+- Hang after a line like
+  ``Loaded kernel to 0x40000000, devicetree at 0x41730000``
+
+With kernel hangs, it's important to find out, whether the hang happens in barebox
+still or already while executing the kernel.
+Without EFI loader support in barebox, there is no calling back from kernel to barebox,
+so a kernel hanging is usually indicative of an issue within the kernel itself.
+
+It's often useful to copy the kernel image into ``/tmp`` instead of booting directly
+to verify that the hang is not just a very slow network connection for example.
+The ``-v`` option to :ref:`command_cp` is useful for that.
+The file size copied may differ from the original if the mean of transport rounds
+up to a specific block size. In that case, round up the size on the host system
+and run a digest function like :ref:`command_md5sum` to check  that the image
+was transferred successfully.
+
+If the image is transferred correctly, the :ref:`command_boot` verbosity is increased
+by each extra ``-v`` option. At higher verbosity level, this will also print out
+the device tree passed to the kernel. The :ref:`command_of_diff` command is useful
+to :ref:`visualize only the fixups that were applied by barebox to the device tree<of_diff>`.
+
+If you are sure that the kernel is indeed being loaded, the ``earlycon`` kernel
+feature can enable early debugging output before kernel serial drivers are loaded.
+barebox can fixup an earlycon option if ``global.bootm.earlycon=1`` is specified.
+
+Spurious aborts/hangs
+=====================
+
+**Symptoms**:
+
+- Hangs/Panics/Aborts that happen in a non-deterministic fashion and whose
+  probability is greatly influenced by enabling/disabing barebox options
+  and corresponding shifts in the barebox binary
+
+It's generally advisable to run a memory test to verify basic operation and to check
+if the RAM size is sane. barebox provides two commands for this: :ref:`command_memtest`
+and :ref:`command_memtester`. In addition, some silicon vendors like NXP provide their
+own memory test blobs, which barebox can load to SRAM via :ref:`command_memcpy` and
+execute using :ref:`command_go`. By having the memory test outside DRAM, a much more
+thorough memory test is possible.
+
+With ``CONFIG_MMU=y``, the decompression of barebox proper in the prebootloader
+and the runtime of barebox proper will execute with MMU enabled for improved performance.
+
+This increase in performance is due to caches and speculative execution.
+barebox will mark memory mapped I/O devices and secure firmware as ineligible for
+being accessed speculatively, but it can only do so if the memory size it's told
+is correct and if secure memory is marked reserved in the device tree.
+
+The memory map as barebox sees it can be printed with the :ref:`command_iomem`
+command. Everything outside ``ram`` region is mapped non executible and uncacheable
+by default. Everything inside ``ram`` regions that doesn't have a ``[R]`` next
+to it is cacheable by default. The :ref:`command_mmuinfo` command can be used
+to show specific information about the MMU attributes for an address.
+
+Memory Corruption Issues
+========================
+
+Some hangs might be caused by heap corruption, stack overflows, or use-after-free bugs.
+
+**What to try**:
+
+- Enable ``CONFIG_KASAN`` (Kernel Address Sanitizer)
+
+  This provides runtime memory checking in barebox proper and can detect
+  invalid memory accesses.
+
+  .. warning::
+     KASAN gratly increases memory usage and may itself cause hangs in
+     constrained environments.
+
+
+Summary of Debug Options
+========================
+
++-----------------------------+-------------------------------------------------------+
+| Option                      | Description                                           |
++=============================+=======================================================+
+| CONFIG_DEBUG_LL             | Early low-level UART output                           |
++-----------------------------+-------------------------------------------------------+
+| CONFIG_PBL_CONSOLE          | Print statements from PBL                             |
++-----------------------------+-------------------------------------------------------+
+| CONFIG_DEBUG_PBL            | Enable all debug output in the PBL                    |
++-----------------------------+-------------------------------------------------------+
+| CONFIG_PBL_VERIFY_PIGGY     | Verify barebox proper in PBL before decompression     |
++-----------------------------+-------------------------------------------------------+
+| CONFIG_ARM_EXCEPTIONS_PBL   | Enable exception handlers in PBL                      |
++-----------------------------+-------------------------------------------------------+
+| CONFIG_DEBUG_INITCALLS      | Logs each initcall                                    |
++-----------------------------+-------------------------------------------------------+
+| CONFIG_DEBUG_PROBES         | Logs each driver probe                                |
++-----------------------------+-------------------------------------------------------+
+| CONFIG_KASAN                | Detects memory corruption                             |
++-----------------------------+-------------------------------------------------------+
+
+Final Tips
+==========
+
+- If all else fails, a JTAG debugger to single-step through the code can
+  be very useful. To help with this, ``CONFIG_PBL_BREAK`` triggers an
+  exception at the start of execution of the individual barebox stages,
+  which ``scripts/gdb/helper.py`` can use to correctly set the base
+  address, so symbols are correctly located.
diff --git a/Documentation/devicetree/index.rst b/Documentation/devicetree/index.rst
index 94e8d04f63c3..4f25b6c6869b 100644
--- a/Documentation/devicetree/index.rst
+++ b/Documentation/devicetree/index.rst
@@ -175,6 +175,8 @@ In the ``chosen``-node, barebox fixes up
 These values can be read from the booted linux system in ``/proc/device-tree/``
 or ``/sys/firmware/devicetree/base``.
 
+.. _of_diff:
+
 To see a dry run of what barebox would fixup, the ``of_diff`` command can be
 used::
 
-- 
2.39.5




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-07-04 14:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-04 14:38 [PATCH 0/3] Documentation: devel: add new troubleshooting Ahmad Fatoum
2025-07-04 14:38 ` [PATCH 1/3] Documentation: devel: porting: split out architecture intro Ahmad Fatoum
2025-07-04 14:38 ` [PATCH 2/3] Documentation: devel: architecture: detail first/second stage handling Ahmad Fatoum
2025-07-04 14:38 ` [PATCH 3/3] Documentation: devel: troubleshooting: add new chapter Ahmad Fatoum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox