* [PATCH 1/2] ARM: i.MX: xload-gpmi-nand: fix bad block mark swapping @ 2021-06-07 9:09 Andrej Picej 2021-06-07 9:09 ` [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page Andrej Picej 0 siblings, 1 reply; 11+ messages in thread From: Andrej Picej @ 2021-06-07 9:09 UTC (permalink / raw) To: barebox Until now GPMI NAND xloader assumed NAND bad block marker start bit was always 0! This is not true for every NAND chip out there. Thus make xloader more robust and port NAND bad block marker swapping functionality from the nand_mxs driver and make use of it in PBL. This patch was tested on a PHYTEC phyCARD i.MX6Q board with next NANDs: - Spansion S34ML08G201FI00, - Samsung K9K8G08U0E and - Winbond W29N08GVSIAA. Signed-off-by: Andrej Picej <andrej.picej@norik.com> --- arch/arm/mach-imx/xload-gpmi-nand.c | 41 +++++++++++++++++++++++++++-- 1 file changed, 39 insertions(+), 2 deletions(-) diff --git a/arch/arm/mach-imx/xload-gpmi-nand.c b/arch/arm/mach-imx/xload-gpmi-nand.c index b3fd479cb..661302762 100644 --- a/arch/arm/mach-imx/xload-gpmi-nand.c +++ b/arch/arm/mach-imx/xload-gpmi-nand.c @@ -1086,6 +1086,43 @@ static int block_is_bad(struct mxs_nand_info *info, int blocknum) return 0; } +/* Function ported from nand_mxs driver -> mxs_nand_swap_block_mark() */ +static void bad_block_marker_swap(struct mxs_nand_info *info, + uint8_t *data_buf, uint8_t *oob_buf) +{ + struct fcb_block *fcb = &info->fcb; + + uint32_t bit_offset; + uint32_t buf_offset; + + uint32_t src; + uint32_t dst; + + /* Location of bad block marker is specified in FCB. */ + bit_offset = fcb->BadBlockMarkerStartBit; + buf_offset = fcb->BadBlockMarkerByte; + + /* + * Get the byte from the data area that overlays the block mark. Since + * the ECC engine applies its own view to the bits in the page, the + * physical block mark won't (in general) appear on a byte boundary in + * the data. + */ + + src = data_buf[buf_offset] >> bit_offset; + src |= data_buf[buf_offset + 1] << (8 - bit_offset); + + dst = oob_buf[0]; + + oob_buf[0] = src; + + data_buf[buf_offset] &= ~(0xff << bit_offset); + data_buf[buf_offset + 1] &= 0xff << bit_offset; + + data_buf[buf_offset] |= dst << bit_offset; + data_buf[buf_offset + 1] |= dst >> (8 - bit_offset); +} + static int read_firmware(struct mxs_nand_info *info, int startpage, void *dest, int len) { @@ -1121,8 +1158,8 @@ static int read_firmware(struct mxs_nand_info *info, int startpage, return ret; } - *((u8 *)dest + fcb->BadBlockMarkerByte) = - *(u8 *)(dest + pagesize); + /* Read DMA completed, now swap the bad block marker. */ + bad_block_marker_swap(info, dest, dest + pagesize); numpages--; dest += pagesize; -- 2.25.1 _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-07 9:09 [PATCH 1/2] ARM: i.MX: xload-gpmi-nand: fix bad block mark swapping Andrej Picej @ 2021-06-07 9:09 ` Andrej Picej 2021-06-07 20:03 ` Trent Piepho 0 siblings, 1 reply; 11+ messages in thread From: Andrej Picej @ 2021-06-07 9:09 UTC (permalink / raw) To: barebox Some NAND update tools/flashers do not take the full advantage of NAND's entire page area for ECC purposes. For example, they might only use 2112 bytes of available 2176 bytes. In this case, ECC parameters have to be read from the FCB table and taken into account in GPMI NAND xloader to properly calculate page data length so DMA chain can be executed correctly. Tested on PHYTEC phyCARD i.MX6Q board with following NANDs: - Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40) - Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and - Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80). All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND chip supporting ECC strength of 9 (29 bytes). Signed-off-by: Andrej Picej <andrej.picej@norik.com> --- arch/arm/mach-imx/xload-gpmi-nand.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-imx/xload-gpmi-nand.c b/arch/arm/mach-imx/xload-gpmi-nand.c index 661302762..f8c456927 100644 --- a/arch/arm/mach-imx/xload-gpmi-nand.c +++ b/arch/arm/mach-imx/xload-gpmi-nand.c @@ -87,6 +87,7 @@ struct mxs_dma_chan { #define NAND_ONFI_CRC_BASE 0x4f4e #define apbh_dma_is_imx23(aphb) ((apbh)->id == IMX23_DMA) +#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) /* udelay() is not available in PBL, need to improvise */ static void __udelay(int us) @@ -324,6 +325,7 @@ static int mxs_nand_read_page(struct mxs_nand_info *info, int writesize, int oobsize, int pagenum, void *databuf, int raw) { void __iomem *bch_regs = info->bch_base; + struct fcb_block *fcb = &info->fcb; unsigned column = 0; struct mxs_dma_cmd *d; int cmd_queue_len; @@ -332,10 +334,17 @@ static int mxs_nand_read_page(struct mxs_nand_info *info, int writesize, uint8_t *status; int i; int timeout; + int readtotal, nchunks, eccstrength; int descnum = 0; int max_pagenum = info->nand_size / info->organization.pagesize; + eccstrength = fcb->EccBlockNEccType << 1; + nchunks = writesize / MXS_NAND_CHUNK_DATA_CHUNK_SIZE; + readtotal = fcb->MetadataBytes; + readtotal += fcb->EccBlockNSize * nchunks; + readtotal += DIV_ROUND_UP(13 * eccstrength * nchunks, 8); + memset(info->desc, 0, sizeof(*info->desc) * MXS_NAND_DMA_DESCRIPTOR_COUNT); @@ -418,12 +427,12 @@ static int mxs_nand_read_page(struct mxs_nand_info *info, int writesize, GPMI_CTRL0_WORD_LENGTH | GPMI_CTRL0_CS(info->cs) | GPMI_CTRL0_ADDRESS_NAND_DATA | - (writesize + oobsize); + readtotal; d->pio_words[1] = 0; d->pio_words[2] = GPMI_ECCCTRL_ENABLE_ECC | GPMI_ECCCTRL_ECC_CMD_DECODE | GPMI_ECCCTRL_BUFFER_MASK_BCH_PAGE; - d->pio_words[3] = writesize + oobsize; + d->pio_words[3] = readtotal; d->pio_words[4] = (dma_addr_t)databuf; d->pio_words[5] = (dma_addr_t)(databuf + writesize); @@ -436,7 +445,7 @@ static int mxs_nand_read_page(struct mxs_nand_info *info, int writesize, GPMI_CTRL0_WORD_LENGTH | GPMI_CTRL0_CS(info->cs) | GPMI_CTRL0_ADDRESS_NAND_DATA | - (writesize + oobsize); + readtotal; } /* Compile DMA descriptor - de-assert the NAND lock and interrupt. */ -- 2.25.1 _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-07 9:09 ` [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page Andrej Picej @ 2021-06-07 20:03 ` Trent Piepho 2021-06-08 6:28 ` Sascha Hauer 2021-06-08 7:23 ` Andrej Picej 0 siblings, 2 replies; 11+ messages in thread From: Trent Piepho @ 2021-06-07 20:03 UTC (permalink / raw) To: Andrej Picej; +Cc: Barebox List On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@norik.com> wrote: > Some NAND update tools/flashers do not take the full advantage of NAND's > entire page area for ECC purposes. For example, they might only use 2112 > bytes of available 2176 bytes. In this case, ECC parameters have to be > read from the FCB table and taken into account in GPMI NAND xloader to > properly calculate page data length so DMA chain can be executed > correctly. > > Tested on PHYTEC phyCARD i.MX6Q board with following NANDs: > - Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40) > - Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and > - Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80). > > All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND > chip supporting ECC strength of 9 (29 bytes). There is a bug in NXP's latest imx kernel, lf-5.10.y-1.0.0, that results in the kernel driver incorrectly using the minimum ECC specified in the ONFI nand specs instead of calculating a maximal ecc value and using that, which is what prior kernels and the upstream kernel use. It was caused by incorrectly resolving a conflict when they rebased one of their old patches to 5.10. The common pagesize 0x800, oobsize 0x40 should use 8-bit ECC. That's what the uboot, barebox, and linux drivers would do since the first mxs nand support years ago. It's only the recent kernel bug in nxp's kernel that will choose 4. So rather than switch to 4-bit, it would be better to fix these boards to use 8-bit like they should. More reliable ECC, and it will work correctly on barebox, u-boot, old imx kernels, current upstream kernels, and hopefully future imx kernels. Using the FCB data here might not be such a good idea. While it seems like the right thing, there are some issues: The barebox main gpmi nand driver doesn't use the FCB U-boot doesn't use the FCB No Linux kernel uses the FCB If you try to read/write nand from any of those places, it won't work. The only way to make it work, is to have the FCB match what those drivers do. I think it would have been better if the original design had been for the bootloader to read the FCB, use that to load the kernel, and then fixup the ECC config into the device tree for the kernel to use too. One source, the FCB, which is propagated to all users. Everyone will agree on the ECC and there are no independent settings to keep in sync. But they didn't do that. Each driver figures it out on it's own and hopefully they use matching algorithms that arrive at the same answer. But of course this fails, like with nxp's lf-5.10.y-1.0.0 kernel. This isn't the first time, this same type of bug appeared back in 2013 in 2febcdf84b and was fixed in 031e2777e. So while your commit will allow these boards using poorly chosen FCB values to work with the xloader, they will be corrupted if nand is written to from barebox non-xload or from linux. _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-07 20:03 ` Trent Piepho @ 2021-06-08 6:28 ` Sascha Hauer 2021-06-08 7:23 ` Andrej Picej 1 sibling, 0 replies; 11+ messages in thread From: Sascha Hauer @ 2021-06-08 6:28 UTC (permalink / raw) To: Trent Piepho; +Cc: Andrej Picej, Barebox List On Mon, Jun 07, 2021 at 01:03:41PM -0700, Trent Piepho wrote: > On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@norik.com> wrote: > > Some NAND update tools/flashers do not take the full advantage of NAND's > > entire page area for ECC purposes. For example, they might only use 2112 > > bytes of available 2176 bytes. In this case, ECC parameters have to be > > read from the FCB table and taken into account in GPMI NAND xloader to > > properly calculate page data length so DMA chain can be executed > > correctly. > > > > Tested on PHYTEC phyCARD i.MX6Q board with following NANDs: > > - Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40) > > - Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and > > - Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80). > > > > All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND > > chip supporting ECC strength of 9 (29 bytes). > > There is a bug in NXP's latest imx kernel, lf-5.10.y-1.0.0, that > results in the kernel driver incorrectly using the minimum ECC > specified in the ONFI nand specs instead of calculating a maximal ecc > value and using that, which is what prior kernels and the upstream > kernel use. It was caused by incorrectly resolving a conflict when > they rebased one of their old patches to 5.10. > > The common pagesize 0x800, oobsize 0x40 should use 8-bit ECC. That's > what the uboot, barebox, and linux drivers would do since the first > mxs nand support years ago. It's only the recent kernel bug in nxp's > kernel that will choose 4. > > So rather than switch to 4-bit, it would be better to fix these boards > to use 8-bit like they should. More reliable ECC, and it will work > correctly on barebox, u-boot, old imx kernels, current upstream > kernels, and hopefully future imx kernels. > > Using the FCB data here might not be such a good idea. While it seems > like the right thing, there are some issues: > The barebox main gpmi nand driver doesn't use the FCB > U-boot doesn't use the FCB > No Linux kernel uses the FCB > > If you try to read/write nand from any of those places, it won't work. > The only way to make it work, is to have the FCB match what those > drivers do. > > I think it would have been better if the original design had been for > the bootloader to read the FCB, use that to load the kernel, and then > fixup the ECC config into the device tree for the kernel to use too. > One source, the FCB, which is propagated to all users. Everyone will > agree on the ECC and there are no independent settings to keep in > sync. We could still go that path. The properties are all there, we have nand-ecc-strength, nand-ecc-step-size and even nand-ecc-maximize properties. The Linux GPMI driver doesn't honour these flags of course, but that could be changed. With that we could ensure that at least barebox and the kernel are consistent. Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 | _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-07 20:03 ` Trent Piepho 2021-06-08 6:28 ` Sascha Hauer @ 2021-06-08 7:23 ` Andrej Picej 2021-06-08 12:38 ` Trent Piepho 1 sibling, 1 reply; 11+ messages in thread From: Andrej Picej @ 2021-06-08 7:23 UTC (permalink / raw) To: Trent Piepho; +Cc: Barebox List Hi Trent, firstly thanks for your input. Please find my comments bellow. On 7. 06. 21 22:03, Trent Piepho wrote: > On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@norik.com> wrote: >> Some NAND update tools/flashers do not take the full advantage of NAND's >> entire page area for ECC purposes. For example, they might only use 2112 >> bytes of available 2176 bytes. In this case, ECC parameters have to be >> read from the FCB table and taken into account in GPMI NAND xloader to >> properly calculate page data length so DMA chain can be executed >> correctly. >> >> Tested on PHYTEC phyCARD i.MX6Q board with following NANDs: >> - Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40) >> - Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and >> - Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80). >> >> All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND >> chip supporting ECC strength of 9 (29 bytes). > > There is a bug in NXP's latest imx kernel, lf-5.10.y-1.0.0, that > results in the kernel driver incorrectly using the minimum ECC > specified in the ONFI nand specs instead of calculating a maximal ecc > value and using that, which is what prior kernels and the upstream > kernel use. It was caused by incorrectly resolving a conflict when > they rebased one of their old patches to 5.10. > > The common pagesize 0x800, oobsize 0x40 should use 8-bit ECC. That's > what the uboot, barebox, and linux drivers would do since the first > mxs nand support years ago. It's only the recent kernel bug in nxp's > kernel that will choose 4. OK, I wasn't aware of this kernel bug, but this is not what we are trying to fix here. Our use-case for this, is migration from eboot (some old WinCE version) to barebox with some proprietary flasher tool. This tool uses NAND settings used by eboot, which are hardcoded to fixed pagesize of 0x800 bytes and oobsize of 0x40 bytes (8 ECC bits). If for example some other NAND is used with different page size (e.g. pagesize of 0x800 bytes with oobsize of 0x80 bytes) the BCH ECC page organization will only use 0x840 bytes. > > So rather than switch to 4-bit, it would be better to fix these boards > to use 8-bit like they should. More reliable ECC, and it will work > correctly on barebox, u-boot, old imx kernels, current upstream > kernels, and hopefully future imx kernels. I agree that it would be better to use all of the space available, but if flasher used wrong settings to copy barebox binary to NAND these settings (although not optimal) should be used to make booting even possible. > > Using the FCB data here might not be such a good idea. While it seems > like the right thing, there are some issues: > The barebox main gpmi nand driver doesn't use the FCB > U-boot doesn't use the FCB > No Linux kernel uses the FCB The main reason why I think we should use FCB here for this is because i.MX6's ROM already uses these values for booting into pre-bootloader. That's why we try to act in xloader like ROM does (reading NAND parameters from FCB). Nevertheless flasher tools should be responsible to match the BCH ECC page with what it is written into FCB. If that is not the case then we can only presume that the flasher used the optimal size for ECC. > > If you try to read/write nand from any of those places, it won't work. > The only way to make it work, is to have the FCB match what those > drivers do. In our case the described proprietary flasher tool only flashes barebox so only NAND pages with barebox binary are using not optimal ECC settings. If for example kernel, devicetree and rootfs would be flashed from barebox the NAND pages there would use correct ECC size and booting into linux and updating those NAND pages from linux works. Updating barebox from barebox itself (using barebox_update) would mean that the barebox binary will be overwritten in NAND with optimal ECC settings and FCB will be updated accordingly. > > I think it would have been better if the original design had been for > the bootloader to read the FCB, use that to load the kernel, and then > fixup the ECC config into the device tree for the kernel to use too. > One source, the FCB, which is propagated to all users. Everyone will > agree on the ECC and there are no independent settings to keep in > sync. > > But they didn't do that. Each driver figures it out on it's own and > hopefully they use matching algorithms that arrive at the same answer. > But of course this fails, like with nxp's lf-5.10.y-1.0.0 kernel. > This isn't the first time, this same type of bug appeared back in 2013 > in 2febcdf84b and was fixed in 031e2777e. > > So while your commit will allow these boards using poorly chosen FCB > values to work with the xloader, they will be corrupted if nand is > written to from barebox non-xload or from linux. > We are only using this ECC values to read barebox binary from NAND and copy it to RAM. If other NAND pages will be using different ECC values that doesn't break anything, I think. Only problem that I can see here is barebox or linux reading NAND pages occupied by barebox binary, this will most likely fail, but I don't see why that would be necessary anyway. I don't think we are braking anything here, we are just fixing booting barebox from NAND whit not optimal ECC settings. Please correct me if I'm wrong or if I'm missing something here? BR, Andrej _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-08 7:23 ` Andrej Picej @ 2021-06-08 12:38 ` Trent Piepho 2021-06-09 6:34 ` Andrej Picej 0 siblings, 1 reply; 11+ messages in thread From: Trent Piepho @ 2021-06-08 12:38 UTC (permalink / raw) To: Andrej Picej; +Cc: Barebox List On Tue, Jun 8, 2021 at 12:23 AM Andrej Picej <andrej.picej@norik.com> wrote: > On 7. 06. 21 22:03, Trent Piepho wrote: > > On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@norik.com> wrote: > >> - Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40) > >> - Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and > >> - Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80). > >> > >> All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND > >> chip supporting ECC strength of 9 (29 bytes). > tool uses NAND settings used by eboot, which are hardcoded to fixed > pagesize of 0x800 bytes and oobsize of 0x40 bytes (8 ECC bits). If for Ok, so 4 ecc bits was used for testing, but your actual use case is for flash that uses 8 bits when NAND has 128 OOB bytes, which the current code uses a value different than 8? My calculation is that 0x800+0x80 would use 18 bit ECC. But really, the exact numbers don't matter. Just that your nand flash tool, barebox xload, barebox main, uboot, uboot spl, linux, kobs-ng, etc. don't all agree on ECC values. > I agree that it would be better to use all of the space available, but > if flasher used wrong settings to copy barebox binary to NAND these > settings (although not optimal) should be used to make booting even > possible. But, how does one know 2nd stage barebox is flashed with the same ECC as 1st stage xload? See below. > > The main reason why I think we should use FCB here for this is because > i.MX6's ROM already uses these values for booting into pre-bootloader. > That's why we try to act in xloader like ROM does (reading NAND > parameters from FCB). Nevertheless flasher tools should be responsible > to match the BCH ECC page with what it is written into FCB. If that is I think it's fair to assume that the barebox xload is using the ECC from the FCB, otherwise it would not boot. But does barebox 2nd stage use same ECC as xload? In your case, the answer is currently yes. But is this always the case? I don't know of a specific board where it is not, but I do know this: It is common that a Linux based software update system will not update the bootloader. It might just do rootfs, or rootfs+kernel, but bootloader is less common. In two a stage system, xload + main, maybe the xload is not updated. It is a pain from Linux, with different versions of kobs and/or kobs-ng, which are poorly maintained and documented, a special attribute in sysfs that old Freescale kernels had and that isn't around anymore that is sometimes needed and sometimes not, etc. And as I have just discovered, iMX6UL and iMX6ULL use a different encoding of FCB that all other iMX and of course some kobs-ng versions don't know this and create a broken FCB. I even made a system that did this: barebox-xload had A/B support for 2nd stage and 2nd stage was updated, but the xload wasn't, since it wasn't fail-safe. But this was for CycloneV and doesn't apply here. So, suppose we have updated barebox 2nd stage from Linux (or barebox)? Now it uses "common" ECC values (IMHO, "optimal" is not an accurate term here) from Linux kernel. Barebox-xload current works to boot this, but your change will break that. It is a difficult problem, either choice of a ECC values could be the correct one. > In our case the described proprietary flasher tool only flashes barebox > so only NAND pages with barebox binary are using not optimal ECC > settings. If for example kernel, devicetree and rootfs would be flashed > from barebox the NAND pages there would use correct ECC size and booting > into linux and updating those NAND pages from linux works. Updating > barebox from barebox itself (using barebox_update) would mean that the > barebox binary will be overwritten in NAND with optimal ECC settings and > FCB will be updated accordingly. Does barebox_update run in 2nd stage barebox update both 2nd stage barebox and barebox-xload + FCB? Consider what happens if barebox 2nd stage is updated from Linux. Usually software update systems run on Linux, e.g. rauc or mender. In this case it will use Linux ECC settings, not FCB settings. You've got boards with barebox-xload and barebox using different ECC settings than kernel and rootfs. And not just two different settings, but also 2nd stage barebox and Linux don't know this. I predict this will be a source of much future pain. > We are only using this ECC values to read barebox binary from NAND and > copy it to RAM. If other NAND pages will be using different ECC values > that doesn't break anything, I think. Only problem that I can see here > is barebox or linux reading NAND pages occupied by barebox binary, this > will most likely fail, but I don't see why that would be necessary anyway. > > I don't think we are braking anything here, we are just fixing booting > barebox from NAND whit not optimal ECC settings. > > Please correct me if I'm wrong or if I'm missing something here? You've got ECC settings for: (xload barebox) (kernel rootfs) But if someone had this: (xload) (barebox kernel rootfs) Then it breaks. Why would they have that? As I describe above, everything in the 2nd set is updated from Linux using some software update system. Of course, the most common way is this: (xload barebox kernel rootfs) With just one set, when the xload has two choices, FCB vs common values, both are the same, so even if barebox is updated from Linux it still works. A solution that works for boths cases, but is also ugly and difficult, is to try both. If xload sees FCB values != calculated values, then just try both settings. One is virtually assured that the incorrect settings will produce massive numbers of errors from BCH. Read a couple pages and the settings which result in uncorrectable ECC errors on all pages are the wrong ones. _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-08 12:38 ` Trent Piepho @ 2021-06-09 6:34 ` Andrej Picej 2021-06-14 22:14 ` Trent Piepho 0 siblings, 1 reply; 11+ messages in thread From: Andrej Picej @ 2021-06-09 6:34 UTC (permalink / raw) To: Trent Piepho; +Cc: Barebox List On 8. 06. 21 14:38, Trent Piepho wrote: > On Tue, Jun 8, 2021 at 12:23 AM Andrej Picej <andrej.picej@norik.com> wrote: >> On 7. 06. 21 22:03, Trent Piepho wrote: >>> On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@norik.com> wrote: >>>> - Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40) >>>> - Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and >>>> - Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80). >>>> >>>> All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND >>>> chip supporting ECC strength of 9 (29 bytes). > >> tool uses NAND settings used by eboot, which are hardcoded to fixed >> pagesize of 0x800 bytes and oobsize of 0x40 bytes (8 ECC bits). If for > > Ok, so 4 ecc bits was used for testing, but your actual use case is > for flash that uses 8 bits when NAND has 128 OOB bytes, which the > current code uses a value different than 8? My calculation is that > 0x800+0x80 would use 18 bit ECC. Actually 8 ECC bits was used for testing. Maybe it was wrong that I named EccBlockNEccType (from i.MX 6Dual/6Quad Applications Processor Reference Manual) as ECC strength (in commit message) as it gets shifted to the left for one bit to get ECC size in bits. So yes, we agree, 8 bit ECC for 0x800+0x80 (4<<1 = 8) and 18 bit ECC for 0x800+0x80 (9<<1 = 18). > > But really, the exact numbers don't matter. Just that your nand flash > tool, barebox xload, barebox main, uboot, uboot spl, linux, kobs-ng, > etc. don't all agree on ECC values. > >> I agree that it would be better to use all of the space available, but >> if flasher used wrong settings to copy barebox binary to NAND these >> settings (although not optimal) should be used to make booting even >> possible. > > But, how does one know 2nd stage barebox is flashed with the same ECC > as 1st stage xload? See below. > >> >> The main reason why I think we should use FCB here for this is because >> i.MX6's ROM already uses these values for booting into pre-bootloader. >> That's why we try to act in xloader like ROM does (reading NAND >> parameters from FCB). Nevertheless flasher tools should be responsible >> to match the BCH ECC page with what it is written into FCB. If that is > > I think it's fair to assume that the barebox xload is using the ECC > from the FCB, otherwise it would not boot. But does barebox 2nd stage > use same ECC as xload? In your case, the answer is currently yes. > But is this always the case? > > I don't know of a specific board where it is not, but I do know this: > It is common that a Linux based software update system will not update > the bootloader. It might just do rootfs, or rootfs+kernel, but > bootloader is less common. In two a stage system, xload + main, maybe > the xload is not updated. It is a pain from Linux, with different > versions of kobs and/or kobs-ng, which are poorly maintained and > documented, a special attribute in sysfs that old Freescale kernels > had and that isn't around anymore that is sometimes needed and > sometimes not, etc. And as I have just discovered, iMX6UL and iMX6ULL > use a different encoding of FCB that all other iMX and of course some > kobs-ng versions don't know this and create a broken FCB. > > I even made a system that did this: barebox-xload had A/B support for > 2nd stage and 2nd stage was updated, but the xload wasn't, since it > wasn't fail-safe. But this was for CycloneV and doesn't apply here. > > So, suppose we have updated barebox 2nd stage from Linux (or barebox)? > Now it uses "common" ECC values (IMHO, "optimal" is not an accurate > term here) from Linux kernel. Barebox-xload current works to boot > this, but your change will break that. OK, I see. This is a valid point. Didn't really understand that updating only 2nd stage barebox is a common practice. Do you know of any imx6 board that does that, because this xloader is imx6 specific? > > It is a difficult problem, either choice of a ECC values could be the > correct one. Yes I agree, either way we break booting in one of our use cases. In my case pre-bootloader wouldn't get correctly read and in your case main bootloader wouldn't get correctly read. > >> In our case the described proprietary flasher tool only flashes barebox >> so only NAND pages with barebox binary are using not optimal ECC >> settings. If for example kernel, devicetree and rootfs would be flashed >> from barebox the NAND pages there would use correct ECC size and booting >> into linux and updating those NAND pages from linux works. Updating >> barebox from barebox itself (using barebox_update) would mean that the >> barebox binary will be overwritten in NAND with optimal ECC settings and >> FCB will be updated accordingly. > > Does barebox_update run in 2nd stage barebox update both 2nd stage > barebox and barebox-xload + FCB? Yes, it does. > > Consider what happens if barebox 2nd stage is updated from Linux. > Usually software update systems run on Linux, e.g. rauc or mender. In > this case it will use Linux ECC settings, not FCB settings. > > You've got boards with barebox-xload and barebox using different ECC > settings than kernel and rootfs. And not just two different settings, > but also 2nd stage barebox and Linux don't know this. I predict this > will be a source of much future pain. > > >> We are only using this ECC values to read barebox binary from NAND and >> copy it to RAM. If other NAND pages will be using different ECC values >> that doesn't break anything, I think. Only problem that I can see here >> is barebox or linux reading NAND pages occupied by barebox binary, this >> will most likely fail, but I don't see why that would be necessary anyway. >> >> I don't think we are braking anything here, we are just fixing booting >> barebox from NAND whit not optimal ECC settings. >> >> Please correct me if I'm wrong or if I'm missing something here? > > You've got ECC settings for: > (xload barebox) (kernel rootfs) > But if someone had this: > (xload) (barebox kernel rootfs) > Then it breaks. Yes I agree, as i already wrote above, I didn't know this is common way of doing bootloader update. > > Why would they have that? As I describe above, everything in the 2nd > set is updated from Linux using some software update system. > > Of course, the most common way is this: > (xload barebox kernel rootfs) > > With just one set, when the xload has two choices, FCB vs common > values, both are the same, so even if barebox is updated from Linux it > still works. > > A solution that works for boths cases, but is also ugly and difficult, > is to try both. If xload sees FCB values != calculated values, then > just try both settings. One is virtually assured that the incorrect > settings will produce massive numbers of errors from BCH. Read a > couple pages and the settings which result in uncorrectable ECC errors > on all pages are the wrong ones. > Yes that would be an ugly fix for this. But I see one problem. If different ECC values are used for pre-bootloader and main bootloader (like it is the case in example that you provided) we would have to read pre-bootloader and main bootloader with different ECC settings. So the xload would look something like: - read a couple of pages from pre-bootloader and select appropriate "readtotal_pbl" - copy pre-bootloader to RAM with selected "readtotal_pbl" - read a couple of pages from main-bootloader and select appropriate "readtotal_main" - copy the remaining pages (main barebox) to RAM with selected "readtotal_main" Now for this we would need to find out where PBL ends and main barebox starts (probably from boot data?). This would solve all of the problems right? But is this all needed for such extreme use case? As I said, I don't know how common it is for user to update only 2nd stage barebox, and how common it is to use flasher tools which would use different ECC settings than barebox and kernel for example. Both of these are needed to get ECC mismatch. And I can't think of other cases where a mismatch between ECC settings between pre-bootloader and 2nd stage barebox would happen. BR, Andrej _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-09 6:34 ` Andrej Picej @ 2021-06-14 22:14 ` Trent Piepho 2021-06-15 14:35 ` Sascha Hauer 0 siblings, 1 reply; 11+ messages in thread From: Trent Piepho @ 2021-06-14 22:14 UTC (permalink / raw) To: Andrej Picej; +Cc: Barebox List On Tue, Jun 8, 2021 at 11:34 PM Andrej Picej <andrej.picej@norik.com> wrote: > On 8. 06. 21 14:38, Trent Piepho wrote: > > On Tue, Jun 8, 2021 at 12:23 AM Andrej Picej <andrej.picej@norik.com> wrote: > >> On 7. 06. 21 22:03, Trent Piepho wrote: > > Ok, so 4 ecc bits was used for testing, but your actual use case is > > for flash that uses 8 bits when NAND has 128 OOB bytes, which the > > current code uses a value different than 8? My calculation is that > > 0x800+0x80 would use 18 bit ECC. > > Actually 8 ECC bits was used for testing. Maybe it was wrong that I > named EccBlockNEccType (from i.MX 6Dual/6Quad Applications Processor > Reference Manual) as ECC strength (in commit message) as it gets shifted > to the left for one bit to get ECC size in bits. So yes, we agree, 8 bit > ECC for 0x800+0x80 (4<<1 = 8) and 18 bit ECC for 0x800+0x80 (9<<1 = 18). Ok, I see. I discovered this too, as the kernel bug caused Linux to use 4 bit ECC instead of 8 bit. I instrumented Linux driver and found it was using 4 bit. I inspected the FCB manually and it was 0x04, so 4 bit ECC is connect? But I have all these NAND errors! I was sure it would be bad BCH config! Maybe NAND timing? Bad PCB? Nope. FCB ECC field is half the real ECC value and this is not documented in RM. If the FCB is 0x04 that is called 8 bits in all BCH documentation in iMX RM and also Linux driver, dts properties, and so on. > OK, I see. This is a valid point. Didn't really understand that updating > only 2nd stage barebox is a common practice. Do you know of any imx6 > board that does that, because this xloader is imx6 specific? I do not see any imx nand xloader users at all in mainlinux Barebox codebase. Which is annoying, since I'm trying to boot barebox from NAND on iMX6ULL and there are no boards in barebox that do this. So I have to do it from scratch, even though I know such boards exist and in fact you are using one. Do I do not know of any specific imx6 boards that do this. As there are apparently no imx boards booting barebox from nand.... I do know of a specific cyclone5 board that does this. There is also plenty of documentation that calls for writing 2nd stage bootloader in Linux with nandwrite. So I think it's reasonable someone would do this. I think if I setup rauc to do a software update, I do not see support for FCB update, but there is a raw nand partition that can easily do 2nd stage bootloader. But consider that even if barebox spl can support two different BCH configs on same nand device for booting, neither Barebox nor Linux support this concept at all. BCH must be the same for entire device, even with ecc info in the device tree it cannot be done. If there was a manufacturing step that did a post programming re-flash on first boot, which reprogrammed to standard BCH parameters, this would almost certainly save future pain. > > A solution that works for boths cases, but is also ugly and difficult, > > is to try both. If xload sees FCB values != calculated values, then > > just try both settings. One is virtually assured that the incorrect > > settings will produce massive numbers of errors from BCH. Read a > > couple pages and the settings which result in uncorrectable ECC errors > > on all pages are the wrong ones. > > > > Yes that would be an ugly fix for this. > > But I see one problem. If different ECC values are used for > pre-bootloader and main bootloader (like it is the case in example that > you provided) we would have to read pre-bootloader and main bootloader > with different ECC settings. Since only the ROM loader boots the pre-bootloader, then it should not matter if linux can read it. Well, it would be nice if it worked, but it doesn't need too. > So the xload would look something like: > - read a couple of pages from pre-bootloader and select appropriate > "readtotal_pbl" > - copy pre-bootloader to RAM with selected "readtotal_pbl" > - read a couple of pages from main-bootloader and select appropriate > "readtotal_main" > - copy the remaining pages (main barebox) to RAM with selected > "readtotal_main" > > Now for this we would need to find out where PBL ends and main barebox > starts (probably from boot data?). > > This would solve all of the problems right? I think only the last two steps are needed? Since the xloader is already in RAM, it only needs to load the main bootlaoder. _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-14 22:14 ` Trent Piepho @ 2021-06-15 14:35 ` Sascha Hauer 2021-06-15 20:25 ` Trent Piepho 0 siblings, 1 reply; 11+ messages in thread From: Sascha Hauer @ 2021-06-15 14:35 UTC (permalink / raw) To: Trent Piepho; +Cc: Andrej Picej, Barebox List On Mon, Jun 14, 2021 at 03:14:22PM -0700, Trent Piepho wrote: > On Tue, Jun 8, 2021 at 11:34 PM Andrej Picej <andrej.picej@norik.com> wrote: > > On 8. 06. 21 14:38, Trent Piepho wrote: > > > On Tue, Jun 8, 2021 at 12:23 AM Andrej Picej <andrej.picej@norik.com> wrote: > > >> On 7. 06. 21 22:03, Trent Piepho wrote: > > > Ok, so 4 ecc bits was used for testing, but your actual use case is > > > for flash that uses 8 bits when NAND has 128 OOB bytes, which the > > > current code uses a value different than 8? My calculation is that > > > 0x800+0x80 would use 18 bit ECC. > > > > Actually 8 ECC bits was used for testing. Maybe it was wrong that I > > named EccBlockNEccType (from i.MX 6Dual/6Quad Applications Processor > > Reference Manual) as ECC strength (in commit message) as it gets shifted > > to the left for one bit to get ECC size in bits. So yes, we agree, 8 bit > > ECC for 0x800+0x80 (4<<1 = 8) and 18 bit ECC for 0x800+0x80 (9<<1 = 18). > > Ok, I see. I discovered this too, as the kernel bug caused Linux to > use 4 bit ECC instead of 8 bit. I instrumented Linux driver and found > it was using 4 bit. I inspected the FCB manually and it was 0x04, so > 4 bit ECC is connect? But I have all these NAND errors! I was sure it > would be bad BCH config! Maybe NAND timing? Bad PCB? Nope. FCB ECC > field is half the real ECC value and this is not documented in RM. If > the FCB is 0x04 that is called 8 bits in all BCH documentation in iMX > RM and also Linux driver, dts properties, and so on. > > > OK, I see. This is a valid point. Didn't really understand that updating > > only 2nd stage barebox is a common practice. Do you know of any imx6 > > board that does that, because this xloader is imx6 specific? > > I do not see any imx nand xloader users at all in mainlinux Barebox > codebase. Which is annoying, since I'm trying to boot barebox from > NAND on iMX6ULL and there are no boards in barebox that do this. So I > have to do it from scratch, even though I know such boards exist and > in fact you are using one. > > Do I do not know of any specific imx6 boards that do this. As there > are apparently no imx boards booting barebox from nand.... There are several i.MX6 boards mainline that support booting from NAND, one of them being the phyFLEX-i.MX6 board which I am using all day. It doesn't use any xload mechanism, theres's only one stage in NAND. We also have imx6_nand_start_image() which can be used when a xload mechanism is desired for cases when the SDRAM shall be initialized in code rather than using DCD data. This part indeed has no users in tree, but there shouldn't be much to implement to get this working. I wonder which parts are missing that make you think that booting from NAND ist not supported? > > I do know of a specific cyclone5 board that does this. > > There is also plenty of documentation that calls for writing 2nd stage > bootloader in Linux with nandwrite. So I think it's reasonable > someone would do this. I think if I setup rauc to do a software > update, I do not see support for FCB update, but there is a raw nand > partition that can easily do 2nd stage bootloader. > > But consider that even if barebox spl can support two different BCH > configs on same nand device for booting, neither Barebox nor Linux > support this concept at all. At least the OMAP NAND driver in barebox supports different ECC configs that are switchable during runtime. The ECC scheme once switched is used for the whole device though, what's missing is indeed to attach an ECC scheme to a partition rather than to the whole device. I have that dream each time when hacking OMAP NAND... Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 | _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-15 14:35 ` Sascha Hauer @ 2021-06-15 20:25 ` Trent Piepho 2021-06-16 7:48 ` Andrej Picej 0 siblings, 1 reply; 11+ messages in thread From: Trent Piepho @ 2021-06-15 20:25 UTC (permalink / raw) To: Sascha Hauer; +Cc: Andrej Picej, Barebox List On Tue, Jun 15, 2021 at 7:35 AM Sascha Hauer <sha@pengutronix.de> wrote: > > > > I do not see any imx nand xloader users at all in mainlinux Barebox > > codebase. Which is annoying, since I'm trying to boot barebox from > > NAND on iMX6ULL and there are no boards in barebox that do this. So I > > have to do it from scratch, even though I know such boards exist and > > in fact you are using one. > > > > Do I do not know of any specific imx6 boards that do this. As there > > are apparently no imx boards booting barebox from nand.... > > There are several i.MX6 boards mainline that support booting from NAND, > one of them being the phyFLEX-i.MX6 board which I am using all day. It > doesn't use any xload mechanism, theres's only one stage in NAND. We > also have imx6_nand_start_image() which can be used when a xload > mechanism is desired for cases when the SDRAM shall be initialized in > code rather than using DCD data. This part indeed has no users in tree, > but there shouldn't be much to implement to get this working. That's the issue, this patch is to the code for xload, which has no in tree users. The patch either doesn't matter, fixes, or breaks a board, depending on what it did with NAND. A normal NAND layout doesn't matter, but there are "wrong", or at least non-standard, layouts that it will either fix or break. Andrej has one real example where it fixes. There are no known specific examples where it would break, but absence of evidence is not evidence of absence! If there was an in-tree board using xload gpmi, then one could also look for a BSP software update system or published software update instructions for that board, and see what they do. For example, Variscite's DART-6UL instructions for u-boot update from Linux in §5.4 of https://variwiki.com/index.php?title=Yocto_NAND_Flash_Burning&release=RELEASE_ZEUS_V1.2_DART-6UL These have both SPL + 2nd stage replacement in one block, which is good and this patch will not break a board flashed this way. Of course, SPL flash uses kobs-ng, which needs the debugfs bch hack that's not in the mainline kernel, so maybe someone skips that step and doesn't update SPL? > > But consider that even if barebox spl can support two different BCH > > configs on same nand device for booting, neither Barebox nor Linux > > support this concept at all. > > At least the OMAP NAND driver in barebox supports different ECC configs > that are switchable during runtime. The ECC scheme once switched is used > for the whole device though, what's missing is indeed to attach an ECC > scheme to a partition rather than to the whole device. I have that dream > each time when hacking OMAP NAND... Even if Barebox could do it, it won't work in Linux, which would be a huge drawback. AFAIK, even the new dts based ECC properties are still attached to a device and not a partition. When hardware ECC first appeared, it seemed obvious to me that an mtd ioctl to get/set ecc values was needed. kobs-ng needs this. The thing I can't remember for the MXS boot system needed it. One could have different ECC for different partitions. One could switch the ECC used when reflashing a partition. But no, we have stuff like the debugfs bch file that is still not mainlined after a decade. _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page 2021-06-15 20:25 ` Trent Piepho @ 2021-06-16 7:48 ` Andrej Picej 0 siblings, 0 replies; 11+ messages in thread From: Andrej Picej @ 2021-06-16 7:48 UTC (permalink / raw) To: Trent Piepho, Sascha Hauer; +Cc: Barebox List Ok, I understand that second patch fixes really specific problem, which is not likely to happen. And if this patch would be applied it would break other specific problem that Trent pointed out. Nevertheless, I think first patch doesn't break anything and only fixes bad block marker swapping. Sascha, if you think only first patch can be applied, should I create a v2 and only send first patch or is this not necessary? Thank you both. _______________________________________________ barebox mailing list barebox@lists.infradead.org http://lists.infradead.org/mailman/listinfo/barebox ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-06-16 7:50 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-07 9:09 [PATCH 1/2] ARM: i.MX: xload-gpmi-nand: fix bad block mark swapping Andrej Picej 2021-06-07 9:09 ` [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page Andrej Picej 2021-06-07 20:03 ` Trent Piepho 2021-06-08 6:28 ` Sascha Hauer 2021-06-08 7:23 ` Andrej Picej 2021-06-08 12:38 ` Trent Piepho 2021-06-09 6:34 ` Andrej Picej 2021-06-14 22:14 ` Trent Piepho 2021-06-15 14:35 ` Sascha Hauer 2021-06-15 20:25 ` Trent Piepho 2021-06-16 7:48 ` Andrej Picej
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox