mail archive of the barebox mailing list
 help / color / mirror / Atom feed
From: Tobias Waldekranz <tobias@waldekranz.com>
To: Ahmad Fatoum <a.fatoum@pengutronix.de>, barebox@lists.infradead.org
Subject: Re: [PATCH 1/5] string: add strtok/strtokv
Date: Mon, 08 Sep 2025 11:26:42 +0200	[thread overview]
Message-ID: <87ms752t2l.fsf@waldekranz.com> (raw)
In-Reply-To: <d321b000-bdbc-479e-886a-568299aebcf1@pengutronix.de>

On fre, sep 05, 2025 at 18:28, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> Hi,
>
> On 9/4/25 3:35 PM, Tobias Waldekranz wrote:
>> On tor, sep 04, 2025 at 13:00, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>>> Hello Tobias,
>>>
>>> On 8/28/25 5:05 PM, Tobias Waldekranz wrote:
>>>> Add an implementation of libc's standard strtok(3), which is useful
>>>> for tokenizing strings.
>>>
>>> strtok was previously removed in favor of strsep as it doesn't suffer
>>> from re-entrancy issues (poller and bthreads can run during delays). If
>>> you want to allow escapes, there's also strsep_unescaped.
>> 
>> Aha, my bad. I did not realize that there was more than one thread of
>> execution.
>
> The pollers are run during delay loops for stuff like feeding a
> watchdog, blinking a heartbeat LED or polling network link state.
> Bthreads are currently only used for the USB mass storage gadget[1] and
> out-of-tree for baredoom, so it can be played while booting..

V-e-r-y cool way to demo Barebox, BTW :)

>> strsep() is not quite the same thing though, I am really after the
>> strtok()'s behavior of skipping empty tokens.
>
> Ah, right. strsep is used in a loop, where it's just an extra check to
> skip empty tokens.
>
>> How would you feel about adding strtok_r() instead?
>
> You are not using anyways though, so what does it matter compared to
>
> while ((token = strsep(&sep, delims))) {
>         if (!*token)
>                 continue;
>
> ?

I just thought it might be a good hint for future readers: "we're
following strtok() semantics here". Anyway, it is not important, I'll
drop it in v2.

>>>> Also, add a version that will collect all tokens from a string into an
>>>> array, which is useful in situations where you need to know how many
>>>> tokens there are, and when a token's relative position in the order is
>>>> significant.
>>>
>>> We have the inverse as strjoin, but not this. Maybe call it strsplit
>>> instead?
>> 
>> If you accept my strtok_r() suggestion, do you still think strsplit() is
>> a better name, or is there value in signaling the underlying strtok()
>> behavior?
>
> I can see the argument that strjoin(strsplit(s)) should be s.
>
> Ok, let's keep it at strtokv. Some final bikeshedding: Would it be
> cleaner to return the string argument and have the length be the pointer
> argument?

Agreed, the array does seem like the "primary" return value. I'll swap
them for v2.

> [1]: Implementing stackful coroutines was less of a hassle than
> rewriting a complex state machine implemented as a kthread..
>
> Cheers,
> Ahmad
>
>> 
>>> Cheers,
>>> Ahmad
>>>
>>>>
>>>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>>>> ---
>>>>  include/string.h |  2 ++
>>>>  lib/string.c     | 66 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  2 files changed, 68 insertions(+)
>>>>
>>>> diff --git a/include/string.h b/include/string.h
>>>> index 71affe48b6..c8df8540d8 100644
>>>> --- a/include/string.h
>>>> +++ b/include/string.h
>>>> @@ -8,6 +8,8 @@
>>>>  void *mempcpy(void *dest, const void *src, size_t count);
>>>>  int strtobool(const char *str, int *val);
>>>>  char *strsep_unescaped(char **, const char *, char *);
>>>> +char *strtok(char *str, const char *delim);
>>>> +int strtokv(char *str, const char *delim, char ***vecp);
>>>>  char *stpcpy(char *dest, const char *src);
>>>>  bool strends(const char *str, const char *postfix);
>>>>  
>>>> diff --git a/lib/string.c b/lib/string.c
>>>> index 73637cd971..be7e65eb45 100644
>>>> --- a/lib/string.c
>>>> +++ b/lib/string.c
>>>> @@ -593,6 +593,72 @@ char *strsep_unescaped(char **s, const char *ct, char *delim)
>>>>          return sbegin;
>>>>  }
>>>>  
>>>> +/**
>>>> + * strtok - extract tokens from string
>>>> + * @str:	string to split
>>>> + * @delim:	set of delimiter characters
>>>> + *
>>>> + * The strtok() function breaks up a string into zero or more nonempty
>>>> + * tokens.  On the first call, the string to be parsed should be
>>>> + * specified in @str.  In each subsequent call that should parse the
>>>> + * same string, @str must be NULL.
>>>> + *
>>>> + * @delim specifies a set of bytes that delimit the tokens in the
>>>> + * string.
>>>> + *
>>>> + * Each call to strtok() returns a pointer to a string containing the
>>>> + * next token.  This is done by replacing the first delimiter with a
>>>> + * NUL character, the operation is thus destructive to the string. If
>>>> + * no more tokens are found, strtok() returns NULL.
>>>> + */
>>>> +char *strtok(char *str, const char *delim)
>>>> +{
>>>> +	static char *cursor;
>>>> +
>>>> +	if (str)
>>>> +		cursor = str;
>>>> +
>>>> +	if (!cursor)
>>>> +		return NULL;
>>>> +
>>>> +	cursor += strspn(cursor, delim);
>>>> +	if (*cursor == '\0') {
>>>> +		cursor = NULL;
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	return strsep(&cursor, delim);
>>>> +}
>>>> +EXPORT_SYMBOL(strtok);
>>>> +
>>>> +/**
>>>> + * strtokv - split string into array of tokens based on a delimiter set
>>>> + * @str:	string to split
>>>> + * @delim:	set of delimiter characters
>>>> + * @vecp:	array of tokens
>>>> + *
>>>> + * Split @str into tokens delimited by @delim, using strtok(), and
>>>> + * store the allocated token array in @vecp, which the caller is
>>>> + * responsible for freeing.
>>>> + *
>>>> + * Return: The number of tokens in the array.
>>>> + */
>>>> +int strtokv(char *str, const char *delim, char ***vecp)
>>>> +{
>>>> +	char *tok, **vec = NULL;
>>>> +	int cnt = 0;
>>>> +
>>>> +
>>>> +	for (tok = strtok(str, delim); tok; tok = strtok(NULL, delim)) {
>>>> +		vec = xrealloc(vec, (cnt + 1) * sizeof(*vec));
>>>> +		vec[cnt++] = tok;
>>>> +	}
>>>> +
>>>> +	*vecp = vec;
>>>> +	return cnt;
>>>> +}
>>>> +EXPORT_SYMBOL(strtokv);
>>>> +
>>>>  #ifndef __HAVE_ARCH_STRSWAB
>>>>  /**
>>>>   * strswab - swap adjacent even and odd bytes in %NUL-terminated string
>>>
>>> -- 
>>> Pengutronix e.K.                  |                             |
>>> Steuerwalder Str. 21              | http://www.pengutronix.de/  |
>>> 31137 Hildesheim, Germany         | Phone: +49-5121-206917-0    |
>>> Amtsgericht Hildesheim, HRA 2686  | Fax:   +49-5121-206917-5555 |
>> 
>
> -- 
> Pengutronix e.K.                  |                             |
> Steuerwalder Str. 21              | http://www.pengutronix.de/  |
> 31137 Hildesheim, Germany         | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686  | Fax:   +49-5121-206917-5555 |



  reply	other threads:[~2025-09-08 10:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-28 15:05 [PATCH 0/5] dm: Initial work on a device mapper Tobias Waldekranz
2025-08-28 15:05 ` [PATCH 1/5] string: add strtok/strtokv Tobias Waldekranz
2025-09-04 11:00   ` Ahmad Fatoum
2025-09-04 13:35     ` Tobias Waldekranz
2025-09-05 16:28       ` Ahmad Fatoum
2025-09-08  9:26         ` Tobias Waldekranz [this message]
2025-08-28 15:05 ` [PATCH 2/5] dm: Add initial device mapper infrastructure Tobias Waldekranz
2025-09-05 16:14   ` Ahmad Fatoum
2025-09-08  9:27     ` Tobias Waldekranz
2025-09-05 17:26   ` Ahmad Fatoum
2025-08-28 15:05 ` [PATCH 3/5] dm: linear: Add linear target Tobias Waldekranz
2025-08-29  5:56   ` Ahmad Fatoum
2025-09-05 16:37   ` Ahmad Fatoum
2025-08-28 15:05 ` [PATCH 4/5] test: self: dm: Add test of " Tobias Waldekranz
2025-09-05 16:50   ` Ahmad Fatoum
2025-09-08  9:27     ` Tobias Waldekranz
2025-08-28 15:05 ` [PATCH 5/5] commands: dmsetup: Basic command set for dm device management Tobias Waldekranz
2025-09-05 16:54   ` Ahmad Fatoum
2025-09-08  9:27     ` Tobias Waldekranz
2025-08-29  8:29 ` [PATCH 0/5] dm: Initial work on a device mapper Sascha Hauer
2025-08-31  7:48   ` Tobias Waldekranz
2025-09-02  8:40     ` Ahmad Fatoum
2025-09-02  9:44       ` Tobias Waldekranz
2025-08-29 11:24 ` Ahmad Fatoum
2025-08-31  7:48   ` Tobias Waldekranz
2025-09-02  9:03     ` Ahmad Fatoum
2025-09-02 13:01       ` Tobias Waldekranz
2025-09-03  7:05         ` Jan Lübbe
2025-09-02 14:46       ` Jan Lübbe
2025-09-02 21:34         ` Tobias Waldekranz
2025-09-03  6:50           ` Jan Lübbe
2025-09-03 20:19             ` Tobias Waldekranz
2025-09-05 14:44               ` Jan Lübbe
2025-09-02 14:34   ` Jan Lübbe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ms752t2l.fsf@waldekranz.com \
    --to=tobias@waldekranz.com \
    --cc=a.fatoum@pengutronix.de \
    --cc=barebox@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox