Extract #3 item from a text list with separators

Ed_Goble1 · ‎Sep 18, 2022

I have been playing around with @Justin_Barrett’s excellent script “Parent Child tree hierarchy” (Parent Child tree hierarchy).

The output of this is a text field with any separator you like, for example:
World > UK > London > Hammersmith > Elspeth Avenue > Joe’s Cafe
World > France > Paris > La Defence > BNP Paribas
World > USA > New York > Manhattan > Central Park

Question - is there any way to extract the #3 item (or the #2 or #4 for that matter) from these text strings? In this case, to pick out “London”, “Paris” or “New York”. I assume this should involve counting 2 instances of the separator (“>”), and then selecting the text from there up to the next separator.

Justin_Barrett · ‎Sep 19, 2022

@Ed_Goble1 Glad that the script has been helpful! Here’s a REGEX solution to the problem:

IF(Origin, TRIM(REGEX_EXTRACT(Origin, "(?:[^>]*> ){2}([^>]*)")))

The part that drives which piece is extracted is the number in the middle of the regular expression. That tells the interpreter how many pieces to skip before grabbing the next one in line. In other words, that number should be one smaller than the Nth piece that you actually want; e.g. if you want the 3rd piece, the number should be 2 (as in the above example). It will even work with zero to grab the first piece.

FWIW, if the delimiter—the character you’re using in-between each part—is something besides >, replace all instances of > with the appropriate delimiter. (NOTE: some characters have special meaning to a regular expression so this isn’t a blanket solution that will work with any delimiter, but it’ll likely work with most without further tweaking.)

See Solution in Thread

TheTimeSavingCo · ‎Sep 19, 2022

Hi @Ed_Goble1, if you’re trying to do this via formulas only, the only way I know how to do this is via a whole bunch of LEFT(), FIND() and SUBSTITUTE()s

So for example, #1 is easy, just do a

LEFT(
  {Text},
  FIND(
    ">",
    {Text}
  )
)

Once you hit #2, you have to do a SUBSTITUTE() to remove #1 from the original text, and then do the same thing:

LEFT(
  SUBSTITUTE(
    {Text},
    {#1} & " > ",
    ""
  ),
  FIND(
    ">",
    SUBSTITUTE(
      {Text},
      {#1} & " > ",
      ""
    )
  ) - 2
)

And so on and so forth

With luck someone else has a better idea heh

Adam, Consultant @ The Time Saving Co

Ed_Goble1 · ‎Sep 19, 2022

Thanks. This was great for my purpose!

(With the tiny addition of a “-2” towards the end of the first script so it didn’t return the " > ")

Justin_Barrett · ‎Sep 19, 2022

@Ed_Goble1 Glad that the script has been helpful! Here’s a REGEX solution to the problem:

IF(Origin, TRIM(REGEX_EXTRACT(Origin, "(?:[^>]*> ){2}([^>]*)")))

The part that drives which piece is extracted is the number in the middle of the regular expression. That tells the interpreter how many pieces to skip before grabbing the next one in line. In other words, that number should be one smaller than the Nth piece that you actually want; e.g. if you want the 3rd piece, the number should be 2 (as in the above example). It will even work with zero to grab the first piece.

FWIW, if the delimiter—the character you’re using in-between each part—is something besides >, replace all instances of > with the appropriate delimiter. (NOTE: some characters have special meaning to a regular expression so this isn’t a blanket solution that will work with any delimiter, but it’ll likely work with most without further tweaking.)

Ed_Goble1 · ‎Sep 20, 2022

Wow. Even better. I just added an IF statement to strip out the Error you get if there aren’t enough levels and it’s more elegant (in my case) to return a blank. I’m very impressed by this community. Thanks both!

valeriecrisp · ‎Aug 07, 2023

Hello and thank you to this community which assures me there's a solution to my similar problem.

In my case, I'd like to select the #1 item from a comma separated list of "Tag Names" to populate the value in "Tag 1" field. Same thing for the #2 item item. (Currently I am manually selecting them from the referenced list.)

I've tried some of the above solutions but I think I'm missing/misunderstanding a step... because I keep getting an ERROR.

Does the formula source field have to be single line text? Perhaps the commas are not compatible as delimiters?

Thanks in advance for your help