Help

Help extracting a string from a rollup (FIND,MID,LEFT,RIGHT)

Topic Labels: Formulas
Solved
Jump to Solution
3216 4
cancel
Showing results for 
Search instead for 
Did you mean: 
Ricardo
7 - App Architect
7 - App Architect

Hello community, my record has a date and looks up a roll-up like the below example. I’m trying to figure out the right formula to extract the string below:

  • Rollup: …5. 04.05 Vila Falo,6. 05.05 Alvero,8. 07.05 Gjirokastra…
  • Record date: 05.05
  • Current formula: MID({Rollup}&“”,FIND(DATETIME_FORMAT(Date,‘DD.MM’),{Rollup}&“”)+5,XXX)
  • Output: Alvero,8. 07.05 Gjirokastra…
  • Desired output: Alvero

XXX is the part where my monkey brain fails to come up with a solution to determine the length of the string to be extracted. Your advice would be appreciated. :slightly_smiling_face:

1 Solution

Accepted Solutions
Justin_Barrett
18 - Pluto
18 - Pluto

This is where regular expressions come in handy. Regular expressions excel at finding patterns, which is ideal for cases where the length will vary, but you have consistent patterns that you can use to determine where to stop. In this case, you would look for the date matching the record date, then extract all characters until you get to a comma. This worked for me:

IF(
  AND(Date, Rollup),
  REGEX_EXTRACT(
    Rollup,
    "(?:" & SUBSTITUTE(DATETIME_FORMAT(Date,"DD.MM"), ".", "\\.") & " )([^,]*)")
)

Screen Shot 2022-06-28 at 9.11.52 PM

(NOTE: I used a single line text field for the “Rollup” field, but it should still work with an actual rollup field as long as the output is the same as you outlined above)

Here’s how this breaks down…

At the start of the regular expression, we first want to find—but not actually extract—the matching date. The syntax for this find-but-ignore process looks like this: (?:XXX), where XXX is the text/pattern you want to find. In this case we want the date followed by a single space. However the “DD.MM” date format that you’re using includes a reserved REGEX token: the period. In REGEX, a period normally matches any single character, but we want to actually match a period. To do this, we need to escape the period by preceding it with a single backward slash. To actually include a backward slash, though, we need another backward slash in front of it.

To mark the group that we actually want to both match and extract, we surround it with parentheses. Inside those parentheses we need to tell the REGEX parser to match as many characters as possible up to—but not including—the next comma. Individual characters are matched by putting them inside square braces. To match anything except those listed characters, you include the caret symbol— ^ —before the rest. To then indicate that the match should be made as many times as possible, you add an asterisk after the square braces. All of this explains this combo at the end of the regular expression: ([^,]*) In short, extract a group that matches any non-comma character zero or more times. Once it hits a comma, it stops.

When it’s all finished, the expression matches—but doesn’t capture—the record date followed by a single space, then actually captures as many non-comma characters as possible.

See Solution in Thread

4 Replies 4

Hi @Ricardo ,

Im sure someone else can come up with an easier formula, but this one should work

LEFT(RIGHT(Rollup,{formula copy copy}),FIND(“,”,Rollup,FIND(DATETIME_FORMAT(Date,‘DD.MM’),{Rollup}&“”)+5)-(FIND(DATETIME_FORMAT(Date,‘DD.MM’),{Rollup}&“”)+5))

Justin_Barrett
18 - Pluto
18 - Pluto

This is where regular expressions come in handy. Regular expressions excel at finding patterns, which is ideal for cases where the length will vary, but you have consistent patterns that you can use to determine where to stop. In this case, you would look for the date matching the record date, then extract all characters until you get to a comma. This worked for me:

IF(
  AND(Date, Rollup),
  REGEX_EXTRACT(
    Rollup,
    "(?:" & SUBSTITUTE(DATETIME_FORMAT(Date,"DD.MM"), ".", "\\.") & " )([^,]*)")
)

Screen Shot 2022-06-28 at 9.11.52 PM

(NOTE: I used a single line text field for the “Rollup” field, but it should still work with an actual rollup field as long as the output is the same as you outlined above)

Here’s how this breaks down…

At the start of the regular expression, we first want to find—but not actually extract—the matching date. The syntax for this find-but-ignore process looks like this: (?:XXX), where XXX is the text/pattern you want to find. In this case we want the date followed by a single space. However the “DD.MM” date format that you’re using includes a reserved REGEX token: the period. In REGEX, a period normally matches any single character, but we want to actually match a period. To do this, we need to escape the period by preceding it with a single backward slash. To actually include a backward slash, though, we need another backward slash in front of it.

To mark the group that we actually want to both match and extract, we surround it with parentheses. Inside those parentheses we need to tell the REGEX parser to match as many characters as possible up to—but not including—the next comma. Individual characters are matched by putting them inside square braces. To match anything except those listed characters, you include the caret symbol— ^ —before the rest. To then indicate that the match should be made as many times as possible, you add an asterisk after the square braces. All of this explains this combo at the end of the regular expression: ([^,]*) In short, extract a group that matches any non-comma character zero or more times. Once it hits a comma, it stops.

When it’s all finished, the expression matches—but doesn’t capture—the record date followed by a single space, then actually captures as many non-comma characters as possible.

This is amazing! Thanks for explaining how it works, this is better than the original Airtable guide to REGEX functions. The syntax is quite complex and to write I need to look-up and process every possible expression and its meaning, but that’s very powerful. :slightly_smiling_face:

I used a single line text field for the “Rollup” field, but it should still work with an actual rollup field as long as the output is the same as you outlined above

I added an &“” to the rollup to turn the array into a string, I suppose ARRAYJOIN() would do the same.

Correct. I had assumed that you’d already converted the array into a string as part of the rollup field’s aggregation formula (the commas that show up by default if the aggregation formula is simply values are for display only; they’re not actually part of the field output). If your rollup aggregation formula is ARRAYJOIN(values, ","), then you won’t need to add the & "" to the formula when referencing the rollup field. Either way, the formula will work as long as the commas are actually there.