Help

New: REGEX formula functions

cancel
Showing results for 
Search instead for 
Did you mean: 
Jason
Airtable Employee
Airtable Employee

regex_functions

Hello everyone! We recently launched some exciting new additions to our formula field with three new regex (regular expression) functions:

  1. REGEX_MATCH(string, regex)
  2. REGEX_EXTRACT(string, regex)
  3. REGEX_REPLACE(string, regex, replacement)

These functions can be used to match character combinations within text strings. For those interested, Airtable’s REGEX functions are implemented using the RE2 regular expression library.

You can learn more about these functions in this article or check out some example use cases shared in the posts below.

50 Comments
kuovonne
18 - Pluto
18 - Pluto

Per the formula field reference page, the REGEX functions use the RE2 library, which is slightly different from the REGEX testing websites.

It is unlikely that the underlying REGEX library will be changed. You might want to use the RE2 library reference when building your expressions.

Karlstens
11 - Venus
11 - Venus

In testing on an RE2 website, I noticed that it returned an array for matches - so I’m not too sure why Airtable devs returned just a boolean instead of the match count array length.

Until we hear back from Airtable development, I’m going to see what workarounds that I can employ via automated scripting - I have a hunch that this will return the match results I’m expecting, so all I need to do is to automate the script with a field updated trigger comparison.

EDIT:
Looking into a Script solution, I’ve put a working test into play that I think (with a bit more development) will work well. In summary, I’m gearing a Script to check all field names within a table for any that have a prefix of :mag: . Upon finding them the field name is used as a search term for matching. The below screenshot shows that “ :mag: Test_1” is being matching the term “Test_1” four times within the Target.

image

Now I just need to create some loop structures so that all the fields with :mag: are parsed, and that if the result time-stamp is before the File Name time-stamp then to execute the script.

Suggestions welcome. :grinning_face_with_big_eyes:

Ockenden_Tech
5 - Automation Enthusiast
5 - Automation Enthusiast

@Justin_Barrett thanks for your reply. But as stated in my original post, I verified the regular expressions I posted above on Regex101.com under the golang variant, and they were approved; however, when I tried to use them in Airtable, they didn’t work.

That’s how round brackets are defined with the Golang variant on Regex101.com . Look in the lower-right corner at the Quick Reference section:

Screen Shot 2021-10-18 at 9.00.36 AM

Right, but in this picture, (...) is specified on a standalone basis. What I was querying was the case when round brackets are meant to symbolise a group of options (...|...). I believe that in Golang, the presence of the outer (...) doesn’t transform the expression into a capturing group, i.e. other matching parts of the regular expression continue to be returned; however, in Airtable only the matches within the (...) are returned, even when the round brackets only serve to define a group, as in (...|...). This also makes it difficult to use expressions like (?:...) shown in your picture, and others I mentioned in my posts above.

Hence why I was questioning whether the Airtable regex language was indeed golang or something else? I have tried the RE2 variant as well, and it also doesn’t line up with Airtable.

Can the Airtable developers provide an exact reference guide to Airtable’s regex language?

Thanks.

Justin_Barrett
18 - Pluto
18 - Pluto

The documentation only indicates that Airtable uses the RE2 library, but only the devs can say for sure how/if that library has been modified to be used by Airtable’s formula system. I recommend reaching out to Airtable support directly (in the app: Help → Contact support). While Airtable staffers do frequent the forum, they prefer to be approached directly with support requests, and questions like this definitely lean more in that direction.

Bryan_Smith
6 - Interface Innovator
6 - Interface Innovator

This seemed like a good place to ask my regex question, which has two parts. One error and second how to make more than one search string (if possible).

I’m attempting to extract a variety of strings from a file path and then use those strings to fill in a linked or multiselect field with automation. I’ve had some success with a little support from the gurus at regex101, but since I’m getting an error on this expression when used in an airtable formula I thought it more appropriate to ask for help here.

file path
/3D Projects/21-951 LH Hot Cyclone/BFA/7353 LH Hot Cyclone to App 5-24-21/7353 LH Hot Cyclone to App 5-24-21.pdf

regex - from regex101
\bBFA\b =>BFA

airtable formula
REGEX_EXTRACT({file path},"\bBFA\b") => BFA --automation to set Status to BFA (success)

my problem is if the path does not contain BFA
/3D Projects/21-951 LH Hot Cyclone/7353 LH Hot Cyclone to App 5-24-21/7353 LH Hot Cyclone to App 5-24-21.pdf
REGEX_EXTRACT({file path},"\bBFA\b") => #ERROR!

The second thing is that I’d like to search for “BFA or APP or DES”, so expanding to look for 3 strings.

Any help is greatly appreciated.

Bryan_Smith
6 - Interface Innovator
6 - Interface Innovator

Once again I found an answer to part of my question thanks to another post. The following solves the error part of my question. Is this the best way to handle this?

IF(
REGEX_MATCH({file path},“BFA”),
REGEX_EXTRACT({file path},“BFA”)
)

Bryan_Smith
6 - Interface Innovator
6 - Interface Innovator

Answered part 2 of my question with this.

IF(
REGEX_MATCH({file path},(?:BFA|APP|DES)),
REGEX_EXTRACT({file path},(?:BFA|APP|DES))
)

Justin_Barrett
18 - Pluto
18 - Pluto

Thanks for sharing your solution, but you’re missing quotes in that final example:

IF(
  REGEX_MATCH({file path}, "(?:BFA|APP|DES)"),
  REGEX_EXTRACT({file path}, "(?:BFA|APP|DES)")
)
Matt_Kennedy1
7 - App Architect
7 - App Architect

I’d like to transform an attachments (images) column into a delimited list of URLs. I can do it with Regex using look aheads (i.e. the ?= syntax mentioned above) but since this feature is not supported by RE2, I’m wondering if anyone has a work around?

Example String:
filename1.jpg (https://domain.com/image1), filename2.jpg (https://domain.com/image2)

Desired Output:
https://domain.com/image1 | https://domain.com/image2

This deletes the filenames using lookahead (with error)
REGEX_REPLACE({Image Field}, “[^(),]+(?=[(]|$)”, “”)

I would probably nest this in more Replace functions(s) to get rid of the () and swap the comma to a pipe etc.

Justin_Barrett
18 - Pluto
18 - Pluto

@Matt_Kennedy1 Nesting definitely takes care of it. Here’s what I came up with, nested from inside to outside:

  1. Replace the closing parenthesis after each URL, optionally adding everything up to the opening parenthesis of the next, with a single space. This gets rid of all filenames and all parentheses except for the very beginning.
  2. Replace everything at the beginning—up to and including the opening parenthesis—with nothing.
  3. Trim the string to get rid of the single space on the end.
  4. Substitute the remaining single spaces between URLs with the desired separator.

The full formula:

SUBSTITUTE(TRIM(REGEX_REPLACE(REGEX_REPLACE(Attachments, "\\)([^(]+\\()?", " "), ".*\\(", "")), " ", " | ")

Divided into separate lines for easier reading:

SUBSTITUTE(
    TRIM(
        REGEX_REPLACE(
            REGEX_REPLACE(
                Attachments, "\\)([^(]+\\()?", " "
            ), ".*\\(", ""
        )
    ), " ", " | "
)

Here’s how it looks with your sample. I also tested it with six attachments, and it works just as well.

Screen Shot 2021-11-06 at 8.02.47 PM