New: REGEX formula functions

Hello everyone! We recently launched some exciting new additions to our formula field with three new regex (regular expression) functions:

  1. REGEX_MATCH(string, regex)
  2. REGEX_EXTRACT(string, regex)
  3. REGEX_REPLACE(string, regex, replacement)

These functions can be used to match character combinations within text strings. For those interested, Airtable’s REGEX functions are implemented using the RE2 regular expression library.

You can learn more about these functions in this article or check out some example use cases shared in the posts below.

5 Likes

Validating and normalizing phone numbers

If you need to validate a set of phone numbers against specific criteria, you can use REGEX_MATCH() like in the formula below.

REGEX_MATCH( {Possible Phone Number}, '^([+]?[0-9]( |-)?)?(\\(?[0-9]{3}\\)?|[0-9]{3})( |-)?([0-9]{3}( |-)?[0-9]{4}|[a-zA-Z0-9]{7})$')

This regular expression handles a number of different phone number formats:

  • Country codes
  • Using dashes or spaces as delimiters
  • Parentheses around the area code
  • Using letters instead of numbers

Next, if it’s a valid phone number, we can normalize it using REGEX_REPLACE to make it easier to work with in the rest of our Airtable base:

IF({Is Valid Phone Number?}, UPPER(REGEX_REPLACE( {Possible Phone Number}, '[^A-Za-z0-9]', '')), ERROR('Invalid phone number'))

If the phone number passed validation, we normalize it by using REGEX_REPLACE() to replace non-alphanumeric characters with an empty string, resulting in just the dialable digits. We also use the UPPER() formula to make the casing consistent.

Normalizing phone numbers into a prettier format is an exercise left to the reader :slight_smile:

2 Likes

Validating email addresses

Similar to the other example above, you can use REGEX_MATCH() to validate a list of email addresses as well.

REGEX_MATCH( {Email address}, "(\\W|^)[\\w.\\-]{0,25}@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}(\\W|$)")

This regex handles basic validation for email addresses:

  • Starts with a username that is composed of letters, numbers, or characters like underscore, period or dash.
  • An at-sign (@).
  • A domain name, which is a string of letters, numbers, or characters like underscore, period, or dash, followed by a period, followed by another string of letters, numbers, or characters like underscore, period or dash.

Email validation is a very complicated subject, but this regex is a good starting point for most use cases!

validate_email

2 Likes

Extracting domain from URLs

As one more example, you can use REGEX_EXTRACT to extract the domain name from a URL field:

REGEX_EXTRACT( {URL}, '^(?:https?:\\/\\/)?(?:[^@\n]+@)?(?:www\\.)?([^:\\/\n?]+)')

URLs start with ‘http’ or ‘https’, possibly a username followed by an ‘@’, the domain, and then the rest of the URL. Here, we strip off the subdomain if it’s ‘www’, but otherwise preserve it:

1 Like

Regular expressions are a powerful capability for working with textual data, and these examples are just a taste for what’s possible. Stack Overflow has many highly-voted examples of usecases for regexes, and regex101.com’s RE2 flavor is a great way of testing your regexes that is compatible with Airtable.

And lastly, for those musically inclined, I’m just going to leave this here…

1 Like

Thanks, @Jason! Very exciting stuff, and thanks for all the examples! :slight_smile:

I’ve discovered that this is a great REGEX function for extracting email addresses from a string of text:
[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,63}

However, as discussed in this thread, the REGEX_EXTRACT function only returns the first matching string, instead of all matching strings from a field.

Is this something that the Airtable team is planning to address in the future?

In the meantime, this tool from MiniExtensions allows for extracting multiple strings with REGEX:

This is precisely why none of these functions should have been developed in the first place.

The right approach to introducing RegEx() and other less obvious but needed functions such as Split() and Math() is to introduce a single function that allows us to call arbitrary Script Blocks that perform computations and return values.

If Airtable would do this – and only this – a vast number of functions would begin to emerge. There would be aftermarket function libraries that you plugin and use. The functions would be able to use all of the power of javascript functionality including every aspect of RegEx(), arrays, and multi-pass processing. There would be no limitations to what we might create and distribute to other users.

It’s great to see that Airtable is always chugging ahead, but it’s unfortunate they haven’t created a universal ability to leverage that which already exists - a fully-baked javascript environment without limitations.

I just don’t get it - they could look like geniuses or they could do something that is once-again, half-baked and extremely limited.

6 Likes

Thanks,
Now it adds more power/command in search and replace data.

These things are also working fine with API. while we fetch data from AirTable.
Now we have less complexity in search records from the table.

Previously :
filterByFormula=AND({FIELD_NAME}=“FULL_VALUE”)
But if there is a problem with this. Because it matches the exact string only.

Now :
REGEX_MATCH({FIELD_NAME}, ‘PARTIAL_VALUE’)
If you add any word that contains that field, that’s also matches and returns in the result.