Help

Re: @alias.com extraction

536 0
cancel
Showing results for 
Search instead for 
Did you mean: 
Sean_Lake1
8 - Airtable Astronomer
8 - Airtable Astronomer

Hey All,

I’m sure this is somewhere and I’ve searched but I’ve not seen an explanation to certain “things” within the REGEX_EXTRACT field:
REGEX_REPLACE(Email, “(\w+)\.(\w+)@.+”, “$1 $2”)
for example, I get what it’s doing, but not sure of those internal things, like \w+) etc.
What I need is really simple and I was thinking I’d use the regex to extract just the @email.com part.
IF({Attendee Email - for Invites}, REGEX_EXTRACT({Attendee Email - for Invites},UNKNOWN PART OF FORMULA),“BLANK FIELD”)

2 Replies 2

Hey @Sean_Lake1!

So, I created a quick example to run through:

image

The formula field in this screenshot is using the following formula body:

REGEX_EXTRACT(
  {Email},
  "@\\w+\\.\\w+$"
)

Looking at what you’re trying to do here:

We can shift things around to get this:

IF(
    {Email},
    REGEX_EXTRACT(
        {Email},
        "@\\w+\\.\\w+$"
    ),
    "No Email Available"
)

image

Now, there’s something important to note about this setup.
The REGEX_EXTRACT function will return an error if the Email field contains data that does not match any portion of the regex pattern we provide it.

Here’s an example:

image

What you’re referring to are called ‘patterns.’
They are basically a blueprint for what we want to search for using regex.

If we look at the pattern I provided in the formula, we can properly troubleshoot the returned error.
I’ll break it down by section.

@\\w+\\.\\w+$

  1. @ → We’re telling regex to literally look for an asperand.
  2. \\w → This represents any word character that is not a numerical character.
  3. + → These are called quantifiers. There are a few different types of them, but this one represents one or unlimited. When placed after the \\w, we declare that we are looking for any word character or characters placed either once or consecutively.
  4. \\. → A single . in regex represents literally anything. This includes whitespace, digits, word characters, special characters, etc. The problem is that we’re not looking for any possible character. We’re specifically looking for a period. So to bypass the original syntax, we can ‘escape’ the syntax by placing the \\ before whatever we’re trying to escape. In this case, we use \\. to declare that we’re literally looking for a period.
  5. \\w+ → This is a repeat and combination of #2 and #1.
  6. $ → This represents the end of a line. It’s not necessary, and there are even arguments for why this shouldn’t be included. Nonetheless, I like to include it in my regex work.

If you’re like me, you don’t like the fact that an invalid email address will return an error, so here’s how I get around it.

IF(
    {Email},
    IF(
        REGEX_MATCH(
            {Email},
            "@\\w+\\.\\w+$"
        ),
        REGEX_EXTRACT(
            {Email},
            "@\\w+\\.\\w+$"
        ),
        "Invalid Email Format"
    ) 
)

This formula only returns a value of the Email field is holding any data.
Now, if the field contains data, then it will use the REGEX_MATCH function to evaluate the email value and determine if it’s in a format that we define as being valid.
If it passes our validation, then it will return the extracted email domain.

If the validation fails, then it will return the string: Invalid Email Format.

If the email field is empty altogether, then the formula will also return blank.

image

As a little twist, if you wanted to remove the asperand from the extracted email address, then you can just nest it all within a REGEX_REPLACE or SUBSTITUTE function in order to remove the character:

IF(
    {Email},
    IF(
        REGEX_MATCH(
            {Email},
            "@\\w+\\.\\w+$"
        ),
        REGEX_REPLACE(
            REGEX_EXTRACT(
                {Email},
                "@\\w+\\.\\w+$"
            ),
            "@",
            ""
        ),
        "Invalid Email Format"
    ) 
)

image

If Airtable would implement Split(), a customary and pervasively available method for string parsing, this would be the answer:

"<whateverThePersonsNameIs>@email.com".Split("@")[1]

But, despite 41 upvotes for this more than three years ago and requests that date back 20 2016, we still do not have a Split() formula function. [sigh]

The use of RegEx by @Ben.Young is impressive, but it reminds me that customers do not want 1/2" drill bits; that only want 1/2" holes. Why not simply Find() “@” and use the right-most characters remaining? What am I missing?