Help

The Community will be temporarily unavailable starting on Friday February 28. We’ll be back as soon as we can! To learn more, check out our Announcements blog post.

Help composing REGEX formula to extract strings from URL

Topic Labels: Formulas
Solved
Jump to Solution
4017 5
cancel
Showing results for 
Search instead for 
Did you mean: 
Stephen_Piscura
5 - Automation Enthusiast
5 - Automation Enthusiast

Hi there, I recognize the power of regex expressions, but I’m just pathetic at composing them, so I’m reaching out for help from someone who actually understands the syntax.

I have a table in which all entries contain a URL with this structure:

https://scholarship.ode.state.oh.us/Student/Home/Details/555555?stdntID=444444

From record to record, only two components of the URL change:

  1. The number immediately following Details/ (i.e. 555555)
  2. The number immediately following stdntID= (i.e. 444444)

I need to write two formulas, one for each of the components named above, to extract these unique numbers. If anyone could help in this regard, goodness what a kindness it would be.

Thank you for taking the time to read and consider.

1 Solution

Accepted Solutions
kuovonne
18 - Pluto
18 - Pluto

Try these formulas. They might be a bit more than you need, but they should work.

IF(
  REGEX_MATCH({My URL Field}, "Details/\\d+"),
  SUBSTITUTE(
    REGEX_EXTRACT({My URL Field}, "Details/\\d+"),
    "Details/", 
    ""
  )
)

IF(
  REGEX_MATCH({My URL Field}, "stdntID=\\d+"),
  SUBSTITUTE(
    REGEX_EXTRACT({My URL Field}, "stdntID=\\d+"),
    "stdntID=", 
    ""
  )
)

Here is some more information about Airtable’s REGEX functions:

  • Airtable adds the extra backslashes to escape the backslash. I’m not sure why they need to be escaped, but that is not the source of your problem.

  • Airtable RegEx does not support look ahead / look behind.

  • Airtable REGEX_EXTRACT returns an error if it cannot find a match.

  • Airtable REGEX_EXTRACT returns only the first match.

See Solution in Thread

5 Replies 5
augmented
10 - Mercury
10 - Mercury

Try these:

Details

\d+(?=?)

student id

\d+(.\d+)?$

Thank you, @augmented, for taking a swing at it!

When I attempt your solution, i’m noticing extra backslashes are added when I save the formula, like this:

IF(
  {My URL Field}, REGEX_EXTRACT({My URL Field}, "\\d+(?=?)")
)

(returns #ERROR!)

and like this:

IF(
  {My URL Field}, REGEX_EXTRACT({My URL Field}, "\\d+(.\\d+)?$")
)

(returns blanks cells)

The formulas as of now are not extracting the numbers as hoped, but I’m wondering if it’s because of those extra backslashes being added.

augmented
10 - Mercury
10 - Mercury

Interesting. I’ve never used REGEX in Airtable, but outside it many times. The extra backslashes are definitely not wanted.

I just tried to replicate your example in Airtable and I couldn’t avoid the extra backslashes. I’m not even sure that REGEX is fully implemented in Airtable. No way to know without getting rid of the auto backslashes.

Really sorry that I can’t help.

Really appreciate your kind efforts nevertheless, @augmented! I may attempt to mark this as a bug, if I can.

kuovonne
18 - Pluto
18 - Pluto

Try these formulas. They might be a bit more than you need, but they should work.

IF(
  REGEX_MATCH({My URL Field}, "Details/\\d+"),
  SUBSTITUTE(
    REGEX_EXTRACT({My URL Field}, "Details/\\d+"),
    "Details/", 
    ""
  )
)

IF(
  REGEX_MATCH({My URL Field}, "stdntID=\\d+"),
  SUBSTITUTE(
    REGEX_EXTRACT({My URL Field}, "stdntID=\\d+"),
    "stdntID=", 
    ""
  )
)

Here is some more information about Airtable’s REGEX functions:

  • Airtable adds the extra backslashes to escape the backslash. I’m not sure why they need to be escaped, but that is not the source of your problem.

  • Airtable RegEx does not support look ahead / look behind.

  • Airtable REGEX_EXTRACT returns an error if it cannot find a match.

  • Airtable REGEX_EXTRACT returns only the first match.