Feb 10, 2021 07:25 AM
Hi there, I recognize the power of regex expressions, but I’m just pathetic at composing them, so I’m reaching out for help from someone who actually understands the syntax.
I have a table in which all entries contain a URL with this structure:
https://scholarship.ode.state.oh.us/Student/Home/Details/555555?stdntID=444444
From record to record, only two components of the URL change:
Details/
(i.e. 555555
)stdntID=
(i.e. 444444
)I need to write two formulas, one for each of the components named above, to extract these unique numbers. If anyone could help in this regard, goodness what a kindness it would be.
Thank you for taking the time to read and consider.
Solved! Go to Solution.
Feb 12, 2021 08:43 AM
Try these formulas. They might be a bit more than you need, but they should work.
IF(
REGEX_MATCH({My URL Field}, "Details/\\d+"),
SUBSTITUTE(
REGEX_EXTRACT({My URL Field}, "Details/\\d+"),
"Details/",
""
)
)
IF(
REGEX_MATCH({My URL Field}, "stdntID=\\d+"),
SUBSTITUTE(
REGEX_EXTRACT({My URL Field}, "stdntID=\\d+"),
"stdntID=",
""
)
)
Here is some more information about Airtable’s REGEX functions:
Airtable adds the extra backslashes to escape the backslash. I’m not sure why they need to be escaped, but that is not the source of your problem.
Airtable RegEx does not support look ahead / look behind.
Airtable REGEX_EXTRACT returns an error if it cannot find a match.
Airtable REGEX_EXTRACT returns only the first match.
Feb 10, 2021 12:38 PM
Try these:
Details
\d+(?=?)
student id
\d+(.\d+)?$
Feb 10, 2021 07:58 PM
Thank you, @augmented, for taking a swing at it!
When I attempt your solution, i’m noticing extra backslashes are added when I save the formula, like this:
IF(
{My URL Field}, REGEX_EXTRACT({My URL Field}, "\\d+(?=?)")
)
(returns #ERROR!
)
and like this:
IF(
{My URL Field}, REGEX_EXTRACT({My URL Field}, "\\d+(.\\d+)?$")
)
(returns blanks cells)
The formulas as of now are not extracting the numbers as hoped, but I’m wondering if it’s because of those extra backslashes being added.
Feb 11, 2021 05:41 AM
Interesting. I’ve never used REGEX in Airtable, but outside it many times. The extra backslashes are definitely not wanted.
I just tried to replicate your example in Airtable and I couldn’t avoid the extra backslashes. I’m not even sure that REGEX is fully implemented in Airtable. No way to know without getting rid of the auto backslashes.
Really sorry that I can’t help.
Feb 11, 2021 06:11 PM
Really appreciate your kind efforts nevertheless, @augmented! I may attempt to mark this as a bug, if I can.
Feb 12, 2021 08:43 AM
Try these formulas. They might be a bit more than you need, but they should work.
IF(
REGEX_MATCH({My URL Field}, "Details/\\d+"),
SUBSTITUTE(
REGEX_EXTRACT({My URL Field}, "Details/\\d+"),
"Details/",
""
)
)
IF(
REGEX_MATCH({My URL Field}, "stdntID=\\d+"),
SUBSTITUTE(
REGEX_EXTRACT({My URL Field}, "stdntID=\\d+"),
"stdntID=",
""
)
)
Here is some more information about Airtable’s REGEX functions:
Airtable adds the extra backslashes to escape the backslash. I’m not sure why they need to be escaped, but that is not the source of your problem.
Airtable RegEx does not support look ahead / look behind.
Airtable REGEX_EXTRACT returns an error if it cannot find a match.
Airtable REGEX_EXTRACT returns only the first match.