Solved

Split a URL into its component

Forum|Forum|4 years ago
May 19, 2021
5 replies
100 views

+12

Paolo_Perrone
Inspiring

I want to be able to split a url address into its component and use them as I may need.

What I have:
“Airtable Universe”

What I’d like to achieve
“https://”
“airtable.com”
universe

how would you handle this use-case?
can you provide any smart solutions?

thanks in advance

Best answer by Paolo_Perrone

If I’m understanding you correctly, this regex will do the job here:

REGEX_EXTRACT(URL,"https://") & "\n" & REGEX_EXTRACT(URL,"https://(.*)/") & "\n" & REGEX_EXTRACT(URL,"https.*\\.[com|net|org]{3,5}\\/(.*)")

If you want to match the final slash in the address, just push the \/ construct into the parantheses-bound capture group at the end. The regex itself is a bit hacky but I’ll chalk that up as a plus in case you wanted to improve upon it and learn a few things yourself. :grinning_face_with_sweat:

this is very good stuff - thanks!

I managed the same result through a bunch of text function, this solution is way more sophisticated.

Any recommended resources to learn REGEX?

+17

Dominik_Bosnjak
Inspiring
Forum|Forum|4 years ago
May 20, 2021

In what field is that link? Those quotes make me think you might have copied it from another formula, in which case it wouldn’t be optimal to try parsing the mirror but the original. Formulas are tricky to poll anyway, due to not having static values (duh), so Airtable doesn’t exactly commit comparable resources to storing their states. If you can explain how the exact field storing the data looks, you’ll get a better answer faster. Ditto if there’s any (formula) code you have already tried.

+12

Paolo_Perrone
Author
Inspiring
Forum|Forum|4 years ago
May 20, 2021

the quotation marks are there to show the web address without previewing it
if I copy the address w/o quotes we get a preview like this

however the link I want to parse is in a url field

+17

Dominik_Bosnjak
Inspiring
Forum|Forum|4 years ago
May 22, 2021

the quotation marks are there to show the web address without previewing it
if I copy the address w/o quotes we get a preview like this

however the link I want to parse is in a url field

If I’m understanding you correctly, this regex will do the job here:

REGEX_EXTRACT(URL,"https://") & "\n" & REGEX_EXTRACT(URL,"https://(.*)/") & "\n" & REGEX_EXTRACT(URL,"https.*\\.[com|net|org]{3,5}\\/(.*)")

+12

Paolo_Perrone
Author
Inspiring
Answer
Forum|Forum|4 years ago
May 23, 2021

If I’m understanding you correctly, this regex will do the job here:

REGEX_EXTRACT(URL,"https://") & "\n" & REGEX_EXTRACT(URL,"https://(.*)/") & "\n" & REGEX_EXTRACT(URL,"https.*\\.[com|net|org]{3,5}\\/(.*)")

this is very good stuff - thanks!

I managed the same result through a bunch of text function, this solution is way more sophisticated.

Any recommended resources to learn REGEX?

+17

Dominik_Bosnjak
Inspiring
Forum|Forum|4 years ago
May 25, 2021

this is very good stuff - thanks!

I managed the same result through a bunch of text function, this solution is way more sophisticated.

Any recommended resources to learn REGEX?

Whew, thank you but I think some of the veterans here would disagree with that part where you used ‘regex’ and ‘sophisticated’ in the same sentence. :grinning_face_with_sweat:

Regular expressions are one of the first real-world skills you’ll learn no matter the language or platform you tackle, from Python and PowerShell’s don’t-call-me-C# to Airtable’s JS-based formulae, Swift, etc. And no matter the environment, your regex-taming ability will be among the biggest,baddest, and heaviest-hitting tools in your arsenal.

Getting a hang of it will inevitably leave a long trail of ones and zeroes in your wake; all smashed to bits, naturally, but sooner or later, you’ll realize that half of the places you busted into were waiting for you to knock, and the other half moved their valuables to another address… in 2005.

So, yeah, you can certainly crack a lot of things with Regex, just don’t forget to keep learning after you get the hang of its parsing logic as there’s a whole other world of language processing beyond it and the most difficult Regex skill is knowing when to abstain from using Regex and why.

Everything else comes down to abstracting your targets from their surroundings and that’s what the brain is the best at doing, anyway.

As for a good resource for learning, would you rather learn by example or trial-and-error? Maybe a mix of both? Assuming you’re not looking for book recommendations, RegExr will probably have content to your liking. It’s also a place where you can occasionally encounter what I’d dare label a “sophisticated” regex. The way you can tell it apart from the rest is that there’s just no way of understanding wtf is going on without tracing the interpreter’s path, character by character.

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded