Help

Welcome to the Airtable Community! If you're new here, check out our Getting Started area to get the most out of your community experience.

Extract all (URL) from Rich Text

Topic Labels: Formulas
498 3
cancel
Showing results for 
Search instead for 
Did you mean: 

Hello!
I have a Blogs with urls like: Some text (https://www.example.com Example Link) some text
And i need extract all links from each blog

Regex blowed my mind, so i here to ask help. Help me please

3 Replies 3

Note that an Airtable formula will realistically only be able to get the first url. There are workarounds to find multiple urls, but the limit would be hardcoded and the formula becomes insanely complex after a couple of urls.

You are better off using scripting to do this. You would probably still need to use regular expressions, but you would use JavaScript regular expressions instead of RE2. You can Google regular expressions that match urls.

Yeah, RegEx needs a PR firm. :winking_face:

Here’s a script block that’ll do it.

image

output.markdown('# URLify');
let text = "Find me at http://www.example.com and also at http://stackoverflow.com";
let matches = text.match(/\bhttps?:\/\/\S+/gi);
console.log(matches);

Similarly, if you want a “lower code” way of extracting all the URLs from a string of text — and yes, I realize the extreme irony of calling anything involving REGEX low-code — you can use the text parsing module of Make.com.

And you could use the following REGEX code that I discovered, which will parse all the links that start with the prefix https://, http://, or www.

(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})

This is how it would actually look in your Make.com scenario:

image

This will return an array of links to you, which you could then do whatever you wanted to do with them in Airtable (or any other app or service).