Help

Extract all (URL) from Rich Text

Topic Labels: Formulas
2570 3
cancel
Showing results for 
Search instead for 
Did you mean: 
111230
4 - Data Explorer
4 - Data Explorer

Hello!
I have a Blogs with urls like: Some text (https://www.example.com Example Link) some text
And i need extract all links from each blog

Regex blowed my mind, so i here to ask help. Help me please

3 Replies 3

Note that an Airtable formula will realistically only be able to get the first url. There are workarounds to find multiple urls, but the limit would be hardcoded and the formula becomes insanely complex after a couple of urls.

You are better off using scripting to do this. You would probably still need to use regular expressions, but you would use JavaScript regular expressions instead of RE2. You can Google regular expressions that match urls.

Yeah, RegEx needs a PR firm. :winking_face:

Here’s a script block that’ll do it.

image

output.markdown('# URLify');
let text = "Find me at http://www.example.com and also at http://stackoverflow.com";
let matches = text.match(/\bhttps?:\/\/\S+/gi);
console.log(matches);

Similarly, if you want a “lower code” way of extracting all the URLs from a string of text — and yes, I realize the extreme irony of calling anything involving REGEX low-code — you can use the text parsing module of Make.com.

And you could use the following REGEX code that I discovered, which will parse all the links that start with the prefix https://, http://, or www.

(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})

This is how it would actually look in your Make.com scenario:

image

This will return an array of links to you, which you could then do whatever you wanted to do with them in Airtable (or any other app or service).