Jun 03, 2022 05:57 AM
Hello!
I have a Blogs with urls like: Some text (https://www.example.com Example Link) some text
And i need extract all links from each blog
Regex blowed my mind, so i here to ask help. Help me please
Jun 03, 2022 06:51 AM
Note that an Airtable formula will realistically only be able to get the first url. There are workarounds to find multiple urls, but the limit would be hardcoded and the formula becomes insanely complex after a couple of urls.
You are better off using scripting to do this. You would probably still need to use regular expressions, but you would use JavaScript regular expressions instead of RE2. You can Google regular expressions that match urls.
Jun 03, 2022 01:55 PM
Yeah, RegEx needs a PR firm. :winking_face:
Here’s a script block that’ll do it.
output.markdown('# URLify');
let text = "Find me at http://www.example.com and also at http://stackoverflow.com";
let matches = text.match(/\bhttps?:\/\/\S+/gi);
console.log(matches);
Jun 04, 2022 02:43 AM
Similarly, if you want a “lower code” way of extracting all the URLs from a string of text — and yes, I realize the extreme irony of calling anything involving REGEX low-code — you can use the text parsing module of Make.com.
And you could use the following REGEX code that I discovered, which will parse all the links that start with the prefix https://
, http://
, or www.
(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})
This is how it would actually look in your Make.com scenario:
This will return an array of links to you, which you could then do whatever you wanted to do with them in Airtable (or any other app or service).