data:image/s3,"s3://crabby-images/ea854/ea85446765e35028c55613504ad46ff6e2043c41" alt="Jen_Baguhin Jen_Baguhin"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nov 16, 2018 06:15 AM
Is there any way to change different versions of domains to its root domains?
example:
If we add
Column 1
https://www.domain.com/
https://domain.com/
https://www.domain.com
https://domain.com
http://www.domain.com/
http://domain.com/
http://www.domain.com
http://domain.com
Column 2 Result in all rows of above
domain.com
Solved! Go to Solution.
Accepted Solutions
data:image/s3,"s3://crabby-images/eb783/eb7836c06f693bce0956bbddc8ca57eb20516abb" alt="Joachim_Brindea Joachim_Brindea"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mar 16, 2023 04:59 AM
This removes subdomains, folders, etc and doesn't output anything if there is no url.
IF({url}="","",REGEX_EXTRACT( {url}, '^(?:https?:\\/\\/)?(?:[^@\n]+@)?(?:www\\.)?(?:[^.]+\\.)?([^:\\/\n?]+\\.[^:\\/\n?]+)'))
data:image/s3,"s3://crabby-images/d47e5/d47e52c11531ff4aac91a3d49bc4d3128b8be61d" alt="Helmi Helmi"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nov 16, 2018 09:35 AM
I’m not a big formula pro yet but I tried it out and it seems to work:
MID(
URL,
IF(
SEARCH('//www.', URL),
SEARCH('//www.', URL) + 6,
IF(
SEARCH('//', URL),
SEARCH('//', URL) + 2,
IF(
SEARCH('www.', URL),
SEARCH('www.', URL) + 4,
1
)
)
),
LEN(URL)
)
here’s what it looks for me when testing:
I’m sure there are more efficient ways to do this and there might be some things left over that you don’t want but maybe a good starter still.
data:image/s3,"s3://crabby-images/26433/26433101725e1bd577b291d9c79d4b063049f39c" alt="Jeremy_Oglesby Jeremy_Oglesby"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nov 16, 2018 09:51 AM
To add one little bit to @Helmi’s solution, this will also catch the ‘/’ at the end if it’s there:
MID(
URL,
IF(
SEARCH('//www.', URL),
SEARCH('//www.', URL) + 6,
IF(
SEARCH('//', URL),
SEARCH('//', URL) + 2,
IF(
SEARCH('www.', URL),
SEARCH('www.', URL) + 4,
1
)
)
),
IF(
SEARCH('com/', URL),
LEN(URL) -
IF(
SEARCH('//www.', URL),
SEARCH('//www.', URL) + 6,
IF(
SEARCH('//', URL),
SEARCH('//', URL) + 2,
IF(
SEARCH('www.', URL),
SEARCH('www.', URL) + 4,
1
)
)
),
LEN(URL)
)
)
data:image/s3,"s3://crabby-images/d47e5/d47e52c11531ff4aac91a3d49bc4d3128b8be61d" alt="Helmi Helmi"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nov 17, 2018 03:04 AM
hmm but then only if it’s .com, right? I’d better not rely on that. But good idea trying to catch that / - I was also thinking about trying to remove paths in the URL
data:image/s3,"s3://crabby-images/ea854/ea85446765e35028c55613504ad46ff6e2043c41" alt="Jen_Baguhin Jen_Baguhin"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nov 17, 2018 05:53 AM
WORKED! :slightly_smiling_face: thank you!
data:image/s3,"s3://crabby-images/63a47/63a47d7f87e32b84f585c8c6499d6ae3d7d49d5b" alt="Kane_Jamison Kane_Jamison"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Aug 19, 2019 11:06 AM
Here’s what we use to turn raw urls into domains (retaining subdomain if there is one):
LEFT(SUBSTITUTE(SUBSTITUTE({URL cell},“http://”,""),“https://”,""),FIND("/",SUBSTITUTE(SUBSTITUTE({URL cell},“http://”,""),“https://”,""))-1)
This removes http(s)://, then looks for the first forward slash, and deletes anything after that.
It could fail if you have a slash-less URL such as domain:8080 or something like a parameter occurring without a trailing slash on the TLD.
data:image/s3,"s3://crabby-images/3d308/3d308110466b9276d5e2124e2faba7b8e2e64355" alt="Brad_Siefert Brad_Siefert"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jun 19, 2020 12:41 PM
I came up with a much more robust, but also sorta dumb formula because I couldn’t get @Kane_Jamison’s solution to work.
IF(
LEFT(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({URL},"https://",""),"http://",""
),"www.",""
),
FIND("/",
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({URL},"https://",""),"http://",""
),"www.",""
)
)-1
),
LEFT(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({URL},"https://",""),"http://",""
),"www.",""
),
FIND("/",
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({URL},"https://",""),"http://",""
),"www.",""
)
)-1
),
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({URL},"https://",""),"http://",""
),"www.",""
)
)
data:image/s3,"s3://crabby-images/16a00/16a00607bb48a6e328782d70be6715bafa1393ea" alt="Yonathan_Cohen Yonathan_Cohen"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Aug 25, 2020 01:08 AM
My formula works perfectly, use it without moderation, just REPLACE “main domain name” by the URL_column_name that you want to clear:
LEFT(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""),
IF(
FIND("/",
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""))=0,
LEN(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", "")),
FIND("/",
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""))-1))
airtable formula to extract domain name from URL airtable
data:image/s3,"s3://crabby-images/16a00/16a00607bb48a6e328782d70be6715bafa1393ea" alt="Yonathan_Cohen Yonathan_Cohen"
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jul 27, 2021 01:30 AM
This is a bit more advanced, it will also remove the first part of such an URL: en.website.com
For example:
=>https://www.en.orchestrasales.com/home/view/
will be turned into “orchestrasales.com”
LEFT(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", ""),
IF(
FIND("/",
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", ""))=0,
LEN(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", "")),
FIND("/",
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", ""))-1))
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
May 16, 2022 07:38 AM
Hi @Jeremy_Oglesby I don’t suppose I could ask you to add to this?
I have multiple top level domains in my data set e.g .com .org .ie etc etc
How can I modify the tld seach string below to pick up multiple tld values?
IF(
SEARCH(‘com/’, URL),
LEN(URL) -
Would really appreciate any help
Thanks,
Brian
data:image/s3,"s3://crabby-images/fd698/fd698713d0baac0a63656a05635b6caad01a7886" alt=""