Help

The Community will be temporarily unavailable starting on Friday February 28. We’ll be back as soon as we can! To learn more, check out our Announcements blog post.

Root domain formula

Solved
Jump to Solution
9247 15
cancel
Showing results for 
Search instead for 
Did you mean: 
Jen_Baguhin
5 - Automation Enthusiast
5 - Automation Enthusiast

Is there any way to change different versions of domains to its root domains?

example:

If we add

Column 1
https://www.domain.com/
https://domain.com/
https://www.domain.com
https://domain.com
http://www.domain.com/
http://domain.com/
http://www.domain.com
http://domain.com

Column 2 Result in all rows of above
domain.com

1 Solution

Accepted Solutions
Joachim_Brindea
6 - Interface Innovator
6 - Interface Innovator

This removes subdomains, folders, etc and doesn't output anything if there is no url.

IF({url}="","",REGEX_EXTRACT( {url}, '^(?:https?:\\/\\/)?(?:[^@\n]+@)?(?:www\\.)?(?:[^.]+\\.)?([^:\\/\n?]+\\.[^:\\/\n?]+)'))

  

See Solution in Thread

15 Replies 15
Helmi
6 - Interface Innovator
6 - Interface Innovator

I’m not a big formula pro yet but I tried it out and it seems to work:

MID(
  URL, 
  IF(
    SEARCH('//www.', URL),
    SEARCH('//www.', URL) + 6,
    IF(
      SEARCH('//', URL),
      SEARCH('//', URL) + 2,
      IF(
        SEARCH('www.', URL),
        SEARCH('www.', URL) + 4,
        1
      )
    )
  ),
  LEN(URL)
)

here’s what it looks for me when testing:

image.png

I’m sure there are more efficient ways to do this and there might be some things left over that you don’t want but maybe a good starter still.

To add one little bit to @Helmi’s solution, this will also catch the ‘/’ at the end if it’s there:

MID(
  URL, 
  IF(
    SEARCH('//www.', URL),
    SEARCH('//www.', URL) + 6,
    IF(
      SEARCH('//', URL),
      SEARCH('//', URL) + 2,
      IF(
        SEARCH('www.', URL),
        SEARCH('www.', URL) + 4,
        1
      )
    )
  ),
  IF(
    SEARCH('com/', URL),
    LEN(URL) - 
    IF(
      SEARCH('//www.', URL),
      SEARCH('//www.', URL) + 6,
      IF(
        SEARCH('//', URL),
        SEARCH('//', URL) + 2,
        IF(
          SEARCH('www.', URL),
          SEARCH('www.', URL) + 4,
          1
        )
      )
    ),
    LEN(URL)
  )
)

hmm but then only if it’s .com, right? I’d better not rely on that. But good idea trying to catch that / - I was also thinking about trying to remove paths in the URL

Jen_Baguhin
5 - Automation Enthusiast
5 - Automation Enthusiast

WORKED! :slightly_smiling_face: thank you!

Kane_Jamison
4 - Data Explorer
4 - Data Explorer

Here’s what we use to turn raw urls into domains (retaining subdomain if there is one):

LEFT(SUBSTITUTE(SUBSTITUTE({URL cell},“http://”,""),“https://”,""),FIND("/",SUBSTITUTE(SUBSTITUTE({URL cell},“http://”,""),“https://”,""))-1)

This removes http(s)://, then looks for the first forward slash, and deletes anything after that.

It could fail if you have a slash-less URL such as domain:8080 or something like a parameter occurring without a trailing slash on the TLD.

Brad_Siefert
4 - Data Explorer
4 - Data Explorer

I came up with a much more robust, but also sorta dumb formula because I couldn’t get @Kane_Jamison’s solution to work.

IF(
  LEFT(
    SUBSTITUTE(
      SUBSTITUTE(  
        SUBSTITUTE({URL},"https://",""),"http://",""
      ),"www.",""
    ),
    FIND("/",
      SUBSTITUTE(
        SUBSTITUTE(  
          SUBSTITUTE({URL},"https://",""),"http://",""
        ),"www.",""
      )
    )-1
  ),
  LEFT(
    SUBSTITUTE(
      SUBSTITUTE(  
        SUBSTITUTE({URL},"https://",""),"http://",""
      ),"www.",""
    ),
    FIND("/",
      SUBSTITUTE(
        SUBSTITUTE(  
          SUBSTITUTE({URL},"https://",""),"http://",""
        ),"www.",""
      )
    )-1
  ),
  SUBSTITUTE(
    SUBSTITUTE(  
      SUBSTITUTE({URL},"https://",""),"http://",""
    ),"www.",""
  )
)
Yonathan_Cohen
5 - Automation Enthusiast
5 - Automation Enthusiast

My formula works perfectly, use it without moderation, just REPLACE “main domain name” by the URL_column_name that you want to clear:

LEFT(
  SUBSTITUTE(
    SUBSTITUTE(
      SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""),
  IF(
    FIND("/",
      SUBSTITUTE(
        SUBSTITUTE(
          SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""))=0, 
    LEN(
      SUBSTITUTE(
        SUBSTITUTE(
          SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", "")), 
    FIND("/",
      SUBSTITUTE(
        SUBSTITUTE(
          SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""))-1))

airtable formula to extract domain name from URL airtable

This is a bit more advanced, it will also remove the first part of such an URL: en.website.com
For example:
=>https://www.en.orchestrasales.com/home/view/
will be turned into “orchestrasales.com

LEFT(
  SUBSTITUTE(
    SUBSTITUTE(
      SUBSTITUTE(
          SUBSTITUTE(
              SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", ""),
  IF(
    FIND("/",
      SUBSTITUTE(
        SUBSTITUTE(
          SUBSTITUTE(
              SUBSTITUTE(
                  SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", ""))=0, 
    LEN(
      SUBSTITUTE(
        SUBSTITUTE(
          SUBSTITUTE(
              SUBSTITUTE(
                  SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", "")), 
    FIND("/",
      SUBSTITUTE(
        SUBSTITUTE(
          SUBSTITUTE(
              SUBSTITUTE(
                  SUBSTITUTE({main domain name}, "http://",""), "https://",""), "www.", ""), "fr.", ""), "en.", ""))-1))

Hi @Jeremy_Oglesby I don’t suppose I could ask you to add to this?

I have multiple top level domains in my data set e.g .com .org .ie etc etc

How can I modify the tld seach string below to pick up multiple tld values?

IF(
SEARCH(‘com/’, URL),
LEN(URL) -

Would really appreciate any help

Thanks,

Brian