Skip to main content
Solved

Removing the tail end from URLs

  • October 7, 2022
  • 10 replies
  • 81 views

Hey AT Community!

Is there a way to remove the tail ends from formulas in bulk?

Example-

I have 3 urls in 3 different rows:

https://google.com/index?test
https://yahoo.com/example?test
https://apple.com/ending?test

Using a formula (or something else) I’d like to get rid of the tails so that I am left with:

https://google.com/
https://yahoo.com/
https://apple.com/

I feel like this should be possible?

Appreciate any pointers!

Best answer by kuovonne

I was thinking -

REGEX_REPLACE({URL}, "^https?:\/\/[^\/]+", "")

What about

IF(
    REGEX_MATCH({URL}, "https?:\/\/[^\/]+"),
    REGEX_EXTRACT({URL}, "https?:\/\/[^\/]+")
)

10 replies

TheTimeSavingCo
Forum|alt.badge.img+31

Hey John, try:

REGEX_REPLACE(
  {Text},
  '\\?(.*)',
  ''
)

Justin_Barrett
Forum|alt.badge.img+21

Hey John, try:

REGEX_REPLACE(
  {Text},
  '\\?(.*)',
  ''
)

@Adam_TheTimeSavingCo Your example removes everything starting with the question mark, but John was asking about taking away everything after the last forward slash. Here’s what I came up with:

IF({URL}, REGEX_REPLACE({URL}, "[^/]*$", ""))

This focuses on the end of the string and strips away everything that’s not a forward slash.


  • Author
  • Participating Frequently
  • October 7, 2022

@Justin_Barrett Thanks so much! This is definitely heading in the right direction. This works really great for most of my links.

I should have been more clear, because I have a bunch of different link formats with different amounts of /'s.

For example:

https://google.com/index/test
https://yahoo.com/example/test/e231/

It seems like really, the common denominator is that I want to keep everything to the left of the 3rd /.

Is there a way to modify the formula to approach it from that side?


Forum|alt.badge.img+19
  • Inspiring
  • October 7, 2022

@Justin_Barrett Thanks so much! This is definitely heading in the right direction. This works really great for most of my links.

I should have been more clear, because I have a bunch of different link formats with different amounts of /'s.

For example:

https://google.com/index/test
https://yahoo.com/example/test/e231/

It seems like really, the common denominator is that I want to keep everything to the left of the 3rd /.

Is there a way to modify the formula to approach it from that side?


I was thinking -

REGEX_REPLACE({URL}, "^https?:\/\/[^\/]+", "")

kuovonne
Forum|alt.badge.img+29
  • Brainy
  • October 7, 2022

@Justin_Barrett Thanks so much! This is definitely heading in the right direction. This works really great for most of my links.

I should have been more clear, because I have a bunch of different link formats with different amounts of /'s.

For example:

https://google.com/index/test
https://yahoo.com/example/test/e231/

It seems like really, the common denominator is that I want to keep everything to the left of the 3rd /.

Is there a way to modify the formula to approach it from that side?


Having a complete set of test data that covers all use cases is always important. When using REGEX, a complete set of test data is absolutely essential!


  • Author
  • Participating Frequently
  • October 7, 2022

I was thinking -

REGEX_REPLACE({URL}, "^https?:\/\/[^\/]+", "")

@Bill.French Interesting, thanks for sharing!

This formula is giving me everything after the 3rd “/”. What I’m looking for is essentially the opposite, everything before (can include) the 3rd “/”.


kuovonne
Forum|alt.badge.img+29
  • Brainy
  • Answer
  • October 7, 2022

I was thinking -

REGEX_REPLACE({URL}, "^https?:\/\/[^\/]+", "")

What about

IF(
    REGEX_MATCH({URL}, "https?:\/\/[^\/]+"),
    REGEX_EXTRACT({URL}, "https?:\/\/[^\/]+")
)

  • Author
  • Participating Frequently
  • October 7, 2022

Having a complete set of test data that covers all use cases is always important. When using REGEX, a complete set of test data is absolutely essential!


@kuovonne appreciate the feedback! The data is confidential at this point so I’m unable to share it in bulk.


  • Author
  • Participating Frequently
  • October 7, 2022

What about

IF(
    REGEX_MATCH({URL}, "https?:\/\/[^\/]+"),
    REGEX_EXTRACT({URL}, "https?:\/\/[^\/]+")
)

You nailed it! Thanks so much @kuovonne. This is exactly what I needed :slightly_smiling_face:


Forum|alt.badge.img+19
  • Inspiring
  • October 7, 2022

@kuovonne appreciate the feedback! The data is confidential at this point so I’m unable to share it in bulk.


It doesn’t have to be the actual data, but should represent the universe of possible patterns.