Skip to main content
Solved

Data cleaning formula


Hi,

I am trying to clean some of my data but I’m not sure which formula to use.

I have three columns (URL, Date, Data).

I need to extract the strings from the data column, the date and the URL (which are always different).

In the Date column, I have LEFT(Data,10) which returns the date perfectly.

In my URL column, I have MID(Data,11,256) which returns the URL. I’ve set up a count of 256 characters to be sure I get all of it but I’m sure there is a more elegant way to do it.

Now the main problem is that sometimes I get 2 identical URL in the data column but I only need one of them.

So how can I be sure to always get the first URL and discard the second one if I get a duplicate one.

Many thanks in advance!

Best answer by _Pierre

kuovonne wrote:

Do you know if there will always be a space after the first url if there are multiple urls? Can you always discard the seond url?

MID(
  {Data},
  11,
  IF(
    FIND(" ", {Data}, 12),
    FIND(" ", {Data}, 12)-11,
    LEN({Data})
  )
)

Thank you @kuovonne! It is working perfectly now :slightly_smiling_face:

I forgot to mention it was not a space between the URLs but a new line so I modified your formula slightly. Also I wrapped it in a trim function because I was having a empty line before the URL.

   TRIM(
     MID(
       {Data},
       11,
       IF(
         FIND("\n", {Data}, 12),
         FIND("\n", {Data}, 12)-11,
         LEN({Data})
       )
     )
    )
View original
Did this topic help you find an answer to your question?

2 replies

kuovonne
Forum|alt.badge.img+27
  • Brainy
  • 6001 replies
  • October 26, 2020

Do you know if there will always be a space after the first url if there are multiple urls? Can you always discard the seond url?

MID(
  {Data},
  11,
  IF(
    FIND(" ", {Data}, 12),
    FIND(" ", {Data}, 12)-11,
    LEN({Data})
  )
)

  • Author
  • New Participant
  • 1 reply
  • Answer
  • October 26, 2020
kuovonne wrote:

Do you know if there will always be a space after the first url if there are multiple urls? Can you always discard the seond url?

MID(
  {Data},
  11,
  IF(
    FIND(" ", {Data}, 12),
    FIND(" ", {Data}, 12)-11,
    LEN({Data})
  )
)

Thank you @kuovonne! It is working perfectly now :slightly_smiling_face:

I forgot to mention it was not a space between the URLs but a new line so I modified your formula slightly. Also I wrapped it in a trim function because I was having a empty line before the URL.

   TRIM(
     MID(
       {Data},
       11,
       IF(
         FIND("\n", {Data}, 12),
         FIND("\n", {Data}, 12)-11,
         LEN({Data})
       )
     )
    )

Reply