Skip to main content

@Sho answered my first URL audit Find formula question

My followup question/challenge:

I want to clean up (not just identify) the {Link URL} field and join it with {reftag} field:

Add

  • missing https://
  • and/or missing "s" in http://

Remove

  • invalid characters; e.g., ' §
  • the following querystrings:
    • ?ref= 
    • ?ref_=
    • &ref=
    • &ref_=

Then join two fields ({Link URL} & {reftag}) based on logic:

  • If {Link URL} Does NOT contain '?' append {Link URL} with '?ref_=' & {reftag} field
  • If {Link URL} contains '?' append {Link URL} with '&ref_=' & {reftag} field

Here is the original solution used to callout errors in the {Link URL} field:

 

IF(

{Link URL},

IF(

OR(

FIND("http://", {Link URL}),

NOT(FIND("https://", {Link URL}))

),

'missing https, '

) &

IF(

FIND("http://", {Link URL}),

'missing https, '

) &

IF(

FIND("ref=", {Link URL}),

'remove ref=, '

) &

IF(

FIND("ref_=", {Link URL}),

'remove ref_=, '

) &

IF(

FIND("'", {Link URL}),

'remove apostrophe, '

) &

IF(

FIND("§", {Link URL}),

'remove §, '

)

)

 

 

 

 

This would be easier to do with a script. If you use a formula though, here are some functions you'll want to look at to get construct a method for each case:

  • SEARCH()
  • SUBSTITUTE()
  • LEFT()
  • RIGHT()
  • MID()
  • LEN()

I really suggest using a script though. This formula is going to look EXTREMELY messy as you'll have to repeat a few steps multiple times.


Hey @airballer86 Great question. Like @Kenneth_Raghuna said I'd recommend an automation here. I recorded a quick Loom Video to show you step by step how I'd set it up. 

*** Remember to change all my Table / Field / Variable names to match yours or this won't work for you ***

Sample Table Setup And Desired Output:

 

Automation Trigger Setup:

 

Script Input Variables:

 

 Automation Script:

var urlTable = base.getTable("URL Markdown") // change this table name to match yours



var inputConfig = input.config();

var recordID = inputConfig.recordID;

var originalURL = inputConfig.originalURL;

var refTag = inputConfig.refTag;



function cleanUrl(linkUrl, reftag) {

// Add missing 'https://' or 'http://'

if (!linkUrl.startsWith('http')) {

linkUrl = 'https://' + linkUrl;

}



// Add missing 's' in 'http://'

linkUrl = linkUrl.replace('http://', 'https://');



// Remove invalid characters

linkUrl = linkUrl.replace(/(\'§]/g, '');



// Remove everything after '?ref='

const index1 = linkUrl.indexOf('?ref=');

if (index1 !== -1) {

linkUrl = linkUrl.slice(0, index1 + 5);

}



// Remove everything after '?ref_='

const index3 = linkUrl.indexOf('?ref_=');

if (index3 !== -1) {

linkUrl = linkUrl.slice(0, index3 + 6);

}



// Remove specified query strings

const queryStringsToRemove = '?ref=', '?ref_=', '&ref=', '&ref_='];

for (const qs of queryStringsToRemove) {

linkUrl = linkUrl.replace(qs, '');

}







// Join fields based on logic

if (refTag != null) {

if (!linkUrl.includes('?')) {

linkUrl += `?ref_=${reftag}`;

} else {

linkUrl += `&ref_=${reftag}`;

}

}



return linkUrl;

}



// Example usage

// const linkUrl = 'example.com/page?param=value&ref_=123&other=456';

// const reftag = '456';



const cleanedUrl = cleanUrl(originalURL, refTag);

console.log(cleanedUrl);



// Output Cleaned URL

var updates = {

"id": recordID,

fields: {

"Cleaned URL": cleanedUrl

}

}]

console.log(updates)



await urlTable.updateRecordsAsync(updates);

Let me know if this works for you!


Hey @airballer86 Great question. Like @Kenneth_Raghuna said I'd recommend an automation here. I recorded a quick Loom Video to show you step by step how I'd set it up. 

*** Remember to change all my Table / Field / Variable names to match yours or this won't work for you ***

Sample Table Setup And Desired Output:

 

Automation Trigger Setup:

 

Script Input Variables:

 

 Automation Script:

var urlTable = base.getTable("URL Markdown") // change this table name to match yours



var inputConfig = input.config();

var recordID = inputConfig.recordID;

var originalURL = inputConfig.originalURL;

var refTag = inputConfig.refTag;



function cleanUrl(linkUrl, reftag) {

// Add missing 'https://' or 'http://'

if (!linkUrl.startsWith('http')) {

linkUrl = 'https://' + linkUrl;

}



// Add missing 's' in 'http://'

linkUrl = linkUrl.replace('http://', 'https://');



// Remove invalid characters

linkUrl = linkUrl.replace(/(\'§]/g, '');



// Remove everything after '?ref='

const index1 = linkUrl.indexOf('?ref=');

if (index1 !== -1) {

linkUrl = linkUrl.slice(0, index1 + 5);

}



// Remove everything after '?ref_='

const index3 = linkUrl.indexOf('?ref_=');

if (index3 !== -1) {

linkUrl = linkUrl.slice(0, index3 + 6);

}



// Remove specified query strings

const queryStringsToRemove = '?ref=', '?ref_=', '&ref=', '&ref_='];

for (const qs of queryStringsToRemove) {

linkUrl = linkUrl.replace(qs, '');

}







// Join fields based on logic

if (refTag != null) {

if (!linkUrl.includes('?')) {

linkUrl += `?ref_=${reftag}`;

} else {

linkUrl += `&ref_=${reftag}`;

}

}



return linkUrl;

}



// Example usage

// const linkUrl = 'example.com/page?param=value&ref_=123&other=456';

// const reftag = '456';



const cleanedUrl = cleanUrl(originalURL, refTag);

console.log(cleanedUrl);



// Output Cleaned URL

var updates = {

"id": recordID,

fields: {

"Cleaned URL": cleanedUrl

}

}]

console.log(updates)



await urlTable.updateRecordsAsync(updates);

Let me know if this works for you!


@Arthur_Tutt , Thank you for the prompt and amazing response! I am putting this in place now. I left you one follow up question on the Loom video:

  • for the cleanUrl functions
    • can the " // Remove everything after '?ref=' " be modified to remove everything until next ?/&
    • and can you add logic to look for ?ref= and &ref= 

I'm beyond thankful and impressed!

I did run the script as is (updating the fields as indicated) and getting the following errors:

ERROR

TypeError: Cannot read properties of undefined (reading 'startsWith')

at cleanUrl on line 10

at main on line 56

Hey @airballer86 loved the video follow up, that's the first one ever received! 😆

Alrighty, so let's do a few examples so I'm clear on what you're looking for:

example.com?apple&ref=square&orange  would become:

https://example.com?apple&orange 

and example.com?ref_=rectangle&ref_=square&apple&ref=triangle&orange would become:

https://example.com?apple&orange 

So essentially you're just trying to eliminate anything directly related to a 'ref' (in it's various formats) but keep the other query parameters (and then add your ref tags if applicable after). Do I have that correct?

If not share a few example cases so I'm clear. Thanks! 


Hey @airballer86 loved the video follow up, that's the first one ever received! 😆

Alrighty, so let's do a few examples so I'm clear on what you're looking for:

example.com?apple&ref=square&orange  would become:

https://example.com?apple&orange 

and example.com?ref_=rectangle&ref_=square&apple&ref=triangle&orange would become:

https://example.com?apple&orange 

So essentially you're just trying to eliminate anything directly related to a 'ref' (in it's various formats) but keep the other query parameters (and then add your ref tags if applicable after). Do I have that correct?

If not share a few example cases so I'm clear. Thanks! 


@Arthur_Tutt Your examples are correct. Here are two more:

https://www.amazon.com/gcx/Customers'-Most-Loved-gifts/gfhz/events/?ref=trackthecampaign&canBeEGifted=false&canBeGiftWrapped=false

https://www.amazon.com/gcx/Customers'-Most-Loved-gifts/gfhz/events/?canBeEGifted=false&ref=trackthecampaign&canBeGiftWrapped=false&getItByTomorrow=false


@airballer86 tonight's your lucky night - was able to work some magic and further refine the script to work to your examples. Here's the sample base and expected output:

And the script (note: all input variables are the same as before)

var urlTable = base.getTable("URL Markdown")



var inputConfig = input.config();

var recordID = inputConfig.recordID;

var originalURL = inputConfig.originalURL;

var refTag = inputConfig.refTag;



function cleanUrl(linkUrl, reftag) {

// Add missing 'https://' or 'http://'

if (!linkUrl.startsWith('http')) {

linkUrl = 'https://' + linkUrl;

}



// Add missing 's' in 'http://'

linkUrl = linkUrl.replace('http://', 'https://');



// Remove invalid characters

linkUrl = linkUrl.replace(/[\'§]/g, '');



// Split the linkUrl by '&'

const parts1 = linkUrl.split('?');

console.log(parts1)



if (parts111] == null) {

let cleanedUrlComplete = parts110].toString()



if (refTag != null) {

if (!cleanedUrlComplete.includes('?')) {

cleanedUrlComplete += `?ref_=${reftag}`;

} else {

cleanedUrlComplete += `&ref_=${reftag}`;

}

}

return cleanedUrlComplete;

}



if (parts111] != null) {

const parts2 = parts111].split('&');

console.log(parts2)



// Filter out parameters starting with 'ref_=' or 'ref='

const cleanedParts1 = parts110]].filter(part => !/^ref(_|=)/.test(part));

console.log(cleanedParts1)

const cleanedParts2 = parts2.filter(part => !/^ref(_|=)/.test(part));

console.log(cleanedParts2)

let cleanedUrl2 = cleanedParts2.join('&');

console.log(cleanedUrl2)



// Join fields based on logic

let cleanedUrl1 = cleanedParts1.toString()

console.log(cleanedUrl1)



let cleanedUrl2Question;



if (cleanedParts2.length == 0) {

cleanedUrl2Question = cleanedUrl2

console.log(cleanedUrl2Question)

} else {

cleanedUrl2Question = `?${cleanedUrl2}`

console.log(cleanedUrl2Question)

}



let cleanedUrlComplete = cleanedUrl1 + cleanedUrl2Question

console.log(cleanedUrlComplete)



if (refTag != null) {

if (!cleanedUrlComplete.includes('?')) {

cleanedUrlComplete += `?ref_=${reftag}`;

} else {

cleanedUrlComplete += `&ref_=${reftag}`;

}

}

return cleanedUrlComplete;

}

}



// Example usage

// const linkUrl = 'example.com/page?param=value&ref_=123&other=456';

// const reftag = '456';



const cleanedUrl = cleanUrl(originalURL, refTag);

console.log(cleanedUrl);



// Output Cleaned URL

var updates = {

"id": recordID,

fields: {

"Cleaned URL": cleanedUrl

}

}]

// console.log(updates)



await urlTable.updateRecordsAsync(updates);

 

Note: this script could prob use some clean up / refinement, but test it out, it at least should be working! 


@airballer86 tonight's your lucky night - was able to work some magic and further refine the script to work to your examples. Here's the sample base and expected output:

And the script (note: all input variables are the same as before)

var urlTable = base.getTable("URL Markdown")



var inputConfig = input.config();

var recordID = inputConfig.recordID;

var originalURL = inputConfig.originalURL;

var refTag = inputConfig.refTag;



function cleanUrl(linkUrl, reftag) {

// Add missing 'https://' or 'http://'

if (!linkUrl.startsWith('http')) {

linkUrl = 'https://' + linkUrl;

}



// Add missing 's' in 'http://'

linkUrl = linkUrl.replace('http://', 'https://');



// Remove invalid characters

linkUrl = linkUrl.replace(/[\'§]/g, '');



// Split the linkUrl by '&'

const parts1 = linkUrl.split('?');

console.log(parts1)



if (parts111] == null) {

let cleanedUrlComplete = parts110].toString()



if (refTag != null) {

if (!cleanedUrlComplete.includes('?')) {

cleanedUrlComplete += `?ref_=${reftag}`;

} else {

cleanedUrlComplete += `&ref_=${reftag}`;

}

}

return cleanedUrlComplete;

}



if (parts111] != null) {

const parts2 = parts111].split('&');

console.log(parts2)



// Filter out parameters starting with 'ref_=' or 'ref='

const cleanedParts1 = parts110]].filter(part => !/^ref(_|=)/.test(part));

console.log(cleanedParts1)

const cleanedParts2 = parts2.filter(part => !/^ref(_|=)/.test(part));

console.log(cleanedParts2)

let cleanedUrl2 = cleanedParts2.join('&');

console.log(cleanedUrl2)



// Join fields based on logic

let cleanedUrl1 = cleanedParts1.toString()

console.log(cleanedUrl1)



let cleanedUrl2Question;



if (cleanedParts2.length == 0) {

cleanedUrl2Question = cleanedUrl2

console.log(cleanedUrl2Question)

} else {

cleanedUrl2Question = `?${cleanedUrl2}`

console.log(cleanedUrl2Question)

}



let cleanedUrlComplete = cleanedUrl1 + cleanedUrl2Question

console.log(cleanedUrlComplete)



if (refTag != null) {

if (!cleanedUrlComplete.includes('?')) {

cleanedUrlComplete += `?ref_=${reftag}`;

} else {

cleanedUrlComplete += `&ref_=${reftag}`;

}

}

return cleanedUrlComplete;

}

}



// Example usage

// const linkUrl = 'example.com/page?param=value&ref_=123&other=456';

// const reftag = '456';



const cleanedUrl = cleanUrl(originalURL, refTag);

console.log(cleanedUrl);



// Output Cleaned URL

var updates = {

"id": recordID,

fields: {

"Cleaned URL": cleanedUrl

}

}]

// console.log(updates)



await urlTable.updateRecordsAsync(updates);

 

Note: this script could prob use some clean up / refinement, but test it out, it at least should be working! 


@Arthur_Tutt I'm not sure if it's user error on my part but I'm getting the following error:

ERROR

TypeError: Cannot read properties of undefined (reading 'startsWith')

at cleanUrl on line 10

at main on line 81

@airballer86 can you paste in your entire script? I'm guessing some variable didn't get updated properly converting from my setup to yours. And also take a screenshot of how you setup your input variables (in the left panel)


@airballer86 can you paste in your entire script? I'm guessing some variable didn't get updated properly converting from my setup to yours. And also take a screenshot of how you setup your input variables (in the left panel)


@Arthur_Tutt here you go:

 

var urlTable = base.getTable("Link Checker")



var inputConfig = input.config();

var recordID = inputConfig.recordID;

var originalURL = inputConfig.originalURL;

var refTag = inputConfig.refTag;



function cleanUrl(linkUrl, reftag) {

// Add missing 'https://' or 'http://'

if (!linkUrl.startsWith('http')) {

linkUrl = 'https://' + linkUrl;

}



// Add missing 's' in 'http://'

linkUrl = linkUrl.replace('http://', 'https://');



// Remove invalid characters

linkUrl = linkUrl.replace(/[\'§]/g, '');



// Split the linkUrl by '&'

const parts1 = linkUrl.split('?');

console.log(parts1)



if (parts111] == null) {

let cleanedUrlComplete = parts110].toString()



if (refTag != null) {

if (!cleanedUrlComplete.includes('?')) {

cleanedUrlComplete += `?ref_=${reftag}`;

} else {

cleanedUrlComplete += `&ref_=${reftag}`;

}

}

return cleanedUrlComplete;

}



if (parts111] != null) {

const parts2 = parts111].split('&');

console.log(parts2)



// Filter out parameters starting with 'ref_=' or 'ref='

const cleanedParts1 = parts110]].filter(part => !/^ref(_|=)/.test(part));

console.log(cleanedParts1)

const cleanedParts2 = parts2.filter(part => !/^ref(_|=)/.test(part));

console.log(cleanedParts2)

let cleanedUrl2 = cleanedParts2.join('&');

console.log(cleanedUrl2)



// Join fields based on logic

let cleanedUrl1 = cleanedParts1.toString()

console.log(cleanedUrl1)



let cleanedUrl2Question;



if (cleanedParts2.length == 0) {

cleanedUrl2Question = cleanedUrl2

console.log(cleanedUrl2Question)

} else {

cleanedUrl2Question = `?${cleanedUrl2}`

console.log(cleanedUrl2Question)

}



let cleanedUrlComplete = cleanedUrl1 + cleanedUrl2Question

console.log(cleanedUrlComplete)



if (refTag != null) {

if (!cleanedUrlComplete.includes('?')) {

cleanedUrlComplete += `?ref_=${reftag}`;

} else {

cleanedUrlComplete += `&ref_=${reftag}`;

}

}

return cleanedUrlComplete;

}

}



// Example usage

// const linkUrl = 'example.com/page?param=value&ref_=123&other=456';

// const reftag = '456';



const cleanedUrl = cleanUrl(originalURL, refTag);

console.log(cleanedUrl);



// Output Cleaned URL

var updates = {

"id": recordID,

fields: {

"Cleaned URL": cleanedUrl

}

}]

// console.log(updates)



await urlTable.updateRecordsAsync(updates);

 

and a screen shot of the variables

 


@airballer86 Couple ideas to try 

https://www.loom.com/share/58320af793454079999b5ed14a2e3704

 

If that's still not working can you screenshot your table? Need to check all field names got converted properly


@airballer86 Couple ideas to try 

https://www.loom.com/share/58320af793454079999b5ed14a2e3704

 

If that's still not working can you screenshot your table? Need to check all field names got converted properly


@Arthur_Tutt It's working now! User error on my part; I did not capitalize the RL in the variable name originalURL. Thank you so much for your solution, and introducing me to Loom! I do wish there was a way to give your solution a SuperKudos!


Reply