Hi guys,
First of all, I’m happy to be here and I would love to get your help with a method I want to develop.
I’ll try to explain myself as clear as I can so you could understand my needs and help.
We’re using a LinkedIn search URLs on a daily basis. Most of them contains many strings and terms in order to filter our target audience. For example:
*(Lead OR Architect OR Director OR SVP OR EVP OR Vice OR Chief OR VP OR Head) AND Data AND (Engineering OR “Application Development” OR “Software Development”) AND NOT (Expert OR Technician OR Support OR Maintenance OR QA OR quality OR Customer OR Specialist OR student *
And here’s a part of the URL for example:
companyIncluded=NOT%2520(college%2520OR%2520university%2520OR%2520school%2520OR%2520HP%2520OR%2520%2522Hewlett%2520Packard%2522%2520OR%2520Samsung%2520OR%2520Nvidia%2520OR%2520Paypal%2520OR%2520google%2520OR%2520BAE)&companySize=F%2CG%2CH%2CI&companyTimeScope=CURRENT&doFetchHeroCard=false&geoIncluded=103644278%2C100506914%2C101165590%2C101174742%2C101452733%2C102454443%2C103291313%2C104738515%2C103323778%2C105146118%2C100446943%2C105490917&industryExcluded=71%2C75%2C77%2C96&keywords=(KYC%20OR%20AML%20OR%20%22authentication%22%20OR%20Fraud%20OR%20Identity%20OR%20passport%20OR%20%22customer%20experience%22%20OR%20%22Compliance%20and%20Assurance%22%20OR%20compliancence%20OR%20Onboarding%20OR%20Identification%20OR%20Journey%20OR%20%22call%20center%22%20OR%20%22contact%20center%22%20OR%20KYC%20OR%20PI%20OR%20%22Personal%20Information%22%20OR%20%22personal%20data%22%20OR%20%22data%20governance%22)%20NOT%20(Intern%20OR%20Student)&listExcluded=all&listType=ACCOUNT&logHistory=true&page=1&rsLogId=529923394&searchSessionId=LDepG29UTV6p3xAdg0C5mg%3D%3D&seniorityIncluded=6%2C4%2C7%2C5%2C8&titleIncluded=(Risk%2520OR%2520Credit%2520OR%2520Prevention%2520OR%2520%2522Compliance%2520and%2520Assurance%2522)%2520AND%2520(Fraud%2520OR%2520Identity%2520OR%2520Authentication%2520OR%2520Risk%2520OR%2520%2522Digital%2520Identity%2522)%2520AND%2520NOT%2520(professor%2520OR%2520office%2520OR%2520Sales%2520OR%2520owner%2520OR%2520software%2520OR%2520Consultant%2520OR%2520Adviser%2520OR%2520Consulting%2520OR%2520Board%2520OR%2520professor%2520OR%2520intern%2520OR%2520assistant%2520OR%2520junior%2520OR%2520JR%2520OR%2520office%2520OR%2520founder%2520OR%2520%2522co-founder%2522%2520OR%2520owner%2520OR%2520extern%2520OR%2520graduate%2520OR%2520undergrad%2520OR%2520contractor%2520OR%2520%2522Chief%2520Executive%2522%2520OR%2520Associate%2520OR%2520advisor%2520OR%2520entry%2520OR%2520journalist%2520OR%2520writer%2520OR%2520secretary%2520OR%2520trainee%2520OR%2520volunteer%2520OR%2520volunteering%2520OR%2520aide%2520OR%2520apprentice%2520OR%2520recruit%2520OR%2520novice%2520OR%2520beginner%2520OR%2520adviser%2520OR%2520Postgrad%2520OR%2520author%2520OR%2520freshman%2520OR%2520novice%2520OR%2520Undergraduate%2520OR%2520postgraduate%2520OR%2520Sales%2520OR%2520coed%2520OR%2520cofounder%2520OR%2520council%2520OR%2520partner%2520OR%2520entry%2520OR%2520expert%2520OR%2520supervisor%2520OR%2520Learning%2520OR%2520Project%2520OR%2520Tutor%2520OR%2520Support%2520OR%2520CEO%2520OR%2520analyst%2520OR%2520Agent%2520OR%2520Rep%2520OR%2520Representative%2520OR%2520Manager%2520OR%2520Lead%2520OR%2520HR%2520OR%2520Human%2520OR%2520Talent%2520OR%2520Recruiter%2520OR%2520Recruiting%2520OR%2520Audit%2520OR%2520Research%2520OR%2520Researcher%2520OR%2520Project%2520OR%2520Account%2520OR%2520Sales%2520OR%2520Presales%2520OR%2520Audit%2520OR%2520Analyst%2520OR%2520Engineer%2520OR%2520Developer%2520OR%2520Regional%2520OR%2520Area%2520OR%2520Local%2520OR%2520account%2520OR%2520Sales%2520OR%2520Specialist%2520OR%2520Modelling%2520OR%2520Investigator%2520OR%2520Instruction%2520OR%2520Instructional%2520OR%2520Products%2520OR%2520Solutions)&titleTimeScope=CURRENT
Basically, we have a lot of duplicates on this links, and I want to reduce this problem in order to optimize our search, targeting and of course to create space for different filters. (space is limited).
So, I’ve created a new workspace on AirTable for this task.
I want to use a script/method/app to spot all the duplicates on the above chain, remove them and create the output on a different column.
So the script needs to flag and remove these duplicates automatically, and reconstruct a clean chain.
Perhaps the way to do this is to break the link into Cells and regroup it into a chain.
The chain has the following logics
- SpaceORSpace between every keyword
- Quotes " " compile 2 or more keywords into one. For exmaple “Thank you” will go into a cell of its own.
- Partneasis () to use AND/OR for conditions.
Any idea to solve this problem would be very helpful. Thanks for your attention!