Jun 29, 2020 04:10 PM
I am creating a Airtable base for HR recruiting purposes. I use the Airtable form to collect information from potential candidates.
I would like to collect only their LinkedIn Profile URL and contact info (email or phone).
Using the collected LinkedIn Profile URL, I would like to pull info such as Name, Location, Experience, Education from LinkedIn into Airtable fields. Can this be done by Scripting block or is there a better alternative solution to this?
Your help is greatly appreciated. Thanks in advance.
Jun 29, 2020 04:27 PM
Yes. But… and there are many (my $0.02).
Finally - (in my opinion) no one should ever build a crawler in Script Block because it would be too brittle, your investment would be exposed to far too many ways it could or should be shut down, and it’s not a sustainable approach - better to use something more viable for the task.
Certainly, small numbers of fetches of content over HTTP(s) should not be a big deal and it might work really well until it doesn’t. :winking_face:
Jun 30, 2020 03:28 AM
Hello @Aswanth_Selva_Pragat, @Bill.French,
I had started building a python WebCrawler App that would then populate an AirTable via AT public API.
This WebCrawler was to meet several attentions from me including those explained here by Bill.
I always preferred the API of a site to any other data collection technique for the reasons explained by Bill and a lot more other ones that I discovered while practicing experimental low-scale Web Crawling lab’s attempts under python.
Then Script-BLOCK started and with it, javascript came into my life, taking up my free time and attention next to my work.
But as a javascript student, I discovered that everything I had started in python web crawling experimental attempts could be done from javascript: it was even interesting to dig into subjects like “headless” “chromium” “chrome” and “puppeteer” and also " write your own chrome extensions".
This message is given as a good intention under condition that it had to become practice.
Although, I could share my Chrome / Chromium by js Control and Automation curated URLs List if it is interesting someone.
But my today’s priorities are now on AirTable Script- and Custom- BLOCK so Web Crawling Projects went asleep.
Cheers,
olπ
Jun 30, 2020 04:10 AM
Thank you very much - @Bill.French, @Olpy_Acaflo for your detailed inputs.
All the points that you have put out totally makes sense.
We do have access to LinkedIn APIs and are planning to leverage that for the process. But, I don’t think the scripting block is the right place to build this at. Probably a separate instance that integrates with Airtable through Airtable API and then accesses LinkedIn APIs to pull info and put it back into Airtable through Airtable APIs would be the better option.
Thanks again! Appreciate it much :slightly_smiling_face:
Aug 03, 2020 12:53 PM
I built a functionality to collect data using the CSS selector functionality of the Web Clipper rather than the Scripting Block. This would be most useful for an individual recruiter or sales rep who can collect information from accounts as they’re browsing.
It may require using the dedupe block to clean up every once in a while but I didn’t find it to be too much a hassle when I was using Airtable every day in my sales job.
Here’s a write up I did on the Web Clipper: https://www.notion.so/Building-a-LinkedIn-Scrapper-on-Airtable-af052e46f3454d40909990d75fbdfde8
Jan 19, 2022 07:37 AM
Hey Olpy, I am looking to hire someone to help scrape date from TikTok posts and feed it into AirTable. Is this something you can help with, or perhaps can help point me in the right direction. Thanks a lot!
Jan 21, 2022 03:50 PM
Hi @Max_Bernstein ,
Sorry for the late reply but :
I hope that in the Airtable Community, you will find someone to hire to satisfy your request: excellent Experts are regularly involved in the most active topics of the moment.
oLπ