Help

Web scraping info from website links in AT records

3837 4
cancel
Showing results for 
Search instead for 
Did you mean: 
Jayme_Richardso
6 - Interface Innovator
6 - Interface Innovator

Hi All!

I have about 10k records in my airtable base, and each have a specific link to a webpage that I regularly check for certain data.  I would love to be able to scrape those pages for the info needed and automatically put it into AT.  Is this possible?  I assume I will have to run a script of some sort, but don't know exactly how to go about that. We are currently updating the 10k records manually, and using some sort of AI would save SO much time!

TIA!

4 Replies 4

My thoughts: read on...

itsmike
5 - Automation Enthusiast
5 - Automation Enthusiast

Hi Jayme, 

Yes this should be possible, depending on the website.  The most important question is 'can we get extract the data'?

I've built a tool that's good at this (Simplescraper), so if you'd like to share an example of what data you're extracting and from which URL it will be possible verify what's possible pretty quickly. 

After that it's a case of pulling in the data and mapping it to your fields - the scripting extension should make this step straightforward. 

Happy to help you through the steps.


@Jayme_Richardso@itsmike used Simplescraper to collect every post I've made in this community into Airtable itself, and it found all 3,188 with author links, etc - cool stuff. Having a consultant who knows as much about Airtable as his scraping craft is golden.

I would have edited my first post, but the edit feature rarely works in this new shiny platform @ChrisShernaman. 😉 It's little things like this that expand the effort to make a simple improvement to content a three-minute process instead of 20 seconds. I suspect the problem is in one of these javascript errors that occur frequently.

jserrors.png

carrgordon
4 - Data Explorer
4 - Data Explorer

I get where you're coming from—manually updating 10k records sounds like a huge time sink! I was in a similar boat a while ago when I needed to scrape data from multiple pages and store it in a database. I ended up using a script with web scraping with PHP, and it saved me so much time! It's pretty straightforward once you get the hang of it. I followed a guide from Scraping Bytes, which really helped me understand the process. If you can set up a scheduled script, it’ll pull the data from those pages and update your Airtable base automatically.