Comparing two tables and update the status and new data

Ma_Te · ‎Jan 03, 2022

Hi community,

I am looking for a solution where I can compare two tables and update the main table with both status and also the new data entries from the second table. Appreciate if anyone can shed a light on how this could be achieved.

Basically we have a data scraping code that writes the entire dataset to a table (let’s call this secondary table) on regular basis. And we have a main table used in production. What we want to achieve is to use the secondary table to update the main one.

If the dataset does already exist in the main table and also exist during the latest script run on the secondary table (there is timestamp), then the “status” field in the main should be updated to “available”.
If the dataset does not exist in the main table but show up during the latest script run on the secondary table, then the “status” field in the main table should be “new”
If the dataset do exist in the main table but does not exist anymore during the latest script run on the secondary table, then the “status” field in the main table should be “unavailable”

I am not sure if this is too complicate for Airtable to handle as it requires sync on both tables in both ways with logic. In case someone have done anything similar, would be happy to learn how this can be achieved!

Many thanks for all of you who would like to share this knowledge.

Alessio_Monino · ‎Jan 03, 2022

Hi,
I did something very similar for a client, just that he had the data in a google sheet, data that would be updated every so often. He needed a “tracking” mechanism in Airtable that would track each record created/updated/deleted and flag it accordingly. For that solution I used an apps script connected to a trigger that would check for any change in the google sheet and update the Airtable accordingly through an API call to Airtable.
I believe your project could be tackled in 3 different ways:

(WORST) You could attempt using an Airtable automation that gets triggered each time a record is changed in the secondary table (this would not keep track of deletions or new record creations but only of record updates). The automation would first look for the record in the main table through a primary key search field you should define. If it finds it, it would update it. This would only solve the problem for record updates and would risk triggering a lot of automation runs, depending on the number of records you have.
(MEDIUM) You could use some external tool like Zapier or Integromat to track any changes in your secondary base and build some conditional logic by first looking for the record through a primary key search field you should define. It you find it in the main table, you would update the record in the main table, if not you would create it. Like the other option this could become very costly depending on the number of changes you are making and records you have and would be fairly slow
(BEST) You could create an Airtable script that triggers every time after your data scraping code runs. The code could, for each record in your secondary base, look for a match in the main base. If it finds a match it would update the record in the main base, if it does not it would create a new record. It could then also cross check that all records in the main base that are not in the secondary base are already marked as deleted. If not they were deleting in the last run and should be flagged accordingly. This is, I think the best solution because even though it would take potentially more to set up than the others, it performs all syncing functionalities and is scalable. Only aspect to be careful about is to set up the code properly so that it is efficient and takes little time to run
Hope it helps!

Alessio
Website: alessiomonino.com
Email: alessio.monino@gmail.com

Ma_Te · ‎Jan 04, 2022

Many thanks for your insight Alessio!

Happy to discuss more in details if you have time. I will drop you an email.