Hello,
I’m trying to sanely bulk update a set of records from a source to Airtable. Mechanically, I have it working, but I’m thinking that I need to employ some better practices for managing these updates.
Here’s the current state:
Data
- Have a JSON payload of multiple rows, preprocessed to match the Airtable schema
- Do not have the Airtable record ID in that dataset, we have our own identifier
Process/Script
- Loop over each record in the JSON payload, add a timestamp (for sanity checks)
- Run a select query on the API filtering by our ID
- If nothing found, issue a create request with the payload
- If found, issue an update request, passing the record ID from Airtable returned in 2
The problem
As written a number of async requests get issued and we run into conflicts. Then data doesn’t get written. I want to make sure the sync is bullet proof so we don’t run into data loss issues or have to do a lot of additional QA. I mitigated by setting a timeout for the first request. I’m thinking of two ways to improve this:
- Separate out the select and just bulk match the records in a pre-processing step, cache that (we’re only talking 1000ish records)
- Use promises or async await to wait until a record gets updated or created before moving to the next record in the payload. The tradeoff is this turns it into a sequential process, but I think async is less important than data fidelity.
Do other folks have code samples or examples of doing a similar bulk process where you have a lot of updates on records? My main goal is to have confidence that data is loading to Airtable.
Additionally, as an aside, a bulk option for updates would be great so that we can just send the record payload or send it in chunks and get back a summary response when the operation completes.