Anyone have example code or ideas around managing bulk updates?


#1

Hello,

I’m trying to sanely bulk update a set of records from a source to Airtable. Mechanically, I have it working, but I’m thinking that I need to employ some better practices for managing these updates.

Here’s the current state:

Data

  1. Have a JSON payload of multiple rows, preprocessed to match the Airtable schema
  2. Do not have the Airtable record ID in that dataset, we have our own identifier

Process/Script

  1. Loop over each record in the JSON payload, add a timestamp (for sanity checks)
  2. Run a select query on the API filtering by our ID
  3. If nothing found, issue a create request with the payload
  4. If found, issue an update request, passing the record ID from Airtable returned in 2

The problem
As written a number of async requests get issued and we run into conflicts. Then data doesn’t get written. I want to make sure the sync is bullet proof so we don’t run into data loss issues or have to do a lot of additional QA. I mitigated by setting a timeout for the first request. I’m thinking of two ways to improve this:

  1. Separate out the select and just bulk match the records in a pre-processing step, cache that (we’re only talking 1000ish records)
  2. Use promises or async await to wait until a record gets updated or created before moving to the next record in the payload. The tradeoff is this turns it into a sequential process, but I think async is less important than data fidelity.

Do other folks have code samples or examples of doing a similar bulk process where you have a lot of updates on records? My main goal is to have confidence that data is loading to Airtable.

Additionally, as an aside, a bulk option for updates would be great so that we can just send the record payload or send it in chunks and get back a summary response when the operation completes.


#2

Check out this comment regarding how to work around the api limits:


#3

Oh, I’m going to give that a shot, that looks like it may solve the main problem for me! Thanks.

Update: was able to quickly refactor using bottleneck and added some logging via Winston for good measure. Seems to be working great! Thanks again for the tip. I poked around, but missed that post.