How fast is searching for a specific value in a field (column)? Results of a test

Continuing the discussion from How fast is searching for a specific value in a field (column)?:

I did a test to check in practice how searching for a single value using Airtable API scales with the number of rows and it turns out that there’s no problem with that.

Here’s a table that I got that shows the number of rows in an Airabse table and how long it takes to find a given user ID string in seconds:

rows, time
1, 0.52
10, 0.39
100, 0.39
1000, 0.41
10000, 0.47
50000, 0.75
1 Like

This is great info - you should consider publishing your methodology for these tests because the context and approach matters greatly.

I have some queries that average < 20ms and some as low as 5ms, but in a specific context. For example imagine you had 50,000 records and you needed to perform 200,000 queries against the same record set in one process. It is impractical to perform 200,000 queries at 750ms each - in fact, it’s not possible with Airtable.

But, a clever approach using a JSON hash index or an inverted index makes this both possible and very performant.

Good point.

I tested a situation when you search for a single text value in a table. I measured how much time it takes to find it. I used Python’s airtable-python-wrapper package to communicate with Airtable API. The table had two text attributes/columns. I executed the following steps for each inspected number of rows.

  1. Delete all the rows in the table.
  2. Add a given number of rows and fill each field with a randomly generated UUID to simulate some real-life unique IDs.
  3. Look for three values from the table in one of the attributes/columns: the one from the first row, the one from the last row, and one from a row in the middle (it turned out that it takes a similar amount of time to find each of those values, so I left only one of them in the table above). Search for each value using match() function from the mentioned Python package (this seems to be retrieving all records through Airtable REST API with a parameter filterByFormula= {Username}={My ID}) and measure the time of execution of this function.

BTW, using the hash table as you suggest makes total sense, but I was interested in a different use-case.

This is brilliant. Should be included in the API docs, because this matters a lot for algorithm design.

This topic was solved and automatically closed 15 days after the last reply. New replies are no longer allowed.