How to query a table ... or do I always have to process all records?

Rory_Kingan · ‎May 18, 2020

In the API definition for RecordQueryResult it says

A RecordQueryResult represents a set of records. It’s a little bit like a one-off Airtable view: it contains a bunch of records, filtered to a useful subset of the records in the table.

but in the API for Table there’s no way I see to filter when calling selectRecordsAsync(). Is there no way to query only a subset of rows in the table, i.e. the equivalent of a SQL WHERE clause? If so, what is the RecordQueryResult documentation referring to regarding ‘filtered to a useful subset’?

kuovonne · ‎May 18, 2020

You can call selectRecordsAsync() from a view instead of a table to get only the records in the view. If you want a different method of filtering, you need to do the filtering yourself after you get the records.

VictoriaPlummer · ‎May 18, 2020

++1 on pulling from a view. Additionally, you can add the fields you want to pull in between the parentheses of selectRecordsAsync() so it looks like selectRecordsAsync({fields:['Field Name 1', 'Field Name 2']) This will only pull those fields from the table. It helps because if your table’s huge you’re not pulling more than you need.

PeterB · ‎Jul 12, 2021

I have the same question as Rory and I don’t think it was answered. We need to be able to create filtered subsets (without using an existing view) and that is what this function offers i.e. one-off views. However the documentation examples don’t use this function. Please explain how we create dynamic views and filtered subsets in script.

kuovonne · ‎Jul 12, 2021

It was answered. The only way to get a subset of the records in a table is with a predefined view. Otherwise you need to filter the records in your code after you get all the records.

PeterB · ‎Jul 12, 2021

Could you please provide an example using RecordQueryResult?

kuovonne · ‎Jul 12, 2021

A record query results is what you get when you perform selectRecordsAsync. Which records are the query results depends on how the query results was created. If the query results was created by running selectRecordsAsync on a table, it contains all the records in the table. If the query results was created by running selectRecordsAsync on a view, it contains only the records in the view. If you are building a custom app, there are also other ways of obtaining a query result.

Chris_Parker · ‎Jul 12, 2021

Sure wish there was a “filter” key just like we have “fields” and “sorts”. That would solve this issue and bring a lot of efficiency. Or do it just like it’s done with the HTTP API.

I’m currently looping through all the records I have returned to find stuff. Sure wouldn’t mind learning a better way!

Dominik_Bosnjak · ‎Jul 28, 2021

For what it’s worth, JavaScript’s for…of performance is off the charts and always has been, not sure how you’d define “better” hah.

But having Views pretty much eliminated the need for robust filtering on my end.

Bill_French · ‎Sep 13, 2021

Indeed, views provide a very powerful way to scope and optimize access to data. However, they cannot be dynamically altered to achieve what I consider hyper-performance in script processes. Here’s an example…

Imagine a table (a) with a view and when a record enters that view, search another table (b) that has 45,000 records in it for a single matching key so that a linkage can be established in (a) to (b).

I’m not an expert in all manners of Airtable scripting, but I think the only way to do this is to iterate across the 45000 records in table (b) to locate the record ID such that the linkage can be updated in table (a). Ironically, searching for discussions about this place me in the commentary mix here. And even then, I’m not sure I expressed the issues succinctly or even know what I was talking about.

Two ideas come to mind when presented with this quandary -

Ideally, there should be a searchRecordsAsync() method that uses the internal search capabilities to return a record (or record collection) given a field/record-level query.
Less than ideal, but likely even faster than search would be a way to create and maintain arbitrary hash indices (like an inverted index) that can be loaded quickly instead of entire tables followed by a filter across all records. Indeed, we need a direct pointer much the way we used to find data in assembly code using registers.

When presented with this challenge I use the approach shown below. It’s ideal in that it’s very fast and how fast depends on the data set and the types of lookups you need to perform. This example approach assumes we need only the record ID to establish a link, and it also assumes the link is based on a single key value. Performance, of course, will vary depending on the number of fields one might need access to for additional processing logic or record updates.

In any case, the analytics speak for themselves - this approach is 6.5 times faster for ~20,000 records and a sizeable 14.5 times faster for approximately 49,000 records. This is not surprising since Airtable typically bogs down the closer you get to 50,000 records.

But it’s important to realize that performance data like this is not linear and use cases vary so greatly. However, if you have to perform lots of lookups across record sets, the performance gap between filtering across records and using a hash index begin to accumulate significantly.

Indices Table

Of deeper interest, I’m sure you’ll wonder what the Indices Table looks like.

The purpose of this table is simple - have a place to persist and update hash indices that can be accessed and queried very quickly.

It’s actually an extremely rudimentary approach - a single record holding an arbitrary hash index definition with long text fields representing the shards blocks. A Shard block is a JSON object of 100k bytes or less and this is necessary when creating indices involving large record collections because they cannot all fit inside a single long test cell. But even with this added complexity, the load and parse time of ten fully populated shards are still under 100 milliseconds; Airtable it seems is very fast when it comes to dealing with a single record. :winking_face: