Simple filter script keeps timing out?

UPDATE: See my last post for the ultimate solution. I came up with some inadequate solutions earlier in the post, so save yourself some time if you are looking up how to generate random items from a linked field.

EDIT: to make this more clear for future readers—I was trying to do this in an automation script. It worked as a script app in Airtable, though my final script below is much more performant.

I make heavy use of scripting in Airtable, as well as the API, and I am pretty comfortable with using both, but this simple script is kicking my butt for some reason… it either hangs after running for 3000ms CPU time, or quits with an “unexpected error”. At one point it also threw a “syntax error”, even though there is no syntax error, and that error went away on its own without me changing anything.

AFAIK there is no way to ‘select’ records with a filter in Airtable Scripts, before fetching them all, so you have to select the whole friggin’ table (even if it’s 6000 records), and then use Javascript’s native Array.filter() method to get a subset of the whole table. As you can see, what I want to do here in this script is:

  1. Cycle through each of 17 or so “topics”.
  2. For each of those topics, generate a subset of “videos” whose topic link field id matches the current topic’s id.
  3. Pick a random video from the subset (by using the subset of videos and selecting a random index in that array, generated from that’ array’s length).
  4. Update the linked record field “Random Video” on the current topic with the random video from the previous step.
    const topics = base.getTable('Topics');
    const videos = base.getTable('Videos');

    const { records: queryTopics } = await topics.selectRecordsAsync();
    const { records: queryVideos } = await videos.selectRecordsAsync();

    queryTopics.forEach(topic => {
         const topicVideos = queryVideos.filter(video => video.getCellValue('Topic Link')[0].id === topic.id);
         const randomVideo = topicVideos[Math.floor(Math.random() * topicVideos.length)];

         await topics.updateRecordAsync(topic.id, {
            'Random Video': [{ id: randomVideo.id }]
         });
    });

Any ideas? This seems exceptionally simple, and from what I understand, unlike the API, Airtable Scripts already have the data loaded in memory—the await call is supposedly just “formal”, so it should just be like running a simple Array.filter() method… I don’t understand why this keeps getting hung up.

I think you should get about a fifty-times performance improvement by simply updating fifty records at a time. You’re doing them one-by-one and that’s extremely hard on the server.

I also think your evaluation logic could be improved although I have zero time today to ponder a faster approach.

1 Like

While that is true, and I will update that, it actually fails BEFORE the updateRecordAsync() call—I can’t even get topicVideos to be logged out to console.

FYI in this base there are 17 topics and 5711 videos with one topic each.

What’s going on?

Also, with regard to my original post, I have realized that even with a filterByFormula on the await videos.selectRecordsAsync() call for each Topic, since I need to do this for all topics, the total evaluation time would actually be longer to do 17 separate filterByFormula calls (if it were even possible) and then evaluate their responses and update the random video, for each topic separately, compared with just selecting all the records and use Array.filter(), as I do now. So I am not sure what I can do to get this to work, or if it’s just the number of records causing it to fail?

Circling back to your original post, if you’re doing this in a script block, the Airtable client has already loaded all of the records, otherwise, they would not be visible in the table pane. What this means is that you’re really using script to access the already loaded records and performance should be blistering fast as evidenced by these tests.

Does this run without error? If not, fix it; then add your deeper logic.

queryTopics.forEach(topic => {
    console.log(topic.id);
});

Actually, weirdly Airtable is not letting me use my destructuring syntax, so I had to go back to using queryTopics.records without destructuring.

Yes, I can get the topic IDs logged out to console, but I can’t get the filter to work. Is there something wrong with my filter function? I believe all is kosher as far as video => video.getCellValue('Topic Link')[0].id === topic.id is concerned. At least the code introspection leads me to believe there is an id property on the 0th index of video.getCellValue('Topic Link') (which is a field that exists and is a linked record field.)

Some deeper details—you say the records are already loaded, but why is this so slow?

    const videos = base.getTable('Videos');
    const videosQuery = await videos.selectRecordsAsync();
    console.log(videosQuery.records[0]);

Screen Shot 2020-12-18 at 7.58.47 AM

I do not know, but when I run your example on a 22,538 record set, is shows it processing about 8000 records/second.

This code …

It’s a shot in the dark, but try telling selectRecordsAsync to return only the fields that you really need.

This thread has more info on how selecting fewer fields can improve performance. (Read to the end of the thread to see Bill’s post where selecting zero fields was 18 times faster.)

You can also improve performance by looping through the videos only ONCE to create an object that maps topics to arrays of records. The resulting object would look something like this:

{
topic1: [recordForVideoOnTopic1, recordForVideoOnTopic1, recordForVideoOnTopic1, ...],
topic2: [recordForVideoOnTopic2,recordForVideoOnTopic2, recordForVideoOnTopic2, ...],
...
}

Then loop through the keys and pick out your video on each topic.

Finally do a single .updateRecordsAsync to write all the updates.

I just discovered that my script works fine when I use it in a standalone scripting app in a dashboard. What is giving me problems is when it is run as an automation.

That would have been good to know 20 hours ago. :wink:

Airtable has three [internal[ scripting environments so it’s always helpful to be clear about the runtime climate.

Ok, sure. But they all use the same language, why should the runtime matter, and shouldn’t any limitations be made clear? Are there limitations I should know about with automations? The expectation as a consumer is that any script I write in a script app could be automated… nevertheless your help has been greatly appreciated.

I’m probably not the guy to ask because most of my work with Airtable is in NodeJS on other systems. However, imagine an automation action with an input element or output elements - that won’t work.

Well, it can be automated, but “automation” – by definition – must not contain blocking code, right? It also probably shouldn’t perform any sort of inefficient enumeration that would likely cross some arbitrary time boundaries, otherwise, a “consumer” could take the platform down.

I have a hunch you were seeing Airtable’s defence mechanisms reacting to protect its environment.

Both of your suggestions were great.

I incorporated both of yours and @Bill.French 's original tip to use updateRecordsAsync() , and now this script is lightning-fast. I may share it with others here, because I suspect a script that chooses a random record from one table, to a linked field of another table performs a fairly useful function for many. (eg. get to know a random employee, word of the day, etc).

Unfortunately it still doesn’t work as an automation. Lucky for me I don’t need it to change that often, and I can manually run it every once in a while, but it would be nice to have it run by itself, or more ideally, once a day (wish this was a trigger option). @Bill.French if it’s so fast now (nearly instantaneous in human terms), why should it trigger Airtable’s defense mechanisms? It’s not querying more data than the view loads.

let topics = base.getTable('Topics');
let videos = base.getTable('Videos');
let allOnlineVideos = videos.getView("All Online Videos");

let { records: videosQuery } = await allOnlineVideos.selectRecordsAsync({ fields: ['Topic Link'] });

let topicVideos = {}, randomVideos = [];

for (let video of videosQuery) {
    let topicId = video.getCellValue('Topic Link')[0].id;
    if (!topicVideos[topicId]) topicVideos[topicId] = [];
    topicVideos[topicId].push(video.id);
}

for (let [topic, videoIds] of Object.entries(topicVideos)) {
    let randomVideoId = videoIds[Math.floor(Math.random() * videoIds.length)];
    
    randomVideos.push({
        id: topic,
        fields: {
            'Random Video': [{ id: randomVideoId }]
        }
    });
}

await topics.updateRecordsAsync(randomVideos);
1 Like

There are several limitations with automations. You can find the script specific ones in this support article. There are other automation limitations described here.

The timeout after 3000ms CPU time is one of the limitations described in that support article. This is different from running in other scripting situations because the script runs on Airtable’s servers versus running on your local computer. Airtable doesn’t care how much of your own CUP time you take, but they need to protect their server CPU time to ensure that the service remains running for everyone.

Ok, that makes sense. Of course an automation would need to run on their servers. I am a little surprised now that I have got the script running so quickly due to both yours and Bill’s suggestions, that it still doesn’t work, though.

Automations need to be activated by a triggering record. In order to run an automation once a day, you need to have a control record that triggers the automation once per day. I don’t have time for a writeup now, but this thread has more info and there are other threads on the same topic.

What error are you getting?

@kuovonne - you know actually I made a mistake. It seems to be working now with all your combined suggestions. Just have to figure out the daily trigger now.

Thanks so much @kuovonne and @Bill.French.

PS: I could only select one ‘Solution’ to this problem, though I attribute both of your responses to the ultimate resolution of this issue. I suspect, however, that the biggest performance gains were from @kuovonne’s suggestions, and I am not sure which was the one that really helped the most, though I suspect her second suggestion not to loop through 5711 records 17 times was the one that really did it, though filtering the initial selectRecordsAsync() call by a single field could have been of great benefit in this case because each record has many, many fields.

1 Like

Good to hear its working.

I was referring to the timeouts you said you were experiencing. You weren’t getting good error feedback and it didn’t make sense so I surmised it could have been the platform shrugging off process as untenable.