Help

Upcoming database upgrades. Airtable functionality will be reduced for ~15 minutes at 06:00 UTC on Feb. 4 / 10:00 pm PT on Feb. 3. Learn more here

Simple filter script keeps timing out?

Topic Labels: Automations
Solved
Jump to Solution
1806 22
cancel
Showing results for 
Search instead for 
Did you mean: 

UPDATE: See my last post for the ultimate solution. I came up with some inadequate solutions earlier in the post, so save yourself some time if you are looking up how to generate random items from a linked field.

EDIT: to make this more clear for future readers—I was trying to do this in an automation script. It worked as a script app in Airtable, though my final script below is much more performant.

I make heavy use of scripting in Airtable, as well as the API, and I am pretty comfortable with using both, but this simple script is kicking my butt for some reason… it either hangs after running for 3000ms CPU time, or quits with an “unexpected error”. At one point it also threw a “syntax error”, even though there is no syntax error, and that error went away on its own without me changing anything.

AFAIK there is no way to ‘select’ records with a filter in Airtable Scripts, before fetching them all, so you have to select the whole friggin’ table (even if it’s 6000 records), and then use Javascript’s native Array.filter() method to get a subset of the whole table. As you can see, what I want to do here in this script is:

  1. Cycle through each of 17 or so “topics”.
  2. For each of those topics, generate a subset of “videos” whose topic link field id matches the current topic’s id.
  3. Pick a random video from the subset (by using the subset of videos and selecting a random index in that array, generated from that’ array’s length).
  4. Update the linked record field “Random Video” on the current topic with the random video from the previous step.
    const topics = base.getTable('Topics');
    const videos = base.getTable('Videos');

    const { records: queryTopics } = await topics.selectRecordsAsync();
    const { records: queryVideos } = await videos.selectRecordsAsync();

    queryTopics.forEach(topic => {
         const topicVideos = queryVideos.filter(video => video.getCellValue('Topic Link')[0].id === topic.id);
         const randomVideo = topicVideos[Math.floor(Math.random() * topicVideos.length)];

         await topics.updateRecordAsync(topic.id, {
            'Random Video': [{ id: randomVideo.id }]
         });
    });

Any ideas? This seems exceptionally simple, and from what I understand, unlike the API, Airtable Scripts already have the data loaded in memory—the await call is supposedly just “formal”, so it should just be like running a simple Array.filter() method… I don’t understand why this keeps getting hung up.

1 Solution

Accepted Solutions

Update #2: I improved this once again by:

  1. Checking to make sure a Tag has a Video before trying to choose a random one
  2. Checking to see if each Tag has more than one Video (not very random if it’s only one!) and if so, then…
  3. Making sure the random video is different than the current video repeatedly choosing a random video if it is initially the same, or…
  4. For Tags with only one Video, use just that one video as the random video.

BONUS: I also updated the second script below that always generates the first record to use for an hourly trigger, simplifying the logic, and reducing the number of getCellValue() calls by n-1 times.

Update: For future readers of this thread, here is a much simpler and more flexible version of the above script that will also handle situations when more than one record can be linked, (The above is for one-to-one relationships, not one-to-many), as well as situations where the total updates are greater than 50 per batch (Airtable’s limit). My above version was a rather silly way to do this, querying two tables and pivoting on one, when all the information needed is actually just in one.

This script will cycle through all the items in a linked record field (videos, here) for each row of a table (tags, here). A linked item (video, here) is chosen at random and then added as the only item in a ‘Random Video’ linked record field, in a queue of updates to the tags table. This queue of updates is then dispatched to Airtable in batches of 50, for maximum efficiency.

let tags = base.getTable('Tags');

let { records: tagsQuery } = await tags.selectRecordsAsync({ fields: ['Videos', 'Random Video'] });

let randomVideos = [];

for (let tag of tagsQuery) {
    let videos = tag.getCellValue('Videos');
    
    if (videos) { 
        let currentVideo = tag.getCellValue('Random Video')[0].id,
        randomVideo = videos[0].id;

        while (videos.length > 1 && currentVideo === randomVideo) {
            randomVideo = videos[Math.floor(Math.random() * videos.length)].id;
        }

        randomVideos.push({
            id: tag.id,
            fields: {
                'Random Video': [{ id: randomVideo }]
            }
        });
    }
}

while (randomVideos.length > 0) {
    await tags.updateRecordsAsync(randomVideos.slice(0, 50));
    randomVideos = randomVideos.slice(50);
}

If you want this script to run daily on a specific hour, follow my instructions above, but use the following script to keep your items (tags, here) updated, so that the script runs only once for all items. As I mentioned above, you can do this manually, but the following second script run as an automation on create and update will ensure that if you add/edit/re-order items, your trigger still runs.

const tags = base.getTable('Tags');

let { records: tagsQuery } = await tags.selectRecordsAsync({ fields: ['ID']});

let firstIds = [];

for (let tag of tagsQuery) {
    const first = tagsQuery[0].getCellValue('ID');

    firstIds.push({
        id: tag.id,
        fields: {
            'First ID': first
        }
    });
}

while (firstIds.length > 0) {
    await tags.updateRecordsAsync(firstIds.slice(0, 50));
    firstIds = firstIds.slice(50);
}

See Solution in Thread

22 Replies 22

I think you should get about a fifty-times performance improvement by simply updating fifty records at a time. You’re doing them one-by-one and that’s extremely hard on the server.

I also think your evaluation logic could be improved although I have zero time today to ponder a faster approach.

While that is true, and I will update that, it actually fails BEFORE the updateRecordAsync() call—I can’t even get topicVideos to be logged out to console.

FYI in this base there are 17 topics and 5711 videos with one topic each.

What’s going on?

Also, with regard to my original post, I have realized that even with a filterByFormula on the await videos.selectRecordsAsync() call for each Topic, since I need to do this for all topics, the total evaluation time would actually be longer to do 17 separate filterByFormula calls (if it were even possible) and then evaluate their responses and update the random video, for each topic separately, compared with just selecting all the records and use Array.filter(), as I do now. So I am not sure what I can do to get this to work, or if it’s just the number of records causing it to fail?

Circling back to your original post, if you’re doing this in a script block, the Airtable client has already loaded all of the records, otherwise, they would not be visible in the table pane. What this means is that you’re really using script to access the already loaded records and performance should be blistering fast as evidenced by these tests.

Does this run without error? If not, fix it; then add your deeper logic.

queryTopics.forEach(topic => {
    console.log(topic.id);
});

Actually, weirdly Airtable is not letting me use my destructuring syntax, so I had to go back to using queryTopics.records without destructuring.

Yes, I can get the topic IDs logged out to console, but I can’t get the filter to work. Is there something wrong with my filter function? I believe all is kosher as far as video => video.getCellValue('Topic Link')[0].id === topic.id is concerned. At least the code introspection leads me to believe there is an id property on the 0th index of video.getCellValue('Topic Link') (which is a field that exists and is a linked record field.)

Some deeper details—you say the records are already loaded, but why is this so slow?

    const videos = base.getTable('Videos');
    const videosQuery = await videos.selectRecordsAsync();
    console.log(videosQuery.records[0]);

Screen Shot 2020-12-18 at 7.58.47 AM

I do not know, but when I run your example on a 22,538 record set, is shows it processing about 8000 records/second.

image

This code …

image

It’s a shot in the dark, but try telling selectRecordsAsync to return only the fields that you really need.

This thread has more info on how selecting fewer fields can improve performance. (Read to the end of the thread to see Bill’s post where selecting zero fields was 18 times faster.)

You can also improve performance by looping through the videos only ONCE to create an object that maps topics to arrays of records. The resulting object would look something like this:

{
topic1: [recordForVideoOnTopic1, recordForVideoOnTopic1, recordForVideoOnTopic1, ...],
topic2: [recordForVideoOnTopic2,recordForVideoOnTopic2, recordForVideoOnTopic2, ...],
...
}

Then loop through the keys and pick out your video on each topic.

Finally do a single .updateRecordsAsync to write all the updates.

I just discovered that my script works fine when I use it in a standalone scripting app in a dashboard. What is giving me problems is when it is run as an automation.

That would have been good to know 20 hours ago. :winking_face:

Airtable has three [internal[ scripting environments so it’s always helpful to be clear about the runtime climate.

Ok, sure. But they all use the same language, why should the runtime matter, and shouldn’t any limitations be made clear? Are there limitations I should know about with automations? The expectation as a consumer is that any script I write in a script app could be automated… nevertheless your help has been greatly appreciated.

I’m probably not the guy to ask because most of my work with Airtable is in NodeJS on other systems. However, imagine an automation action with an input element or output elements - that won’t work.

Well, it can be automated, but “automation” – by definition – must not contain blocking code, right? It also probably shouldn’t perform any sort of inefficient enumeration that would likely cross some arbitrary time boundaries, otherwise, a “consumer” could take the platform down.

I have a hunch you were seeing Airtable’s defence mechanisms reacting to protect its environment.

Both of your suggestions were great.

I incorporated both of yours and @Bill.French 's original tip to use updateRecordsAsync() , and now this script is lightning-fast. I may share it with others here, because I suspect a script that chooses a random record from one table, to a linked field of another table performs a fairly useful function for many. (eg. get to know a random employee, word of the day, etc).

Unfortunately it still doesn’t work as an automation. Lucky for me I don’t need it to change that often, and I can manually run it every once in a while, but it would be nice to have it run by itself, or more ideally, once a day (wish this was a trigger option). @Bill.French if it’s so fast now (nearly instantaneous in human terms), why should it trigger Airtable’s defense mechanisms? It’s not querying more data than the view loads.

let topics = base.getTable('Topics');
let videos = base.getTable('Videos');
let allOnlineVideos = videos.getView("All Online Videos");

let { records: videosQuery } = await allOnlineVideos.selectRecordsAsync({ fields: ['Topic Link'] });

let topicVideos = {}, randomVideos = [];

for (let video of videosQuery) {
    let topicId = video.getCellValue('Topic Link')[0].id;
    if (!topicVideos[topicId]) topicVideos[topicId] = [];
    topicVideos[topicId].push(video.id);
}

for (let [topic, videoIds] of Object.entries(topicVideos)) {
    let randomVideoId = videoIds[Math.floor(Math.random() * videoIds.length)];
    
    randomVideos.push({
        id: topic,
        fields: {
            'Random Video': [{ id: randomVideoId }]
        }
    });
}

await topics.updateRecordsAsync(randomVideos);

There are several limitations with automations. You can find the script specific ones in this support article. There are other automation limitations described here.

The timeout after 3000ms CPU time is one of the limitations described in that support article. This is different from running in other scripting situations because the script runs on Airtable’s servers versus running on your local computer. Airtable doesn’t care how much of your own CUP time you take, but they need to protect their server CPU time to ensure that the service remains running for everyone.

Ok, that makes sense. Of course an automation would need to run on their servers. I am a little surprised now that I have got the script running so quickly due to both yours and Bill’s suggestions, that it still doesn’t work, though.

Automations need to be activated by a triggering record. In order to run an automation once a day, you need to have a control record that triggers the automation once per day. I don’t have time for a writeup now, but this thread has more info and there are other threads on the same topic.

What error are you getting?

@kuovonne - you know actually I made a mistake. It seems to be working now with all your combined suggestions. Just have to figure out the daily trigger now.

Thanks so much @kuovonne and @Bill.French.

PS: I could only select one ‘Solution’ to this problem, though I attribute both of your responses to the ultimate resolution of this issue. I suspect, however, that the biggest performance gains were from @kuovonne’s suggestions, and I am not sure which was the one that really helped the most, though I suspect her second suggestion not to loop through 5711 records 17 times was the one that really did it, though filtering the initial selectRecordsAsync() call by a single field could have been of great benefit in this case because each record has many, many fields.

Good to hear its working.

I was referring to the timeouts you said you were experiencing. You weren’t getting good error feedback and it didn’t make sense so I surmised it could have been the platform shrugging off process as untenable.