Help

Simple filter script keeps timing out?

Topic Labels: Automations
Solved
Jump to Solution
10077 22
cancel
Showing results for 
Search instead for 
Did you mean: 
Avana_Vana
6 - Interface Innovator
6 - Interface Innovator

UPDATE: See my last post for the ultimate solution. I came up with some inadequate solutions earlier in the post, so save yourself some time if you are looking up how to generate random items from a linked field.

EDIT: to make this more clear for future readers—I was trying to do this in an automation script. It worked as a script app in Airtable, though my final script below is much more performant.

I make heavy use of scripting in Airtable, as well as the API, and I am pretty comfortable with using both, but this simple script is kicking my butt for some reason… it either hangs after running for 3000ms CPU time, or quits with an “unexpected error”. At one point it also threw a “syntax error”, even though there is no syntax error, and that error went away on its own without me changing anything.

AFAIK there is no way to ‘select’ records with a filter in Airtable Scripts, before fetching them all, so you have to select the whole friggin’ table (even if it’s 6000 records), and then use Javascript’s native Array.filter() method to get a subset of the whole table. As you can see, what I want to do here in this script is:

  1. Cycle through each of 17 or so “topics”.
  2. For each of those topics, generate a subset of “videos” whose topic link field id matches the current topic’s id.
  3. Pick a random video from the subset (by using the subset of videos and selecting a random index in that array, generated from that’ array’s length).
  4. Update the linked record field “Random Video” on the current topic with the random video from the previous step.
    const topics = base.getTable('Topics');
    const videos = base.getTable('Videos');

    const { records: queryTopics } = await topics.selectRecordsAsync();
    const { records: queryVideos } = await videos.selectRecordsAsync();

    queryTopics.forEach(topic => {
         const topicVideos = queryVideos.filter(video => video.getCellValue('Topic Link')[0].id === topic.id);
         const randomVideo = topicVideos[Math.floor(Math.random() * topicVideos.length)];

         await topics.updateRecordAsync(topic.id, {
            'Random Video': [{ id: randomVideo.id }]
         });
    });

Any ideas? This seems exceptionally simple, and from what I understand, unlike the API, Airtable Scripts already have the data loaded in memory—the await call is supposedly just “formal”, so it should just be like running a simple Array.filter() method… I don’t understand why this keeps getting hung up.

1 Solution

Accepted Solutions
Avana_Vana
6 - Interface Innovator
6 - Interface Innovator

Update #2: I improved this once again by:

  1. Checking to make sure a Tag has a Video before trying to choose a random one
  2. Checking to see if each Tag has more than one Video (not very random if it’s only one!) and if so, then…
  3. Making sure the random video is different than the current video repeatedly choosing a random video if it is initially the same, or…
  4. For Tags with only one Video, use just that one video as the random video.

BONUS: I also updated the second script below that always generates the first record to use for an hourly trigger, simplifying the logic, and reducing the number of getCellValue() calls by n-1 times.

Update: For future readers of this thread, here is a much simpler and more flexible version of the above script that will also handle situations when more than one record can be linked, (The above is for one-to-one relationships, not one-to-many), as well as situations where the total updates are greater than 50 per batch (Airtable’s limit). My above version was a rather silly way to do this, querying two tables and pivoting on one, when all the information needed is actually just in one.

This script will cycle through all the items in a linked record field (videos, here) for each row of a table (tags, here). A linked item (video, here) is chosen at random and then added as the only item in a ‘Random Video’ linked record field, in a queue of updates to the tags table. This queue of updates is then dispatched to Airtable in batches of 50, for maximum efficiency.

let tags = base.getTable('Tags');

let { records: tagsQuery } = await tags.selectRecordsAsync({ fields: ['Videos', 'Random Video'] });

let randomVideos = [];

for (let tag of tagsQuery) {
    let videos = tag.getCellValue('Videos');
    
    if (videos) { 
        let currentVideo = tag.getCellValue('Random Video')[0].id,
        randomVideo = videos[0].id;

        while (videos.length > 1 && currentVideo === randomVideo) {
            randomVideo = videos[Math.floor(Math.random() * videos.length)].id;
        }

        randomVideos.push({
            id: tag.id,
            fields: {
                'Random Video': [{ id: randomVideo }]
            }
        });
    }
}

while (randomVideos.length > 0) {
    await tags.updateRecordsAsync(randomVideos.slice(0, 50));
    randomVideos = randomVideos.slice(50);
}

If you want this script to run daily on a specific hour, follow my instructions above, but use the following script to keep your items (tags, here) updated, so that the script runs only once for all items. As I mentioned above, you can do this manually, but the following second script run as an automation on create and update will ensure that if you add/edit/re-order items, your trigger still runs.

const tags = base.getTable('Tags');

let { records: tagsQuery } = await tags.selectRecordsAsync({ fields: ['ID']});

let firstIds = [];

for (let tag of tagsQuery) {
    const first = tagsQuery[0].getCellValue('ID');

    firstIds.push({
        id: tag.id,
        fields: {
            'First ID': first
        }
    });
}

while (firstIds.length > 0) {
    await tags.updateRecordsAsync(firstIds.slice(0, 50));
    firstIds = firstIds.slice(50);
}

See Solution in Thread

22 Replies 22

I think you should get about a fifty-times performance improvement by simply updating fifty records at a time. You’re doing them one-by-one and that’s extremely hard on the server.

I also think your evaluation logic could be improved although I have zero time today to ponder a faster approach.

While that is true, and I will update that, it actually fails BEFORE the updateRecordAsync() call—I can’t even get topicVideos to be logged out to console.

FYI in this base there are 17 topics and 5711 videos with one topic each.

What’s going on?

Also, with regard to my original post, I have realized that even with a filterByFormula on the await videos.selectRecordsAsync() call for each Topic, since I need to do this for all topics, the total evaluation time would actually be longer to do 17 separate filterByFormula calls (if it were even possible) and then evaluate their responses and update the random video, for each topic separately, compared with just selecting all the records and use Array.filter(), as I do now. So I am not sure what I can do to get this to work, or if it’s just the number of records causing it to fail?

Circling back to your original post, if you’re doing this in a script block, the Airtable client has already loaded all of the records, otherwise, they would not be visible in the table pane. What this means is that you’re really using script to access the already loaded records and performance should be blistering fast as evidenced by these tests.

Does this run without error? If not, fix it; then add your deeper logic.

queryTopics.forEach(topic => {
    console.log(topic.id);
});

Actually, weirdly Airtable is not letting me use my destructuring syntax, so I had to go back to using queryTopics.records without destructuring.

Yes, I can get the topic IDs logged out to console, but I can’t get the filter to work. Is there something wrong with my filter function? I believe all is kosher as far as video => video.getCellValue('Topic Link')[0].id === topic.id is concerned. At least the code introspection leads me to believe there is an id property on the 0th index of video.getCellValue('Topic Link') (which is a field that exists and is a linked record field.)

Some deeper details—you say the records are already loaded, but why is this so slow?

    const videos = base.getTable('Videos');
    const videosQuery = await videos.selectRecordsAsync();
    console.log(videosQuery.records[0]);

Screen Shot 2020-12-18 at 7.58.47 AM

I do not know, but when I run your example on a 22,538 record set, is shows it processing about 8000 records/second.

image

This code …

image

It’s a shot in the dark, but try telling selectRecordsAsync to return only the fields that you really need.

This thread has more info on how selecting fewer fields can improve performance. (Read to the end of the thread to see Bill’s post where selecting zero fields was 18 times faster.)

You can also improve performance by looping through the videos only ONCE to create an object that maps topics to arrays of records. The resulting object would look something like this:

{
topic1: [recordForVideoOnTopic1, recordForVideoOnTopic1, recordForVideoOnTopic1, ...],
topic2: [recordForVideoOnTopic2,recordForVideoOnTopic2, recordForVideoOnTopic2, ...],
...
}

Then loop through the keys and pick out your video on each topic.

Finally do a single .updateRecordsAsync to write all the updates.

I just discovered that my script works fine when I use it in a standalone scripting app in a dashboard. What is giving me problems is when it is run as an automation.