Re: My first script: specify selectRecordsAsync fields, getCellValues, updateRecordsAsync

Patrick-Kennedy · ‎Sep 10, 2021

Hello,

I just made my first Script app writing my own custom code :tada: and wanted to share :blush:

Context: I am working with a base that has gone through several iterations of development and am trying to resolve some contradictory data sets. Specifically, for a directory of people who get assigned a “level.” There are five different fields to compare and resolve so as to make a single source of truth for each person. My fiirst step is to identify where there are mismatches across the five fields (a rollup, two lookups, and two single-select fields).

My first thought was to use a nested IF() ƒormula, but I found it difficult to construct the matching logic for the fields values and for when any them are without values (i.e. blank).

My next thought was to use an array formula to make an ARRAYUNIQUE() out of an ARRAYCOMPACT() values (i.e. remove the blank values, then look for unique values). To my surprise, the Array functions are only applicable in Rollups or when the field invoked in a ƒormula is a lookup - never crossed that nridge before with Airtable ƒormulas :astonished: Per the Formula reference page: “Note that array functions can only be used in rollup fields or when the input field is a lookup.”

So, maybe there’s a simpler ƒormulaic way, but not finding a solution with my first couple of ideas, I decided to give a Script app a go at finding mismatches (and then adding a :white_check_mark: to a Checkbox field).

This was also my first time using the options with selectRecordsAsync (to avoid the future deprecation). Specifying the fields definiitely makes a big difference in performance!

Per the comments, I have a lot of questions. Specifically regarding:

selectRecordsAsync default sorting, am I seeing things right? I couldn’t find the default behavior documented, so it made me wonder.
Also wondering if the selectRecordsAsync's fields: [] can be accessed by the field ID, or just the name :thinking:
getCellValue behavior when encountering null values - is there a better way?
Better solution suggestions for dealing with null - e.g. a different method, or a better way to approach the ternary operator?
Suggestioins for determiining objects vs arrays to handle the different ways of invoking the key:value elements and indexes?
Solutions for handling tier limits of under 50 updateRecordsAsync at a time?

If you have any answers or suggestions on how I could improve my script, I would love to hear from you.

:nerd_face:

Thanks!

// Read in the records from the Members table
let table = base.getTable("Members");
let query = await table.selectRecordsAsync({
  sorts: [
    // // just noting for syntax reference
    // {field: "Name", direction: 'desc'}
    // // w/o a 'directiion' the sort defaults to
    // // the 'asc' option for an ascending sort
    // {field: "Name"}
    // I am pretty sure {field: '{CREATED_TIME()}', direction 'asc'}
    // is the default sorting field and does not require declaring a
    // sort option at all, but I put it in below anyways
    {field: "When their record was created"}
  ],
  // declaring the specific fields reduces the Script's Run time significantly!
  fields: [
    "Level 1",
    "Level 2",
    "Level 3",
    "Level 4",
    "Level 5"
  ]
});

// I noticed below when using getCellValue that
// the script would stop iterating over the table
// where a null value was encourntered.
// Noting here for future investigation & learning,
// so I added a variable to count and confirm the
// total number of records iterated over
let totalCount = 0;
// to count the number of mismatches and confirm
// the accuracy of the checkbox update
let countMismatch = 0;

// helper function which will remove the null values from an array
let removeNull = (arr) => {
  let noNull = []
  for (let i = 0; i < arr.length; i++) {
    if (arr[i] != null) {
      noNull.push(arr[i]);
    }
  }
  return noNull;
}

// helper function to reduce an array to unique values
// and check if there is more than one unique values
// i.e. proof-positive of a mismatch
let mismatch = (arr) => {
  return [... new Set(arr)].length > 1;
}

// iterate over all the records on the Members table
// and make an array out of the values retrieved
// from each of the five "Level" fields
for (let q of query.records) {
  totalCount += 1;
  let arr = [];

  // Airscript seems to handle null values in a weird way -
  // it balks when it finds a null and stops iterating over the set of records
  // so I used a ternary to set a null value if getCellValue finds one and balks.
  // I am probably not using the getCellValue method correctly, or there's something more apropos?
  arr.push(q.getCellValue("Level 1") == null ? null : q.getCellValue("Level 1")[0]); // Roll-up (array)
  arr.push(q.getCellValue("Level 2") == null ? null : q.getCellValue("Level 2").name); // Single-Select (object)
  arr.push(q.getCellValue("Level 3") == null ? null : q.getCellValue("Level 3")[0]); // Lookup (array)
  arr.push(q.getCellValue("Level 4") == null ? null : q.getCellValue("Level 4")[0]); // Lookup (array)
  arr.push(q.getCellValue("Level 5") == null ? null : q.getCellValue("Level 5").name); // Single-Select (object)
  // IMHO the above should be broken out to make the array,
  // and the bottom it's own function for writing to a checkbox field on Members
  // e.g. a "getLevels" function which iterates over a "fields" array
  // containing the field names/identifiers (requires navigating the
  // object/array differences?)  For the future . . .

  // condition: pass the array of all five "Level" fields to the "removeNull"
  // function, then use the "mismatch" function to evaluate the resulting array
  // of level names to see if there are any mismatches, and if so, then add a check
  if (mismatch(removeNull(arr)) == true) {
    countMismatch += 1;

    // to review ea. person's resulting array when mismatches are found
    console.log(`${totalCount} ${q.name}`);
    console.log(removeNull(arr));

    // add the check when the condition is met
    await table.updateRecordAsync(q, {
      "Level Mismatch": true,
    });
  }
}

// Summary
console.log(`${countMismatch} mismatches marked in "Level Mismatch" from  ${totalCount} records on ${table.name} table`);

// Final thoughts: add a while loop to reduce the writes to records to under 50 at a time?

Dominik_Bosnjak · ‎Sep 10, 2021

From what I know about JS’s internal sorting methods, there is no default sorting. That is to say, the order isn’t guaranteed. Not unless you specifically sort your query result. If you’re dealing with massive data quantities, investing a few hours into implementing a straightforward bubble sort tailored to your particular use case will be worth the hassle before the day is over - the performance difference is still staggering, even in 2021.

Also wondering if the selectRecordsAsync 's fields: [] can be accessed by the field ID, or just the name :thinking:

Field names, ids, or even entire objects (like if you did base.tables[0].fields[0]) are fair game.

getCellValue behavior when encountering null values - is there a better way?

I don’t understand the issue. Null values are awesome because as far as relational databases an JS are concerned, the absence of evidence is clear evidence of absence.

If your code is getting tripped up on null values, though, just preface the cell query with a question mark:

record?.getCellValue('possiblyNullField');

The built-in linter will even warn you about possibly null values in most cases, so shouldn’t be an issue.

Suggestioins for determiining objects vs arrays to handle the different ways of invoking the key:value elements and indexes?

const whatAre = You => (typeof You === 'object') 
    ? Array.isArray(You) 
    ? 'am array' 
    : 'am object' 
    : 'am potato';

But you could just use Object.keys to iterate over both arrays and objects, assuming the issue here is how to loop throught the entirety of passed data.

Solutions for handling tier limits of under 50 updateRecordsAsync at a time?

while(my500RecordHeavyUpdateQueue.length)
    await base
        .getTable(targetTable)
            .createRecordsAsync(
                my500RecordHeavyUpdateQueue.splice(0,50));

Patrick-Kennedy · ‎Sep 10, 2021

Thanks so much @Dominik_Bosnjak!! :heart_eyes:

Patrick-Kennedy · ‎Sep 10, 2021

re: the behaviour on null, this was the error message when querying the value of a “Single-Select” Field object:

TypeError: null is not an object (evaluating 'q.getCellValue("Level 2").name')
    at asyncFunctionResume
    at promiseReactionJobWithoutPromise
    at promiseReactionJob

This code invoked the error:

arr.push(q.getCellValue("Level 2").name);

So I chose the ternary expression:

arr.push(q.getCellValue("Level 2") == null ? null : q.getCellValue("Level 2").name); // Single-Select (object)

Not sure why the ternary is needed. I tried the question mark syntax within the for (let q of query.records) {...} block, but that threw the same error :thinking:

TypeError: null is not an object (evaluating 'q?.getCellValue("Level 2").name')

i.e.

console.log(q.getCellValue("Level 2").name); // <--- errors, Script doesn't run
console.log(q.getCellValue("Level 2")); // <--- logs the nulls and the handful of objects returned
// so how to get into the objects and handle the nulls?

Dominik_Bosnjak · ‎Sep 10, 2021

Okie, I’ll try to help the best I can then:

your general attitude is pretty healthy - you might have found your true calling with coding. You just need an occasional reminder that the computer is here to solve your problems, not the other way around. So, plan goes first, the typing part is usually last and not really that big of a deal once you learn to recognize problems on a structural level.
so, while I know that getting to this point took tremendous willpower, whether the toughest learning period is behind you is solely up to you. But I’d say it’s time for you to keep pushing past your am-able-to-experiment-in-relative-comfort zone and expanding your skillset.

Because you’re already experiencing the inconvenient limitations of imperative programming: micromanagement is cumbersome. It has its uses if you’re, say, rendering 3D graphics using godeffin HTML, for example, but that’s David Blaine territory.

We just need to filter a few arrays. How hard can it be? :grinning_face_with_sweat:

Well, that and nitpick some syntax while we're at it.

But the excerpt below does raise some performance concerns, depending on the size of your dataset.

What the runtime/browser understands here is roughly:

   arr.push(              // push the following into array arr:
      
   q.getCellValue(      //  if cell value 
    "Level 1") == null  // 'Level 1' of q is null   
      ? null                  // push null

    : q.getCellValue(   // else push cell value 'Level 1'
      "Level 1")[0])      //-'s first element

All of this is problematic in several aspects.

First of all, it’s not how you write ternaries; your goal is to get the statement evaluated as quickly as possible without sacrificing or while even improving readability. Meaning you want to finish evaluating the statement ASAP on every loop. But here, you’re wasting time spelling out that zero means zero. That’s like… one of two things computers understand beyond doubt.

By then, something like this would already be at the end of its next iteration, or the one after:

arr.push(q.getCellValue('Level 1')[0]||null)

I’m using null here to draw a parallel with your snippet, but the actual code below uses an empty array instead; you’ll understand why in a sec.

As for the ternary operator, there’s no need for it here because we don’t have another outcome that we want to keep tabs on. Like, sure, it’s possible, likely even, that you get an empty||null cell, but judging from the 40-something lines of code dealing with the leftovers of your initial query, I’m going to assume you don’t actually need them.

We’re not using the above line btw, this is just an FYI.

Ditto for this being a really bad idea:

a rollup, two lookups, and two single-select fields

“This” being: “five fields called Level # that are actually nothing alike, half the time”. The way you can tell is because it does nothing to help us navigate the data structure and might even confuse us. Or so I hear.

the verbosity of your comments was actually pretty helpful to both of us here, so keep at it for as long as you feel is necessary - better to overexplain than get lost in your own code.
with that said, a good rule of thumb — even with imperative coding — is that any function you only call once is a good destructuring candidate.

Oh, and as for the "balking bug:"

//Airscript seems to handle null values in a weird way
//it balks when it finds a null and stops iterating over the set of records

Good catch.

so I used a ternar-

-aand this is what I meant by the healthy attitude comment above: you sound almost… entertained haha? Which is better than frustrated, but it would be even better to save your energy. Instead of rushing to write more code with extra complexity layers that’s more likely to add to your list of bugs than anything else, take a breather a look at the big picture.

Well, slightly bigger picture, I guess:

for (let q of query.records) {
  totalCount += 1;
  let arr = [];

  // Airscript seems to handle null values in a weird way -
  // it balks when it finds a null and stops iterating

This post is already getting too long, so regarding this peculiarity, I’ll just add that everything you need to debug it is right here in that block above.

All of this headache can be avoided if we leave the number-crunching to the computers;

a minuscule trade-off to doing that is that our code might be dash slower than if we were doing everything by hand - but unless you’re Doc Brown designing some really bleeding-edgy stuff, what matters is having a working prototype in minutes instead of hours. Not to mention that fewer lines of code means fewer mistakes.

This is what a functional approach allows us to do. On top of improving code readability from the perspective of anyone with even just an elementary understanding of certain design patterns.

Now, suggestion #1:

in an ideal world, you’d have done 90% of the prepwork with a single ARRAYUNIQUE() formula field beforehand - then realized another formula field (or even that same one, if you want to get fancy could get the job done all the way.

~~Here’s the 90%. ampersand concatenates strings in Airtable formulae not unlike how + behaves in a JS runtime.~~

ARRAYUNIQUE(
{Level 1} & ‘,’ &
{Level 2} & ‘,’ &
{Level 3} & ‘,’ &
{Level 4} & ‘,’ &
{Level 5}
)

EDIT: Correction:

Dominik_Bosnjak:

One note though about the ARRAYUNIQUE() suggestion… when I use that with the fields and & ',' & I just end up with a comma-separated string listing all the field values, i.e. not a short list of only the unique ones :man_shrugging:t2: Am I missing something?

My bad, I was thinking about doing an object literal like this:
@AllLevels:

"[" & '"' & {Level 1} & '"' & "," & '"' & {Level 2} & '"' & "," & '"' & {Level 3} & '"' & "," & '"' & {Level 4} & '"' & "," & '"' & {Level 5} & '"' & "]"
Then just polling that single cell value with a script. At which point we’d go straight to the aforementioned spread syntax that surfaces uniques:
const uniques = [...new Set(q.getCellValue('AllLevels'))]
But since there are only five fields, you could definitely still wing it with a few regex_replace calls to check for duplicates. Though writing it out is definitely more elegant.

But since I don't have the base in front of me and it seems like you've been wrecking your head with this for a while now... let's just make this a learning opportunity:

const

  table = base.getTable("Members"),

  levels = table.fields.filter(f => f.name.startsWith('Level')),

  query = await table.selectRecordsAsync({ fields: levels }),

  reportBot = n => console.log(`${n.length || n}/${query.records.length} mismatched records.`),

  allRecords = query.records.flatMap(rec => {

    const lookMumNoLoops =

      //step 1: flatMap
      levels.flatMap(F => F = rec.getCellValue(F)

        //step 2: type check
        ?.[(F.type) === 'singleSelect' ? 'name' : 0]

        //step 3:
        ?? []
      );

        // step 4: confirming that step 3 means flatMap + the nullish 
        // coalescing operator. which result in a falsy-ish 
        // (yes zero, no null||undefined) filter


    // you may have to tweak the next part, depending on how
    // the actual base is composed. But following the original
    // plan, let's count the unique 'levels' per record:

    const Mismatch = [...new Set(lookMumNoLoops)].length;


    // then use that to gauge if this record needs
    // updating while trying to demo 'proper'
    // usage of the ternary operator:

    return (Mismatch >= 1)
      ? []
      : {
        id: rec.id, fields: {
          "Level Mismatch": true
        }
      }

  });

// think the logging bit is redundant
// in this context, but there you go:

reportBot(allRecords)

while (allRecords.length)
  await table
    .updateRecordsAsync(
      allRecords.splice(0, 50))

So, there you have it: if you’re going to power through mindbenders like that one, functional programming is the way to go - forget about the how and just tell the heap of silicon in front of your what is it that you want. Which reminds me, we have now successfully… marked a few checkboxes? Yeah, remember that part above about having a plan haha?

Just to be clear: this happened to be a pretty ideal case for showcasing the strengths of FP: everything you learned beforehand about loops and key-value manipulation in objectss is valid and very much relevant. But no one can be expected to pick out the correct tool for the job if they haven’t even seen half the toolboxes JS has to offer. And it kind of seems like you’re ready to make that step. If only because you sound too chill with your recent failures, so maybe learning about currying raises your blood pressure to a functional temperature. :joy:

P.S.

Seriously, as you don’t sound burned out, your next learning milestone seems pretty clear: get acquainted with a few more Array.prototype methods: map, filter, forEach, and reduce - ideally in that order.

Long-term: maybe tackle Eloquent JS, and not just because of the fabulous cover.

P.P.S.

The “balking” bug stems from your careless declaration of arr. It happened inside the for…of loop. Meaning that if the code reaches an unresolved state like, say, null, it’s going to throw, the said array instance very much included. Meaning the wrong query record count comes secondary to your data integrity going to the bin. Moral of the story: you don’t learn scope in JavaScript so much as you learn to live with it. Don’t rock that boat, always check your ~~angles~~curly brackets, and remember to sacrifice a Java programmer to Cthulhu at the start of every winter solstice.

Dominik_Bosnjak · ‎Sep 10, 2021

This code invoked the error:

arr.push(q.getCellValue("Level 2").name);

First reaction in this type of case should be duct-taping your data flow with optional chaining.

Not sure why the ternary is needed.

Pls don’t ever say a ternary is needed or you’ll jinx us. :joy:

It works because you do the chaining part in a roundabout way while going “zero means zero”. See the above post for a detailed explanation, you’ve posted this reply just as I started drafting that wall of text, so I ended up missing it.

Patrick-Kennedy · ‎Sep 11, 2021

Thank you so much - your insights and kind words are very much appreciated. There’s a lot here to review. For now tho, I do have map, filter, & reduce on my radar and just completed chapter four of Eloquent JS :slightly_smiling_face:

One note though about the ARRAYUNIQUE() suggestion… when I use that with the fields and & ',' & I just end up with a comma-separated string listing all the field values, i.e. not a short list of only the unique ones :man_shrugging:t2: Am I missing something?

The bigger prize, however, has definitely been in taking this as a learning opportunity to get better at Javascript and functional programming. Thank you (and apologies for jinxing everyone, and praise Cthulu!! :joy: )

Dominik_Bosnjak · ‎Sep 11, 2021

One note though about the ARRAYUNIQUE() suggestion… when I use that with the fields and & ',' & I just end up with a comma-separated string listing all the field values, i.e. not a short list of only the unique ones :man_shrugging:t2: Am I missing something?

My bad, I was thinking about doing an object literal like this:

@AllLevels:

"[" & '"' & {Level 1} & '"' & "," & '"' & {Level 2} & '"' & "," & '"' & {Level 3} & '"' & "," & '"' & {Level 4} & '"' & "," & '"' & {Level 5} & '"' & "]"

Then just polling that single cell value with a script. At which point we’d go straight to the aforementioned spread syntax that surfaces uniques:

const uniques = [...new Set(q.getCellValue('AllLevels'))]

But since there are only five fields, you could definitely still wing it with a few regex_replace calls to check for duplicates. Though writing it out is definitely more elegant.

EDIT: Here’s a succinct high-level overview of FP. The author does a good job explaining the reasoning behind it without glossing over the shortcomings. Her coding style should also be a notch easier to follow than the final snippet I shared above, most notably because she uses function declarations over expressions.