My script never stops

Kim_Trager1 · ‎Jan 15, 2021

I’m trying to make monthly reports of a few thousands entries.
However my script never stops, and not sure what the mistake is.

I’m pretty new to JS and not sure if it’s my loop within a loop that is the problems?

let wip, wipOrderView ,tableWIP, queryWIP
let cost, costReviewView, costTable, date, customer, garment, switchs, totalFactCost, costTime, estTotalCost, price
let currentMonth

wip='Work in Progress'
wipOrderView='Monthly_order_review'

cost = 'Costing'
costReviewView = 'Monthly_Overview'
date='Date'
customer='Customer'
garment = 'Garment'
switchs = 'Switch'
totalFactCost = 'Total fact cost'
costTime = 'Time'
estTotalCost = 'estTotalCost'
price = 'priceScripting'

tableWIP = base.getTable(wip).getView(wipOrderView)
costTable = base.getTable(cost).getView(costReviewView)


/////////////////////////////////////EVERY MONTHLY REVIEW //////////////////////////////////

let everyMonth = async () =>{
    output.text(`This will take a little time as the system is compiling thousands of records...`)
    let monthlyCostQuery, months, allUniqueMonths, allUniqeGarments, garmentTotals
    
    monthlyCostQuery = await costTable.selectRecordsAsync()
    months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'November', 'December']

    // Find all unique [Years and Months] and unique [Customer:Garment]
    allUniqueMonths = []
    allUniqeGarments=[]

    monthlyCostQuery.records.forEach(el=> {
        
        //Filter out unique [Year:Month]
        let getMonth = new Date(el.getCellValue(date)).getMonth()
        let getYear = new Date(el.getCellValue(date)).getFullYear()
        
        let dateObj = {'year':getYear, 'month':months[getMonth]}
        
        if(!allUniqueMonths.some(obj => obj['year']===getYear && obj['month']===months[getMonth])){
            allUniqueMonths.push(dateObj)
        }


        //Filter out unique [Customer:Garment:EstimateCost:Price]
        let getCustomer = el.getCellValueAsString(customer)
        let getGarment = el.getCellValueAsString(garment)
        let estCost = el.getCellValue(estTotalCost)[0]
        let garPrice = 0
        
        //rollup numbers - avoid if no number rolled up
        if(el.getCellValue(price)){
           garPrice = el.getCellValue(price)[0]
        }
        
        

        let garmentObj = {'customer':getCustomer, 'garment': getGarment, 'totalEst': estCost, 'price':garPrice}

        if(!allUniqeGarments.some(obj=> obj['customer']===getCustomer && obj['garment']===getGarment && obj['totalEst']===estCost && obj['price'])){
            allUniqeGarments.push(garmentObj)
        }
        
    })

    //get total time, cost, staffcost, materialcost

    garmentTotals = []
    allUniqeGarments.forEach(obj=>{
        let custArr = obj['customer']
        let garmArr = obj['garment']
        let estArr = obj['totalEst']
        let priceArr = obj['price']
        let totalTime = 0
        let costStaff = 0
        let costMaterial = 0

        monthlyCostQuery.records.forEach(el => {
            let custCost = el.getCellValueAsString(customer)
            let garmCost = el.getCellValueAsString(garment)
            let estCost = el.getCellValue(estTotalCost)[0]
            let priceCost = 0

            //rollup number issue when no number is rolled up
            if(el.getCellValueAsString(price)){
                priceCost = el.getCellValue(price)[0]
            }
            


            if(custArr === custCost && garmArr===garmCost && estArr === estCost && priceArr === priceCost){
                if(el.getCellValueAsString(switchs)==='Material' || el.getCellValueAsString(switchs)==='Leather' ){
                   
                    costMaterial += el.getCellValue(totalFactCost)
                }else if(el.getCellValueAsString(switchs)==='Staff'){
    
                    costStaff += el.getCellValue(totalFactCost)
                    totalTime += el.getCellValue(costTime)
                }
            }
            
        })
        garmentTotals.push({'customer': custArr, 'garment':garmArr,'price':priceArr, 'timeTotal': totalTime, 'estimate':estArr, 'totalCost':+costStaff+costMaterial, 'staffCost':costStaff, 'materialCost':costMaterial})
        

    })


    console.log(garmentTotals)
    

    
    
}

await everyMonth()

Bill_French · ‎Jan 19, 2021

Two ideas here -

Hash index
Inverted index

#1 Hash Index

This is simply a JSON document intended to make it fast and simple to perform lookups; it’s literally a copy of a collection of items where the key represents a way to directly access the data related to that key.

Let’s say you had a thousand records in a table and you needed to loop through it to find a given record whose name matched a specific customer name. You have three options -

Perform a FilterByFormula approach for each lookup; ugly, slow, convoluted.
Loop through every record seeking a match; very slow and CPU-intensive, added code complexity.
Extract the data from a pre-built hash index; almost instant, elegant.

Approach 3 requires a single pass across all the records containing the customer names (that’s the downside). That pass would create a collection of objects in a single JSON document (ex., oCustomerNameHash that is global to your app) that looks something like this:

{ "ABC Transportation" : {
   "customer_id" : "1001",
   "customer_contact" : "Jimmy Johns",
   other attributes needed for the index...
   }
}

When you want to know the contact name for “ABC Transportation”, the code is simple. effortless, and requires only 6 milliseconds.

let customerContact = oCustomerNameHash["ABC Transportation"].customer_contact;

Even if the hash index has 50,000 items in it, the response time will be sub- 20 milliseconds. This single line eliminates a nested loop and the complexity of doing so. It is possible because you performed the loop in advance of needing to perform the lookups. This makes it possible to vastly compress the code and complexity while also sparing the number of loop introspections required to just one pass instead of vastly many passes.

With a hash index like this, you often find multiple cases where oCustomerNameHash can be utilized. Furthermore, you may need multiple hash indexes for the same data. Imagine you wanted to also lookup by customer ID. This would require a new hash index like oCustomerIDHash. This can be built using the same loop so there’s no need to loop for every index you need.

Lastly, you can mix both indexes into one hash index document by simply creating two keys for each customer - one for the customer name, and one for the customer ID. The lookup process is constant.

#2 Inverted Index

This is a very different idea and ascribed to full-text indices. It makes it possible to perform lookups similar to a hash index but it uses an inverted architecture to make it possible to index multiple fields and perform global search for tokens or field-level search for specific tokens.

I believe you want to focus on the hash index for your process.

See Solution in Thread

Jeremy_Oglesby · ‎Jan 15, 2021

I’m definitely not an expert in asynchronous programming, but it seems to me like you are telling your script to await the results of the everyMonth() async function, but then never telling it to do anything with those results.

Is there a reason that you need to have your everyMonth() function be asynchronous?

You might try just removing the async and await keywords from your script entirely. I’m not sure I see any need for them in your script. And if you can’t do that because the everyMonth() function is surrounding other async functions, then maybe just try doing away with the wrapping everyMonth() function and let everything inside of it run inline in the script environment.

Kim_Trager1 · ‎Jan 15, 2021

@Jeremy_Oglesby I just made it a function as I’m planning much more to the script, so to easier get around in it.
I’ve done this before and it definitely worked fine.
I just tried to take it out, but nothing changed though.

kuovonne · ‎Jan 15, 2021

You definitely have a lot of nested loops. I saw at least three loops, including two loops on the same query result.

I recommend two things:

Throw in some console.log() statements in the loops to teach what is going one. Maybe with all the loops it just takes super long to process everything.
Look for a different way of calculating what you want that doesn’t require all those nested loops.

I wouldn’t worry about the asynchronous call in this case. It looks like you do only one async call and if isn’t in a loop. Of course, I might have missed something as I am reading this on a 4” screen.

Kim_Trager1 · ‎Jan 15, 2021

It’s the last loop that seems to be going on forever…
I’ll attempt another way of achieving roughly the same.

Out of curiosity what determines the speed that the script finishes except from amount of data to run through?
Whats the 80/20 - own CPU power, airtable serverspeed, internet speed or something else?

kuovonne · ‎Jan 16, 2021

All of those things can affect the speed of a script. And, you don’t have much control over those things.

However, you do have control over the number one thing that affects the speed of the script—the actual code itself. There are many ways to do things in code. The more complicated the task, the more skill it takes to pick how to accomplish that task.

Here is an example. Suppose you need to bake enough cookies so that every student in the class gets three cookies. You could loop through the class roster and add three cookies for each student. The more students there are, the longer it will take to calculate the number of cookies. Or you could multiply the number of students by three, and the size of the class won’t really affect the calculation time. However, this faster method requires knowing that multiplying exists and that it produces the same result.

Now you situation is much more complicated, and you probably will need to retain some of your loops. However, maybe you don’t need all of the loops or maybe they don’t need to be nested. I can’t say for sure what specific change to you code will make it run faster. That would require a lot of time to (1) find out your exact goals and (2) understand your current algorithm.

Bill_French · ‎Jan 16, 2021

Gosh, I hate this … (sorry - no answer in my message - just opinions because that’s what I do on Saturdays).

As developers we get roped into codifying a vast and seemingly endless escalation of conditional processing. Where does it end? Apparently it doesn’t because as soon as you resolve this, the client (or management) is going ask for a “slight” modification or you will invariably discover an unanticipated case where you’re standing in a corner holding a paintbrush without egress.

This and vastly many of our projects are fundamentally based on two essential tasks -

Filter out the data that we need
Perform computations on the filtered data

Typically, we use loops to achieve #1. #2 is easy if we get #1 right. But #1 is never simple and rarely performant given enough conditional complexity.

What’s better than this approach? Fewer loops for sure because they are slow, and they make our lives shi**y.

The Remedy…

The term “filter”, and especially in a javascript sense, is subject to abuse. Why not exchange this term with “search”. Indeed, a JSON hash index or an actual inverted index is exactly the right chainsaw to mitigate the looping and increase direct access to lists of record IDs that match what the loop is intended to provide.

I use hash indexes all the time, but increasingly, I use Lunr to establish inverted indexes that are very agile and extremely fast. Imagine finding 200 record IDs from a collection of 30,000 rows where three fields match specific values and the fourth is “fuzzy”. Looping requires lots of code and plenty of time. An inverted index can do this in 20 milliseconds.

And when you factor in the advantages of term presence, wildcard searches, and field-specific boosting, the one loop that builds the index is time extremely well spent. Plus, you can create and cache-forward index documents independent of your business logic.

One last comment - there are many computations and metrics that we perform repeatedly for business analytics. The amount of electricity consumed by the world to sustain analytics is likely many times the GDP of small industrial countries. Why not compute these metrics and store them in the search index so that when the CEO wants to see how her business is performing, you’ve reduced the overhead to a simple search query rendered at the edge in 500ms.

Jeremy_Oglesby · ‎Jan 16, 2021

I am now engrossed in reading lunrjs documentation as my wife is elbowing me to (not so)subtly remind me that I am supposed to be installing new bath and shower trim in the hall bathroom…

Thanks a lot @Bill.French

Bill_French · ‎Jan 16, 2021

Had you indexed that task, you would have been able to find it quickly. Just sayin…

Kim_Trager1 · ‎Jan 19, 2021

Hi @Bill.French, @kuovonne & @Jeremy_Oglesby

Thank you for all your feedback.
I’m very green doing any-form of coding. Airtable has been the gateway drug to finally give into the temptation at giving JS a try.
So this is super valuable information.

I think the immediate problem has been as @kuovonne points out, that I might not initially have spend the time upfront really figuring out what I’m trying to do. Instead I just been building as I go. Which have ended up with my code being 1000 times too convoluted and complicated.

I’m very intrigued by @Bill.French and a JSON hash index - however might be a little to over my head at the moment - considering I’m using loops left front and center. However is there a place I can learn more about an inverted hash index and how to implement it? I’ve looked across the internet and it haven’t become much wiser.

As a general rule of thumb do you (@Bill.French , @kuovonne & @Jeremy_Oglesby ) recommend to always make designated views that has filtered out most of the irrelevant records before doing a selectRecordsAsync()?
As I think I have had this idea that I should try to solve this with code, so wonder how you guys look at this?

Sorry for all these stupid questions…