Help

Re: Forecasting & Machine Learning

8387 0
cancel
Showing results for 
Search instead for 
Did you mean: 

While scoping a new Airtable solution for a client, the topic of forecasting came up and specifically by employing machine learning. There are many definitions of machine learning and while my focus was to at least prove to myself that a Script Block could perform the forecasting process natively (i.e., without calling out to any external services), I took a stab at crafting a simple data science block to forecast sales by month for the next three years.

Setting the Table - Aggregation by Month/Year

The sales data exists in a table which includes about 3,000 transactions spanning 2018 and 2019. Instead of using roll-ups or formula fields I decided to see how well the Script Block was able to perform the aggregation of sales by year and month.

The objective is simple – total sales by month and year. The performance of this part of the process is blistering fast; about 1.3 seconds.

image

Forecasting Algorithm

As I mentioned, there are many AI approaches to forecasting but I self-constrained this project to allow me to keep it simple; the model has two key requirements:

  1. It must be native Script Block code;
  2. It must be able to learn as the actual sales data expands.

Many of the forecasting models in AI require GPU’s to process and employ relatively sizeable model resources. As such, I chose an algorithm that I can code in every-day javascript using triple exponential smoothing which explicitly adds support for seasonality to the univariate time series. This method is sometimes called Holt-Winters Exponential Smoothing, named for two contributors to the method: Charles Holt and Peter Winters. In addition to the alpha and beta smoothing factors, a new parameter is added called gamma (g) that controls the influence on the seasonal component.

How it "Learns"

The script is designed to utilize historical data (2018/2019) to forecast 2020. Then it automatically expands the next forecast to use three year’s data (2 historical values for each month plus one forecast value) to determine yet another forecast value. As the future inches forward and actual data expands, the forecasting process becomes increasingly more intelligent and likely more precise.

image

Why Not Use Formulas?

Airtable formulas are wonderful tools, but they tend to work against you in complex processes. And in some case, they expand dependencies and the number of fields in your data model to the point where it’s difficult to manage and maintain.

Examples like this make it abundantly clear - script-based blocks have a very bright future for Airtable builders because they demonstrate how to craft complex computational features that interchange with other data processes and reporting requirements. Formulaic-based processes will never be able to achieve these objectives, nor should they be expected to. Imagine how difficult it would be to replicate this script as a formula. It might be possible but it would be very ugly indeed.

The question is [rather] …

Why not make it possible for Script Blocks to become formulas?

What Next?

This example doesn’t make any assumptions about what to do with the forecasted values; that’s up to you. As a simple block app it’s useful in a variety of forecasting use cases as is. It could be applied to any time-series data as a prediction tool.

With this data and the power of integrated javascript in the block, this data grid can be updated to another table, or emailed as a report, or incorporated into many other use cases. Instantiated as a table that is updated from time-to-time, it could drive data visualizations with other blocks.

AirDrop

A few people have asked about the name “AirDrop”. This is just a tag name I have assigned to a class of scripts I openly share with the Airtable community. It helps me keep my script libraries organized so I can remember what has been sanitized for open sharing. “AirDrops” are not declared under an open-source license, but they are offered here for anyone to use to learn from or shape into other solutions.

Here’s the complete code… enjoy.

/*

   ***********************************************************
   Airdrop - Forecasting
   Copyright (c) 2020 by Global Technologies Corporation
   ALL RIGHTS RESERVED
   ***********************************************************
   
*/

// get the table name
let sourceTableName = "Sales Data";

// display app title
output.markdown('# AirDrop Forecasting');

// get the source data
let sourceTable = base.getTable(sourceTableName);

// get the data set for 2018/2019
let result = await sourceTable.selectRecordsAsync();

// set a counter for testing
var count = 0;

// build the dataframe
var aDataFrame = [];
for (var i = 1; i < 13; i++)
{
    aDataFrame.push({
        "Month" : i,
        "2018" : 0,
        "2019" : 0,
        "2020" : 0,
        "2021" : 0,
        "2022" : 0
    });  
}

// iterate across all the records
for (let record of result.records)
{
  if ((record.getCellValue("MONTH_ID")) && (record.getCellValue("YEAR_ID")) && (record.getCellValue("YEAR_ID") != "2005") )
  {
    updateDataFrame(record);
    count += 1;
  }
}

//
// compute the future
//
for (var i = 0; i < aDataFrame.length; i++)
{
  aDataFrame[i]["2020"] = forecast(3, [aDataFrame[i]["2018"], aDataFrame[i]["2019"]], [1,2]);
  aDataFrame[i]["2021"] = forecast(4, [aDataFrame[i]["2018"], aDataFrame[i]["2019"], aDataFrame[i]["2020"]], [1,2,3]);
  aDataFrame[i]["2022"] = forecast(5, [aDataFrame[i]["2018"], aDataFrame[i]["2019"], aDataFrame[i]["2020"], aDataFrame[i]["2021"]], [1,2,3,4]);
}

// format the final results
for (let i in aDataFrame)
{
    aDataFrame[i]["2018"] = aDataFrame[i]["2018"].toFixed(0).replace(/(\d)(?=(\d\d\d)+(?!\d))/g, "$1,");
    aDataFrame[i]["2019"] = aDataFrame[i]["2019"].toFixed(0).replace(/(\d)(?=(\d\d\d)+(?!\d))/g, "$1,");
    aDataFrame[i]["2020"] = aDataFrame[i]["2020"].toFixed(0).replace(/(\d)(?=(\d\d\d)+(?!\d))/g, "$1,");
    aDataFrame[i]["2021"] = aDataFrame[i]["2021"].toFixed(0).replace(/(\d)(?=(\d\d\d)+(?!\d))/g, "$1,");
    aDataFrame[i]["2022"] = aDataFrame[i]["2022"].toFixed(0).replace(/(\d)(?=(\d\d\d)+(?!\d))/g, "$1,");
}

//
// forecast (based on triple exponential smoothing)
//
function forecast(x, ky, kx)
{
   var i=0, nr=0, dr=0,ax=0,ay=0,a=0,b=0;
   function average(ar) {
          var r=0;
      for (i=0;i<ar.length;i++){
         r = r+ar[i];
      }
      return r/ar.length;
   }
   ax=average(kx);
   ay=average(ky);
   for (i=0;i<kx.length;i++){
      nr = nr + ((kx[i]-ax) * (ky[i]-ay));
      dr = dr + ((kx[i]-ax)*(kx[i]-ax))
   }
  b=nr/dr;
  a=ay-b*ax;
  return (a+b*x);
}

//
// display the forcasted dataframe
//
output.table(aDataFrame);

//
// update dataframe
//
function updateDataFrame(record)
{
    try {
      var yearID = parseInt(record.getCellValue("YEAR_ID")) + 15;
      var itemID  = findElement(aDataFrame, "Month", record.getCellValue("MONTH_ID"));
      aDataFrame[itemID][yearID] += record.getCellValue("SALES");
    } catch(e) {
      output.inspect(e);
    }
    return(true);
}

//
// find an element in the aggregate list
//
function findElement(arr, propName, propValue)
{
  for (var i = 0; i < arr.length; i++)
  {
    if (arr[i][propName] == propValue)
      return i;
  }
}
19 Replies 19

Hi Burak! Welcome to the community!

The data I used is actual data from a client and it’s not even from the years displayed; I had to write additional algorithms to obscure the original dates/times in the sourced sales data and this is not exactly obvious in the code. As such, the intent of this shared code example is to give other coders a leg up for building similar forecasting approaches.

Indeed, this is not intended to be a solution or an example of a working solution. Rather, it’s intended to share an approach I was successful with so that other Airtable users who have coding experience.

As one reader wrote -

This is much more sophisticated than anything I could do…

Machine learning and forecasting is complex stuff and attempting to apply it is not easy, and especially not easy without some basic development skills. But this can be said for almost any use of the Script Block.

As this analysis shows, even someone with deep experience in building forecasting systems can be sideswiped by development issues. Ironically, I forecast a 4 hour effort to build this example in Script Block but it ultimately took two more than two full development days to get it right -

image

At the risk of sounding a bit harsh, integrating a forecasting model into your own data is likely to (a) require a keen understanding of your data and how to shape it so that machine learning can be applied, and (b) require code, and perhaps a fair bit of complex code.

I would encourage other developers in the community to expand upon this approach to create derivatives that employ exponential smoothing in a fully shared base. Unfortunately, my resources for this topic are tapped unless my boss has me on a clock developing it. :winking_face:

Oh I see, now it’s clear many thanks for the detailed response.

So, I hope someone in the community will publish a template that unexperienced people can use.

Fingers crossed! :slightly_smiling_face:

Ahhhh! Got it to work, too!

Wow, thank you so much for sharing this code, my airtable mind just went Kaboom!! - just when I was about to look for other software to process our historic sales data.

Adam_Rehm
4 - Data Explorer
4 - Data Explorer

Bill,

I am having issues with type errors on the Data frame.

I also don’t have much clue what I am doing, so that might not help either. I have month and year columns created already, but am still getting the error.

image

Hi Adam, and welcome to the community!

It’s really difficult to say exactly where the issue lay until I see all of the data and how you’ve configured the example to apply in this use case.

It’s important to understand that this is an approach, not a solution. I reluctantly share the source code because so many people ask for it and I’m trying to avoid justifying why I shouldn’t share it. Data science, after all, is a little complex and as you can see in the code, it involves a pretty good understanding of multi-dimensional array handling. Your issues are likely very much related to @Burak_Bardakci and @Joe_Levy ran into with my somewhat convoluted example. If I get some time this week I’ll try to assist but I will need to have your data to be helpful.

I would love to see someone take this to a higher level as Burak suggested -

I hope someone in the community will publish a template that unexperienced people can use.

Brett_Snelgrov1
4 - Data Explorer
4 - Data Explorer

This is fascinating! I have been looking for a solution similar to Excel’s Forecast or Trend formulas within Airtable and this is the closest I have seen thus far in Airtable.

One question, what is required in terms of the layout of the fields within Airtable. This is what I have below.
Screenshot 2022-07-09 at 12.27.34 PM

But I am getting this result in the Script output:
Screenshot 2022-07-09 at 12.28.59 PM

If anyone can provide some tips, that would be appreciated.

Thanks!

Thanks for your kind comments. One of the upsides of contributing example approaches is that someone might use it for benefit. It’s also the downside - it might waste a lot of precious time. LOL

I wish I had free time to help, but my advice - use the approach to glean a pattern and then apply that pattern to your own data and possibly rewrite the code substantially. My experience with this - and invariably all data science projects - is that your own data model is typically just different enough to cause great anguish trying to fit the data to a retrograde legacy implementation.

One more consideration - this was designed using the “blocks” SDK at a time in Airtable’s history when Blocks had just been released. As such, I consider this an interesting historical artefact that would never be developed in this manner today. :winking_face:

Grace_Kochanek
4 - Data Explorer
4 - Data Explorer

@Bill.French Hello! I was wondering if you could help me out with a quick question…

I work with builder clients using the Monday.com platform and most of them need accurate forecasting for projects and revenue that can be flexible using specific factors. In my time working with Monday, I have yet to find an app, workaround or any feature that can do what we need. I outlined the key features below for your reference.

  • Inputting milestone dates (i.e. installation date, installation ready, etc) and having other dates adjust automatically. I have been able to create this in Monday, but only for forward dating. I can add the “build ready” date and populate the later dates (projected build complete), but I cannot add a final date and calculate when I need to be ready (i.e. back dating)… at least not in a way that doesn’t take 30 columns.
  • Flexibility is largely something Monday does not offer for scheduling currently… lots of our clients want to be able to see these dates and then change the variables that they are dependent on to see how it reflects the timeline (i.e. how many other projects do we already have on the books, how much capacity do we have for workers, how long do we have to complete this, etc.).

Overall, we often hear that Monday is good because it is visual and easy to understand, but the more complex scheduling or relations between entities are quite limited.

I was wondering if you have seen functionalities similar to this in airtable and/or if you think this could be completed in airtable. I am looking forward to your response!

Hi Grace, and thanks for reaching out on this forum.

Proper disclosure - I am not the best person to ask questions concerning project management or anything involving calendar integration. However, there are some bright developers here that are, and I’m sure we’ll hear their voices as well. @Adam_TheTimeSavingCo, and @Kamille_Parks come to mind, but there are many you can spot by searching this community concerning project management.

Understandably. I think there’s a tipping point at which project management reaches a level of dependency complexity that suggests you need software designed for specifically this challenge. Microsoft Project comes to mind, of course, and it has proven to work well for projects with deep dependencies. Does it make it any simpler to manage complex projects? That’s a big no. But it does provide the underlying architecture for predicting impact when dates and times change.

Yep - a challenge. This is exactly the way SpaceX works; they must enter an immovable launch window and then manage the tasks in reverse to hit that window.

I enjoy questions that do not require me to do anything. :winking_face: Here’s my opinion…

If there are any custom nuances or specific requirements that will make your solution a success, Monday[.com] is almost certainly not going to get you there. Airtable [out of the box] will also struggle with some of the deeper dependencies that I suspect are lurking in your ideal solution.

Unlike Monday, Airtable has a few “outs” as they say in poker, and let’s be clear - this is all about betting which pathway will help us win (I recommend Thinking in Bets by Annie Duke, BTW).

At the outset, Airtable supports three key integration and automation facilities that enable deep customizations of workflow and data flow. This makes it possible to create schedule refactoring processes that appear seamless to the user’s experience. While its event model is weak (by most standards), Monday[.com] doesn’t have one, and Airtable’s is suitable for building automations that trigger based on many kinds of changes. This seems to be a critical element for your solution.

Example… You need the ability to react to changes that are recognized when a record enters a specific view - do (x) where (x) may be a simple field change performed in a no-code configuration or a script process that updates 20 different dependencies across three tables. It is this diversity - from #no-code to #script that sets Airtable apart.

Script

Can you can pull this off without some script programming? Probably not. So, to ride the betting metaphor, I would wager a nice sub sandwich that somewhere in your solution, there will be code. But that’s not a significant issue since Airtable’s customization flexibility is based on javascript throughout and there are tens of millions who know javascript. This community has a broad array of people who write really elegant script that drive very custom Airtable behaviours.

Indeed, there are very few outs. Ideally, you never want to paint yourself into a corner, but with some poor planning or unanticipated requirements, this is bound to happen and Airtable will provide those outs.

Grace_Kochanek
4 - Data Explorer
4 - Data Explorer

Hi @Bill.French

I really appreciate you taking the time and responding to my question. You were extremely helpful and thorough with your answers.

I am going to take this back to my team and bounce these idea off of our clients. I will reach back out if I have more questions for you.

I hope you have a wonderful week!

Grace