Help

The Community will be undergoing maintenance on Friday January 10 at 2:00pm - Saturday January 11 at 2:00pm EST, and will be "read-only." For assistance during this time, please visit our Help Center.

Re: Splitting Values Into Quartiles

4368 0
cancel
Showing results for 
Search instead for 
Did you mean: 
VictoriaPlummer
7 - App Architect
7 - App Architect

You may be in a situation where you have a dataset with a lot of numbers, and you can’t find any meaning in it. (It’s me. I’m in this situation :raising_hand_woman:t5: ) If I could only break those numbers up into smaller groups, maybe I’d be able to better digest and distill some meaning.

As a first start, I figured why not group these numbers into a nice set of four quartiles: Low, Below Median, Above Median, and High. That way I can use Airtable’s grouping to get a better sense of what I’m dealing with. I’ll preface by saying that I majored in English and I do not stand behind any mathematical concepts I’m using. If they’re wrong, call me out, I seriously don’t know better.

Until recently, the only way you could run calculations on the content of an entire field was through rollups, which was pretty inconvenient and took some mental gymnastics to understand. But, with the scripting block you can do this analysis pretty neatly. I found this nice script over on Stack Exchange and built around it so the script also accounts for 0 & empty values.

Screen Recording 2020-05-04 at 07.04 PM

I realized that I didn’t want to group by three different fields (too much nesting), but I did want those three fields to inform my groups. Basically I wanted to tag each record based on a criteria:

:star2: All Star = 3/3 Groups are High or Above Median
:mushroom: Positive Signal = 2/3 Groups are High or Above Median
:rotating_light: Negative Signal = Everything Else

I started writing a complicated IF statement, that only I could decipher, so I thought, maybe a script will do this better. Which it totally did.

Screen Recording 2020-05-04 at 07.17 PM

Here’s a link to the base!

Happy Scripting!

let table = base.getTable('Data');

// Define the field names to loop through, the field names to write to, and the value to write if a cell is Zero/Null
let analysis = [
    
    {analysis:'Volume',output:'Volume Range',nullvalue:'Negligible'},
    {analysis:'Price',output:'Price Range',nullvalue:'Negligible'},
    {analysis:'Revenue',output:'Revenue Range',nullvalue:'No Revenue'},

]
let analysisFields = analysis.map(c => c.analysis);
let outputFields = analysis.map(c => c.output);
let nullValues = analysis.map(c => c.nullvalue);
let query = await table.selectRecordsAsync({fields:analysisFields.concat(outputFields).concat('Signal')});
let records = query.records;

/* Loop through values */
for(let field of analysisFields){

let index = analysisFields.indexOf(field)

//Get the Array of Records
let rawArr = records.map( c => ({id:c.id,fields:{[field]:c.getCellValue(field)}}))

// Filter out Zeroes
let bigArr = rawArr.filter( c => c.fields[field] != 0)
let arr = bigArr.map(c => c.fields[field])

// Method from this awesome Stack Exchange Find: https://stackoverflow.com/questions/48719873/how-to-get-median-and-quartiles-percentiles-of-an-array-in-javascript-or-php

const asc = arr => arr.sort((a, b) => a - b);

const sum = arr => arr.reduce((a, b) => a + b, 0);

const mean = arr => sum(arr) / arr.length;

// sample standard deviation
const std = (arr) => {
    const mu = mean(arr);
    const diffArr = arr.map(a => (a - mu) ** 2);
    return Math.sqrt(sum(diffArr) / (arr.length - 1));
};

const quantile = (arr, q) => {
    const sorted = asc(arr);
    const pos = (sorted.length - 1) * q;
    const base = Math.floor(pos);
    const rest = pos - base;
    if (sorted[base + 1] !== undefined) {
        return sorted[base] + rest * (sorted[base + 1] - sorted[base]);
    } else {
        return sorted[base];
    }
};

const q25 = arr => quantile(arr, .25);

const q50 = arr => quantile(arr, .50);

const q75 = arr => quantile(arr, .75);

const max = arr => quantile(arr, 1);

const median = arr => q50(arr);


var range = [{
  Low: [0, q25(arr)+1]
}, {
  'Below Median': [q25(arr), q50(arr)+1]
}, {
  'Above Median': [q50(arr), q75(arr)+1]
}, {
  High: [q75(arr), max(arr)+1]
}];

function getBuckets(c){

var res = range.filter(function(el) {
  var key = el[Object.keys(el)];
  return c > key[0] && c < key[1]
});

return Object.keys(res[0])[0];

};


// Format values for BatchAnd and updateRecordsAsync()
let update = bigArr.map( c => ({id:c.id,fields:{
    
    [outputFields[index]]:{name:getBuckets(c.fields[field])}

}}))

// Update more than 50 records at a time using my favorite script 😇
await batchAnd('Update',table,update)

// Update Null Values

let nullArr = rawArr.filter( c => c.fields[field] == null || c.fields[field] == 0);
let nullUpdate = nullArr.map( c => ({id:c.id,fields:{
    
    [outputFields[index]]:{name:nullValues[index]}

}}))

// Update more than 50 records at a time using my favorite script 😇
await batchAnd('Update',table,nullUpdate)

};




//Add Signals
// I had to declare my query again for some reason. I'm not sure why but it works.
let newQuery = await table.selectRecordsAsync({fields:analysisFields.concat(outputFields).concat('Signal')});
let newRecords = newQuery.records

// Create an array with the indicators we want to assess
let arr = newRecords.map( c => ({id:c.id,indicators:[

    c.getCellValue('Volume Range').name,
    c.getCellValue('Price Range').name,
    c.getCellValue('Revenue Range').name
    ]
}));

/* Filter out 'High' and 'Above Median' from the array and get the legth of the filtered array, and pass the length through Signal()
For example, we have an the following array: ['High', 'High', 'Low']. Once 'High' is filtered out, the Array is now ['Low'] with a length of 1. 
This made sense at the time, but now explaining it, it sounds crazy. Would gladly take recommendations for improvements here. */
let signalUpdate = arr.map( c => ({
    
    id:c.id,
    fields:{'Signal':signal(c.indicators.map(x => x).filter( x => x != "High" &&  x != "Above Median").length) }
    
    }))

// This function uses the length o f the filtered array to determine the signal
function signal(elem){

    if(elem == 1){
        return {name:'🍄Positive Signal'}
    }else if (elem == 0){
        return {name:'🌟All Star'}
    }else
        return {name:'🚨Negative Signal'}

}

// Update more than 50 records at a time using my favorite script 😇
await batchAnd('Update',table,signalUpdate)


/*
    Use this function to perform 'Update', 'Create', or 'Delete'
    async actions on batches of records that could potentially 
    more than 50 records.

    ::PARAMETERS::
    action = string; one of 3 values:
           - 'Update' to call table.updateRecordsAsync()
           - 'Create' to call table.createRecordsAsync()
           - 'Delete' to call table.deleteRecordsAsync()

    table = Table; the table the action will be performed in

    records = Array; the records to perform the action on
            - Ensure the record objects inside the array are
            formatted properly for the action you wish to
            perform

    ::RETURNS::
    recordsActedOn = integer, array of recordId's, or null; 
                   - Update Success: integer; the number of records processed by the function
                   - Delete Success: integer; the number of records processed by the function
                   - Create Success: array; the id strings of records created by the function
                   - Failure: null;
*/
async function batchAnd(action, table, records) {
    let recordsActedOn;

    switch (action) {
        case 'Update':
            recordsActedOn = records.length;
            while (records.length > 0) {
                await table.updateRecordsAsync(records.slice(0, 50));
                records = records.slice(50);
            };
            break;
        
        case 'Create':
            recordsActedOn = [];
            while (records.length > 0) {
                let recordIds = await table.createRecordsAsync(records.slice(0, 50));
                recordsActedOn.push(...recordIds)
                records = records.slice(50);
            };
            break;

        case 'Delete':
            recordsActedOn = records.length;
            while (records.length > 0) {
                await table.deleteRecordsAsync(records.slice(0, 50));
                records = records.slice(50);
            }
            break;

        default:
            output.markdown(`**Please use either 'Update', 'Create', or 'Delete' as the "action" parameter for the "batchAnd()" function.**`);
            recordsActedOn = null;
    }
    return recordsActedOn;
}
9 Replies 9

Hi @VictoriaPlummer

Do you have a [Link to the base here] ?

All the best,

olπ

And yet, you are quite good at this. :winking_face: Your approach places you at the doorstep of machine-learning. You should step across.

Updated! Thanks for the reminder.

Ha! Thanks Bill, glad this is co-signed by you!

English majors unite! English majors can code, just like anyone else. In fact, I often think of my code in terms of writing a composition. Functions are like paragraphs. Variables are vocabulary. Methods are verbs. Properties are adjectives. Syntax is grammar. Debugging is proofreading.

Kendy511
6 - Interface Innovator
6 - Interface Innovator

@VictoriaPlummer Well, I’ve to say this script 's what I’m looking for. This’s called RFM method to calculate for ranking point.
In your case, You script Volume range from volume field Q1(Quartile 1 (25%) ,Q2 (Quartile 2 (50%) ,Q3 (Quartile 3 (75%). What would I change something to display this values of 25% ?

So glad this helped! The below section in the script is where i think you can change the values for the quartiles. You can even remove some if you’d like.

const q25 = arr => quantile(arr, .25);

const q50 = arr => quantile(arr, .50);

const q75 = arr => quantile(arr, .75);

const max = arr => quantile(arr, 1);

Still get error when edit, Can you help a bit script to display value of q25, q50, q75 ?

Ricardo_Pimente
4 - Data Explorer
4 - Data Explorer

Hi Victoria, congratulations on pulling all this together and thanks for sharing.

In the description you say this accounts for empty values, however I am running into an error if I delete any of the values in the “Volume”, “Price” or “Revenue” columns (as is the case in the data I am trying to apply this to).

Is there an easy fix without me having to manually add zeros to my data? (in my use case it would distort the data).

PS: This is the error I get

ERROR
TypeError: Cannot convert undefined or null to object
at getBuckets on line 84
at on line 92
at main on line 90