Help

Spotting Fraud in Airtable Data

832 0
cancel
Showing results for 
Search instead for 
Did you mean: 

It’s not likely many of you known about Benford’s Law, but it will surprise you especially if you happen to unwittingly bet against it.

Imagine a scenario - you’re having lunch at a counter cafe and the old guy next to you (i.e., me) says out of the blue…

I will give you 20 bucks if you open any page in that newspaper and the first number you see starts with a 7, 8, or 9. And as part of this little game, if the number starts with a 1, 2, or 3 - you have to give me just ten bucks.

Would you play that game knowing that the odds are stacked against me? It’s a two-for-one bet after all and numbers, of course, occur in the wild in even distributions, right?

Um, wrong.

Here’s a 10,000 record [data se](https://download .csv file)t of insurance information that I randomly chose for this article - download it and import it into a base. Then run the script block (included below) on these fields:

  • hu_site_limit
  • tiv_2011
  • tiv_2012

Here are the distributions of the numbers from the insurance table. Note how predictable the beginning numbers are.

Still wanna’ bet on that newspaper game? :winking_face:

image

With Benford’s Law, we can analyze expense records and determine if their patterns fall outside these normal distributions. If the pattern does not follow a Benford distribution, there’s a high probability that the data is contrived.

Benford’s Law Script Block

output.markdown("# Benford's Law");

// get the table name
let sourceTable     = await input.tableAsync("Pick the table:");
let sourceTableName = sourceTable.name;

// get the field name
let sourceField = await input.fieldAsync("Pick the field to be analyzed:", sourceTable.id);
let sourceFieldName = sourceField.name;

// get the data set for this gallery
let result = await sourceTable.selectRecordsAsync();
let sourceRecords = result.records;
let totalValues = sourceRecords.length;

// get the record count
let recordCount = sourceRecords.length;

// iterate across all the records
let benfordCounts = {};

for (var r in sourceRecords)
{
  
    // get the value of this record
    let thisValue = sourceRecords[r].getCellValue(sourceFieldName);

    // get the first digit in the number
    let thisDigit = thisValue.toString()[0];

    if (thisDigit != "0") {
        benfordCounts[thisDigit] = (benfordCounts[thisDigit] === undefined) ? {"count" : 1, "percent" : 0.0, "chart" : ""} : {"count" : benfordCounts[thisDigit].count + 1, "percent" : 0.0, "chart" : ""};
    }

}

//
// compute the percentage distribution
//
for (var i in benfordCounts)
{
    // create the percentage for this digit
    benfordCounts[i]["percent"] = (benfordCounts[i].count/totalValues * 100).toFixed(1);
    // create the bar for this digit
    benfordCounts[i]["chart"] = "█".repeat((benfordCounts[i].count/totalValues * 100).toFixed(0))
}

// output.inspect(benfordCounts);
output.table(benfordCounts);
0 Replies 0