Scripting Block csv Importing Long Numbers as Scientific Notation

Sam_Cederwall · ‎Jul 08, 2020

Hello,

I’ve been working with a script for a while to import some CSVs. The purpose of doing this is to cleanse some of the data prior to importing it and adding more controls around how these files are uploaded verses using the CSV import block which we can’t control as well.

A question I had was, that when I try to import one of my files that contains a 16 digit tracking number, it comes in formatted as scientific notation and loses any data beyond where it is summarized.

I understand that this is often regarded as an Excel issue, and there are workaround within Excel, but when I try those workarounds, save the file, and try to import it, I still get scientific notation formatted numbers. I was wondering if anyone had a potential solution to this or if anyone has run into similar issues. My current solve would be mapping where we currently get our data from, to an Excel file, and using the scripting import on an .xlsx file type instead but we would really prefer to have them stay in .csv file formatting.

Thanks for all the help, let me know if there are any further questions.

Sam_Cederwall · ‎Jul 30, 2020

Thanks for the response.

After doing more digging and testing on my own, I was able to determine that my issue was happening within the scripting block’s automatic parsing functionality. I created a new forum post here to tackle this specific issue and it doesn’t currently look like there is an solution within the scripting block but it has become a feature request!

Thanks for all the help on this issue, hopefully we get this feature implemented in the near future.

See Solution in Thread

Bill_French · ‎Jul 08, 2020

Yes - seen this; fixed this many times.

Insanity.

The solution is to cleanse and normalize the data while in-flight. I use javascript to achieve this by first casting the value as a string (this will expose the true number) and then converting it to an int or float as needed.

But your answer depends somewhat on the nature of the data.

There’s Number.toFixed(), but it uses scientific notation if the number is >= 1e21 and has a maximum precision of 20. If one of these two approaches doesn’t work, you can roll your own, but it may get deep (as in deep sh*t).

function toFixed(x) {
  if (Math.abs(x) < 1.0) {
    var e = parseInt(x.toString().split('e-')[1]);
    if (e) {
        x *= Math.pow(10,e-1);
        x = '0.' + (new Array(e)).join('0') + x.toString().substring(2);
    }
  } else {
    var e = parseInt(x.toString().split('+')[1]);
    if (e > 20) {
        e -= 20;
        x /= Math.pow(10,e);
        x += (new Array(e+1)).join('0');
    }
  }
  return x;
}

Sam_Cederwall · ‎Jul 14, 2020

Hi Bill.

Thanks for the response. I thought I was really close with this answer but ended up just getting a bunch of zeroes instead of the actual number after 15 digits or so.

I was trying to figure out at which point the numbers are getting formatted into scientific notation and after using output.table to view the parsed contents of my csv, I noticed that even in the parsed contents, the data is formatted as scientific notation. My quesiton is, does this point to an issue in my csv or does it point to an issue in the way that the scripting block parses my csv?

Thanks again for all the help.

Bill_French · ‎Jul 14, 2020

I’m not certain; it may be neither because there is an issues in Airtable with floating point math.

Sam_Cederwall · ‎Jul 22, 2020

Thanks for the reply. I don’t know if this is the issue. I think where my issue differs is that my long string of numbers won’t even import from my script. I can change it to a string in the same script but it is already imported in scientific notation.

That’s why I think this is either occurring as part of Excel or something that happens within Airtable as part of the parsedContents() function.

Mike_Pennisi · ‎Jul 29, 2020

A sample CSV file may help move this forward. Can you share a file that demonstrates the problem? The smaller the better :slightly_smiling_face:

Sites like pastebin.com will host text files like that without requiring sign-up. Alternatively, you could try just posting the literal file contents into this discussion thread (though there’s some risk that we’ll miss an important detail if we create our own CSV file to try to reproduce the bug).

Sam_Cederwall · ‎Jul 30, 2020

Thanks for the response.

After doing more digging and testing on my own, I was able to determine that my issue was happening within the scripting block’s automatic parsing functionality. I created a new forum post here to tackle this specific issue and it doesn’t currently look like there is an solution within the scripting block but it has become a feature request!

Thanks for all the help on this issue, hopefully we get this feature implemented in the near future.

Bill_French · ‎Jul 30, 2020

Sounds like you made progress in identifying the source of the issue and @Billy_Littlefield was quick to provide some detailed insight that is beyond my intellectual pay grade. :winking_face:

Precisely what automatic parsing functionality are you referring to?

As I’ve mentioned a few times, often good workarounds arise when developers are given a chance to actually see the code that’s responsible for the roadblock and if nothing else, we can all become familiar with the pattern to avoid the experience you’ve had.

Sam_Cederwall · ‎Jul 30, 2020

Sure, here is an example of the code I was using:

let testFile = await input.fileAsync(
    'Please upload file',
    {
        allowedFileTypes: ['.csv'],
        hasHeaderRow: true
   }
);
let parsedFile = testFile.parsedContents;
output.table(parseFile);
output.inspect(testFile);

If you take this code and run it, you’ll notice that doing output.inspect() on the file gives an object with a parsedContents array of the data within the file already present and that when you do output.table(), much of what was in the file will be formatted automatically (i.e. date/time fields, numbers, etc.). This lead me to believe that there was a hidden step within the block causing it to format my long numbers as scientific notation, that step being the automatic parsing.

The automatic parsing functionality is discussed here, when the input.fileAsync was initially announced.

P.S. I know there’s a way better way to format my code in these posts, and maybe I’m just completely clueless but if someone could point to how that’s done, that would be great. I’ve seen other posts where bits of the code are highlighted with color just like the actual scripting block does.

Bill_French · ‎Jul 30, 2020

Thanks, Sam. This is helpful to see. To run this and experience what you are seeing, we’d need to construct a CSV example - can you also share a three-record CSV that includes the data we’d need to see this?

Understood. But isn’t output.table() (by its very nature) a rendering method that is already predisposed to show data with biased optics? Can we trust what we’re actually seeing with this method?

With regard to this approach (i.e., testFile.parsedContents), wouldn’t it be fairly simple to overcome this limitation by simply parsing the CSV document natively in javascript, thus avoiding the assumptions made by the parsedContents method? I’m still not certain this would remedy your issue, but I’ll gladly give it a try when you drop three nasty CSV records on me. :winking_face:

One housekeeping issue - I assume parseFile should really be parsedFile.