.csv Parsing through Scripting Block

I am attempting to map a csv file to one of my bases via scripting due to some pre-processing I would like to do, however, I began running into issues with parsing. Figuring that I had made a mistake I created a new script to test a more simple csv with state names and regions essentially hoping I could troubleshoot on my own.

This is the output table I get with an csv I upload:
image

And here’s the script I’m using:

// Prompt for a file upload
let csvFileResult = await input.fileAsync(
    'Upload a CSV file',
    {allowedFileTypes: ['.csv'], hasHeaderRow: true}
);

//Display table w/ parsed contents
let csvRows = csvFileResult.parsedContents;
output.table(csvRows)

I’ve been looking at the spreadsheet importer example and the documentation within the block but have had no luck thus far, any help would be much appreciated.

Thank you,
Sam

Could you share the first few rows of the CSV file?

1 Like

Sure, thanks for the response Mike.

States & Regions Screenshot

Keep in mind that I’m using this csv as test. Appreciate the help.

Your script works for me! There’s something fishy about the output in your
original post: it includes a cell with the text “Sheet1”. A “sheet” is a
concept in a spreadsheet file (which normally has a filename like .xls or
.ods); CSV files don’t have sheets.

Usually, the file extension accurately describes the format of the data
inside, but sometimes things get mixed up. What makes this a bit tough to see
is that many popular spreadsheet applications quietly convert between the
formats to help you get your work done. So even though your seconds screenshot
shows that the file works in a spreadsheet application, it might not actually
be formatted in the way Airtable expects.

Are you able to verify that the file is indeed formatted with plain-test
comma-separated values? Or could you add the example file as an attachment
here?

2 Likes

There’s something fishy about the output in your
original post: it includes a cell with the text “Sheet1”.

I had the same thought, I’ve never seen a sheet number come up when uploading a CSV anywhere.

I don’t think file uploads are permitted through this site (I at least don’t see it, I could be missing it) but now that I know it’s not the script, I attempted the same script with other CSVs I had lying around and got the same result. Most of these CSVs have been converted from .xlsx or .xls file types.

It sounds like you’re right about something happening quietly that ends up in a format that Airtable doesn’t expect.

I’m not sure if this is particularly helpful but here’s a screenshot of the CSV text:

States & Regions Text

Do you have any advice for how I might be able to work around this issue? Do you need any additional information?

Thanks again.

Looks like the forum’s “upload” capabilities are limited to image files, so scratch that.

There are some small differences between the data in the second and third screenshots, but it doesn’t look like any of them are significant here.

We can take a look at the “raw” text contents using the HTML File API:

output.text(JSON.stringify(await csvFileResult.file.text()));

Does that report CSV-formatted text?

That seems to show us what we needed to see. Here is the output:

"statename,stateabr,Region,Region#\r\nAlabama,AL,Southeast,4\r\nAlaska,AK,West,1\r\nArizona,AZ,Southwest,2\r\nArkansas,AR,Southeast,4\r\nCalifornia,CA,West,1\r\nColorado,CO,West,1\r\nConnecticut,CT,Northeast,5\r\nDelaware,DE,Northeast,5\r\nFlorida,FL,Southeast,4\r\nGeorgia,GA,Southeast,4\r\nHawaii,HI,West,1\r\nIdaho,ID,West,1\r\nIllinois,IL,Midwest,3\r\nIndiana,IN,Midwest,3\r\nIowa,IA,Midwest,3\r\nKansas,KS,Midwest,3\r\nKentucky,KY,Southeast,4\r\nLouisiana,LA,Southeast,4\r\nMaine,ME,Northeast,5\r\nMaryland,MD,Northeast,5\r\nMassachusetts,MA,Northeast,5\r\nMichigan,MI,Midwest,3\r\nMinnesota,MN,Midwest,3\r\nMississippi,MS,Southeast,4\r\nMissouri,MO,Midwest,3\r\nMontana,MT,West,1\r\nNebraska,NE,Midwest,3\r\nNevada,NV,West,1\r\nNew Hampshire,NH,Northeast,5\r\nNew Jersey,NJ,Northeast,5\r\nNew Mexico,NM,Southwest,2\r\nNew York,NY,Northeast,5\r\nNorth Carolina,NC,Southeast,4…

Didn’t feel the need to put all the text in there. Definitely not just comma separated, looks like it’s returning to the start of a line then going to a new one. You have any advice on how to handle this? Thank you.

This is strange. The file apparently has Windows-style line endings, but I’ve
confirmed that works on my end, too.

Working backwards from your first screen shot, I have some guesses about the
structure of the parsedContents object. It may be that if you access the
Sheet1 property, you’ll see the output you’re after:

But I wouldn’t recommend relying on that. We might be dealing with a platform
bug. Could you share your browser and operating system (I’m using Chromium on
Ubuntu 18.04). Also, I appreciate that you don’t want to fill this thread with
a lot of data, but there’s a chance that some content later on in the file is
triggering unexpected parser behavior. Could you shrink the file down to a size
that is small enough to share here (using the “raw” text method from my
previous reply) and that still renders incorrectly for you?

It seems like the block may be misinterpreting the file’s type on your system. Could you also share the result of:

output.text(csvFileResult.file.type);

(It’s "text/csv" for me.)

I’m running Chrome on Windows 10.

Here’s the full raw text:

“statename,stateabr,Region,Region#\r\nAlabama,AL,Southeast,4\r\nAlaska,AK,West,1\r\nArizona,AZ,Southwest,2\r\nArkansas,AR,Southeast,4\r\nCalifornia,CA,West,1\r\nColorado,CO,West,1\r\nConnecticut,CT,Northeast,5\r\nDelaware,DE,Northeast,5\r\nFlorida,FL,Southeast,4\r\nGeorgia,GA,Southeast,4\r\nHawaii,HI,West,1\r\nIdaho,ID,West,1\r\nIllinois,IL,Midwest,3\r\nIndiana,IN,Midwest,3\r\nIowa,IA,Midwest,3\r\nKansas,KS,Midwest,3\r\nKentucky,KY,Southeast,4\r\nLouisiana,LA,Southeast,4\r\nMaine,ME,Northeast,5\r\nMaryland,MD,Northeast,5\r\nMassachusetts,MA,Northeast,5\r\nMichigan,MI,Midwest,3\r\nMinnesota,MN,Midwest,3\r\nMississippi,MS,Southeast,4\r\nMissouri,MO,Midwest,3\r\nMontana,MT,West,1\r\nNebraska,NE,Midwest,3\r\nNevada,NV,West,1\r\nNew Hampshire,NH,Northeast,5\r\nNew Jersey,NJ,Northeast,5\r\nNew Mexico,NM,Southwest,2\r\nNew York,NY,Northeast,5\r\nNorth Carolina,NC,Southeast,4\r\nNorth Dakota,ND,Midwest,3\r\nOhio,OH,Midwest,3\r\nOklahoma,OK,Southwest,2\r\nOregon,OR,West,1\r\nPennsylvania,PA,Northeast,5\r\nRhode Island,RI,Northeast,5\r\nSouth Carolina,SC,Southeast,4\r\nSouth Dakota,SD,Midwest,3\r\nTennessee,TN,Southeast,4\r\nTexas,TX,Southwest,2\r\nUtah,UT,West,1\r\nVermont,VT,Northeast,5\r\nVirginia,VA,Southeast,4\r\nWashington,WA,West,1\r\nWest Virginia,WV,Southeast,4\r\nWisconsin,WI,Midwest,3\r\nWyoming,WY,West,1\r\n”

And the file type:

application/vnd.ms-excel

Well, that’s problematic I assume. Does this help clarify what’s wrong? I appreciate the quick responses, it’s incredibly helpful.

And the file type:

application/vnd.ms-excel

Well, that’s problematic I assume. Does this help clarify what’s wrong?

Yes, that’s unexpected. This bug report from another project has a little more detail:

I think Windows is messing with MIME types where it shouldn’t. The official MIME type for CSV files is text/csv. It appears that application/vnd.ms-excel is also the MIME type for actual Excel files

@Stephen_Suen @somehats This seems like something the Scripting block may have to address.

In the mean time, two workarounds come to mind (both seem safer than parsedContents.Sheet1). You might try disassociating .csv files with Excel in the “Default Programs” settings in Windows. That was suggested in the above issue, but no one reported whether it worked. It could also interrupt other workflows you have around CSV files.

Alternatively, you could try exporting the file as a .xls or .xlsx file, and then updating your script like so

 // Prompt for a file upload
 let csvFileResult = await input.fileAsync(
     'Upload a CSV file',
-    {allowedFileTypes: ['.csv'], hasHeaderRow: true}
+    {allowedFileTypes: ['.xls', '.xlsx'], hasHeaderRow: true}
 );

 //Display table w/ parsed contents
 let csvRows = csvFileResult.parsedContents;
 output.table(csvRows)

This may work if Windows and Airtable might have more agreement on what it means to be a spreadsheet file than they do on what it means to be a CSV file.

I appreciate the quick responses, it’s incredibly helpful.

Sure thing!

1 Like

Thanks for the detailed report and debugging efforts @Sam_Cederwall and @Mike_Pennisi. You’ve identified the underlying issue here perfectly. To summarize –

The Sheet1 root property on parsedContents is the result of the scripting block parsing this file as an XLS file, which can contain multiple sheets and therefore needs this extra layer at the root to differentiate between rows in different sheets. “Sheet1” is the default fallback when no sheet name is provided. When a CSV file is parsed, we omit this “Sheet” layer and parsedContents just returns the rows directly (as mentioned above in the thread, there’s no concept of multiple “sheets” within a CSV file)

The unexpected bit here is the mismatch between the mime-type and the file extension. I think this qualifies as a bug - we can update the scripting block to remove this “Sheet” layer for both files ending in “.csv” and files of type “text/csv”.

2 Likes

Thank you guys for all the hard work on this. Appreciate the help, glad we could uncover this bug.

Also @Billy_Littlefield, if you could let me know here when this has been fixed, that would be a tremendous help. Thanks again.

@Sam_Cederwall Sure thing, I’ll post here once the fix is live.

@Sam_Cederwall The patch just went out, so parsedContents should directly return an array of row items for that CSV. Let me know if you’re still experiencing issues!

2 Likes

Thanks for the update @Billy_Littlefield, it’s working great so far!