URL to Attachment

Hi! I have sooo enjoyed this script for images. But I have a PDF automatic download link like this: https://www.teacherspayteachers.com/Preview/Noun-Sort-Cut-and-Paste-083291900-1381063936?fn=valentinesdayworksheetspreview.pdf

That is NOT pulling the PDF correctly. Any ideas?

One idea - that URL is not publicly accessible, a requirement of Airtable attachments.

Ah, makes sense. Ok, thanks!

How about this? I know it is in a viewer…https://www.teacherspayteachers.com/pdfjs/view.html?file=https%3A%2F%2Fpreview-2.teacherspayteachers.com%2F0%2F944%2F944153%2F6d9dcd81962c9b98d6d16b223d687828%3Fpreview-token%3Dexp%3D1596657536~hmac%3D7376d51a55862c7f9837c6e18591f1b34907f90be611c2387ba77cb1ee6c25f9%26file_name%3Dwinterworksheetspreview.pdf

Viewers do note make it possible for a GET request to access the file. Viewers are for humans; Airtable’s attachment system is a machine.

Ha, yeah, I know. I want the impossible :slight_smile:

The viewer you shared is an HTML proxy for viewing what appears to be a PDF document. Someone needs to tell you how to look at the PDF without going through that proxy. Then, and only then, will Airtable be able to ingest the documents.

1 Like

So, i’ve been following along on this, trying to apply it to my own table.

Could you possibly post the entire script, start to finish, that you were able to execute fully? Then I might be able to more clearly see where my error is…!

Hi - This script works great for one URL to attachment , but is it possible to make it work with several URLs? I have 7 columns of URL images (some records do not have all 7 filled), and tried to get it to do more than one column but have been unsuccessful. (My scripting levels are not great but I usually can get there in the end after a lot of trial and error, but I’ve come to a standstill on this attempt. )

Thanks

@Stephen_Suen @Kasra, this is all working great for me except for one component: My image in the Attachment field doesn’t display as it should. See screenshot above. The link I am trying to load goes to a Google Drive image file. The permissions are set so anyone with the link can view.

How can I modify the code or the link so it correctly imports the Google Drive link as an attachment?

The link follows this pattern:

https://drive.google.com/file/d/file_id/view?usp=sharing

Here is the code:

let submissionsTable = base.getTable('Table 10');
let urlField = submissionsTable.getField('Attachment Source URL');
let attachmentField = submissionsTable.getField('Attachment');

let submissionsQuery = await submissionsTable.selectRecordsAsync();
let updates = [];
for (let record of submissionsQuery.records) {
    let url = record.getCellValue(urlField);
    let attachments = record.getCellValue(attachmentField);

    // If this record already has an attachment, skip it.
    if (attachments !== null) {
        continue;
    }

    // Otherwise, attach the image at the URL.
    updates.push({
        id: record.id,
        fields: {
            [attachmentField.id]: [{url: url}]
        }
    });
}

// Update records in batches of 50.
while (updates.length > 0) {
    await submissionsTable.updateRecordsAsync(updates.slice(0, 50));
    updates = updates.slice(50);
}
1 Like

@Marty_McCoy It looks like the URL you’re trying to attach points to a web page, rather than a direct URL for the raw image. This means when you’re updating the record, you’re uploading HTML rather than the PNG/JPG/etc itself.

Have you considered using the URL preview app instead? This will allow you to preview Google Drive share URLs by clicking cells in grid view:

2 Likes

Thanks @Stephen_Suen, I need to get them out of Google Drive because of limited storage. Does URL Preview work with Box? I have unlimited storage there.

Alternatively, do you know a way for me to access the URL for the raw image in Google Drive? It’d be great to use this URL to Attachment scripting solution for scalability.

Update

I got the script to import by using a link as follows: https://drive.google.com/uc?export=view&id=file_id

Update 2

After further testing, the script is not working consistently, even though all of the Google Drive links now point to the raw image, not the HTML. It works for a couple of records, then stops.

@Stephen_Suen any ideas to troubleshoot are appreciated.

Possibly helpful intel:

  • When I use Python’s requests library to download files from Google Drive, I use a base downloading URL https://docs.google.com/uc?export=download
  • Then, I pass in the the file ID and the destination where I want to save the file.

Maybe something similar needs to happen in this script? Use this base URL and the file ID, rather than using the other format (https://drive.google.com/uc?export=view&id=file_id).

Anybody else have a solve for this?

This sounds precisely like the known bug documented here and here.

I see this failure often across many clients despite many attempts hone the URLs into Google Drive using different patterns such as:

  • https://drive.google.com/uc?export=download&id=" + thisID
  • https://drive.google.com/uc?export=view&id=" + thisD
1 Like

@Bill.French I tried using this URL pattern and it’s working right now. I am now not confident that it’ll work consistently, but glad it’s working at least sometimes. Thank you!

Can’t edit my comment above. Last week, I used this script to process 322 google drive photo links in one click! But today it’s not working.

The URL Preview app doesn’t currently work with Box, but that’s a fantastic feature request! Many of my clients use Box, and I’m sure that this would be a very handy feature for them! Please submit this as a feature request to either the #show-and-tell:product-suggestions category, or by emailing support@airtable.com.

Last week, I used this script to process 322 google drive photo links in one click! But today it’s not working.

I actually don’t have the answer to solving this script problem for you, but I’ve setup my clients with Integromat in order to convert their Google Drive files into Airtable attachments.

1 Like

Thanks Scott!

The script just worked for me to process 105 photos. Thankfully, I am in a position where if the script doesn’t work, I can wait an hour or two and try again and it seems there’s a good chance it’ll work then.

Marty, that does sound more like a rate limit (number of photos processed per x amount of time) than an issue with your script – so that’s good news!

Google Drive and Dropbox are tricky. They intentionally don’t make it easy to discover their URL patterns to “hotlink” an image using their server. Their services are meant for personal file sharing, not web hosting, which would cost them more bandwidth to provide their services. I deal with this limitation a lot.

1 Like

I also believed this might be the suspected cause of such failures, but I was wrong.

The Test Climates

With more than three clients I have engaged in extreme measures to isolate issues concerning attachment inconsistencies and failures to process images as well as PDF documents. All of my tests and debugging activities are specific to the use of the Airtable API and Google Apps Script to perform mass extractions and attachment uploads into Airtable.

In 100% of the cases, we ultimately proved that document sources were not the cause. If the sources are properly configured for accessibility and the file URLs are constructed in a manner to reveal the actual artifacts (as opposed to rendering veneers), the Drive platform provided the climate for a successful extraction.

Rate Limiting

We also ruled out rate limiting and other quota breaches by performing mass high-velocity extractions into other systems such as local drives, network drives, and even Amazon S3 object stores. One test included the processing of 10,000 PDF documents from Google Drive into S3 over a five hour period without a single issue; roughly one TB of content.

It’s Airtable

As such, the failures that these three clients experience (even today) with their Airtable solutions are almost certainly in Airtable itself, not the document source platforms. To be clear, this causal assertion is with the Airtable as it pertains to the API only. There may or may not be other attachment issues with manual uploads or with other script processes in Blocks or automations but I can only speak to my experiences with the API.

The Pattern

If we’re talking about the same problem y’all are discussing here, the error condition is unmistakable - an API call into Airtable creates an attachment and for a few seconds it looks like the upload sticks because the UI indicates a document has arrived in the cell. The icon of the document is greyed out during this brief indication period which typically lasts a few seconds. Then the icon vanishes as if to suggest that Airtable tried to store the file but gave up.

Oddly, this behaviour is roughly the identical experience if you use the API to make an attachment using a URL that cannot be accessed by Airtable, a condition that should fail.

The Frequency of the Pattern

These observed failures are intermittent and from time-to-time there are periods where all attachment attempts via the API fail. And there are periods where all attachment attempts succeed. Largely, though, it seems they fail intermittently and randomly.

We also invested heavily in script processes that perform up to ten retries and capture upload success analytics when attachment attempts fail. The analytics from attachment attempts that are retired show equally random outcomes - sometimes a second retry succeeds; sometimes ten retries fail. And in almost every failure situation, a manual upload works.

I believe this issue is still creating problems from time-to-time and in very random cases and it has forced many of my clients to skirt the issue by avoiding attachments and referencing documents by URL only.

2 Likes