Help

Re: Automatically get files from (protected) URLs using Sripting

2472 5
cancel
Showing results for 
Search instead for 
Did you mean: 
Jonas_Kockx
4 - Data Explorer
4 - Data Explorer

Hey
I wanted to automate the process of fetching files from URLs and putting them in a Attachment filed. I found a script to do so at:

However, the URLs I want to fetch require being logged in into Webflow. So I was wondering if someone knew a method of making sure the URLs are fetching ‘as it was working in your own browser’. In my browser I can simply copy the URLs as I’m automatically logged in to that website.
Kind regards
Jonas

18 Replies 18

Hi Jonas, and welcome to the community!

This is a noble challenge, however, it requires some deep thought to achieve the stated requirement. The deepest of the thought is central to what Airtable requires in an automated process, not so much what users require.

When Airtable is presented with a URL, it uses that endpoint to retrieve a document (like an image, or a PDF, etc). In that process, Airtable has no understanding of a security context unless, of course, it is already established in the context of your browser when performing attachment additions manually.

Making attachments automatically is a completely different security climate; there is no browser and the automated process can only hand-off to Airtable’s API URLs that are publically accessible. And the only option to make this work with attachment URLs that are private is a proxy that does possess the ability to establish a security context with the platform hosting the secure URLs (ergo, the platform possessing the private URLs must also provide a modern API).

I have created this approach many times but it’s complicated. The proxy must mimic (for Airtable’s benefit) document URLs that are (in fact) public. It must do this for a few seconds; long enough for Airtable to make a copy of the target URL and then immediately thereafter, it must invalidate that URL.

The Cat is Out of the Bag

And doing this makes the documents once protected by private URLs, suddenly publically accessible because Airtable attachments are open URLs; if you know the link, you can access the attachments.

@Jonas_Kockx I don’t know about WebFlow in particular, but to do anything like this, you would need to tap into the software’s API.

WebFlow has an API that lets you retrieve any items from your collections, which seems to include the URLs of your files (which is what Airtable needs to create a file in the attachment field).

So you would need to write a script in Airtable that taps into WebFlow’s API.

Alternatively, if you don’t want to (or don’t know how to) write a script, that’s where you would turn to Integromat’s WebFlow integrations and automations with Airtable.

Integromat is a no-code integration platform lets you tap into APIs without writing any code. (Note that I am a Registered Integromat Partner, and that link contains my personal referral code.)

And how would that script hand off a public (non-private) URL such that it could be absorbed into an attachment?

In general, when accessing any API, the API requires the user to authenticate their API call using their API key which lets them access any private information that is only accessible to them. In other words, the process of authentication gives the API call full access to anything that they would normally have access to, as if they were currently logged into their system.

Ok, thank you for your answer! I’ll look into integrating WebFlow’s API in my script!

Hi @ScottWorld,

If I do that WITHOUT any PROXY, should I expose my API-Key into my Airtable-Script-App ?
I believe YES and I don’t want to do that.
So I would have to use a PROXY anyway ?
I believe YES.

Thank you to explain if I’m right or wrong.

Second thought: maybe this kind of PROXY to hide an API KEY is much easier to write / adapt / setup or to rent than the full PROXY Solution but I still don’t know which of Bill’s or Scott’s way should be the best for me on some lightly or deeply different API Use Cases. (maybe both are right for me but, and it’s personal, I don’t use what Bill’s calling Glu-Factories: this is only a personal initial choice.)
I don’t use WebFlow but some other API so I will continue to follow this thread.

Best,

olπ

That’s actually another good reason to use a 3rd-party service like Integromat — because it shields your API key from your users.

As far as I know, using Airtable’s scripting app reveals your API key to all of your users, even read-only users.

Yes.

Thank you @ScottWorld

oLπ

Indeed, this is how APIs work. But you need to address the core requirement of Airtable attachments. It doesn’t matter if your script can authenticate with the target platform and access the data including the private URLs. What matters is what URL will you hand off to Airtable for loading an attachment. If the API retrieves a secure URL and hands it off to Airtable, the attachment process will fail because Airtable itself requires ALL attachments to be publicly accessible.

Once again - my question - how will any process via script or any other means such as Integromat overcome this essential attachment requirement?

Am I missing something?

Yes @Bill.French and @ScottWorld,
this is the main problem to solve anyway !

Many Thanks,

olπ

@Bill.French @Olpy_Acaflo

Ahhh, I see, that’s a good question! And something I overlooked above. It definitely depends on the API. For example, Google Drive has the option to allow access to a private URL, as long as you know the private URL.

I’m not familiar with WebFlow, so if it doesn’t operate in a similar fashion, then some intermediate steps would be needed in Integromat to temporarily transfer that file into a publicly-accessible space (such as Google Drive) where it can temporarily gain a public URL.

So, an Integromat scenario could look like this:

Authenticate with WebFlow > download the file to Google Drive > send the Google Drive URL to Airtable > delete the file from Google Drive.

LOL! Finally, I got through. :slightly_smiling_face:

Actually, it depends on the visibility options of “private” URLs in the target document platform.

Ergo, it’s not really private nor is it secure (at least not in the sense or spirit of the target document platform). It is private and secure if your definition includes links to documents that are openly accessible by anyone in possession of said link. :winking_face:

Ergo - you need a proxy. But this definition of a proxy likely violates the spirit of why the documents are not available as openly accessible URLs in the first place. Perhaps this is not really a requirement, but we have to assume it is because documents are typically secured for good reason.

Your suggested process is missing a very important step -

… it must also set the security settings for the downloaded file in Google to “Accessible to anyone with the link…”.

What you’ve described here is similar to what I described in my first response. Can Integromat perform all these steps? Perhaps it can. I’ve been fortunate enough to get hired many times to create proxies like this because it’s very difficult to achieve this process without precise timing in a no-code environment.

One thing to remember - any Google Drive document with the security setting “Accessible to anyone with the link…” is also discoverable by anyone in the G-Suite domain for which the account exists. And by “discoverable” I mean searchable, viewable, copyable - it (by definition) is a wide-open document to everyone in the domain even if they do not possess the URL.

To be clear, in my approach, I do not instantiate “intermediate” temporary documents in Google Drive that are exposed publicly or at the domain because it increases the security attack surfaces by a factor of 3 and also broadly publishes the document to the organization’s users.

The most secure way to do this while satisfying the spirit of the secure document URLs is to established a signed URL that is accessible only through a webhook in Google Apps Script. This makes it possible to host (for a few seconds) a document that can be retrieved, but which doesn’t actually exist in Google Drive.

I don’t think this is the case any longer. Indeed, use of any third-party API in a script requires that script has knowledge of a security context of some type. However, isn’t this how you defend against overtly making API tokens and other login credentials difficult to view?

Yes, Integromat has the ability to change permissions of Google Drive files.

Cool. And can it reverse the permissions and/or remove the file when (and only when) Airtable’s attachment process finishes?

Yes, it can do both things.

Excellent! Someone should publish that use case because it’s very costly to script.

Actually, even better, Integromat can temporarily download files into its own cache while running a scenario… so you could eliminate Google Drive altogether and just download the attachment into Integromat, upload it to Airtable, and it will be instantly deleted when the scenario is finished.