Standard way to prevent formula injections when using AirTable `select` and `filterByFormula`

Luciano_Mammino · ‎Aug 05, 2022

NOTE I originally wrote this in a rush hoping that the information provided here would be sufficient to start a conversation. It turned out it was not. Please refer to this comment below for a better explanation (and A DEMO) of the issue: Standard way to prevent formula injections when using AirTable `select` and `filterByFormula` - #15 ...

Hello,

When using your APIs to get data dynamically from a table, is there any secure way to handle formulas that need to contain user input data?

How can we avoid a user trying to alter the formula by crafting an injection (like a SQL injection but for your formula language)?

I couldn’t find anything in the docs, nor any utility in your (JavaScript) SDK…

So far I had to come up with my own escape function…

In case you need an example, this is my formula:

{code} = 'someCodePassedByUser'

If the user passes the following code

' >= 0 & '

I end up with

{code} = '' >= 0 & ''

which is always TRUE!

Considering that injections are the 3rd item in the OWASP security top10, I would consider this a VERY BIG SECURITY FLAW for people using airtable as a backend. :skull_and_crossbones: :skull_and_crossbones: :skull_and_crossbones:

You should mention this in the docs and provide a standard way to sanitize user input for formulas.

A function built in the JavaScript SDK would be ideal…

kuovonne · ‎Aug 07, 2022

Oh dear. To be clear, I am not questioning this. I agree with Bill. I just have a different writing style and prefer to focus on different aspects of the issue.

Bill is not a gatekeeper, but he is an established, respected member of this community. He is also one of the most skilled users of the Airtable REST API and understands it better than any other frequent poster on this forum.

Bill was not offering a deal to allow you to keep posting. Rather, he was offering a deal where he would like your posts. That is a subtle but important distinction.

I also have the benefit of experience with Bill’s writing style, and I interpret his offer differently from you. It is his way of recognizing that you are capable of producing helpful, informative, balanced content and expressing a desire that you do so. I think that would be a better approach than repeated apologies.

Bill_French · ‎Aug 07, 2022

Since the original post that started this thread is basically reflected in your sentiment as well, and you apparently haven’t read the dialogue deeply enough to realize query injection is not a security issue with or central to only Airtable, I’ll try to be brief. But people who know me know that this is fundamentally impossible. Buckle up!

Poorly designed web apps that use features described by @Luciano_Mammino in the manner that he presented can be risky.

Let’s be really clear by breaking the security risk into more manageable points. I think the use of predicates matters greatly in any conversations involving security. :winking_face:

Why # zero you ask? This point does not go without saying because it is central to the overall risk envelope that @Luciano_Mammino has raised. The risk exists if – AND ONLY IF – filterByFormula can be dynamically programmed through the web app’s UI with user input.
In many cases, the risk of an injection attempt designed to see other records is zero because the entire data table is intended to be publically accessible.
In some cases, the developer has exposed to the open Internet a mix of record classes in a manner that could cause certain users (i.e., hackers) to access records not intended for consumption.
The attack vectors mentioned in this thread are framed without the benefit of a security context for the web app itself. @Luciano_Mammino failed to include this point in his scenario, perhaps to simplify his assertions. I cannot speak for his reasoning.

#0 narrows the breadth of this issue to a very small fraction of web apps built on Airtable.

If perhaps 3% of Airtable solutions are front-ended with a filterByFormula query, then perhaps less than 3/10th’s of one per cent are instrumented with a dynamic filterByFormula capability that can actually be altered by users. Despite the rarity of this design choice, some developers use this technique to make it easy for users to intentionally “inject” filtering parameters. @Luciano_Mammino is correct in assuming that web apps can be much more useful with this approach, but it’s not the only way to achieve it.

@Luciano_Mammino’s warning is valid in this very narrow use case if you are doing this.

In my view, this approach is a bit dated because it requires new REST request/response interchanges with the API to effectuate each new query. Long-session HTTP gateways provide a much faster query response without the risk of injection and all while lessening the API load on the Airtable instance (it’s a thing, BTW).

#1 is no factor.

No one cares if filterByFormula is used in an unpredictable manner.

#2 is certainly worrisome.

Don’t do this [my opinion]. :winking_face: It’s important to point out that Airtable provides ways to easily overcome this risk by scoping [with precision] what data is made accessible to the web app and its intended users. And you can do this if you take the steps suggested by @Luciano_Mammino and a bazillion other web development articles. Doing it yourself doesn’t make it a zero-risk proposition.

#3 is strangely absent from this entire thread.

I raised this point in a few passages above, but I think it was swept aside in the heat of the debate. When it comes to building web apps with sensitive data, most developers wrap the app in a very solid security context. It’s a matter of developer preference and largely influenced by business requirements, but it typically exists in every web app that contains pathways to sensitive information. I tend to use Firebase for this security layer, but there are many ways to secure web apps in a manner that makes them almost impervious to access by unintended nefarious actors.

Bottom Line

To exploit #2, you have to get past #3.

If I’ve missed something or there are design patterns that I failed to expose that make this molehill into a mountain that deserves this many thread updates, please enlighten us all.

You are free to do that, but it is tantamount to advising your company to jettison a platform for reasons that have nothing to do with the platform itself. From Oracle to Airtable, these risks are generally the same.

Leaving Airtable, for this reason, is like jumping into the escape pods because your perfectly good spaceship has only downloaded the soundtrack for Guardians of the Galaxy #1, and you’re tired of Spirit in the Sky.

kuovonne · ‎Aug 07, 2022

I’ve also raised my tray table and returned my seat to its full upright position.

I tangentially referred to this when I pointed out that the API request comes with the credentials/API Key of a valid user. The API has done its job of ensuring that the request came from a valid user. It is not the API’s responsibility to second-guess an authenticated and authorized request. It is the responsibility of the API Key holder to make sure that any use of his/her key is surrounded by the proper security. Unfortunately, many API Key holders don’t understand this.

I think the number of posts in this thread has more to do with what you and I find interesting than anything else. Ben has not returned to the conversation. If neither of us had replied to this thread, it might have died an obscure death. Alas, here I am being entertained by your posts when I could be doing something more productive, but less fun.

Bill_French · ‎Aug 07, 2022

Is that true? I thought Airtable’s API was limited to a single key/token for the workspace that the developer would use in a web app, thus acting purely as a proxy for all users of the web app against all assets of the account’s workspace. Is there a way to build a web app that authenticates a specific Airtable user and, in so doing, grants that user access to Airtable data in that user’s context? I didn’t think there was, but I could be wrong.

What I’m referring to in #3 is an entirely different security layer that is wrapped around the API security layer.

If you’re going to expose Airtable data to the open interwebs, an authentication layer likey exists (or should exist) well above the API security layer. It is this aspect of usual and customary web apps that is absent from @Luciano_Mammino’s conversation. Only after getting past this higher layer are you in a position to exploit an injection feature that presumably already has a valid Airtable API key with which to execute the query.

You are safety-minded indeed. I’ll bet you’ve not once argued with a flight attendant about their process. :winking_face: I have and I’ll share the story over on the Opensiders Slack channel.

kuovonne · ‎Aug 07, 2022

I’m referring to that one proxy API key. Typically it would be either the API key or the developer, or of the developer’s client.

It is possible for someone to write a web app that has the user input their own API key and then use that API key. But that would be a rare case. One of my co-workers actually did this with a non-Airtable script that needed to access the Airtable REST API. The script asked the user for the API key as it’s first input.

Bill_French · ‎Aug 07, 2022

Ah, okay. And so sorry for drifting off topic here in @Luciano_Mammino’s thread.

This is what I thought the API security constraints were. I was thinking maybe they recently advanced toward a more granular authentication model (as rumoured). In any case, the diagram below is how I believe most developers create a secure web applications layer when accessing Airtable with sensitive information. Of course, the Firebase features are interchangeable with many implementation approaches. In this diagram, the Airtable API is deep within the app. This doesn’t eliminate the risk of an injection attack (should the app include a dynamic filterByFormula design choice), but it does limit such attacks only to known and authenticated users (i.e., an inside job) that can be easily spotted in logs.

I tend to use Firebase for web apps because I prefer the real-time ability to cache-forward Airtable data, thus avoiding the REST API altogether. It has the benefit of employing Airtable in a support role as the backend without limiting scalability and all while eliminating any risk of an injection attack and offering <= 250ms response for any query.

While I doubt many Airtable low-coders are up for this type of design, it’s important to note that any low-coder with some HTML and javascript experience can build this and deploy 100% in the free tier, which I believe is up to 20,000 Firebase events per day.

BTW - I use this identical architecture with Coda (example of a map app driven by Coda data proxied into Firebase).

Justin_Barrett · ‎Aug 08, 2022

I’ve seen this exact behavior before, both here and elsewhere. Someone new joins, minutes (sometimes seconds) later posts a scathing rebuke of the company with a threat to leave, and never comes back. Either that or something spammy, possibly with a suspicious link. Regardless of the content, my gut feeling is that there are bots out there designed specifically for these behaviors, but why someone would go to the effort of doing this—either for real or via a bot—is beyond me. Some apparently like to sow seeds of discontent wherever possible. :person_shrugging: