Help

Re: Best way to handle spam that comes in from embedded forms?

3359 1
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Shereen_Adel1
4 - Data Explorer
4 - Data Explorer

Does anyone have a recommended way to handle spam that comes in through publicly share/embedded Airtable forms? We've had forms published for over a year that never got spam before and then between Dec 20, 2022 and Dec 23, 2022 we got 49,245 spam submissions. It disrupted our workflow and prompted messages that we are over our limit. It generally feels invasive and pretty horrible. Any tips are welcome!

17 Replies 17
Bill_French
17 - Neptune
17 - Neptune

My thoughts: read on...

Andres_at_Conne
7 - App Architect
7 - App Architect

Having a question bots can't answer?  Or a hidden field that bots will fill out?

Bots now use AI to answer questions, and they're really getting good at it. They can also ferret out hidden fields using headless browsers - it's an arms race; you cannot win that war in this way. If you have so many bot entries in a database that it reaches the limits of the data table itself, the vendor is the problem, not your form or the bots.

Thank you!!

Ro_
4 - Data Explorer
4 - Data Explorer

I am experiencing the same currently. They keep coming every 5 seconds it's stressful. How do I stop it?

 

As I said earlier, Airtable is the problem. It has no defenses against a bot army that has could your form and determined to gain access to your system through relentless probes.

Your only out is to use a third party forms provider who has features that defend against bots.

onar
5 - Automation Enthusiast
5 - Automation Enthusiast

If you are using Zapier + Airtable then you could add OOPSpam app in your flow.

An example flow:
New Record Airtable -> OOPSpam -> Insert Record Airtable (or Send Outbound Email).

https://zapier.com/apps/airtable/integrations/email/1208750/spam-check-new-airtable-records-with-oop... 

Note: I work at OOPSpam ๐Ÿ™‚

Yep - so, with this approach, if a form attracted ten million bot posts and only one legitimate post, the Airtable instance would have to process ten million and one new records to capture one legitimate record? Wouldn't that pretty much kill the Airtable service?

Furthermore, given a scenario where 100 posts were made to the form and 80 of them were spam. By adding the Zapier process, please tell me how many Airtable API requests would be required to capture the 20 legitimate records?

onar
5 - Automation Enthusiast
5 - Automation Enthusiast

So the flow is triggered for each submitted form. In the example flow I linked above, there is only one Airtable API call (New Record) for each submission.

Screenshot 2023-10-11 at 1.49.51 PM.png

 

It is true that if you go with New Record -> OOPSpam -> Insert then it will call the Airtable API twice. Now looking back at this, it doesn't make sense to trigger the flow for a new record and then insert it back to Airtable because the record already exists in Airtable. Unless we want to insert/update with some new information like spam score.

We also use Airtable for our contact form. Our flow looks like this (simplified)

Webhook -> OOPSpam (check for spam) -> Only Continue If -> Insert Airtable (1 call for a legitimate record)

So it all depends on the workflow. If there is only one Airtable trigger then it will take 100 calls for 100 submissions. if there is another call (to update/insert a record) then it would take extra 20.

In hypothetical situation where you get 1M bot requests then there should be other measures to prevent this attack like DNS/hosting level security and rate limiting. For most cases this flow will work just fine, considering Airtable API rate limit is 5 requests per second per base.

 

โ€ƒ

>>> For most cases this flow will work just fine, considering Airtable API rate limit is 5 requests per second per base.

The API rate is not the issue. Most Airtable users must abide by a new limit - calls per month.

It's unlikely that a single form will attract a million requests. But it is likely for every legitimate form submission, some magnitude more will be bots doing what bots do. This approach will eat into the new API quota, and as we all know, Airtable APIs are married to the user's instance, so they can also affect user performance.

It's no secret -- I'm not a fan of using external automation platforms unless absolutely necessary. As such, I would approach this differently.

One approach - internal automation; when a record is added, use AI to identify any records that are probably created by a bot. There are numerous ways to target nefarious submissions with a deterministic prompt, including learner-shot examples that examine certain fields. The learner shots could be dynamic based on a known and vetted collection of legitimate records. This approach would be performant, would not require API calls, or sending all your data to another platform.

KuboS
5 - Automation Enthusiast
5 - Automation Enthusiast

Can you provide an examoples? 
I really wonder why they don't at least try for a sollution here. (Airtable)

onar
5 - Automation Enthusiast
5 - Automation Enthusiast

Sorry for the late reply.

It is true! Sending your data to third parties is not preferable. That said, OOPSpam is a privacy friendly, doesn't require IP and email. Also, maintaining a spam detection infrastructure like the one you mentioned is a lot of work. Airtable may not be interested in building the system just for the forms.

To add to the internal automation discussion, the internal automation allows Run script, so it is possible to bypass Zapier and do an API directly to OOPSpam within the Airtable automation.

>... maintaining a spam detection infrastructure like the one you mentioned is a lot of work.

It's actually less work.

> Airtable may not be interested in building the system just for the forms.

This is not just about forms. An API process is just as capable of injecting bad data into your system. Airtable has proven over the years that it is not interested in building many good feature ideas. As such, no-codeists must look for solutions that involve things they can do to mitigate issues like this. I believe they will increasingly lean on generative AI to meet these requirements.

onar
5 - Automation Enthusiast
5 - Automation Enthusiast

I see. I thought you were referring to Airtable to build the system.

Could you please provide an example so others can benefit?

> ... please provide an example

Sure. Imagine an AI field that applies a generative AI prompt whenever a new record is added. If the AI inference returns false, an automation deletes the record. The prompt might look something like this.

You are an expert who can recognize records that are spam. By definition, spam records contain data values that are vastly unlike legitimate records. I will provide you with the field values of a record you will use to gauge legitimacy. I will also provide you with a small set of example records that are legitimate. You will assess the current record and output "true" if legitimate and "false" if illegitimate.

Examples: `
  <fieldNames>: <dataValues>
  [output]: <true|false>
`
Record: <dataValues>
[output]

 

KuboS
5 - Automation Enthusiast
5 - Automation Enthusiast

Slowly on me please. I can build a form and am basic in formulas. 
I can build a hidden field or a field that unhides. 
But you said conditioning is useless .... 2-5 ... if is inserted incorrect value the form record will be deleted.

But am not understanding your condition

The term "condition" was used twice in this thread by you, not me. So I don't understand what you are saying. You'll have to use more words to help me understand your question.