Re: Maximum base size

GREG_WEITZMAN · ‎Jul 31, 2020

Hi, we are just getting started, but before moving our business data over and relying on airtable, I have a question regarding the maximum size of a base.

From what see there is a max of 50,000 rows (records)
We are currently generating 6,000 rows per month in our existing excel sheets, which means we will run out of space in a bout 6 months (not idea)

Perhaps there’s a way to create a base for each month, or each quarter by duplicating the current base, (without data, or copy then delete the data)

But… from what I understand I will not then be able to report across bases, the reporting is based in a single base.

Here is some context…

We manage out bound email campaigns,
Each campaign we send can contain up to 8 links within the body of the email)
on average 3 links

We track the number of clicks received on each of the links
15 lists x 4 campaigns x 3 links x 31 days = 5,580

That’s how we get to approx 6k rows per month

I would love to hear any thoughts, comments or suggestions

Thanks for your help and support

Justin_Barrett · ‎Aug 06, 2020

Welcome to the community, @GREG_WEITZMAN! :grinning_face_with_big_eyes: Sorry that this hasn’t seen any responses yet. I don’t have experience dealing with such massive data collections, so I don’t have any immediate answers. However, my gut says that the multiple-base scenario might work if you’re open to using reporting tools that are outside of Airtable, which can read the data from multiple bases using Airtable’s API to generate the reports you need. I’m going to tag @Bill.French for his input, as I believe he’s got experience managing large data sets, including reporting.

Bill_French · ‎Aug 06, 2020

Thanks @Justin_Barrett.

Greg,

Indeed, but it’s important to assess the nature of the data and the data model before coming to any conclusions. For example, I have a client who has about 89,000 invoice line-items in an Airtable table for about 4,000 sales invoices. Each invoice is a record in the table and the line-items (averaging 20 per invoice) are stored as a JSON object in a single cell of the same record. Is this ideal for your situation? I can’t say, but it works well for my client. Under this design, they could have about 50,000 invoices and a total of 1 million line-items in a single table.

One must ask - should each click be a separate row in a table?

We can’t really answer this until we know the requirements for using this data. My sense is that a click - a discrete analytic in and of itself - is not really a user-chummy data object. As such, no user likely ever needs to look at one click transaction, at least not the way a user would look at a customer contact record for example.

Clicks are details that are best utilized as aggregate measures and this is a sign that they can be stored as abstract objects in the background, thus avoiding the consumption of a lot of individual record space. Imagine a single record containing a tracked link, and a blob object containing a structured compilation of all click transactions. That approach would provide the ability to track about 50,000 different links each with perhaps a thousand or two click events. (i.e., about a hundred million clicks depending on the attributes of each click event).

Worth a deeper look - Can Airtable handle a table with 50,000 rows and 25 columns?

Stephen_Cremin · ‎Aug 07, 2020

Airtable is where I want to put data for (collaborative) editing, not for storing logs, or for tracking behavior like email clicks. It’s a great tool but not every job requires a hammer.

For the greatest user experience, I want my bases self-contained. If I’m organizing an event for European-based companies, I don’t want them to be able to select “Paris, Texas” as their city just because I’m sharing the table with another event. I like that linking is restricted.

There is a need for cross-base analysis, etc, but I don’t think that’s best achieved by shoe-horning everything into one base. Yes, you’ll need to extract the data, but you can do so much more with it when it’s merged into a time-series database or a graph database.

I’m fascinated by the JSON experiments by @Bill.French. I often have a parent base that includes metadata for its children bases. In the base, I do actually store hashes of the JSON representations of the children, but I never considered actually storing the raw JSON there!

Bill_French · ‎Aug 07, 2020

It’s not necessarily a wise choice, but it works and pretty well. Plus, even Airtable has implied that this is a useful approach by adding the JSON editor block.

Agree,

kuovonne · ‎Aug 07, 2020

Scripting block and Custom Blocks both are good at handling JSON data stored in a long text field. It also helps discourage people from manually changing data that should only be maintained by code.

Bill_French · ‎Aug 07, 2020

Or, it allows them to completely decimate perfectly good data. :winking_face: Best to keep these fields hidden from view.