Hi, we are just getting started, but before moving our business data over and relying on airtable, I have a question regarding the maximum size of a base.
From what see there is a max of 50,000 rows (records)
We are currently generating 6,000 rows per month in our existing excel sheets, which means we will run out of space in a bout 6 months (not idea)
Perhaps there’s a way to create a base for each month, or each quarter by duplicating the current base, (without data, or copy then delete the data)
But… from what I understand I will not then be able to report across bases, the reporting is based in a single base.
Here is some context…
We manage out bound email campaigns,
Each campaign we send can contain up to 8 links within the body of the email)
on average 3 links
We track the number of clicks received on each of the links
15 lists x 4 campaigns x 3 links x 31 days = 5,580
That’s how we get to approx 6k rows per month
I would love to hear any thoughts, comments or suggestions
Thanks for your help and support
Welcome to the community, @GREG_WEITZMAN! :grinning_face_with_big_eyes: Sorry that this hasn’t seen any responses yet. I don’t have experience dealing with such massive data collections, so I don’t have any immediate answers. However, my gut says that the multiple-base scenario might work if you’re open to using reporting tools that are outside of Airtable, which can read the data from multiple bases using Airtable’s API to generate the reports you need. I’m going to tag @Bill.French for his input, as I believe he’s got experience managing large data sets, including reporting.
Indeed, but it’s important to assess the nature of the data and the data model before coming to any conclusions. For example, I have a client who has about 89,000 invoice line-items in an Airtable table for about 4,000 sales invoices. Each invoice is a record in the table and the line-items (averaging 20 per invoice) are stored as a JSON object in a single cell of the same record. Is this ideal for your situation? I can’t say, but it works well for my client. Under this design, they could have about 50,000 invoices and a total of 1 million line-items in a single table.
One must ask - should each click be a separate row in a table?
We can’t really answer this until we know the requirements for using this data. My sense is that a click - a discrete analytic in and of itself - is not really a user-chummy data object. As such, no user likely ever needs to look at one click transaction, at least not the way a user would look at a customer contact record for example.
Clicks are details that are best utilized as aggregate measures and this is a sign that they can be stored as abstract objects in the background, thus avoiding the consumption of a lot of individual record space. Imagine a single record containing a tracked link, and a blob object containing a structured compilation of all click transactions. That approach would provide the ability to track about 50,000 different links each with perhaps a thousand or two click events. (i.e., about a hundred million clicks depending on the attributes of each click event).
Worth a deeper look - Can Airtable handle a table with 50,000 rows and 25 columns?
Airtable is where I want to put data for (collaborative) editing, not for storing logs, or for tracking behavior like email clicks. It’s a great tool but not every job requires a hammer.
For the greatest user experience, I want my bases self-contained. If I’m organizing an event for European-based companies, I don’t want them to be able to select “Paris, Texas” as their city just because I’m sharing the table with another event. I like that linking is restricted.
There is a need for cross-base analysis, etc, but I don’t think that’s best achieved by shoe-horning everything into one base. Yes, you’ll need to extract the data, but you can do so much more with it when it’s merged into a time-series database or a graph database.
I’m fascinated by the JSON experiments by @Bill.French. I often have a parent base that includes metadata for its children bases. In the base, I do actually store hashes of the JSON representations of the children, but I never considered actually storing the raw JSON there!