Oct 04, 2019 09:11 AM
I’ve got a lot of data (~75k records) that won’t fit into a single table, so I need to separate it out into multiple tables. It’s basic contact info (name, address, city, state, zip, phone, etc). We’re doing a zip-code search on the front-end (via the API), so with a limited data set (smaller than the final batch of data), it’s easy enough to query the zip field for the passed value. I was figuring I could break it out into states as that’s an easy & logical breakdown.
However, now that I need to break this out into multiple tables, I don’t want to search ALL of the tables for that single value. Is there a way to create some kind of lookup table that would accomplish this? I imagine something where each record is a pointer to the other tables, along with say, min & max values from that table (eg. min zip & max zip)? This way I can search the lookup table for the row whose min is less than the desired zip AND whose max is greater than the desired zip. Then, I can take the resulting table and search that one for my actual data.
Or, if there’s a way to create a formula that references another table, so I could manually set up the lookup table to be State Name, min Zip => min(state.zip), max Zip => max(state.zip) or something.
Thanks!
-Rob
Oct 04, 2019 09:33 AM
FWIW, the limit on the number of records is per base, not per table.
See Airtable pricing.
With 75,000 records (regardless of how many small tables you create), you’ll need to upgrade to the Enterprise subscription.
Unless I’m missing something about how Airtable pricing works.
Oct 04, 2019 09:45 AM
We’re on Enterprise. We have a 50k row limit per table.
I’m considering just breaking them out by zip range in batches of 10k. This way I don’t need to look anything up and I can just determine which table to look at by the first digit of the zip code. But I’d still be curious to know if something like that is possible.
Oct 04, 2019 09:58 AM
I didn’t realize that the Enterprise plan had a limit per table.
The limit seems very low.
Thanks.
Oct 04, 2019 11:26 AM
Rob, I don’t have an answer to your question per-se, but I am curious about the ideal user experience. Does this sum it up?
If that is ideal, would this be even more ideal?
In this simplified search experience, the index itself is unified despite the fact that the data may be scattered across many bases and tables.
While the search solution must be aware of 75,000+ items (and likely to grow), it doesn’t have to possess ALL of the fields nor is there any requirement for a 1-to-1 relationship between a record in the index and a record in the data. Hashing algorithms can be used to compress the number of items in the index. As such, a 20,000 record Airtable index might possibly contain addressability for 100,000 data items.
Build Your Own Search Engine? (how hard can it be)
Using the Airtable API it is possible to create your own search index. It can be a little challenging because a unified index must hold the 75,000+ items. As such, building an index in Airtable itself is therefore difficult but it has one big advantage - your data remains private to the enterprise account. To overcome the index size issue, consider this architecture.
Another challenge is how to link the index to actual records to facilitate immediate access. Lookups from the index into the actual records perhaps? Perhaps dynamic links to the actual data - dunno.
Lastly, you need a crawler that is always updating the index. Also not a snap to build, but you mentioned proficiency with the API so I gather you have some skills in this regard.
I build enterprise-grade search engines for Airtable clients and other code-free platforms (shameless plug).
Jul 29, 2020 09:45 AM
Hey Rob,
Not sure if this is helpful or not, but I have a little experience from a project I’m working on in transportation. You can separate each batch of zip codes by the leading digit to create 10 separate zones, Zones 0-9. This gives you a group of states that neighbor each other (with the exceptions of Alaska and Hawaii, of course). I’m assuming you could apply some IFTTT logic on the front end to only search the specific Zone that is indicated by the first digit of the zip code provided by the user. Am I understanding your question correctly or did you find another solution?