Re: Best way to organize data where only some rows require additional notes?

Heirtable · ‎Oct 27, 2021

Let’s say I have a list of Artist / Track / Link and I want to store it in Airtable. I know that would be easy with a 3 column table in some base.

But let’s say I wanted to add notes to some of the rows. For example, for some rows I might want to add extra information about the track. This would be for some rows but not all rows.

I know I could use a table with 4 columns, like this:

Track | Artist | Link | Notes

But if most of the Notes fields are going to be empty that seems like an odd/inefficient way to do this.

What is the best way to do this?
Thanks much.

kuovonne · ‎Oct 28, 2021

I think we both know that Airtable puts effort into storing data efficiently, even if we don’t know the full, exact details.

On the other hand, there could be a different, simple explanation of why the REST API omits fields with no data. It could be omitting the data simply to reduce internet bandwidth.

Similarly, if you query the scripting API, null fields are returned in the result. This is not Airtable trying to expose or obfuscate how the underlying data is stored. Rather, this is Airtable trying to make things easier for the script writer, when you do not have the same internet bandwidth issues.

Bill_French1 · ‎Oct 28, 2021

Indeed, and this is likely because both of these SDKs are implemented at the client, not the server, right? Ergo, I suspect they had to manicure the results for their own client. API developers must do the same - we have to accommodate the fact that the values are undefined and interpret them as null (not empty) values. But, the lack of such values at the lowest level interface pretty much indicates there is no wasted space.

kuovonne · ‎Oct 28, 2021

I agree that the difference in the APIs is due to the client/server situation. And the REST API is certainly an older system. And I totally agree that there is no wasted space.

However I don’t believe that the REST API is necessarily the best representation of how Airtable currently stores its underlying data. The field values for some of the more complex data types is very different between the APIs, and I suspect that the scripting API value is closer to what is actually stored.

But we are really getting off into a tangent here. I wonder if Heirtable is upset by us hijacking the thread. I hope not. Side chats like this, and getting to read everyone’s particular writing style is part of what makes this community so vibrant and fun.

Bill_French · ‎Oct 28, 2021

I feel exactly the opposite. Here’s why…

#1 There is only one publicly-available API (the REST API). The others that you refer to as “APIs” are actually SDKs, and SDKs are built on top of APIs - which API exactly is unknown because Airtable likely has an internal one that we aren’t privy too. However, whatever API it is, it is the most intimate one can get with the platform. SDKs are [by definition] abstraction layers.

#2 The SDKs exist in Airtable at the client; not the server. And since they are client-bound, there’s a strong likelihood that they were designed to be helpful in building clients. After all, this is precisely their intention and any such “helpers” baked into this abstraction layer are probably going to represent not the underlying data architecture, but a transformed data layer suitable for efficient rendering and other UI logic.

That said - whatever guesswork we may engage in is far more likely to be accurate if we let the result sets of the API guide us because the SDKs must be more abstract to achieve what they do. Ergo, we can safely surmise that they are augmented realities of the underlying database architecture.

Way Off Point?

To be clear, I don’t write for just the questioner; I try to be broadly informative to the 10,000 readers who will follow @Heirtable’s inquisitiveness. But, I think we’re still on point - the question was clear - aren’t empty fields a waste of resources? Pretty sure we can agree, they’re not.

However… (love that word)

To really address this fully, we need to circle back to one of your earlier points concerning the debate over flat tables vs relational models. I think there’s at least a small book we need to hammer out. You first. :winking_face:

kuovonne · ‎Oct 28, 2021

I see the word “API Reference” as a heading for the Scripting documentation and Custom Apps documentation so often that I tend to think of them as APIs. But SDK is a better term.

I don’t see why the publicly REST API is necessarily closer to what is actually stored versus the SDKs. The public REST API is not the API that Airtable itself uses. (I remember getting a public statement from Airtable to this effect when custom blocks came out.) There is a strong likelihood that the REST API was designed to be helpful for consumers of the API. For example, in the REST API single select fields are returned as strings, but in the SDKs they are returned as objects with an id, color, and name. The id of the choice is probably stored (or why bother having an id?), but that internal id is pretty useless to the consumer of the REST API, so only the name is returned.

I don’t write for just the questioner either. But I think you vastly overestimate the number of readers of an individual thread. But given the original poster’s response to earlier posts in this thread, I still wonder.

Oh no. I’m afraid to answer this. I currently firmly believe that when building an Airtable base, the base designer should take into account the inherent structure of the data, the abilities & limitations of the platform, the workflows & preferences of the users. Sometimes this will mean a flat table. Other times it will mean linked tables.

Sometimes people use flat tables when a linked table would be better because they need to deal with record count limits. Other times people use linked records when the data could be flat, because having multiple tables/tabs makes more sense to them and fits better with their workflows or security models.

Michael_Lever · ‎Oct 29, 2021

I was about to compose an op until after reading yours I realised I have. the same or similar issue.

I am a commercial property surveyor and currently I use Filemaker 12 as my main database for property information but I want to transfer all the records, some 80,000, to AT. On an AT base I have fields that generally replicate the FM fields but I shall not. be importing records but copy and paste one by one because for me the advantage of a new system is to have a. rethink. For example, in FM I have 5 fields for a particular aspect whereas in AT only one because I’ve customised the field to Multiple Select.

In my FM database I have created what appears to be a table (with label ‘Rent History’) consisting of 12 fields and 10 rows. Reading this thread I now realise that it is not a table as such but 120 fields that I have positioned on the FM layout to display the content in each field as if a table. I would guess that relatively few records have enough content to occupy most of the fields, the bulk are empty.

Where AT differ it seems to me is in layout. Unless I use the Page Designer app for a layout or have a view for each property (which might not be feasible, but would be inefficient) I am not going to be able to know until entering the content which fields should be hidden. Moreover, as content is not always fixed, it would be better not to hide any as having to unhide would be a waste of time as and when.

After several days pondering my conclusion is to have 12 fields, each customised type to Long Text. Instead of ‘rows’ I shall enter the content in each field as if a row. For example:

Rent History 1
RR 2016-09-29 - £8000 Zone A £20 Memo ? ML

Rent History 2
RR 2019-09-29 - £10,000 Zone A £25. Memo 2019-10-06 ML

Rent History 3
RR 2021-09-29 - £10,000 Zone A £25. Memo 2021-10-29 ML

The downside of my conclusion is how to manipulate the data. That is resolved by having other fields for data that I want to manipulate. Not ideal but a work-around.

What do others think?

Bill_French · ‎Oct 29, 2021

Now we’re definitely off-topic. :winking_face: Even this relatively recent thread has more than four thousand known community views. Typically, for every named-user view, there are many circuitous views that happen in the context of other search systems and scrapers. One thing is always true about tracking analytics on the Interwebs - they’re vastly under-stated.

Bill_French · ‎Oct 29, 2021

I have used this technique for a variety of use cases where a large number of small transactions pertain to a single row. This is useful if you want to breach the record limit boundaries of Airtable and other no/low-code platforms.

The challenge, of course, is how to extract analytics from data living in a relatively unstructured call. There are ways to overcome this (RegEx, text manipulation, etc), but it does create a complexity that can lead to database disfunction. As long as you accept the implications and consequences, I see no reason to dissuade anyone from this approach.

Indeed, Airtable all but sanctioned this approach by introducing the JSON editor block and at least one use case here.

I see this new capability as a “green light” to utilize JSON payloads as data objects without fully driving our client/users off the cliff. Until now, embedding JSON in a text field was a rogue idea despite the advantages it presents in many use cases.

kuovonne · ‎Oct 29, 2021

Airtable has a limit of 50,000 records per base, unless you have budget for an enterprise workspace. Even an enterprise workspace will get you only 100,000 records per base, and there may be a lower per table record limit.

One by one copy/paste of this number of records doesn’t make sense. If you can find some way of staying within Airtable record limits, do a CSV export from FM and the. Import into AIrtable. Airtable’s CSV import app has a limit of 25,000 records, so you may need to import in chunks. You also might need to use scripting to import the data if it needs to be transformed during the import.

kuovonne · ‎Oct 29, 2021

You have much better statistics than I do, but that particular thread isn’t representative of this thread. That thread is about a much more highly desired feature, and isn’t particularly recent in my opinion. Plus many “views” are from people who don’t actually read the text and could be from repeat people. So even with 4K views of that thread being an undercount, I don’t see anywhere close to 10K people reading this thread.