Wide Tables - pros and cons

Michael_William · ‎Apr 18, 2018

One of my tables is “Accounts” and it has gotten really wide (40+ fields) and growing.

(Name, ID, Address Complete, Address1, Address2, City, State, Zip, Phone, Fax, Comments, Attachments, COD status, Credit Terms, Tax Exempt, Tax Exempt Number…)

I’ve backed up the table and started playing with linking sub-sets of account information in multiple tables (a 2nd table for Address information, a 2nd table for Credit/Billing information, a 2nd table for interests/category information… so each table can be limited to 15 or fewer fields)

What are the pros and cons of wide tables?

The way I see it, going wide is:
Pros:

Everything can be entered in one record (I don’t need to go to multiple tables and re-type the account name to continue entering information)
No confusion over whether the additional records have already been made in the additional tables
Can use custom views to make the data more visually manageable

Cons:

Googling DB vs Spreadsheet makes me think that DB’s are supposed to be tall and skinny
Without custom views, the table is getting awkwardly wide

After typing this up, it seems that the benefits of going wide exceed the costs… but I’ve only ever worked with spreadsheets before Airtable, so maybe I’m stuck in a 2D mentality. Are there wide table cons I’m not seeing?

Thanks,
Mike

Andrew_Johnson1 · ‎Apr 18, 2018

This just my opinion but you would generally like to use the “Linking of Records” of Airtable to keep your table nicely organized and not too wide.

The scrolling might drive you crazy in the long run (Once you have more than 1000 records) and keeping the different tables linked to each other allows easy retrieval and access of data.

In traditional databases we use the concept “Normalization” for this particular use case, and Airtable has nicely abstracted this feature by adding in the “Linking of Records” feature

Michael_William · ‎Apr 18, 2018

That’s similar to the advice I’ve seen from online searches, but what am I risking (performance, corruption, visual organization…) by keeping all of my fields within one record within one table? It seems like from a users perspective things are easier if I stay within one record and just make it super wide.

Thanks again,
Mike

Andrew_Johnson1 · ‎Apr 18, 2018

It will prevent data redundancy and ease of data updation.

Imagine this for your “Users” table you have each user associated with a “Company”.
Now the “Company” has its own set of attributes such as “Number Of Employees”, “Company Name”, “Company Registration Number” etc.

Now suppose for some reason you want to update a company’s data.
You would then have to go in and make changes for all the users associated with that Company.

But if you were to link all your Users to a “Company” Table having the attributes “Number Of Employees”, “Company Name”, “Company Registration Number”, you would only have to make the change in one place.

Also there is unnecessary wastage of precious storage space since you are storing the same “Company” data for all the users associated with that Company

Jeremy_Oglesby · ‎Apr 18, 2018

One of the important things that @Andrew_Johnson1 is getting at, but didn’t say quite as explicitly, is that any field who’s data gets repeated (duplicated) across multiple records should probably be a linked field to another table.

Among the fields you mentioned above, take the “state” part of the address for example - many entries will have the same state listed there. You could easily make another table with all the states and turn that into a linked records field. After doing that, instead of having hundreds or thousands of individual, identical data points called “New York”, you could have a single data point called “New York” that hundreds or thousands of records point to. This prevents data corruption because when hundreds of people are entering “New York” into your table (or one person is entering “New York” hundreds of times), you know for certain it will be the same every time because you are creating a link to an existing data point, not making a new one.

The “state” field is a pretty low benefit field to turn into a table - I’m sure you probably have other fields that are even greater candidates for doing this - but you see the point at least.

Jeremy_Oglesby · ‎Apr 18, 2018

Also, the process of making a linked record while making a new “Account” record isn’t so labored as you imagine.

Let’s say you start a new record in that table. You get to the state field and need to enter a state you do not have in your states table yet (ie, you need to create a new record in a different table than the one you are in now) - this will not interrupt your workflow that much. You’d type the name of the state you need to enter - the list searching for results on the state table will return that this record doesn’t exist, but at the bottom will be an option to create a new record. When you select this, a pop-over view of the new record you are making in the states table would appear over your accounts table and it will have a line from the top pointing to the table it originates from to let you know you are working in a table that is not open right now - after you are done entering any info you need to into the pop-over (the new record you are making) you can simply hit escape and you will be right back where you left off in making your account record. This is true no matter how involved your secondary table is.

These things are best understood by trying though, so I encourage you to test it out yourself.

W_Vann_Hall · ‎Apr 18, 2018

To my knowledge, no one has yet to create a ‘best practices’ document for Airtable — but when one does exist, I expect ‘data entry [for a given process] should be performed from a single table’ will be one of the suggestions.

Note that does not mean all data should be entered into a single table; instead, there should be a table and view that serves as one’s base of operation.

For instance, in the example you gave, my first inclination would be to break out what for lack of a better term I’ll call ‘static’ and ‘dynamic’ company information. Static information would be those things about a company that change only infrequently: name, address, phone numbers, D&B ID, and so on. Dynamic information, unsurprisingly, would be those things that do change frequently — which in most instances also correspond to your business intelligence, the data that power your company: orders, balance, line of credit, class of service, and the like.

Since those are what are essential to your business, I would make them the ‘subject’ of your primary table. (Many, of course, may well be itemized in other tables with a summary rolled up to the main table.) The static company info resides in a second, linked table. When you add a new company to your customer base, though, you do not go to the [Company] table to enter the data; instead, you select the plus sign in the linked {Company} field in the [MyBusiness] table and drill through to the [Company] table. There, you select '+ Add a new record' and enter the company’s static information. Once you’re done and close the popped-up [Company] record window, you’re returned to the [MyBusiness] record where you left off.

The one drosophila-sized fly in the ointment currently is Airtable’s disregard of default field values in records created as a linked record. This complicates certain tasks, but probably doesn’t impact most users.

Edit: I wanted to add, if it’s not obvious from my comments, the width of a table should be one of the last things to consider when deciding what data to retain and what to link. I’m probably not the guy to write the ‘best practices’ document, but I have tables with at least 120 fields.[1] I manage these through generous use of views, hiding those fields unimportant to the process or procedure using that view. I also usually create a grid view with no hidden fields labeled <Development - All Fields> and locate it at the bottom of the list.

Admittedly, these are probably edge cases. For instance, I may have a formula field that extracts a certain value from a record side-by-side with a single-select field containing the same information, created by copy-and-pasting values from the formula field into the single-select field. I never use the single-select field for, well, selecting; instead, i use it to drive behaviors for which Airtable demands a single-select: Kanban grouping, for instance, or certain features of the Charts Block.

Michael_William · ‎Apr 18, 2018

Thanks for all of the thoughtful responses.
3 follow-ups that I haven’t quite grasped:

If one of my main concerns in data corruption, is there a difference between:
a) making a new table (with US States, for example) and linking it
b) a single-select field type loaded with my state options
It seems like both would achieve the data validation protection.
Most of this data is very static (our interactions, contacts, and orders are being stored other tables), so what I’m gathering is:
a) If the data isn’t unique to the record (state name, process code, category, tax exempt (yes/no), credit terms (30, 60, 90, COD…)… it should be linked from another table or selected from a single-select field type (pending the answer of number 1) to prevent data corruption
b) If the fields within a record are generally entered at the same time, it isn’t a bad idea to enter them wide (even though creating a new-sub-table-record isn’t that that rough, as described above)

Other than using sub-tables to ensure data validity (no miss-typing states or using Caps sometimes and not other times) and object inheritance (to make sure batch changes can be made in one place (a company’s address change needs to flow down to all linked employees)… are there other fundamental concerns with 50 columns (fields) in a record?

Thanks again,
Mike

W_Vann_Hall · ‎Apr 18, 2018

I agree. Frankly — and remember, I’m Mr. Unbest Practices — if it was a base in which i personally had to enter data, I’d do neither. If i was concerned about erroneously entering the state, I’d write a formula that broke out the state abbreviation, checked to see if it could be found in a concatenated list of approved abbreviations, and displayed a '🔥🔥🔥 Danger, Will Robinson!' error message if I’d fat-fingered it. That’s based upon my rather risking an occasional muffed entry of a two-character field than having to choose the entry from either a single-select field or linked record.

I agree totally, though, if there’s no compelling reason to use a linked record, you’re probably better off using a single-select — in part because i find it more irritating to navigate a base with a large number of tables than I do one with very wide tables.

I’ll give you my tentative agreement, here; I can think of exceptions, but they may be of the kind that prove the rule. In my response to your first question, I referred to the UI/UX (user interface/user experience) tradeoff between data cleanliness and user aggravation, so I want to caution about going too far in either direction.

I think in many instances you’ll be using both single selects and linked records. For instance, I would see both tax exempt and credit terms existing as fields in your [Company] table, the former a checkbox and the latter a single-select. You would make use of those fields, though, in calculations performed in the [MyBusiness] table, accessing them through rollups or lookups from the linked record.

Well — and this isn’t a computer science, best practices sort of response, but more of a ‘you kids stay off my lawn!’ one — my gut tells me, if you have 50 data entry fields in a table, you’re doing something wrong: Your data model is too flat.

And now I think about it, part of my reason for feeling that way is because it’s too difficult to manage all that data entry given Airtable’s layout; that is to say, it makes the table too wide. :winking_face: More than that, though, not many applications deal with so many data points of comparable significance. Part of what you do when you organize a base is implicitly define a hierarchy of importance for your data: More-crucial information tends to rise to your primary (see?) table, with data of lower criticality drifting to linked tables. Somehow, 50 fields of comparable significance seems unusual.

Now, if many (most?) of those 50 fields are derived fields (formula, lookup, rollup, autonumber, etc.), I see no problem. True, I’d probably define views that hid as many as possible, just to lower stress on whoever is stuck entering data, but a 50-field table, while large, is by no means excessive. (For instance, in that 120-field table I mentioned, only 15 fields are for data entry. An additional 36 fields are linked-record fields — but that’s an unusually high number and results from the somewhat atypical nature of the base. All the rest [69, if my high school math hasn’t deserted me] are derived fields.)

Jeremy_Oglesby · ‎Apr 18, 2018

@Michael_Williams
This is a much more all-encompassing answer compared to mine, and I agree entirely. The “State” example I used really wasn’t the best, and as I said, there are probably much better candidate fields in your table for breaking out into a separate table. “States” could definitely be handled with a Single-Select field, as they probably have no other information attached to them (ie, a record in a “State” table probably wouldn’t have much more than the name of the state and then their linked record fields).

But at the same time, it can be very nice if all of your “Data Entry” fields in your accounts table could fit in a single screen without horizontal scrolling, or with minimal horizontal scrolling. That’s more or less what I aim for in tables that will be a heavy data-entry gateways, and linked records that break the data down into chunks can help with that. If there follow 10, 20, 30 formula derived fields, those can be isolated in separate Views, or their data modeled or aggregated in Blocks.

Well put, @W_Vann_Hall