Hi all –
I am building a production pipeline database for the nonprofit where I work. We currently track web publications, email, events, and social content in separate bases and that has made it difficult to visualize everything together and do cross-org and program-specific reporting. Our new base will unify all those content types into one pipeline for better reporting, more consistent data, and also tracking relationships between content types (ie, social post X is promoting web content Y). The different content types (Web, Email, Events, Social) have some shared fields, but each also has many fields which are unique to that type and our processes – tracking social content has different data than web content, for example.
With that context here’s the question: One table with all items regardless of type OR a table per item type and some kind of ‘unity’ table that brings things together for reporting and planning?
Option 1: One table to track all the content regardless of type.
Option 2: A pipeline table per content type (probably 4) and a “parent stubs” table. In other words, one All Content table with the fields that all content records have, with each record linking to another table with the content-type-specific fields. ie, for each row in ALL CONTENT where [Type]=Web, there’s a linked record in table WEB CONTENT with all the web-specific fields. Same for [Type]=Email, [Type]=Social, etc.
Option 3: Separate Tables per Content Type and a unified view created through automation. In other words, each current pipeline moves into the one DB and the common fields are synced into an All Content table.
In conclusion… I’m definitely leaning towards option1, but wanted to make sure there wasn’t a good method I’m missing! Your suggestions are very welcome.
i would recommend to read this:
The primary field
I did cleanup of some old base, with ~50 tables of different size, and joining some tables into one helped a lot. some had near 100 fields, but large part of it was redundant info, like links (sometimes joining the same two bases each other), lookups, ‘orphaned links’ etc.
you don’t need to use joins, when data stored in single table.
also, take attention at record limit per plan:
I would recommend to think in terms “Separate Tables per Content Type”, imagine these tables, but then join those having the same primary fields type, if other fields type and their number match at least 50%
Your narrative is excellent but it lacks enough data to make a realistic recommendation. What’s missing is scale and budget constraints - how much data for each entity are you anticipating? Sometimes the best approach may be out of reach solely because of constraints related to the size of your data.
Are you aware that the sync feature could allow you to roll up the existing data model architecture into a wholly new, high-level base containing the data necessary to extract key metrics and reports? It’s very possible you needn’t change a thing about the current system to achieve your objectives.
Hi Bill –
Thanks for the feedback. Always appreciate someone asking the step-back-big-picture questions :winking_face:
A few answers:
FYSA, the table structure I’m increasingly settling on:
Pipeline. Table for all tracked items.
Item Types. Table of the different types of things we produce and process requirements associated with each. (For example, a longer Report requires CEO review but a shorter commentary piece would not, etc.) Many of these columns are used as lookups to determine if a pipeline item is subject to particular automations.
Projects. Canon for the organization’s Program/Project hierarchy. We have ~15 policy programs, each of which has multiple projects. Includes links to Staff table to pull in the responsible approver for pipeline items.
Staff. Name, email, slack, status for team members. Lets us automate things like status change email notifications, internal publication notifications, etc.
Change Records. Records produced via automations that record changes to particular fields so that we have an easily email-able record of updates made as an item moves through the pipeline. (dont believe I can get record change history via the API.) For example, If someone changes Pipeline.Phase, an automation creates a new record linked to the modified Pipeline record that notes the date, field that changed, the new value, and the user.
Content Submissions. When programs complete writing a product, or need to send it for review, they submit it via a form. That form confirms they have completed the relevant publication requirements (inherited by the Pipeline item via Item Type table), whatever their final titles and summary are, and an attachment (word). Plan right now is that entries here get notified to my team and then we can accept a submission (probably a button that sets up an automation to copy the submitted content into the appropriate fields in the Pipeline record).
Email Distro lists. Data pulled from mailchimp of our various lists (groups and tags).
Date Submissions. Similar to content submissions but much simpler – a way for staff to update their expected completion dates so that my team can plan releases/launches.
Email Performance Record. Each item in Pipeline with ItemType.category=Email gets a record here. On a schedule (X days after the email is sent), we will pull performance numbers into Airtable from the Mailchimp API. The record name is matched against the incoming data and populated accordingly. (This process is imperfect, which is why this is a separate table - too easy to inadvertently mess up lots of records.) Note eventually we will also have a Social Performance Records table that pulls social item performance from the relevant APIs, but will be later.
Yes. There are a few reasons for doing it fresh.
Okay - and by “one place”, you mean “one base”, I think.
Given the scale and limited budget, I don’t think this is an irrational approach; it makes sense for the points you make obvious. And it seems you have a good grasp of the implications when tilting in favour of a flatter data model.
One of the big benefits of NoSQL platforms is the ability to manage variable record lengths to optimize performance, findability, and storage costs. However, in this case, the advantages of a unified base/table approach pay big dividends and the cost of many empty fields seem like a small price to pay for increased simplicity and reduction of “parts” and simplification in reporting.
People often forget that reducing parts in anything - from physical goods to data management, to process automation - is fundamentally a wise move. I recently touched on this topic here.
One aspect of Airtable to be cautious of is the expanded nature of formula-type fields to make stuff work. Unlike a spreadsheet which has the capacity to store formulas and values in the same cell, Airtable pushes its users to create what I refer to as “kludge fields” to overcome the lack of a unified ability to both store and compute dynamic values. This has the impact of creating some incentives to lean on formulas a bit too much and despite a relatively small number of data fields, there sometimes exists an equal number of supporting computation fields which drag down performance. Your annual data flow doesn’t seem like this should create any serious performance issues though.
Yes! I find that my “programmer brain” experience, limited as it is, has served me very well when thinking about organization and process design. I often find that people worry about creating single points of failure while forgetting that adding parts is also adding new, more complicated, failure modes.
Those hacks bother me, too – they always seem like covers for a missing feature. One of my goals with this rebuild is to minimize that kind of thing. I’m hoping that well designed views and very granular automations will cover most cases. (I expect I’ll be using lots of buttons to flip checkboxes.)
On the same page with you. I’ll add: “3.5. Change management comms and process documentation.”
Many thanks again, Bill.
Interesting read - I have the same problem, so maybe restating the problem in a generic way…
What structure (tables/links) should someone create, to enable one table to contain fields shared with entries in other tables, where the other tables “extend” the information in the common table.
Basically, its a MANDATORY one-to-one relationship that is needed?
But what you end up with, is the “shared” table having a field for each “extend” table, when it should really have 1 field called “child” that can be linked to any 1 of the sub tables…
But I don’t think thats possible in airtable right?