Help

Re: Best way to Go from Raw to Summary Data

1305 0
cancel
Showing results for 
Search instead for 
Did you mean: 
Anthony_Fiedler
4 - Data Explorer
4 - Data Explorer

I am trying to set this up.
I have a Table with a list of companies, years, and list links where I got them from.
I have another table with the list links, list names and years.
I am trying to get the companies to merge into one source with the list names and years that they appeared. Images below - the one with colored multiple select is the similar to what I’m looking ot get as a result.
Any help if all of the companies are in one table appended for each list.
image
image

2 Replies 2

Hey @Anthony_Fiedler!
Welcome in!

Okay, so I am curious to learn more about your use case.
In order to dig in and get you a solution, I need a bit more context.


Here’s what I understand so far (and what I’m filling the blanks with)…

  1. Table one contains records of companies referenced in different lists. I presume found in publications(?)

  2. Table two contains records for each list or publication(?)

You want to see a complete, consolidated view of companies, the publications they’ve been listed in, and the years they appeared.


Here is a link to the base I used for this post

Feel free to explore and copy it to play around with it.
If I make any changes to any feedback you provide, I’ll change things around in this base as well.


If those things are true, then here’s my solution.

I created my own base to experiment with your use case a bit. There are a few things that caught my eye from your screenshots.
The biggest one is that you have multiple records for the same type of data.
In your case, you have multiple records for the same companies.

While there isn’t anything explicitly wrong with doing this, you make it a bit harder for you to manage your data and effectively summarize it.

So…

Companies

The first thing I did was create a dedicated companies table.
This table contains a single record for every company.

image

This allows me to consolidate and focus my data.
See, instead of having to link multiple publications/lists to multiple company records, all I have to do now is worry about linking multiple records to a single record.

Next is…

Publishers

This table holds records for each publication.
I wasn’t too sure what your data looks like, so I just added in some placeholder publications and outlets, etc.

image

I just have a sample URL to the website of the publication as an example for data we can store for these records.

And finally…

Lists

It seems like list is a synonym for any sort of published article and such. Thus, I’m just gonna talk about it that way.

The table holds records for every individual published article and list
We have linked records for the given publication, as well as any associated companies that are listed and associated with the article.

The record has a field for the title of the post, as well as details like the URL, as well as the date it was published.

The primary field is a formula field that builds a tag to summarize key information.
It lists the post name, the publisher, as well as the year that it was published.

image

Here is the raw form formula I used in the field:

IF(
    {Post Name},
    {Post Name} & "\n",
    "⚠ Missing Name" & "\n"
) &
IF(
    {Publisher},
    "By: " & {Publisher} & "\n",
    "⚠ Missing Publisher" & "\n"
) &
IF(
    {Date Published},
    DATETIME_FORMAT(
        {Date Published},
        'YYYY'
    )
)

There are small additions for the tag that alert to missing information from key fields.
The {Date Published} field is referenced in the DATETIME_FORMAT function that is nested in the formula body to create the year value on the bottom of the tag.


How It All Comes Together

If you look at the first screenshot of the Companies table, you’ll notice that there are a few fields hidden.

These fields are rollup fields that provide summarized information about the publishers that have referenced a given company, as well as the years that any articles/lists were published.

image

The rollups use the ARRAYUNIQUE function to avoid duplicates.

The Publisher Linked Field

You’ll notice that there is a publisher linked record field.
The thing I needed to do was to find a way to establish a link between the companies and the publishers that have written about them.

The problem was that there might be new publishers and/or new companies, and that creates a lot of possible new values.

So I decided to use an automation here to solve this.
It triggers based on a record update. It specifically watches the publisher rollup field for updates.

When the trigger is fired, it will update the record’s linked field to the publishers’ table using the values of the rollup field.
This works due to the comma separation between rollup values, but it’s a technical thing. Don’t worry too much about it… it just works.

image

The effect is that the companies will only be linked to the publications that have written about them based on the articles/lists that are linked to them.

This allows you to get lookup/rollup data directly from the publisher records.

The Summary Field!

Okay, now that we’ve built the framework, we can begin the flow of data between our records.

In the Companies field, you’ll find the Summary field.
This compiles all of the key publisher/article data into a summary of a company’s appearances.

image

It’s conditionally formatted for either a single year or for multiple years.

Here is the raw form formula:

IF(
    {Lists},
    {Company} & " appears " & 
    IF(
        COUNTA(
            {Date Published Rollup}
        ) = 1,
        "once in " & {Date Published Rollup},
        IF(
            COUNTA(
                {Date Published Rollup}
            ) > 1,
            COUNTA(
                {Date Published Rollup}
            ) & " times in: " & "\n" &
            ARRAYJOIN(
                {Date Published Rollup},
                '\n'
            )
        )
    )
)

If the company has only appeared for one year, it will display a message like:
Airtable appears once in 2022.

If a company has multiple appearances across multiple years, it will display a message like:

Airtable appears 4 times in:
2018
2019
2021
2022

Again, let me know of any additional context you might have or think I might need and I will adapt and help you brainstorm more.

Thanks - this looks a lot closer to what I was going for.

Yeah, I’d be adding more and more data as I collect it. Right now, it is just a list where I can add to excel and copy over to airtable easily.
image
The second image here was an attempt at hacking together a summary list based on the publication links (end goal is to do what you and I had above ^ Publication link/url is for my own benefit (where I sourced data), Publication Name and Years would be what I wanted listed. But there are many lists still to scrape and sort through.
image

Let me know if you’re possibly open to chat about it in realtime or discuss via Loom. Primarily the capability/flexibility of what you’d done there as well as being able to add in more data.