Removing duplicates from a string

Hi all!!
I have a list of participants in a project, but as this list of participants is a lookup from several tasks linked to the project, it has duplicated items. I want to count the number of unique participants.

I’ve tried transforming this list of participants into a string, I know how to count the number of participants in that string but I need to remove the duplicates. Does anybody know how to remove duplicates from a string?

I know how to remove completely a participant from the string (using SUBSTITUTE), but I don’t know how to make it appear only once.

SUBSTITUTE({Participants}, Find, “”)

(Find is the name of the table with the name of the participant I want to remove)

Thanks in advance!!!

Use this formula:

ARRAYUNIQUE({Your Lookup Field Here})

Thanks ScottWorld, I’ve tried that one, but is not working. Each task has several participants attached to it, so the arrayunique formula reviews the complete list of participants for each task, and if the complete list is different from one task to another, all of them are included.

For instance, the project has two tasks. Task 1 has Person 1, Person 2 and Person 3. And task 2 has Person 1 and Person 2.
When I use arrayunique, it compares “Person 1, Person 2, Person 3” with “Person 1, Person 2”. As they are different the final string is “Person 1, Person 2, Person 3, Person 1, Person 2”.

And I don’t know how to avoid this…

I’m not understanding how your system is setup. Maybe include some screenshots or a video.

First of all, sorry as I’m new with Airtable and perhaps I’m not using the proper terminology. I’ll do my best to explain myself. This is not a screenshot from my actual system, but I hope it helps me explain my problem.

I have a table of employees, a table of tasks, a table of projects and a junction table to join a Project (single link) with a task (single link) that also allows me to assign several Workers (multiple link to Employee table).

In this example, Captain America is assigned to the task of collecting the infinity gems (with other colleagues), but also to the task of returning them.

I want to see which employees are assigned to each project. So I’ve created a Lookup field in the “Project” Table.
As the same worker can be included in several tasks in the same project, when I create the lookup field, they appear duplicated.

Either using Lookup or Rollup (with the formula “Arrayunique”), Captain America appears duplicated. And if I try to count the number of workers it shows 5 people, when only 4 different workers are involved.

Arrayunique is useful in case there are several tasks with the exact same workers assigned on them. But if there are several workers and only some of them are duplicated, Arrayunique considers they are two different arrays, and shows all the workers, including duplicated ones. Let me explain this better.

I’ve added a new task “Back to normality”. Only Nick Fury is assigned to this task.

As Nick Fury is also the only worker assigned to “Call Captain Marvel”, Rollup realizes is a duplicated record and doesn’t show it, although Lookup is still adding it.

So my question is, how can I eliminate duplicates when there are several workers assigned to a task?

I hope is more clear now :smiley:
Thanks!!!

I like the superhero example! :stuck_out_tongue_closed_eyes:

Try this:

Instead of using a Lookup field, change that lookup field to a rollup field and use this formula for the rollup field:

ARRAYUNIQUE(ARRAYFLATTEN(values))

That should give you an array of unique people.

To get a count of the unique people, you could modify that rollup field (or just create a brand new rollup field) using this formula:

COUNTA(ARRAYUNIQUE(ARRAYFLATTEN(values)))

Let me know if that works!

Sorry, still not working :frowning:

Captain America is still duplicated, and Counta() function is counting 3 :open_mouth:
I think is because Black Widow, Captain America and Hulk, as they are assigned to the same task “Collect Infinity Gems” are counting as 1.

Hmm, very strange. Yeah, I just tried it on my end too (I was just typing it up from memory before), and it doesn’t work at all.

The ARRAYFLATTEN function isn’t actually flattening the array at all!

And on my end, ARRAYFLATTEN doesn’t even show up as a choice in the Rollup formula box.

I’m not really sure how to solve this dilemma. Perhaps someone else knows the answer, like @Kamille_Parks or @Justin_Barrett or @kuovonne or @Bill.French.

I’m not skilled in this, but is it possible this should be reversed?

COUNTA(ARRAYFLATTEN(ARRAYUNIQUE(values)))

Using ARRAYFLATTEN can be tricky. When using it in a rollup to count unique items across lin with linked records, you need to use it in a rollup of a lookup, not a rollup of a linked record field.

image

ARRAYUNIQUE(ARRAYFLATTEN(values))

I tried to search these forums for the thread where this particular use case was explained to me to give credit, but I couldn’t find it.

Note that ARRAYUNIQUE(ARRAYFLATTEN(features)) works differently, depending on what type of field is being rolled up.

3 Likes

Oh wow, thanks so much, @kuovonne! What an incredible revelation!! :slight_smile: Yep, that fixed it for me! :slight_smile:

This is extremely unintuitive & unexpected behavior to me!

@Mar_Castro To make this work, you have to go back into your OTHER table, and create an additional lookup field. This is, in essence, a redundant field, but as @kuovonne has pointed out, this is required to make this work. You’ll be creating a lookup field that is based on your linked record field, but it is looking up the exact same field as your linked record field. Then, you can come back to your main table, and make your rollup point to that new lookup field that you just created.

Thank you, @kuovonne, for straightening out all of this craziness! :crazy_face:

ARRAYFLATTEN() only works with an array that contains other arrays, and a rollup of a linked record field doesn’t produce that. Linked record fields feel like they should produce arrays, but they produce comma-separated strings. My gut says that’s where a lot of the confusion comes from, not just for you but for a lot of others. Because links are a collection, the easy assumption is that Airtable gives you an array when using that link field anywhere else. The only place you do get an array from a linked record field is in JavaScript (i.e. the Scripting app, or a scripting action in an automation).

1 Like

Thanks for much for chiming in, @Justin_Barrett.

So linked records are a “collection” instead of an “array”?

You mentioned that we get linked records as an array via JavaScript. It also seems like we also get linked records as an array when using the API (via a simple CURL request or via an automation platform like Integromat or Zapier).

I wonder if Airtable could simplify this across the board, and always make linked records act like arrays — even within the Airtable interface itself. I have been using Airtable for years now, and didn’t realize any of this! :crazy_face: But that might be because I’ve been using Integromat & the API a lot.

My gut says that linked record fields are most likely stored as arrays internally. Why they’re not presented as arrays through Airtable formulas is a mystery. Sadly, we’re not likely to see it change because so many formulas exist that expect linked fields to output strings.

True. I was just focusing on internal scripts.

1 Like

Justin, thanks for chiming in on this thread, and this previous thread on the same topic, where you first explained this issue to me.

The read format for lookup fields is now published as an array in the documentation for scripting and custom apps. However, these arrays contain slightly different data. In scripting, you get an array of field values (along with some other data not mentioned in the documentation). In custom apps, the array is an array of objects that contain both the field value and the linked record id. This tells me that under the hood, the information is not stored exactly as it appears when using the api.

1 Like

I appreciate the breakdown on lookup fields, but I was talking about linked record fields in that comment you quoted. :slight_smile: Also, I was speculating on how they are stored internally by Airtable. In most cases, I imagine that the internal representation is very close to how Airtable presents the data via various APIs. Assuming that links are stored internally in arrays, and with many (if not most) users also assuming something similar when they first start interacting with linked record fields via formulas, it’s even more of a mystery why linked record fields are passed to formulas as strings and not arrays. Arrays are easily turned into strings, but Airtable’s formulas provide no ways of making arrays (cue the long-standing request for a SPLIT() function), so what benefit is there in pre-converting an array of links into a string that a) instantly limits what the user can do, and b) forces them to add a lookup field to create the array that a link field could have passed?

2 Likes

Oops. I must have gotten distracted.

I agree.

Notice that lookup fields have the exact same problem, only in reverse.

People expect lookup fields to behave like strings in formulas, but they don’t. We have to manually convert lookup fields to strings in the formula if we want to use them in functions that expect string inputs.

Maybe Airtable was trying to avoid this confusion, which is far more common than the issue of trying to flatten arrays of linked records.

I wish there were a way for formula fields to automagically do type conversion for linked record and lookup fields. If the value is inside an array function, convert it to an array of values, otherwise convert it to a string. But I guess that it is just too hard to tell with rollup formulas.

2 Likes

Wow!! It’s working!!
@kuovonne @Justin_Barrett and @ScottWorld Thank you so much for your help!!

Today I’ve learned a lot about Airtable, and yet I feel as there are so many things I still don’t know :smiley:

2 Likes

This is a great example of why I wish Airtable had some sort of a product manager who was paying attention to all of these things, and then addressing these flaws in the product.

Or at the very least, addressing these issues in the formula documentation.

Otherwise, we’re just left trying to to figure all of this out for ourselves.

This topic was solved and automatically closed 3 days after the last reply. New replies are no longer allowed.