Help

Re: ONLY detect duplicates in a column

3504 1
cancel
Showing results for 
Search instead for 
Did you mean: 
asja_fejzic
4 - Data Explorer
4 - Data Explorer

I tried the already existing topic about duplicates but was unable to find what I am looking for.
So basically, I have a list of applicants listed chronologically according to the day they applied, and each of the applicants has their own unique ID number. What I need is to be able to see automatically if someone with the same ID applies…I don’t need to remove duplicates from the list, nor to regroup them or anything. I just need them to be marked somehow in case the same ID occurs, so that I can now immediately if someone is applying again or for the first time.
I google sheets I use the =countif(A:A,A1)>1 …is there an equivalent in airtable?

Thanks and sorry if this was already asked.

13 Replies 13
AlliAlosa
10 - Mercury
10 - Mercury

Hi there,

This can get complicated pretty quickly, but the approach I would take is this:

  • Link each applicant to a single record on a new summary table (or existing if you have this already). Depending on how you load the applicant data, you may be able to include the link to the summary table automatically for incoming data (i.e. through Zapier).
  • Rollup all of the unique ID numbers into a new field on the summary table with the formula ARRAYJOIN(values)
  • Create another rollup or lookup field on the table with your applicants to pull the above rollup field into each record on your applicant table, let’s say you call it, {All IDs}
  • Last, create a formula field on the applicant table with the formula:

(LEN({All IDs}) - LEN(SUBSTITUTE({All IDs},{Applicant ID},"")))/LEN({Applicant ID})

The above formula should give you the amount of times the {Applicant ID} appears in the rollup field of {All IDs} :slightly_smiling_face:

As @AlliAlosa mentions, this may seem complicated at first. You may want to look at the Quickstart section to the guide I reference in this post.

I would simplify the previous instructions slightly by using an aggregation formula in the rollup field; if you have a large number of applicants, it should also speed up processing a little:

  1. The first step she mentions is essential: Link every record of your [Applicants] table to a single record in a new [Summary] table. You can find several ways to do this here; chose whichever one seems simplest and most logical to you.
  2. To quote the previous reply, for step 2, "Rollup all of the unique ID numbers into a new field in the [Summary] table with the formula ARRAYJOIN(values)" Call this field {All IDs}.
  3. Define a new rollup field in your [Applicants] table called something like {Number of Applications}. It should (a) follow the link to [Summary] and rollup {All IDs} using the following aggregation formula (basically, copy-and-paste this formula into the place where you would ordinarily enter an aggregation function.):
(LEN(
    values
    )-LEN(
        SUBSTITUTE(
            values,
            {Applicant ID},
            ''
            )
        )
    )/LEN(
        {Applicant ID}
        )

This is essentially the same formula @AlliAlosa provided, but it uses one fewer field and it doesn’t require keeping a copy of the entire {All IDs} field in each record — which seems to accelerate processing somewhat.

Thank you for this! I wouldn’t have thought to use the formula in the rollup field itself - Brilliant!

Vikas_Vimal
4 - Data Explorer
4 - Data Explorer

The answer above is Brilliant. Note that with some strings, you will have to convert to a text string or the formula LEN(Value) Fails. Create a new column next to the rolled up value - Concatenate(Value,"") and use this as column in the main Formula.

Thank you guys so much for this brilliant idea, @AlliAlosa @W_Vann_Hall @Vikas_Vimal. I followed your instructions and it’s working like a charm, wouldn’t work without the concatenate formula thing, The Only thing that I can’t automate now is how to create and add the Newly created records link to the Calc field. so they are included in the Len calculations , Can you please just guide me on how to implement this?

Thank you so much for your support.

Hi there @Ahmed_Elagami! So glad you’ve got it working!

I would suggest setting up a new automation, triggered “when a record matches conditions”.

Set the conditions to be where your linked record field which links to your summary table is empty.

Set the action step to be “update record”, and update the same record which triggered the automation. Choose your linked record field as the field to update and insert the name of your summary record.

When on, this automation should catch any records that aren’t linked appropriately and add them to your summary field.

I hope this helps!

Hello again @AlliAlosa, Thankyou so much for the instructions, I just followed your steps in here, I just added( ‘.’ ) to each empty cell in the Linked records field through the automation, so the rollup is updated right away. I can even see the green rectangle update sign in all the rollup lookup fields. It’s working like a charm now, I can’t thank you enough, Merry Christmas, Happy new year.

Megan_Smith1
4 - Data Explorer
4 - Data Explorer

Thanks so much for this! Is it possible to post a quick example in airtable that can be viewed? Looks like an elegant solution, but I’m having trouble getting it to work as described.

Hello all - I’ve been trying to implement, and have gotten to the Concatenate step.

However, when I try and create the “All IDs” lookup field back in the Applicants sheet, it says “unexpected error” and doesn’t let me create the field.

Anyone else have this issue?

Could you share some screenshots of what you’re setting up? My gut says there’s a hiccup somewhere in the configuration.

Janene_Pappas
5 - Automation Enthusiast
5 - Automation Enthusiast

Hi all, I’m attempting to set this up and somewhere I’ve not got something right. @Vikas_Vimal and @W_Vann_Hall do you see where I’ve gone wrong here?

Screen Shot 2022-03-28 at 12.40.20 PM
Screen Shot 2022-03-28 at 12.38.13 PM

I am stuck in this same space as @Janene_Pappas I have done all the steps but still show a #Error on my Rollup. Did you ever figure this out Janene? In my case I am trying to find if we have the same Applicant ID.@Justin
image
image

It’s kind of surprising that I actually caught this because I wasn’t tagged correctly (it should be @Justin_Barrett, not @Justin ).

Anyway, the first issue that I see is with your {All IDs} formula field in the [Summary] table. That field should be a rollup pulling in the {Applicant ID} values from the [Applicants] table using the ARRAYJOIN(values) aggregation formula. Your current set up just mashes together the primary field values of the linked records, which won’t give you the results that you need.

However, even with that issue fixed, this probably won’t work well because the applicant IDs aren’t a consistent length. For example, the ID 12 can be found inside all of the longer IDs. Even if the other issues were fixed, it would report more duplicates than actually exist.

If the actual IDs that you plan on using are all the same length, this won’t be an issue. Otherwise you’ll need to pad those numbers so that you have a consistent length. First determine the longest ID’s length, or the longest ID that you feel would ever be used. In this case I’ll go with 8 characters, and pad extra zeroes onto the front of the ID with this formula in a field named {Padded ID}:

RIGHT(REPT("0", 8) & {Applicant ID}, 8)

Screen Shot 2022-11-24 at 8.52.47 AM

With that done, change the rollup in the [Summary] table to pull those padded IDs instead of the original IDs.

The final issue is with the {Number of Applications} rollup field aggregation formula. It’s possible that older versions of Airtable would automatically convert a single-item string array into a string, but the current version does not. This is the ultimate reason behind the error message: values is an array, but the LEN() function is only designed to operate on a string, so it’s throwing an error because it doesn’t know what to do with an array. Thankfully the fix is pretty easy, and it doesn’t require an extra formula field as @Vikas_Vimal mentioned above. All that’s needed is to take each reference to values in that aggregation formula and turn it into values & "". The final aggregation formula becomes this:

(LEN(values & "") - LEN(SUBSTITUTE(values & "", {Padded ID}, ""))) / LEN({Padded ID})

With that, the count of repeated IDs works.

Screen Shot 2022-11-24 at 8.55.42 AM