May 18, 2020 02:03 PM
Hi Airtable!
I have a base that I created in which I will be adding Notes from Apple.
There are duplicate records, but there is no key, except for a formula to create different record ID’s. This ID is unique for all records, including duplicates.
My question is how do I create a unique key, when each of the fields are not unique in my base.
Fields:
DATE *these dates are not all different
SUBJECT *some are the same
NOTES *the titles are either too varied or the same
TIME CREATED *some of the same records will have different times
FORMULA *creates a different record ID, even if it’s a duplicate
Perhaps the formula can be adjusted to create the same ID for all duplicates, based on another field?
If so, I don’t quite know how to do that.
I have created a share link to my base, I thought it might be easier.
I would appreciate any help.
Thank you,
Mary Kay
P.S. For some reason my share link looks like this. If you go to the last record, there’s a link (View larger version) to view the whole base.
I set up this link correctly and it views fine when I select preview in my base, so I am not sure what’s going on.
Solved! Go to Solution.
May 19, 2020 04:22 AM
Hi @kuovonne
Yes! This would work.
I can use the field name “Duplicate?” as my key, to identify which records can be deleted in Integromat.
Thank you so much for your help.
Mary K
May 19, 2020 10:41 PM
Here’s the writeup. Note: I changed some field/table names from my original screen capture.
@M_k I created this solution using a copy of the base that you shared. Since it is your data, I didn’t want to share my copy of the base without your permission. Please let me know if I have your permission to share my copy of your base publicaly.
A base has duplicate records, and the user needs a way of identifying and deleting records without using the Dedupe block or writing code.
This method relies on a combination of techniques:
Once each record has a copy if all the records in the table, a formula field can tell if the record is a duplicate or not.
You need to create a new table with a single record. All of the records in the original table will need to be linked to this single record.
This table will have three fields:
You need to create several fields in the original table.
Non-formula fields:
Linked record field to the single record in the new [MasterLookup] table. You will need to create this link for every record in the table.
Lookup of the {Composite Key + Record Id} field in the new [MasterLookup] table
Formula Fields:
{Composite Key} formula field, a concatenation of all the fields that must match for a record to be a duplicate (used to identify if a record is a duplicate) example: DATE & SUBJECT & NOTES
{Composite Key + Record Id} formula field, {Composite Key} & RECORD_ID()
, a concatenation of all the fields must match, plus RECORD_ID()
at the end (used to identify if a record is the “first” in a set of duplicates)
{Duplicate?} field that determines if the record is a duplicate that can be deleted
In order for this method to work, you must link all the records in the original table to the single record in the [MasterLookup] table. Here is the fastest way to link existing records:
This field ties everything together.
First the formula checks if the record is linked to the master table. If it isn’t, then the formula gives a warning.
Then the formula searches the combined data for all the records for both its {Composite Key}, and its {Composite Key + Record ID}, and returns the position of each string. It will find both of them, because the {Composite Key} is a subset of the {Composite Key + Record ID}.
If the two positions are the same, it is the “first” record, and should not be deleted. If the two positions are different, some other record must have the exact same composite key, but appear “earlier” in the array. Thus, this record is a duplicate and can be deleted.
Note that the {ALL_COMPOSITE_KEY+RECORD_ID} lookup field needs to be concatenated with an empty string to convert it from an array to a string.
IF(MasterLookup,
IF(
FIND({COMPOSITE_KEY+RECORD_ID},
{ALL_COMPOSITE_KEY+RECORD_ID} & ""
)
=
FIND(COMPOSITE_KEY,
{ALL_COMPOSITE_KEY+RECORD_ID} & ""
),
BLANK(),
"duplicate"
),
"Record needs to be linked"
)
This method is very resource intensive and can take a long time, especially as the number of records increases. The amount of data to be processed is the square of the number of records. If there are 1,000 records, an array of all 1,000 records needs to be stored in each of them, resulting in 1,000,000 records worth of info in the lookups alone! If there are many records, be prepared to wait a long time for Airtable’s calculations to catch up.
To delete duplicate records, create a view with filters to show only records that have “duplicate” in the {Duplicate?} field. Then delete all the records in that view.
You can automate the duplication with a 3rd party integration, such as Integromat or Zapier by having it delete new records in the view.
If you will be adding new records to the table that might be duplicates, those new records will also need to be linked to the [MasterLookup] table. You can also have an Integromat or Zapier to create those links for you. Create a new view with a filter to only include records where the {MasterLookup} linked field is blank, and create the link.
May 20, 2020 12:44 AM
Hi @kuovonne
Thank you so for the writeup. I really appreciate it.
You can share the table publicly and this way I can visually see how everything is setup.
I was also wondering if you could provide an example of the first base option with the “Duplicate?” formula?
I have a little over 400 records, since I have the free plan which allows me to have 1,200 records and 5 megabytes of data, will I stay within the limitations?
Thank you,
Mary K
May 20, 2020 01:18 PM
@M_k Here’s a link to the shared base.
The formula for the {Duplicate?} field in my screen capture is the same as the formula that I included in my post. I just have the other fields hidden in that screen capture. (I had a bunch of other temporary fields that I was experimenting with when coming up with this solution, and I didn’t want to clutter up the screen shot.)
You don’t need to worry about the record limit, as this method only creates one new record.
I’m not sure about your “5 megabytes of data” comment. The Plus plan has a limit of 5 GB (not MB) of attachment space. But you said that you are on the free plan with a 1,200 record limit and 2 GB of attachment space. If you are referring to the attachment space limit, you don’t need to worry about that either–this base doesn’t have any attachments.
My comments about this method being resource intensive apply to the amount of data that needs to be processed by Airtable in all the lookups and formulas. When Airtable has to perform lots of calculations, it can take a while for the calculated fields to catch-up. When I tested this system with your table of only 76 records, it was reasonably fast, a few seconds at most. However, that’s only 76x76=5776 records worth of data to push through. With 400 records, there would be 400x400 copies of the records, which is 160,000 records worth of data. That’s actually not a lot of storage space, because each record is only text strings, which take an infinitesimal about of storage compared to things like video files. It is the processing of that data that takes time.
I’d appreciate it if you report back on how this method works for you, and how long it takes. (Of course, if you’re using Integromat, you might have spend more time waiting for Integromat than waiting for Airtable.)
By the way, for anyone else listening to this conversation, I want to warn them that this method completely ignores any fields that are not included in the {Composite Key} field. In this base, that doesn’t matter, because the {Composite Key} field includes all the user editable fields in the table.
May 21, 2020 01:49 PM
Hi @kuovonne
Thank you so much for your help.
I am just starting to get into the tables setup. I am noticing that when I link a record, I get the field name called “unknown record” instead of “Link All Records” for all the records.
When I manually enter a test record, in Table 1, I select “Link to a record in the MasterLookup table”, then I link it.
I am not sure why I am getting “unknown record”. Unless, I am not linking the records correctly.
I should mention that the table that this is occurring in, is a duplicate base, without records. I am not sure if that makes a difference. The reason I duplicated the base is the first base was just a test.
Thank you,
Mary
P.S. I mentioned the wrong attachment size capacity. You are correct, it should be 2 GB of space.
Mary K.
May 21, 2020 02:51 PM
Mary K., Can you share screen captures of both tables?
May 21, 2020 03:45 PM
Hi @kuovonne
I figured it out!
I just typed “Link All Records” in the first field, in the “MasterLookup” table and it worked.
Now I just need to do a test run with my Integromat scenario, so I will see how it everything works out.
Thank you!
Mary K.