I’ve been playing around some with embedding data-validation code inside tables; although this approach does not truly enforce good data, it can at least flag invalid data at the time of entry, allowing the user to correct the mistake at the time it was made, rather than be forced to locate it later. While it may not be an optimal solution, something similar could easily be added to your contacts base to ensure each new name doesn’t already exist.
You can find a demo base with example code here.
The only change to your workflow required is that when entering a new contact record, you’ll need to link it to the ‘DeDupe’ table. As this table contains but a single row, all you need do is select the plus sign (’+’) in the linked record field and, when the pop-up window providing drill-down access to ‘DeDupe’ records opens, select the first (and only) record. After a brief moment of cogitation (a second or two on the demo base, which contains 995 entries), one of two things will happen: One, nothing, meaning the contact you just entered is not a duplicate, or two, a flag will appear in the ‘Dupe?’ column, meaning the entry you just entered might be a duplicate.
I say ‘might’ because it’s up to you to vet the two similar entries – and because in this implementation I match
(GivenName&Surname), ignoring middle initials. If you decide ‘John Q. Public’ and ‘John R. Public’ should both remain on the list, tick the ‘DupeOK’ checkbox ; this will append
RECORD_ID() to ‘MatchName’ and clear the ‘Dupe’ flag.
The routine works by creating a Rollup of ‘MatchName’ using the
ARRAYUNIQUE() aggregation formula. It then compares
COUNTA(Uniques) with a Count field for the linked records. If the numbers differ, it means
ARRAYUNIQUE() removed at least one duplicate.
Since you know the most recently entered name is the one duplicated, it’s relatively easy to find the earlier matching record. If you’re converting an existing base that contains duplicates, though, it’s a slightly more complicated process. It turns out the 1,000 random names with which I seeded the base included both John A. [line 574, now with ‘DupeOK’ checked] and John M. [line 806] Jenkins, and it took a while to locate the duplication. (Actually, since I wasn’t positive the algorithm was correct, I opened the CSV in LibreCalc to double-check my work and ended up finding it there.)
To convert an existing base
First, to create the ‘DeDupe’ table and add the ‘MatchName’, ‘DupeOK’, and ‘Check’ fields to your ‘Contacts’ table. The latter should be defined as a text field. Return to the ‘DeDupe’ table and enter any arbitrary name in the primary field of the first row. (I used the ‘White Heavy Check Mark’ emoticon: .) Mark and copy this name, and return to the ‘Contacts’ table. Select the ‘Check’ field in the first row, navigate to the final row of the table, and, while holding down the shift key, select the last row’s ‘Check’ field and then enter
Ctrl-V). This should paste the ‘DeDupe’ record name into the ‘Check’ field of every ‘Contact’ record. Finally, reconfigure ‘Check’ to be a Linked Record field linking to ‘DeDupe.’ This will result in every ‘Contacts’ record being linked to the single entry in ‘DeDupe’. Add the ‘Dupe’ field to ‘Contacts’, and everything should be ready to go.
There: A workable, if not exactly elegant, solution to your problem, one built entirely in Airtable, without need for third-party middleware.
- For instance, in the demonstration base that accompanies this reply, whenever a Person’s record is updated to show what Size (e.g., S, M, MT) he or she wears of a certain clothing Type (e.g., Men’s Sweatshirt, Women’s Shorts), a quick sanity check is run to make sure the specified Size is valid for that Type.
- I was blessed with very clean data, since it consists of 1000 uniform records created by fakenamegenerator.com (less five, that is, to stay below the 1,000-record limit on free accounts); I also didn’t have to contend with titles or suffixes. If yours isn’t so clean, doesn’t come prepackaged into orderly FirstName/M.I./LastName chunks, or if your existing base contains occasional, unpredictable inclusion of titles, prefixes, or suffixes, you’ll need to scrub it a bit. You can find a set of routines designed for name processing here.
This implementation allows for each given name/surname pair to be duplicated once; should your client list contain, say, a suspicious number of 'John Doe’s, you’ll have to change ‘DupeOK’ to a text or number field and enter a unique deduplication value for each. I recently modified the demo base to support an unlimited number of permitted duplicates per given name/surname pair, so this limitation no longer applies.