Scripting to identify duplicate name records

Kongtho_Imarith · ‎Dec 25, 2023

Hello,

In my Airtable's "PEOPLE" Table, we stored many people's name records—more than 15,500 records. I aimed to identify those name records whose spelling is the same as any particular person's name. I developed the following scripting codes using the Scripting Extension in Airtable. When I tested it with another table with a small number of name records, less than 10, it worked well. However, when testing the code with a table with many people's names, the script's running was not complete. This issue could be related to the large number of records in the Table. Please note the following regarding my "PEOPLE" Table: 1) Table name: PEOPLE, 2. Primary field name: English Name, and 3. "Duplicate Name" is a checkbox field, which means if a tick is created for any name, that name has more than one record. Please see my scripting codes below:

//Read our table
var table = base.getTable('PEOPLE');
var query = await table.selectRecordsAsync();

//Identify duplicates
//loop through all of the records

I look forward to hearing from all of you who can help with this.

kuovonne · ‎Dec 25, 2023

What is your level of experience writing scripts and what is your level of interest in learning to write better scripts?

What error message are you seeing when you run the script on a larger data set?

When using table.updateRecordsAsync(), you can have a maximum of 50 records. This might be your issue.

This script selects the checkbox for the records that are duplicates, but you still have to find the duplicate of duplicates to do something with them.

If the checkbox is already selected for a record, the script does not clear the checkbox if there is no duplicate.

This script has nested looping, which means that as the record count increases, the run time dramatically increases. A more efficient version would not have nested looping.

Is there a reason you want to use this script instead of the dedupe extension?