Skip to main content
Solved

Delete duplicates (old records and not new ones)


  • Known Participant
  • 27 replies

Hello, the default Delete duplicates extension is awesome to remove with one click all duplicated records at once but have a little problem, its removing new records and keep first one, how i can invert the process and keep the last record instead of forst ? remove then all old records and not new.


here is the default code :


let settings = input.config({

    title: 'Delete duplicates',

    description: `This script will delete duplicate records in a given table according to the value of

two input fields. Duplicate records are detected when they contain the same cell value for each identifying

field. For any two records that are considered duplicates, it will use a third comparison field to determine

which of the two records should be deleted.`,

    items: [

        input.config.table('table', { label: 'Table' }),

        input.config.field('firstIdField', {

            parentTable: 'table',

            label: 'First identifying field',

        }),

        input.config.field('secondIdField', {

            parentTable: 'table',

            label: 'Second identifying field',

        }),

        input.config.field('comparisonField', { parentTable: 'table', label: 'Comparison field' }),

    ],

});



let { table, firstIdField, secondIdField, comparisonField } = settings;



// Airtable limits batch operations to 50 records or fewer.

let maxRecordsPerCall = 50;



function choose(recordA, recordB) {

    let valueA = recordA.getCellValueAsString(comparisonField);

    let valueB = recordB.getCellValueAsString(comparisonField);

    return valueA > valueB ? { keep: recordA, discard: recordB } : { keep: recordB, discard: recordA };

}



let existing = Object.create(null);

let toDelete = [];



// Part 1: Identify duplicate records in need of deletion

//

// We don't modify the table contents in this Part in the interest of

// efficiency. This script may trigger a large number of deletions, and it's

// much faster to request that they be done in batches. When we identify a

// record that should be deleted, we add it to an array so we can batch the

// operations in Part 3 of the script.

let query = await table.selectRecordsAsync({

    fields: [firstIdField, secondIdField, comparisonField],

});



for (let record of query.records) {

    let key = JSON.stringify([

        record.getCellValue(firstIdField),

        record.getCellValue(secondIdField),

    ]);



    // If we've already encountered a record with identical field values,

    // either that record or the current record need to be removed.

    if (key in existing) {

        let { keep, discard } = choose(record, existing[key]);

        toDelete.push(discard);

        existing[key] = keep;



        // If this is the first time we've observed a record with these

        // particular field values, make a note of it so we can recognize

        // duplicates as we iterate through the rest.

    } else {

        existing[key] = record;

    }

}



// Part 2: Verify

//

// Inform the script's user of the changes to be made and await their

// confirmation.

output.markdown(`Identified **${toDelete.length}** records in need of deletion.`);



let decision = await input.buttonsAsync('Proceed?', ['Yes', 'No']);



// Part 3: Execute the necessary operations



if (decision === 'No') {

    output.text('Operation cancelled.');

} else {

    output.text('Applying changes...');



    while (toDelete.length > 0) {

        await table.deleteRecordsAsync(toDelete.slice(0, maxRecordsPerCall));

        toDelete = toDelete.slice(maxRecordsPerCall);

    }



    output.text('Done');

}

Best answer by Alexey_Gusev

JAN11 wrote:

Thank you !


any tips to :




  1. Skip the validation question and execute the action directly


  2. execute the script from time to other to clear the base ?



first,

‘lite way’

change line let decision = await inpu.... to

let decision = 'Yes'

OR, ‘more brutal’, change it to

while (toDelete.length) await table.deleteRecordsAsync(toDelete.splice(0, maxRecordsPerCall))

and remove all code under this line


second,

set up automation with desired schedule

‘hardcode’ these 4 variables on the left side, in input parameters, like this



Omit let settings= part and start with

let { table, firstIdField, secondIdField, comparisonField } = input config()


change output.markdown to console.log (you can also just remove this line, same as any other output, automation supposed to work without user interaction)

note that astericks will be seen as usual characters, without effect of bold number of records to delete. but I would left it all ‘as is’ for logging purposes.


p.s. I would think twice before implement such solution - I mean delete records without confirmation and review. it feels like increasing risk of losing a piece of ‘good data’.

View original
Did this topic help you find an answer to your question?

4 replies

Alexey_Gusev
Forum|alt.badge.img+12
  • Inspiring
  • 1111 replies
  • September 12, 2022

Hi,


return valueA > valueB ? { keep: recordA, discard: recordB } : { keep: recordB, discard: recordA };


here, try to change the sign (or swap the order of brackets)


  • Author
  • Known Participant
  • 27 replies
  • September 12, 2022

Thank you !


any tips to :




  1. Skip the validation question and execute the action directly


  2. execute the script from time to other to clear the base ?


Alexey_Gusev
Forum|alt.badge.img+12
  • Inspiring
  • 1111 replies
  • Answer
  • September 12, 2022
JAN11 wrote:

Thank you !


any tips to :




  1. Skip the validation question and execute the action directly


  2. execute the script from time to other to clear the base ?



first,

‘lite way’

change line let decision = await inpu.... to

let decision = 'Yes'

OR, ‘more brutal’, change it to

while (toDelete.length) await table.deleteRecordsAsync(toDelete.splice(0, maxRecordsPerCall))

and remove all code under this line


second,

set up automation with desired schedule

‘hardcode’ these 4 variables on the left side, in input parameters, like this



Omit let settings= part and start with

let { table, firstIdField, secondIdField, comparisonField } = input config()


change output.markdown to console.log (you can also just remove this line, same as any other output, automation supposed to work without user interaction)

note that astericks will be seen as usual characters, without effect of bold number of records to delete. but I would left it all ‘as is’ for logging purposes.


p.s. I would think twice before implement such solution - I mean delete records without confirmation and review. it feels like increasing risk of losing a piece of ‘good data’.


  • Author
  • Known Participant
  • 27 replies
  • September 19, 2022

Thanks a lot @Alexey_Gusev


Reply