Help filtering through an array to merge duplicates

Hey there,
I would love a bit of help with an issue I have been struggling with:

I have large data set containing keywords / descriptive tags to describe/classify the designs on our webshop juniqe.com, they are stored on a table called Image Keywords in my base, whereas the designs they are related to are on a table called DESIGNS

I wrote the script below to merge the duplicate entries in the keyword table, since these records hold much information other than the linked record to DESIGNS. I’d like to believe that the script does the following

  • cycle through all the records and compare the current records main field to the others, using the field Name (EN) as comparison value (the keywords english translation)
  • If a duplicate has been found, combine the linked records of both the instances the script has found, and write them back to the first instance
  • Clear the linked record field of the second instance (the comparison record) and mark that record with a checkbox field called DUPE to indicate a duplicates. My intention was to clear the duplicates later manually after having reviewed and verified the script works.

The code does do in parts what it’s supposed to do - it certainly calculates and runs forever.
However, in the end, it produces some unexpected results with records that are marked as Duplicates but still contain records in the linked record field.
So, something doesn’t work as expected, I have a feeling I am missing a trivial but essential logical element of the loop, but by god, I can’t figure it out.

This is the code, any help appreciated, it seems to combine and reconstruct the arrays correctly but i feel the way the two loops are designed must be flawed somehow:


function onlyUnique(value, index, self) {
  return self.indexOf(value) === index;
}

// GET TABLES AND DEFINE VARIABLES

let table = await base.getTable("Image Keywords");
let query = await table.selectRecordsAsync({fields:["Designs - Vision AI","Name (EN)","DUPE"]})
let lower ="l";
let id = "123";


// OUTER LOOP -> LOOP THROUGH ALL KEYWORDS, GET CURRENT KEYWORD
for (let count = 0; count < query.records.length; count++) {

// GET THE TRIMMED LOWER CASE KEYWORD AND RECORD ID
lower = query.records[count].getCellValueAsString("Name (EN)").toLocaleLowerCase().trim();
id = query.records[count].id

// INNER LOOP -> COMPARE CURRENT KEYWORD TO THE REST OF THE ARRAY
for (let countlower = count+1; countlower < query.records.length; countlower++) {
    
// IF OF RECORD OF INNER LOOP MATCHES THE CURRENT KEYWORD, ISN'T THE SAME RECORD AND HASN'T PREVIOUSLY MARKED AS DUPE, ESTABLISH A DUPLICATE HAS BEEN FOUND
    if (lower==query.records[countlower].getCellValueAsString("Name (EN)").toLocaleLowerCase().trim() && id!=query.records[countlower].id && query.records[count].getCellValue("DUPE")!=true && query.records[countlower].getCellValue("DUPE")!=true) {
            
            console.log("dupe ",lower,"-",query.records[countlower].getCellValueAsString("Name (EN)").toLocaleLowerCase().trim())
            let designarray = query.records[countlower].getCellValue("Designs - Vision AI")
            let designarraysec = query.getRecord(id).getCellValue("Designs - Vision AI")
            console.log(designarray,designarraysec)

// COMBINE LINKED FIELDS OF BOTH RECORDS, DEPENDING ON WHICH ONE OF THEM MIGHT BE EMPTY
            if (designarraysec!=null && designarray!=null){designarray = designarray.concat(designarraysec)}
            if (designarray==null && designarraysec!=null){designarray = designarraysec}
            if (designarray!=null && designarraysec==null){}
            
// IF BOTH ARE EMPTY, MARK INNER ARRAY'S DUPLICATE RECORD AS DUPE=TRUE            
            if (designarray==null && designarraysec==null){
                        
                        console.log("no records")
                        await table.updateRecordAsync(query.records[countlower].id,{
                            
                            "DUPE":true
                        
                        });

                    }

                    else {

// IF EITHER ARRAYS ARE NOT EMPTY, RE-CONSTRUCT THE COMBINED ARRAY...

                        console.log (designarray)
                        
                        let newarray = new Array();
                        

                        let unique = [...new Set(designarray.map(keyword => keyword.id))];


                        for(let record of unique) {
                            newarray.push({"id":record})
                        }

// ...THEN UPDATE THE OUTER LOOP'S RECORD WITH THE COMBINED ARRAY ...
                        await table.updateRecordAsync(id,{
                                
                                "Designs - Vision AI":newarray
                            
                            });

// ...AND UPDATE THE INNER LOOP'S DUPLICATE TO SET DUPE=TRUE AND CLEAR ALL LINKED RECORDS
                        await table.updateRecordAsync(query.records[countlower].id,{
                            
                            "Designs - Vision AI":[],
                            "DUPE":true
                        
                        });

                    }

        }
    }
}```

Is it possible some of the primary fields in your records are empty? Unlike every other field, the primary one never returns null but ‘Unnamed record’. That’s exactly 14 characters, which I know because I’ve spent weeks hunting bugs related to this. Not sure if this behavior is documented but no native outputing method picks it up.

The most direct way to check for this is to use Object.getOwnProperty[Names||Descriptors] on your suspect records. This will reflect all properties of the object back to you. In theory, proxy traps would work even better for this kind of a bug hunt because you could configure them to only catch problematic behavior and report it, thus not really affecting your already problematic performance/runtime. But in practice, you probably have enough on your plate for now. :sweat_smile:

Also, generally speaking, when your comments start sounding like a tourist guide in terms of what’s left, right, upper, or inner, then it’s probably time to stop looping and start looking into alternative techniques. Loops are super versatile and accessible, but they 're like the hammer from that saying about every problem looking like a nail and whatnot.

The fact that you managed to get this to kind of work with just for…of loops confirms that you’ve been ready to learn about more approaches, beyond the imperative one (procedural,OOP, functional programming all have easier methods of dealin with this kind of problem).

If you want to take another jab at debugging this, though, I suggest cutting down on your variable reference names and giving code some lines to breathe, i.e., improving readability.

this was surprisingly helpful.

If anyone needs, this code now does what it’s supposed to do and merges duplicates records along with their linked record fields



function onlyUnique(value, index, self) {
  return self.indexOf(value) === index;
}

// GET TABLES AND DEFINE VARIABLES

let table = await base.getTable("Image Keywords");
let query = await table.selectRecordsAsync({fields:["Designs - Vision AI","Name (EN)","DUPE"]})

let designarray = new Array();
let dupearray = new Array();
let newarray = new Array();                    


// OUTER LOOP -> LOOP THROUGH ALL KEYWORDS
for (let count = 0; count < query.records.length; count++) {

    if (query.records[count].getCellValue("DUPE")!=true) {
// GET THE TRIMMED LOWER CASE KEYWORD AND RECORD ID
    let currentkeyword = query.records[count].getCellValueAsString("Name (EN)").toLocaleLowerCase().trim();
    let id = query.records[count].id

// ADD LINKED RECORDS TO DESIGN ARRAY
    if (query.records[count].getCellValue("Designs - Vision AI")!=null){designarray = query.records[count].getCellValue("Designs - Vision AI")}
    
    console.debug(currentkeyword)

// FILTER ARRAY TO FIND ALL DUPLICATES
    let duplicatefilteredRecords = query.records.filter(keyword => {
        return keyword.getCellValue('Name (EN)').toLocaleLowerCase().trim()==(currentkeyword) && keyword.id!=(id)
    })

// CONTINUE IF DUPLICATES COULD BE FOUND
    if (duplicatefilteredRecords.length!=0){
    
    console.debug(duplicatefilteredRecords);


//LOOP THROUGH ALL DUPLICATES
            for (let countduplicates = 0; countduplicates < duplicatefilteredRecords.length; countduplicates++) {

// ADD LINKED RECORDS TO DESIGN ARRAY
                if (duplicatefilteredRecords[countduplicates].getCellValue("Designs - Vision AI")!=null){designarray = designarray.concat(duplicatefilteredRecords[countduplicates].getCellValue("Designs - Vision AI"))}

// ADD DUPLICATE'S UD TO DUPE ARRAY
                dupearray.push({"id":duplicatefilteredRecords[countduplicates].id,"fields":{"DUPE": true}});

            }
    
    console.log(currentkeyword.toLocaleUpperCase(),"-> total duplicates:",dupearray.length,"total designs:",designarray.length)

    }
    
        
// UPDATE DUPLICATE RECORDS
    await table.updateRecordsAsync(dupearray);

// REMOVE DUPLICATE DESIGN ENTRIES FROM DESIGN ARRAY
    newarray = [];
    let unique = [...new Set(designarray.map(keyword => keyword.id))];
    for(let record of unique) {
        newarray.push({"id":record})
    }

// UPDATE MAIN KEYWORD WITH MERGED ARRAY

      console.log(newarray)  
      await table.updateRecordAsync(id,{
                                
        "Designs - Vision AI":newarray
                            
        });
        
    }

}

console.log("FINITO!")


This topic was solved and automatically closed 3 days after the last reply. New replies are no longer allowed.