The Airtable Community will undergo scheduled maintenance on September 17 from 10:00 PM PST to 11:15 PM PST. During this period, you may experience temporary disruptions. We apologize for any inconvenience and appreciate your understanding.
Aug 27, 2020 03:11 PM
Is there a script to deduplicate content, then select the rows to keep based on a timestamp or another count field?
Eg., In a table of social media data the content is duplicated, but the dedupe block doesn’t allow me to group content then choose to keep only the most recent.
Aug 27, 2020 06:52 PM
Eg., In a table of social media data the content is duplicated, but the dedupe block doesn’t allow me to group content then choose to keep only the most recent.
If you’re looking to keep the most recent record by creation time and you don’t need to do any complex merging, you can actually do this with the dedupe block by using the sort feature.
If you sort by “Newest first”, you can use keyboard shortcuts to quickly process your records:
The sort criteria will be preserved for each set of records, so you’ll always keep the latest record. This isn’t quite as efficient as a script that automatically deduplicates all records as a bulk operation, but it may be sufficient for your needs in the meantime.
Aug 27, 2020 07:12 PM
Unfortunately, the timestamp is unrelated to creation time–it’s based on tweet time. Is there a way to sort by other fields than what’s in the current Sort By dropdown?
Aug 27, 2020 07:24 PM
This isn’t possible unfortunately, so a script would be the way to go. What constitutes a “duplicate record” in your particular use case?
Aug 28, 2020 03:46 AM
The body of the text matches exactly, I’d like to keep only the most recent based on date field (this date doesn’t correspond to ‘newest’) or by another count field. The script should be similar. Thanks for your help!