Re: Get rows with unique values

Stephan_Strittm · ‎Aug 31, 2017

I have a table with several entries. One Column is an ID which could be used several times (not unique).

Now I would like to retrieve a list of all records with just one unique entry in column ID. Somehow like an inverted duplication-checker.

I wanted to solve this using a filerByFormula but have no idea how to do it (I miss the group by operator of SQL at this point).

How could I manage this using the API?

Chester_McLaugh · ‎Aug 31, 2017

While not exactly a “Group By” operator, here’s two approaches I’ve used:

A) Use your programming language to create a list of distinct IDs from a list of all the record IDs. Use the “fields” parameter to specify that you only want the API to return the ID field/column to speed things up and cache the results since, depending on your table size, this request could take 10s of seconds.

B) This requires creating a “reference” table with a “basket” field and a “distinct” field.

Create table “reference”
Create field “basket” that is a link back to your first table
Create field “distinct” that is a rollup that uses the ID field and ARRAY_UNIQUE(values)
To get the “distinct” ID’s from the field in the reference table, you’ll first need to fetch all records from your IDs table where the reciprocating linked field (called “reference”) is FALSE() and update them so they’re linked.
then you can simply query the value of the “distinct” field of the “reference” table!

Ug… And this is why Airtable will never compete with any other data source as a backend unless the API (and probably underlying data storage engine) is overhauled OR unless you develop your own API interface and caching layer that can make some of these basic scenarios, well… basic. The Airtable interface ROCKS! And the API is great, but it’s just not built or optimized to replace a relational database backend.

Tuur · ‎Sep 01, 2017

I don’t think there are other options either…

Stephan_Strittm · ‎Sep 01, 2017

Many thanks for your replies! I think I will fetch all records and filter then in the java backend. I had the little hope that I could reduce the amount of data to fetch.

W_Vann_Hall · ‎Sep 01, 2017

OK, I’m confused.

When you say

do you mean

All records with one or more entries in {ID} where all entries are identical

or

All records where the contents of {ID} are not found in any other record?

If the latter, I would think in Chester’s example, rows 1 and 4 should be distinct but not rows 2 and 3.

Of course, there’s always option

Something else entirely.

Which is the right interpretation?
. _____________

edit

Not an API solution, but it’s relatively easy to identify uniques by adding a couple of columns to the base, if that’s a possibility…

Stephan_Strittm · ‎Sep 04, 2017

Sorry for the confusion,

It is the option 2. I am looking for.
Your example is great. This could reduce my amount of rows to load. I will investigate in it to adopt it for my project.

W_Vann_Hall · ‎Sep 04, 2017

Thanks!

Something I failed to mention is that your IDs need to be entirely unique; for instance adding a fifth record, ‘bluegreen,’ will fail because record #4 invalidly matches ‘bluegreen’. In such a case, concatenating a comma to the beginning and end of both ‘ID’ and ‘{All IDs}’ should work – as long as there are no embedded commas in ‘ID’…