Table association (keys, lookups) question

I’m a new user, so apologies if this query comes off as ignorant.

I’m trying to set up an interaction between two tables:

  1. [Books], which has a list of book titles (with full ISBNs as the primary key), and

  2. [ISBN Digits-Vendor Reference List], which is a reference table that has the first 4-8 digits (e.g. a truncated pattern — ‘978-0-2280’ — out of a full ‘978-0-2280-xxxx-x’) of a ISBN as a primary key (the “Vendor/Publisher Code”), and a second column that’s just the name of the vendor/publisher.

Like:

‘978-0-2280’ | ‘Random House’

The truncated pattern can vary in length, for example:

‘978-3-2’ | ‘Scholastic’

In [Books], I want to “look up” the vendor/publisher from [ISBN Digits-Vendor Reference List] by matching the “substring” ISBN pattern in [ISBN Digits-Vendor Reference List] against [Books]'s “full” ISBN primary key.

So any full ISBN like ‘978-0-2280-1111-1’ in [Books] can look up and pull in ‘Random House’ from [ISBN Digits-Vendor Reference List] based on the pattern of the first 8 digits.

So far, I haven’t had any luck. Is this even possible?

Thanks,
Chris

Chris,
Will the “|” (pipebar character) always precede the publisher name in the ISBN-Vendor table?
Landeye

Welcome to the community, Dan! :smiley: I don’t think the pipe character is in the data. It’s just Chris’s way of separating the fields for the purpose of his example here. The number code is in the primary field, and the vendor/publisher name is in the next field.

Welcome to the community, Chris! :smiley: What you want should be possible (I’ve done some similar stuff), though it’s a little tricky to set up. Before going down the rabbit hole, though, I need to ask two questions:

  1. Are you looking to pull anything else besides just the vendor/publisher name into the [Books] table when matching those codes? More can probably be done, but if it’s just the name, it’s a whole heckuvalot easier. :wink:
  2. Why do the truncated codes vary in length? It would make the process less of a headache if the length were consistent. On a related note, is the next character immediately after the truncated portion another number, or a hyphen? Using the Scholastic code as an example, the truncated version is ‘978-3-2’, so is the longer version something like ‘978-3-2-xxx…’? Or is it more like ‘978-3-2xxx…’? If the next character is a hyphen, that might make the matching easier even if the length varies.

Once I hear back on these items, I’ll get into my solution (assuming your answers don’t make that solution impossible).

Hi Justin, thanks. This is Chris

  1. Yep, just the vendor/publisher name from the reference table into [Books].

  2. I’m not sure why they do either, but I believe it’s just the way ISBNs get assigned. I can’t seem to add links in a reply, but you can check out the Wikipedia article “List of group-0 ISBN publisher codes” for an example. There seem to be 2, 3, 4, 5, 6, and 7-digit publisher codes — though there could very well be something I’m missing, as I’m just a dabbler in these things. The hyphen is always fixed place like this: 978-x-xx-xxxxxx-x

Let me know if that helps or if there’s any other info you need.

That helps. Not needing more than the vendor/publisher name makes the setup slightly easier, and the varying length of the codes makes the setup only slightly more complex, so I guess it balances out in some odd way. :slight_smile: Even though the hyphen is in different places depending on how many digits are in the publisher code, the publisher code is always between the second and third hyphen, so we can use that to our advantage.

First, you’re going to need a “control” table, which will simply act as a place to collect information used by the system we’re building. I’ll often name these types of tables [CTRL]. After making the table, delete two of the three default records, and put a single period into the {Name} field of the one that remains.

30%20PM

In your [Books] table, add a link field pointing to the [CTRL] table, naming it {Link to CTRL}. Link the first record to that lone “period” record in [CTRL] manually, then drag-fill the rest. You should end up with something looking like this (minus the background color; this is copied from an existing table in one of my bases).

35%20PM

All records in [Books] will need this link in order for the system to operate. To ensure that all new records get linked, you could group this view by that link field. That way each new record will receive the link automatically.

Next, go to your [ISBN Digits-Vendor Reference List] table and create a similar link field setup pointing all vendor/publisher records to that lone record in [CTRL], and group by that field to ensure that all new publisher records will be auto-linked.

With that done, we need to give that [CTRL] table some data to collect, which we’ll create on the [ISBN Digits-Vendor Reference List] table. Make a new formula field named {Code Padded}, using the following formula (substituting your actual field name in place of {Code}):

{Code} & REPT("X", 15-LEN({Code})

This will take all codes and pad them out to 15 characters with Xs. For example, the code for Random House that you listed above would look like this:

Code:        978-0-2280
Code padded: 978-0-2280XXXXX

The number 15 is somewhat arbitrary, but it just needs to be long enough to fully encapsulate any code it might be fed; i.e. the code itself before padding can’t be longer than 15 characters. If you want to mess with this number, feel free, but I’m going to use 15 for the purpose of this demo.

Make another formula field named {Publisher Padded} that does the same thing to the publisher name. However, before doing that, we need to figure out the length, so use a quick temp formula for this field for starters (again replacing {Publisher} with your actual field name:

LEN({Publisher})

Look through the results and find the highest number. For now, let’s say it’s 20. I prefer to have some extra characters on the end just in case, so I’ll go with 25. Replace that temp formula with this:

{Publisher} & REPT(" ", 25-LEN({Publisher})

That will pad all publisher names with spaces to take them up to 25 characters. We won’t see the spaces, but they’ll come in handy later.

Now move to the [CTRL] table and add a rollup field named {Padded Codes}, rolling up the {Code Padded} field we made in the publisher table, using the aggregation function:

ARRAYJOIN(values, "")

This will collect all of those padded codes and mash them together into one massive string. Now make another rollup named {Padded Publishers} that does the same with the {Publisher Padded} field, and using the same aggregation function.

Now go to the [Books] table, and add two lookup fields to pull the {Padded Codes} and {Padded Publishers} rollups from the [CTRL] table, using the same field names for convenience. This will bring these two massive strings we just made and make them available to every record.

Add a formula field named {Publisher Index} to the [Books] table using the following formula (as always, replacing field names as needed):

SEARCH(
    LEFT(ISBN, FIND("-", ISBN, 7) - 1)
    & REPT("X", 15-LEN(LEFT(ISBN, FIND("-", ISBN, 7) - 1))),
    {Padded Codes} & ""
)

This uses what we know about the ISBN code design—that the publisher code is between the second and third hyphens, and that the second hyphen is always the sixth character—to determine how much of the left end of the full ISBN we want, then pads it with Xs to 15 characters and searches for that padded version in that massive mashed string of padded codes.

What should come back is a number representing the “index” of that code in the string, with the “index” being one larger than a multiple of 15. So say that the Random House padded code was found starting at position 31. Subtract 1 and we have 30. Divide that by 15 and we get 2. Using zero-based indexing (i.e. 0, 1, 2 vs 1, 2, 3), we know that it was the third code in the list.

Because of the way that rollups work with linked records from the same table, we can guarantee that the codes and publishers will be in the same order in their respective rollups. So now that we know that the third code is a match, we can extract the matching publisher name from the {Padded Publishers} mega-string by extracting its third item, with these items being 25 characters long.

Add a final formula field named {Publisher} using the following formula:

IF(
    {Publisher Index},
    TRIM(
        MID(
            {Padded Publishers} & "",
            ((({Publisher Index} - 1) / 15) * 25) + 1,
            25
        )
    )
    "❌ NOT FOUND ❌"
)

I wrapped the extracted text in a TRIM() function to get rid of the trailing spaces, leaving only the name. I also wrapped the whole thing in an IF to check for those rare cases where there is no match, alerting you via a colorful response that a match wasn’t found so that you can track down the missing code and add it to the code table.

NOTE: I put a good chunk of this together from memory, but have plenty of reference of similar setups in my own bases, so I’m pretty sure this will all work. If you run into any hiccups, or don’t understand something about the setup, just ask.

2 Likes

Hi, Justin. This solution looks exciting, thanks!

One more thing – would it be possible to implement this workaround with {Publisher} being a multi-select? I assume there would be additional array manipulation, if so. Thanks again.

Other than needing to jack up that 25-character {Publisher Padded} field length, I don’t think anything would change with {Publisher} as a multi-select. When referencing the contents of a multi-select via formula, Airtable passes the combined choices as a single string, not an array of items.

Hi Justin,

Thanks for your feedback. Sorry for keep piling on, but I’m being told that we’re stripping out dashes in Airtable (to integrate with an external database). Would there be a way to do this without the dashes?

The full ISBN will still always be 13 digits.

The main issue is I’m not sure how the search function could match patterns of varying lengths (e.g., 978000XXXXXXX and 978003467XXXX).

It should work just fine without the dashes.

They aren’t varying lengths. Because of the Xs used for padding, each one becomes 13 characters. I don’t see an issue. Is there something else I’m missing?

I’m unsure of how to construct a search function that would match the full {ISBN} in [Books] against the {Padded Codes} lookup field that contains long mash of padded codes.

I understand that the 13-digit string is padded with Xs (and so not of varying length) but the truncated ISBN pattern (that references the publisher name) is of variable length, so how would I cut up the full {ISBN}? I would send a screenshot of my base, but I can’t seem to attach images to these replies…

I’m new at this so I apologize for my slow uptake! Thanks again.

The full process is outlined in my earlier message, including all formulas and extra fields you’ll need. If it would help, I can build a sample base with this setup in place so you can see how everything ties together, but I’m pretty busy these days, so it might take me a day or two to put it together.

EDIT: Sorry, it just clicked that without the dashes in the ISBN codes, it’s going to be trickier setting it up. Give me a few days to mull this over.

Quick question: is it possible to still store the ISBN in Airtable, including dashes, and use a formula field to remove them so you’ve got a “clean” version for connecting to your external database? That would allow the system to remain as I outlined above. It’s kinda critical for those dashes to be there so that we know how to break up the full ISBN and search against the padded partials. With nothing but numbers, it pretty much breaks the whole thing.