I’m trying to make shortenings of company names using a regular expression, by taking taking the first letter every part of the name.
Fx ‘Great Garden Centre’ → GGC
I’m not sure if this is possible without a loop, but I’m also no buff at using regex.
I’ve sofar I can only extract the first letter using this:
Is this possible without looping through every word?
Hi there! What you did will only extract the letter G from the example you’ve provided. It’s because according to documentation REGEX_EXTRACT returns only the first occurence that matches your regex. I thought it’s not possible but I’ve played a while and I was able to achieve it by below formula:
So with REGEX_REPLACE I’m replacing every small letter with empty string first. This doesn’t seem to work as expected giving me “G G C” with spaces between letters. For that I’ve wrapped it with SUBSTITUTE that replaces spaces (" “) with nothing (”") so you get “GGC”.
This will delete all word characters that are not at the start of a word up to, and including, eventual subsequent non-word characters. “Great Garden Company” becomes “GGC” and “Not so great Company” becomes “NsgC”.
Note that, as of this writing, Airtable regular expressions are not Unicode aware, so that letters outside the English Latin subset will be considered non-word characters and thus will not be abbreviated.
Absolutely, you are right on all counts. There is only so much you can achieve with a regex considering the bewildering range of possible inputs, including incorrect forms (case in point: your regex will actually not match your first example the way you would expect it to, because that apostrophe is a typographically correct one while the regex only matches on ASCII single quotes), never mind the fact that any letter not accounted for by ASCII throws a spanner into the works, as mentioned. Personally, I’d recommend not relying on such an automatism at all and setting the abbreviation manually instead, but that is outside the scope of OP’s question, and I prefer to avoid answering “you shouldn’t be doing that” / “why would you want to do that” when I can .
One of the nicer aspects of Airtable is that you can actually combine both approaches for an existing data set. Define a formula column using a variant of my regex to get a first raw set of abbreviations, then convert that column into text. Airtable will keep the calculated values while converting, which means you can now fine tune the existing abbreviations, and any new entry will then add their abbreviation manually.
Or, if you prefer, you can keep the calculated column and add an override text column where users can manually tune (or deviate from) it, then base all display and operations on a third formula column that selects the override value if present, or the automatic one if not (I do that in some scenarios where the automatic value is right in at least 75 % of the cases). It all depends on factors like existing data set, frequency of new inputs, qualification of users inputting data etc.
On a more personal note, I wouldn’t call this “headless”: the allure of automatisms is great, and when they work, they feel like magic. It’s when you start hitting the edge cases you will come to the insight that doing the work, albeit annoying, is often superior to magic that goes wrong. At least that’s what happened to me: I have rolled back most automatisms in our system over time, sometimes falling back to the override pattern I described, but most often just saying “f**k it, not worth the hassle” (disclaimer: we have a very small, specialised team doing data entry and the volume of new inputs at any point in time is small enough for me to keep tabs on inconsistencies and lags in the data; YMMV).