Re: Parsing Text without Split()

Bill_French · ‎Dec 24, 2019

Recently, a member contacted me directly asking for help …

I am have been searching desperately to find a clean way to break apart an address like this…
123 anywhere st, San Diego, CA, 90001

Unfortunately, there is no “clean” way to do this without a Split() method as described in detail here.

And rather than jumping to the end by showing a unified working formula (that even I would be unable to read), I approached this question by providing a framework for tackling almost any parsing challenges.

Without Split() we are forced to fabricate rather complex formulas and like many of you, my brain is incapable of juggling a lot of string pointers to create that perfect [single] formula. And therein lay the issue - we often assume we must build a single formula from the beginning. This is neither necessary nor likely to be helpful. Sure, you want to use the fewest number of fields and create concise formulas for simplicity. However, there’s plenty of time to polish the code once you have something that works reliably.

So here’s how I tend to approach these tasks.

I reduce every step of the parsing process into the smallest unit of functionality. For example, in this challenge, we need to split the four entities of the address string into separate columns. But to do that, we have to sustain an always shrinking string so we can easily peel off the leading segment with each step toward parsing completion.

Step 1

Tease out the first entity (the street address).

Step 2

Eliminate the first entity from the full address string.

Step 3

Repeat steps one and two until all four entities are separated.

As you work through the steps, a common parsing pattern will emerge. Each of the patterns become[somewhat] reusable. It always helps to do this with real data and separate fields exposing each step and the remaining text to be parsed one entity at a time.

While I laid out the stages separately, you can combine these formulas pretty easily to eliminate the interim fields Remainder1 and Remainder2. Just replace each of the remainder field references with their respecting field formulas. Working in small chunks provides a manageable framework that also serves as a good learning exercise. But most important - when tackling a complex parsing project, I recommend you eat the elephant in small stages before trying to compile a unified parsing approach.

Street Address
Take the left-most characters from the beginning of the address string until the first character before the first comma.

LEFT(Address, FIND(",", Address) -1)

Remainder1
Take the full address string and remove the left-most characters from the beginning of the address string until the first comma. This serves as a waypoint to capture what’s remaining after we parse off the first entity in the string.

RIGHT(Address, (LEN(Address) - FIND(",", Address)))

City
Using the value of Remainder1 we are now in a position to parse just like we did for the street address.

LEFT(Remainder1, FIND(",", Remainder1) -1)

Remainder2
Once again, we reduce Remainder1 to exclude the City, thus setting the stage for two final splits to reveal the state, and the zip entities.

RIGHT(Remainder1, (LEN(Remainder1) - FIND(",", Remainder1)))

State

LEFT(Remainder2, FIND(",", Remainder2) -1)

Zip

RIGHT(Remainder2, (LEN(Remainder2) - FIND(",", Remainder2)))

Unfortunately, Airtable doesn’t support a Split() method for strings. If Airtable would provide a Split() method, this challenge would be delightfully fun. :winking_face:

address = split(Address, ", ")[0]
city    = split(Address, ", ")[1]
state   = split(Address, ", ")[2]
zip     = split(Address, ", ")[3]

Merry Christmas!

Bill_French · ‎Feb 12, 2020

@BobBannanas,

You must have some pull - I have access now. Coincidence?

Thanks!

Bill_French · ‎Feb 13, 2020

Do you have any idea how we would apply {typecast: true} in the context of updateRecordAsync?

JonathanBowen · ‎Feb 14, 2020

@Bill.French - Doesn’t look like that’s an option right now.

Bill_French · ‎Feb 14, 2020

Really unfortunate. This is why I come a little unglued when I learn that a vendor has deliberately chosen to not use their own API to build new functionality. Instead of hardening the core API so that it can support many use cases, they decided to implement this as an offshoot afterthought in an entirely separate code base.

C_Boyd · ‎Feb 26, 2020

I had a use case that required just extracting the city names from addresses, but they were sometimes 1-line addresses and sometimes 2-line (e.g., with an Apt.# or Suite#). I modified @Bill.French’s technique to do it, and thought I’d share in case it helps anyone else.

Caveats:

There’s a built-in assumption that cities are all in the same state (NC) (they were for me), so it would have to be tweaked for multiple states.
It requires the original addresses to have commas in between Street Address, Street Address Line 2, City, and State.

Here’s what it looks like:

Here are the formulas:

Start with {Address_Full} …then…

{FX_Street_Address} =
LEFT({Address_Full}, FIND(", ", {Address_Full}) -1)

{FX_Remainder} =
TRIM(RIGHT({Address_Full}, (LEN({Address_Full}) - FIND(",", {Address_Full}))))

{FX_Remainder_Split_a} =
TRIM(LEFT(FX_Remainder, FIND(", ", FX_Remainder) -1))

{FX_Remainder_Split_b} =
TRIM(RIGHT(FX_Remainder, (LEN(FX_Remainder) - FIND(",", FX_Remainder))))

{City_Extracted} =
IF(LEFT(FX_Remainder_Split_b,2)=‘NC’,{FX_Remainder_Split_a},LEFT(FX_Remainder_Split_b, FIND(", ", FX_Remainder_Split_b) -1))

Here’s a commenter link to the sample base (let me know if I need to share differently…I’m new to sharing examples).

Matt_Lawrence · ‎Aug 07, 2020

@Bill.French I owe you a beer! I’m guessing you just saved me hours of work, thank you!

Stefan_Baxter · ‎Feb 24, 2021

I’d really like to know why split is not simply added to Airtable.
Long discussions on workarounds seem like the wrong discussion to be having.

Justin_Barrett · ‎Mar 06, 2021

Welcome to the community, @Stefan_Baxter! :grinning_face_with_big_eyes:

My gut says that it’s because split() functions/methods create arrays. That means Airtable would need to support array parsing/indexing, which it currently doesn’t do (Airtable’s array support is bare-bones at the moment). In other words, the issue isn’t strictly with adding split() as much as it is with adding all of the other array support that the output of split() would require.

Unfortunately it’s the only type of discussion we as users can have at the moment. A split() function has been requested before, so all we can do in the meantime is work around its absence.

Ryan_Coyle · ‎Jul 14, 2021

Annoying this is needed, but very elegant solution. Now I just wish I could hide all the remainder Fields.

YourFriend · ‎Oct 22, 2021

How can this be done when the address is split by line breaks and not commas? May need to write a script or reformat everything to have commas.

I think my issue is what I want to split is created by a formula so it may not be recognized as a string. Bummer.

I was correct. I fixed this by adding & “” after where I referenced columns in the formula.

LETS GO TEAM

Re: Parsing Text without Split()

Split Text to Multiple Columns

splitting content in long form text in a formula

Splitting text containing hyphens and underscores ...

Split Formula

Parsing date from text into an exact Date with spe...