Mar 11, 2019 05:19 PM
I have a string of text that I want to break into two separate strings of text.
Here’s an example:
I want all of the words that precede the “/” to be one string of text. And then I want all of the words that come after the “/” to be a separate string of text. I want to completely eliminate the “/”.
So, to recap, we start with this:
And then we end-up with this:
Dec 04, 2021 07:36 PM
This is working so good with First and Last Name, but, i’m wondering… how about the “Middle2” in your table. Im on a Latinamerican country and We Usually name like this (example name):
Carlos Roberto Flores González
How can i extract “Flores” from it?
Thanks in advance.
Dec 04, 2021 07:46 PM
Welcome to the community, @Rodrigo_Flores! :grinning_face_with_big_eyes: This will extract the next-to-last item from a string (using spaces for separation):
IF(Name, REGEX_EXTRACT(Name, "([^ ]+) ([^ ]+)$"))
Dec 04, 2021 09:18 PM
Wow, it worked! :winking_face: many thanks Justin. Just to be aware, how can i study the logic of those functions “([^ ]+) ([^ ]+)$”, or, what’s the logic behind that… :smiling_face_with_sunglasses: you rock!
thanks!
Dec 04, 2021 09:54 PM
First off, the $
at the end anchors the pattern to the end of the string. Without that, the regex engine would search for a match at the start of the string.
Parentheses are used to define search groups. In the expression that I wrote, there are two matching groups separated by a space, meaning that both groups and the separating space would need to be matched for anything to be returned.
Square braces are wrapped around collections of individual characters that should be matched. However, the caret ^
character at the front tells the regex engine to not match any of the following characters. In this case, we want to not match a space.
The +
token after the square brace collection says to match the preceding token—any non-space character—one or more times.
Taken altogether, this collection—([^ ]+)
—will match any non-space character one or more times, effectively selecting a single word. (Note: There is a separate \w
token for specifically selecting “word” characters, but it’s very narrowly-focused in the characters that it matches. The version that I’m using is more broad, literally matches anything that’s not a space.)
Taking everything into account, the full expression says, “Find the pattern WORD-SPACE-WORD at the end of the string.”
With that in mind, we now hit an interesting bit of idiosyncratic behavior: Airtable’s regex implementation only returns the contents of the first defined group. So even though we’re telling the regex engine to find two words separated by a space, only the first word is actually returned, giving us the second-to-last word in the string.
To learn more, I recommend bookmarking a site like regex101.com (where you’ll need to select the Golang “flavor” to most-closely match the regex engine that Airtable uses), and just playing around.
Dec 08, 2021 10:54 AM
Thank you Justin, do me a favor… take care, you’re awesome! :relaxed:
Mar 10, 2022 02:24 PM
This conversation seems related to my current issue, but I’m looking for something that splits text out to columns from a comma separated list of kids’ first names in one field. Sometimes there are no kids sometime one, two, and even one instance of 7 kids. So I want to have at least seven new fields populated with the text between commas: Child FN 1
, Child FN 2
, Child FN 3
, Child FN 4
, Child FN 5
, Child FN 6
, Child FN 7
. Any suggestions?
Mar 11, 2022 10:40 AM
@Justin_Barrett, I’m curious if you had some suggestions. Thanks.
Mar 11, 2022 08:33 PM
Welcome to the community, @ACRA_Data! :grinning_face_with_big_eyes: This can be solved using a variation of the multi-line extraction technique from above.
First, add a {Child Count}
field. The method above uses a formula to count lines, but because the children in your base are in linked records, you can use a count field that counts the links.
For {Child 1}
, the formula could be this:
IF({Child Count}, REGEX_EXTRACT({Children First Names} & "", "[^,]*"))
{Child 2}
would be this:
IF({Child Count} > 1, REGEX_EXTRACT({Children First Names} & "", "(?:[^,]*,)([^,]*)"))
{Child 3}
would be this:
IF({Child Count} > 2, REGEX_EXTRACT({Children First Names} & "", "(?:[^,]*,){2}([^,]*)"))
The formulas for {Child 4}
through {Child 7}
would continue the pattern begun in {Child 3}
. With each successive formula, you would increase the following numbers by one:
{Child Count}
Mar 13, 2022 08:18 PM
Thanks @Justin_Barrett! This is super helpful!
Apr 01, 2022 04:05 PM
Hello All,
I’m new to this community and appreciate all the discussion thus far. But I don’t see a solution to my particular problem quite yet.
I’m trying to do something similar to @ACRA_Data, but rather than having the first names in separate columns, I would like them to be in the same column as a multiple select field, rather than a single text field. I think I need to change something in the REGEX_EXTRACT function, but can’t figure it out.
The column on the left in the image below shows you what I have, and the column on the right is what I am working towards. The column on the left can have words separated by commas ranging from 1 to 10. Any help is very much appreciated.
Apr 04, 2022 04:40 PM
Welcome to the community, @Riddhi_Mehta-Neugeba! :grinning_face_with_big_eyes:
The only thing that the REGEX_EXTRACT()
function can do is extract text. It can’t control how that text is presented. For that matter, there’s nothing that any formula function can do to make the formula output look like those colored “pills” that you see in single- and multiple-select fields.
You didn’t indicate whether this is a one-time need or something that will need to be repeated on a regular basis. If it’s a one-time thing, it’s pretty easy.
If this is a recurring conversion need, you could use an automation. Before setting this up, add a multiple-select field, but create it with no options. Those will be handled automatically by the automation. Assuming for this example that your original field is named {Options}
, I’ll call the multiple-select field {Options Converted}
.
Set the automation up as follows:
{Options}
field.
Notice how the helper text under “Options Converted” says, “Separate multiple options with commas”. By feeding it the name of the selection from the {Options}
field—which contains comma-separated items—it will automatically create new entries based on those items. If the items already exist, it will select them and not add duplicates.
Here’s a demo of how that works. In this case, I’m manually choosing a selection from {Options}
, but the same behavior would work if the records are being created via a form.
Apr 05, 2022 05:01 PM
It worked like a charm. Thank you so much @Justin_Barrett !
Apr 07, 2022 03:40 PM
Hello @kuovonne, can we meet for your services in order to get a solution like this? Thanks
Apr 07, 2022 06:33 PM
Which solution are you interested in? There have been several slightly different solutions in this thread. I also recently posted this YouTube video about a non-scripting method of creating linked records for each line in a long text field.
If you are interested in hiring me to write a script or explore a formula-based system, you can book a meeting. I currently do not have any open slots, but some should open up in a week or so.
Apr 11, 2022 06:14 AM
I will check the video, thanks!
Apr 15, 2022 03:26 AM
I love this example, it’s elegant and leverages Airtable strengths more than it tries to sidestep the limitations of regex formulas.
Speaking of which, anyone struggling with REGEX_MATCH or REGEX_EXTRACT implementations might want to try forgetting either exists and use REGEX_REPLACE instead. It’s by far the most powerful of the trio and offers almost the entire feature set of RE2, the Google-made engine powering Airtable regex formulas (disclaimer: regular expressions rarely scale well).
Jul 17, 2022 09:37 AM
Hello,
This is my current formula:
TRIM(RIGHT(Name, (LEN(Name) - FIND(" ", Name))))
I just want the name of the country please.
Jul 20, 2022 09:29 PM
When trying to work out a REGEX solution, the first thing to look for is a repeatable pattern. In this case, there’s a colon and space immediately before the country name. With that, you can build an expression that finds—but doesn’t extract—that colon-space combo, then extracts everything else after it:
IF(Name, REGEX_EXTRACT(Name, "(?:\\: )(.*)"))
Breakdown…
?:
combination, which means to find what’s in the group, but don’t actually extract it..
token matches any single character, and the *
after it says to match the previous token zero or more times, effectively grabbing everything else to the end of the stringJul 21, 2022 06:56 PM
Thank you Justin! I really wish I had waited for a response. I did it manually *sigh. I’m saving this though!
Jul 21, 2022 07:27 PM
I have another question of IF. Currently I have (DATETIME_FORMAT({Date of donation}, ‘M/D/YYYY’) & “—”) & Donor
But if the field “On behalf of corporation” is checked, I want to be (DATETIME_FORMAT({Date of donation}, ‘M/D/YYYY’) & “—”) & Donor Employer