Help

Re: Formula to break a string of text onto 2 separate strings of text

3484 1
cancel
Showing results for 
Search instead for 
Did you mean: 
J_B_S
5 - Automation Enthusiast
5 - Automation Enthusiast

I have a string of text that I want to break into two separate strings of text.
Here’s an example:

  • This is an example string of text / I want to break it into two separate strings of text

I want all of the words that precede the “/” to be one string of text. And then I want all of the words that come after the “/” to be a separate string of text. I want to completely eliminate the “/”.

So, to recap, we start with this:

  • This is an example string of text / I want to break it into two separate parts

And then we end-up with this:

  • This is an example string of text
  • I want to break it into two separate parts
73 Replies 73

This is working so good with First and Last Name, but, i’m wondering… how about the “Middle2” in your table. Im on a Latinamerican country and We Usually name like this (example name):
Carlos Roberto Flores González

How can i extract “Flores” from it?

Thanks in advance.

Welcome to the community, @Rodrigo_Flores! :grinning_face_with_big_eyes: This will extract the next-to-last item from a string (using spaces for separation):

IF(Name, REGEX_EXTRACT(Name, "([^ ]+) ([^ ]+)$"))

Screen Shot 2021-12-04 at 7.45.23 PM

Wow, it worked! :winking_face: many thanks Justin. Just to be aware, how can i study the logic of those functions “([^ ]+) ([^ ]+)$”, or, what’s the logic behind that… :smiling_face_with_sunglasses: you rock!

thanks!

First off, the $ at the end anchors the pattern to the end of the string. Without that, the regex engine would search for a match at the start of the string.

Parentheses are used to define search groups. In the expression that I wrote, there are two matching groups separated by a space, meaning that both groups and the separating space would need to be matched for anything to be returned.

Square braces are wrapped around collections of individual characters that should be matched. However, the caret ^ character at the front tells the regex engine to not match any of the following characters. In this case, we want to not match a space.

The + token after the square brace collection says to match the preceding token—any non-space character—one or more times.

Taken altogether, this collection—([^ ]+)—will match any non-space character one or more times, effectively selecting a single word. (Note: There is a separate \w token for specifically selecting “word” characters, but it’s very narrowly-focused in the characters that it matches. The version that I’m using is more broad, literally matches anything that’s not a space.)

Taking everything into account, the full expression says, “Find the pattern WORD-SPACE-WORD at the end of the string.”

With that in mind, we now hit an interesting bit of idiosyncratic behavior: Airtable’s regex implementation only returns the contents of the first defined group. So even though we’re telling the regex engine to find two words separated by a space, only the first word is actually returned, giving us the second-to-last word in the string.

To learn more, I recommend bookmarking a site like regex101.com (where you’ll need to select the Golang “flavor” to most-closely match the regex engine that Airtable uses), and just playing around.

Thank you Justin, do me a favor… take care, you’re awesome! :relaxed:

ACRA_Data
4 - Data Explorer
4 - Data Explorer

This conversation seems related to my current issue, but I’m looking for something that splits text out to columns from a comma separated list of kids’ first names in one field. Sometimes there are no kids sometime one, two, and even one instance of 7 kids. So I want to have at least seven new fields populated with the text between commas: Child FN 1, Child FN 2, Child FN 3, Child FN 4, Child FN 5, Child FN 6, Child FN 7. Any suggestions?
image

@Justin_Barrett, I’m curious if you had some suggestions. Thanks.

Welcome to the community, @ACRA_Data! :grinning_face_with_big_eyes: This can be solved using a variation of the multi-line extraction technique from above.

First, add a {Child Count} field. The method above uses a formula to count lines, but because the children in your base are in linked records, you can use a count field that counts the links.

For {Child 1}, the formula could be this:

IF({Child Count}, REGEX_EXTRACT({Children First Names} & "", "[^,]*"))

{Child 2} would be this:

IF({Child Count} > 1, REGEX_EXTRACT({Children First Names} & "", "(?:[^,]*,)([^,]*)"))

{Child 3} would be this:

IF({Child Count} > 2, REGEX_EXTRACT({Children First Names} & "", "(?:[^,]*,){2}([^,]*)"))

The formulas for {Child 4} through {Child 7} would continue the pattern begun in {Child 3}. With each successive formula, you would increase the following numbers by one:

  • The number in the comparison against {Child Count}
  • The number in curly brackets in the middle of the regular expression

Screen Shot 2022-03-11 at 8.34.25 PM

Thanks @Justin_Barrett! This is super helpful!

Riddhi_Mehta-Ne
4 - Data Explorer
4 - Data Explorer

Hello All,
I’m new to this community and appreciate all the discussion thus far. But I don’t see a solution to my particular problem quite yet.

I’m trying to do something similar to @ACRA_Data, but rather than having the first names in separate columns, I would like them to be in the same column as a multiple select field, rather than a single text field. I think I need to change something in the REGEX_EXTRACT function, but can’t figure it out.

The column on the left in the image below shows you what I have, and the column on the right is what I am working towards. The column on the left can have words separated by commas ranging from 1 to 10. Any help is very much appreciated.
airtable

Welcome to the community, @Riddhi_Mehta-Neugeba! :grinning_face_with_big_eyes:

The only thing that the REGEX_EXTRACT() function can do is extract text. It can’t control how that text is presented. For that matter, there’s nothing that any formula function can do to make the formula output look like those colored “pills” that you see in single- and multiple-select fields.

You didn’t indicate whether this is a one-time need or something that will need to be repeated on a regular basis. If it’s a one-time thing, it’s pretty easy.

  • Duplicate your field, making sure to choose “Duplicate Cells”
  • Change the field type of the duplicate field to single line text.
  • Change the field type to multiple-select. Airtable will automatically create new items based on the comma separation in the text.

If this is a recurring conversion need, you could use an automation. Before setting this up, add a multiple-select field, but create it with no options. Those will be handled automatically by the automation. Assuming for this example that your original field is named {Options}, I’ll call the multiple-select field {Options Converted}.

Set the automation up as follows:

  • Trigger: When a record is updated. Specifically, only look for changes in the {Options} field.
  • The only action you’ll need is an “Update record” action. Update the triggering record to copy the name of the selected “Options” choice into the “Options Converted” field.

Screen Shot 2022-04-04 at 4.28.24 PM

Notice how the helper text under “Options Converted” says, “Separate multiple options with commas”. By feeding it the name of the selection from the {Options} field—which contains comma-separated items—it will automatically create new entries based on those items. If the items already exist, it will select them and not add duplicates.

Here’s a demo of how that works. In this case, I’m manually choosing a selection from {Options}, but the same behavior would work if the records are being created via a form.

single-to-multiple

It worked like a charm. Thank you so much @Justin_Barrett !

Hello @kuovonne, can we meet for your services in order to get a solution like this? Thanks

Which solution are you interested in? There have been several slightly different solutions in this thread. I also recently posted this YouTube video about a non-scripting method of creating linked records for each line in a long text field.

If you are interested in hiring me to write a script or explore a formula-based system, you can book a meeting. I currently do not have any open slots, but some should open up in a week or so.

I will check the video, thanks!

I love this example, it’s elegant and leverages Airtable strengths more than it tries to sidestep the limitations of regex formulas.

Speaking of which, anyone struggling with REGEX_MATCH or REGEX_EXTRACT implementations might want to try forgetting either exists and use REGEX_REPLACE instead. It’s by far the most powerful of the trio and offers almost the entire feature set of RE2, the Google-made engine powering Airtable regex formulas (disclaimer: regular expressions rarely scale well).

Angelena_Zeiser
6 - Interface Innovator
6 - Interface Innovator

Hello,
Screen Shot 2022-07-17 at 9.35.02 AM

This is my current formula:
TRIM(RIGHT(Name, (LEN(Name) - FIND(" ", Name))))

I just want the name of the country please.

When trying to work out a REGEX solution, the first thing to look for is a repeatable pattern. In this case, there’s a colon and space immediately before the country name. With that, you can build an expression that finds—but doesn’t extract—that colon-space combo, then extracts everything else after it:

IF(Name, REGEX_EXTRACT(Name, "(?:\\: )(.*)"))

Breakdown…

  • Parentheses define groups. In this case, we have two groups defined slightly differently.
  • In the first group, we begin with the ?: combination, which means to find what’s in the group, but don’t actually extract it.
  • In that first group we’re looking for a colon-space combo. However, the colon is a special token in REGEX, so it must be escaped by putting a backslash before it. And because the backslash is also an escape character for strings and we need a backslash as an actual part of the expression string, we must escape that backslash with another backslash.
  • In the second group, the . token matches any single character, and the * after it says to match the previous token zero or more times, effectively grabbing everything else to the end of the string
Angelena_Zeiser
6 - Interface Innovator
6 - Interface Innovator

Thank you Justin! I really wish I had waited for a response. I did it manually *sigh. I’m saving this though!

Angelena_Zeiser
6 - Interface Innovator
6 - Interface Innovator

I have another question of IF. Currently I have (DATETIME_FORMAT({Date of donation}, ‘M/D/YYYY’) & “—”) & Donor

But if the field “On behalf of corporation” is checked, I want to be (DATETIME_FORMAT({Date of donation}, ‘M/D/YYYY’) & “—”) & Donor Employer