Skip to main content
Solved

Extract the first two words from a string?

  • October 15, 2024
  • 4 replies
  • 193 views

Forum|alt.badge.img+3

I have string fields and I want to extract the first two words (minus any punctuation or special symbols). Ideally, they would remove the white space and capitalize each word as well.

So a field with “Midnight: A Tale” = “MidnightA”. 

”What’s New in Books” = “WhatsNew”

 

I’ve looked all over to find a way to get two words, but can only find out how to get one. I got into RegEx a bit, but couldn’t figure out how to account for the punctuation/symbols. If someone has a slick way to extract the first two words of a string, I’d be eternally grateful!

Best answer by Pascal_Gallais-

Hello,

Someone may have an easier solution, but you could try the following.

Here is the result that I get (column "first 2 words"):

 

 

Formula "Word1" extracts the first word:

 

Formula "Remainder" extrats the rest of the sentence if it exists:

Formula "word2" extracts the second word from the remainder:

Formula "First 2 words" concatenates Word1 and Word2 and removes characters " ' "and " : " :

You will need to complete the REGEX_REPLACE function with all special characters that you wish to remove (as i did with the dot for example)

Regards,

Pascal

 

4 replies

Pascal_Gallais-
Forum|alt.badge.img+21

Hello,

Someone may have an easier solution, but you could try the following.

Here is the result that I get (column "first 2 words"):

 

 

Formula "Word1" extracts the first word:

 

Formula "Remainder" extrats the rest of the sentence if it exists:

Formula "word2" extracts the second word from the remainder:

Formula "First 2 words" concatenates Word1 and Word2 and removes characters " ' "and " : " :

You will need to complete the REGEX_REPLACE function with all special characters that you wish to remove (as i did with the dot for example)

Regards,

Pascal

 


Forum|alt.badge.img+6
  • Participating Frequently
  • October 15, 2024

Here’s a formula that accomplishes your requirements. Make sure to replace {YourField} with the actual name of your field containing the strings. 

 

 

UPPER( LEFT( REGEX_REPLACE( REGEX_EXTRACT({YourField}, '^(\\w+)[^\\w]*(\\w+)'), '[^A-Za-z0-9]', '' ), 1 ) ) & UPPER( MID( REGEX_REPLACE( REGEX_EXTRACT({YourField}, '^(\\w+)[^\\w]*(\\w+)'), '[^A-Za-z0-9]', '' ), 2, 1 ) )

 


Hope this helps. 


Forum|alt.badge.img+3
  • Author
  • New Participant
  • October 15, 2024

Here’s a formula that accomplishes your requirements. Make sure to replace {YourField} with the actual name of your field containing the strings. 

 

 

UPPER( LEFT( REGEX_REPLACE( REGEX_EXTRACT({YourField}, '^(\\w+)[^\\w]*(\\w+)'), '[^A-Za-z0-9]', '' ), 1 ) ) & UPPER( MID( REGEX_REPLACE( REGEX_EXTRACT({YourField}, '^(\\w+)[^\\w]*(\\w+)'), '[^A-Za-z0-9]', '' ), 2, 1 ) )

 


Hope this helps. 


Thanks for the help! This formula returns the first two letters of the field, rather than the first two words.


Forum|alt.badge.img+3
  • Author
  • New Participant
  • October 15, 2024

Hello,

Someone may have an easier solution, but you could try the following.

Here is the result that I get (column "first 2 words"):

 

 

Formula "Word1" extracts the first word:

 

Formula "Remainder" extrats the rest of the sentence if it exists:

Formula "word2" extracts the second word from the remainder:

Formula "First 2 words" concatenates Word1 and Word2 and removes characters " ' "and " : " :

You will need to complete the REGEX_REPLACE function with all special characters that you wish to remove (as i did with the dot for example)

Regards,

Pascal

 


Thank you, Pascal, this worked perfectly! For others trying it, I did include an additional field to capitalize the second word before concatenating them.

IF({File second word}, UPPER(LEFT({File second word}, 1)) & (REGEX_EXTRACT({File second word}, "(?:.)([^ ]*)")))