Feb 02, 2022 10:44 AM
Feb 02, 2022 11:07 AM
Hi Rick. I would just substitute in an empty string or a space, depending on your preference.
SUBSTITUTE({string field name}, '-','')
Sound good?
Feb 02, 2022 11:52 AM
what would removing the spaces get me.
ABCD - EFGH - IJKL - MNOP - QRST
Would like to create 5 columns from each data set between the dashes. The length of each data set is not fixed.
Feb 02, 2022 12:40 PM
This is a very similar use case to this thread:
That example was looking for numbers presented in the pattern of
(length)( x )(width)( x )(height)
Your pattern appears to be alphanumeric characters separated by spaces and dashes, or
(alphanumeric text 1)( - )(alphanumeric text 2)( - )(alphanumeric text 3)( - )(alphanumeric text 4)( - )(alphanumeric text 5)
The REGEX pattern for that is:
(\\w*)(\\-+)(\\w*)(\\-+)(\\w*)(\\-+)(\\w*)(\\-+)(\\w*)
^ if you want to explicitly define the separator as one or more consecutive dashes. (\\-+)
means one or more - characters
.
or
(\\w*)(\\W+)(\\w*)(\\W+)(\\w*)(\\W+)(\\w*)(\\W+)(\\w*)
^ if you want to explicitly define the separator as one or more consecutive non-alphanumeric characters (which would include both spaces and dashes). (\\W+)
means one or more non-alphanumeric characters
Therefore a full formula for the first of the five fields could be either:
(1)
REGEX_REPLACE(
SUBSTITUTE({string field name}, " ", ""),
"(\\w*)(\\-+)(\\w*)(\\-+)(\\w*)(\\-+)(\\w*)(\\-+)(\\w*)",
"$1"
)
Note: the SUBSTIUTE was included, assuming you want to remove the spaces.
(2)
REGEX_REPLACE(
{string field name},
"(\\w*)(\\W+)(\\w*)(\\W+)(\\w*)(\\W+)(\\w*)(\\W+)(\\w*)",
"$1"
)
^ this is the recommended formula. If there are not always 5 groups of text, then replace all the +
with *
to change the formula from “one or more” to “zero or more”.
For either method, the REGEX pattern has 9 “capture groups” (each group is what’s in-between the parentheses). Each of your five formula fields that extracts the alphanumeric text will have the exact same REGEX_REPLACE() formula apart from the last argument to reference the relevant capture group.
Your first field will have "$1"
as the last argument (as shown above), the second field will have "$3"
, the third will have "$5"
, etc.
Feb 02, 2022 01:06 PM
Someday I will “get” regex but until that day…thank you.
I am having an issue with the extraction, the first $1 work great, but after that, $3, $5 I am getting improper results.
I did realize my example was slightly incorrect the first part of the pattern is always a number, not text.