Formula has bizarre interaction with accented characters

I have a short text field with a record containing the text ‘Açaí Bowl.png’
I have another formula field that strips the file extension from text in this field, but instead of the expected result of ‘Açaí Bowl’, I get ‘Açaí Bong’. :innocent: Somehow the last two characters of ‘Bowl’ are transforming?
Here is my formula text, which usually works fine:

IF(
  OR(
    RIGHT({File name}, 4) = ".jpg",
    RIGHT({File name}, 4) = ".png",
    RIGHT({File name}, 4) = ".pdf"
  ),
    REPLACE({File name}, LEN({File name}) - 3, 4, ""), 
    IF(RIGHT({File name}, 5) = ".jpeg", 
      REPLACE({File name}, LEN({File name}) - 4, 5, ""),
      "..."
    )
)

Does it behave the same way on your base? Is there a way to prevent this? Not a huge deal since not a common case but it would be nice to know.

No, it does not behave the same way in my base. In my base I get the expected result.

However, it may be that the accented character is being encoded differently for you. Accented characters are sometimes stored under the hood as multiple characters, which may be affecting the way the RIGHT formula is dealing with the length. Notice that the formula is still removing four characters (wl.p), just not the characters you want.

You should send a report to support@airtable.com, including a screen shot.

If this is a problem, you can use regular expressions to remove the file extension instead.

IF(
    REGEX_MATCH({File Name}, "((\\.jpg)|(\\.png)|(\\.pdf)|(\\.jpeg))$"), 
    REGEX_REPLACE({File Name}, "((\\.jpg)|(\\.png)|(\\.pdf)|(\\.jpeg))$", ""), 
    {File Name}
)
1 Like

The following REGEX function will remove periods with any characters after it, and does not (at least in my base) mess up the special characters

REGEX_REPLACE({File Name}, "(.*)([.].*)", "$1")
1 Like

This topic was solved and automatically closed 15 days after the last reply. New replies are no longer allowed.