Function to count the number of words in a string or text field


#1

I haven’t been able to find an answer to this on the site, but there doesn’t seem to be a function to count the number of words in a text field. There’s a function to count the number of characters – LEN.

One workaround could be to the count the number of spaces in a text string, but I don’t see a function to do that, either. SEARCH and FIND return the first position of the search string inside the text, not the total number of times the search string occurs.

A word count function is very important for using AirTable as a content management system.


#2

OK, in case anyone else is searching for this – my workaround is to divide the length of a text by 5.5: LEN(text_field)/5.5. (The average word length in English is 4.5 characters, plus another character for the space between words.)

It gets somewhat close to the real number, though, of course, it counts punctuation marks and other special characters.


#3

You can count the number of spaces and sum 1 with this technique:


#4

Thanks – great advice!

I took out all the space characters, then subtracted the length of the new string form the old string, to get the number of spaces in the text.

LEN(Text)-LEN(SUBSTITUTE(Text," “,”"))

The result were a slight undercount, but within 5% of the word count that I got from Google docs

Then I realized I was also missing paragraph breaks.

Here’s the formula I used to count how many paragraph breaks there are in the text (which you should dive by two if you double-space between paragraphs):

LEN(Text)-LEN(SUBSTITUTE(Text,"\n",""))

I added both of these together into a single formula:

(LEN(Text)-LEN(SUBSTITUTE(Text," “,”"))) + (LEN(Text)-LEN(SUBSTITUTE(Text,"\n","")))

Now I’m a little tiny bit over – less than 1% compared to the Google Docs word count function.

Now I’m in business! Thanks!

I could shorten that formula a little bit with basic algebra:

2*LEN(Text)-LEN(SUBSTITUTE(Text," “,”")) -LEN(SUBSTITUTE(Text,"\n","")

Thanks, Elias!


#5

I realized that I could nest SUBSTITUTE functions to solve the double-spacing between paragraphs problem:

LEN(SUBSTITUTE(SUBSTITUTE(Text,"\n\n","\n"),"\n",""))


#6

Perfect! I know, a bit tricky and long formula but working in the end :smiley: