Extract tags from text


#1

Is there a single function that can extract whole words (either trailed by a space or the last word) from text fields?

We’re scheduling and tracking social media campaigns and have post text including hashtags in a single text field in Airtable. This makes the workflow easy for planning and reporting.

Nearest way I can think of achieving this is a combination of text functions (FIND for the start character “#”, another FIND for the first space character occurrence afterwards, and the MID to extract with the number of characters set to the difference between the “#” and space characters) described at https://support.airtable.com/hc/en-us/articles/203255215-Formula-field-reference

Is there a simpler way of achieving word extraction? Or recursive word extraction?


#2

I don’t think there is another way to achieve this using the default functions.

An alternative would be a custom API solution or something like Zapier.


#3

Hey @Tuur I found a way! :raised_hands:

To extract the first hashtag word, use the following function madness and replace {Post Text} with the name of your field.

IF(AND(FIND("#",{Post Text}),FIND(" “,{Post Text},(FIND(”#",{Post Text})))),
MID({Post Text},(FIND("#",{Post Text})),(FIND(" “,{Post Text},(FIND(”#",{Post Text})))-(FIND("#",{Post Text})))),
IF(FIND("#",{Post Text}),
MID({Post Text},(FIND("#",{Post Text})),(LEN({Post Text})+1-FIND("#",{Post Text}))),
BLANK()))

This handles:

  1. the first hashtag word when there’s a trailing space
  2. the first hashtag word when there is no trailing space (ie. at the end of the {Post Text})
  3. no hashtag word - blank output

I can’t imagine what function madness we’d need to try extracting multiple hashtags!


#4

Good job! Multiples are difficult with the default functions because you can’t use repeating or recursive logic (unless there is some predefined maximum of course).

By the way, in most cases I prefer multiple hidden (helper) fields with smaller formulas instead of one big formula, so other people can sort of follow it too. :wink:


#5

Agree on hidden helper fields to make it easier to follow. In Airtable, if new fields added to a table weren’t added to all views, I’d take this approach more often.

As well as First hashtag, I’ve used a similar approach for a First @mention formula field.

IF(AND(FIND("@",{Post Text}),FIND(" ",{Post Text},(FIND("@",{Post Text})))),

SUBSTITUTE(MID({Post Text},(FIND("@",{Post Text})),(FIND(" ",{Post Text},(FIND("@",{Post Text})))-(FIND("@",{Post Text})))),":",""),

IF(FIND("@",{Post Text}),

SUBSTITUTE(MID({Post Text},(FIND("@",{Post Text})),(LEN({Post Text})+1-FIND("@",{Post Text}))),":",""),

BLANK()))

In addition to the same 3 situations handled in the First hashtag extraction, the SUBSTITUTE() function removes any trailing “:” character.