Skip to main content

Soundex implementation

  • August 30, 2018
  • 5 replies
  • 33 views

Forum|alt.badge.img+5

Possibly of interest to some, the following is an implementation of Soundex pseudo-homophone encoding. As there are variations of the algorithm, care should be taken if using its output to search externally encoded databases. (I think this conforms to U.S. Census encoding, but I’ve found contradicting statements.) To use, copy and paste the following into the formula configuration window of a formula field. It expects to use an existing field in the table named {Word} as input, so adjust field names accordingly. (There are two instances of the field name {Word} in the formula.)

Yes, it’s far from a perfect algorithm; yes, it’s rooted somewhat intractably in American English; yes, there are alternative encodings that may be more appropriate for many uses: Feel free to modify, enhance, and improve as you see fit — preferably releasing your improvements to the community afterwards.

LEFT(
SUBSTITUTE(
 SUBSTITUTE(
  SUBSTITUTE(
   SUBSTITUTE(
    SUBSTITUTE(
     SUBSTITUTE(
      SUBSTITUTE(
       SUBSTITUTE(
        SUBSTITUTE(
         SUBSTITUTE(
          SUBSTITUTE(
           SUBSTITUTE(
            SUBSTITUTE(
             SUBSTITUTE(
              SUBSTITUTE(
               SUBSTITUTE(
                SUBSTITUTE(
                 SUBSTITUTE(
(
LEFT({Word},1)&
(
SUBSTITUTE(
 SUBSTITUTE(
  SUBSTITUTE(
   SUBSTITUTE(
    SUBSTITUTE(
     SUBSTITUTE(
      SUBSTITUTE(
       SUBSTITUTE(
        SUBSTITUTE(
         SUBSTITUTE(
          SUBSTITUTE(
           SUBSTITUTE(
            SUBSTITUTE(
             SUBSTITUTE(
              SUBSTITUTE(
               SUBSTITUTE(
                SUBSTITUTE(
                 SUBSTITUTE(
                  SUBSTITUTE(
                   SUBSTITUTE(
                    SUBSTITUTE(
                     SUBSTITUTE(
                      SUBSTITUTE(
                       SUBSTITUTE(
                        SUBSTITUTE(
                         SUBSTITUTE(
                          SUBSTITUTE( 
                           SUBSTITUTE(
                            SUBSTITUTE(
                             SUBSTITUTE(
                              SUBSTITUTE(
                               SUBSTITUTE(
                                SUBSTITUTE(
                                 SUBSTITUTE(
SUBSTITUTE(
 SUBSTITUTE(
  SUBSTITUTE(
   SUBSTITUTE(
    SUBSTITUTE(
     SUBSTITUTE(
      SUBSTITUTE(
       SUBSTITUTE(
        SUBSTITUTE(
         SUBSTITUTE(
          SUBSTITUTE(
UPPER({Word}),
           'A','0'
           ),
          'E','0'
          ),
         'I','0'
         ),
        'O','0'
        ),
       'U','0'
       ),
      'Y','0'
      ),
     'H','0'
     ),
    'W','0'
    ),
   'B','1'
   ),
  'F','1'
  ),
 'P','1'
 ),
                                  'V','1'
                                  ),
                                 'C','2'
                                 ),
                                'G','2'
                                ),
                               'J','2'
                               ),
                              'K','2'
                              ),
                             'Q','2'
                             ),
                            'S','2'
                            ),
                           'X','2'
                           ),
                          'Z','2'
                          ),
                         'D','3'
                         ),
                        'T','3'
                        ),
                       'L','4'
                       ),
                      'M','5'
                      ),
                     'N','5'
                     ),
                    'R','6'
                    ),
                   '1111','1'
                   ),
                  '111','1'
                  ),
                 '11','1'
                 ),
                '2222','2'
                ),
               '222','2'
               ),
              '22','2'
              ),
             '3333','3'
             ),
            '333','3'
            ),
           '33','3'
           ),
          '4444','4'
          ),
         '444','4'
         ),
        '44','4'
        ),
       '5555','5'
       ),
      '555','5'
      ),
     '55','5'
     ),
    '6666','6'
    ),
   '666','6'
   ),
  '66','6'
  ),
 '0',''
 )
)
),
                  'B1','B'
                  ),
                 'F1','F'
                 ),
                'P1','P'
                ),
               'V1','V'
               ),
              'C2','C'
              ),
             'G2','G'
             ),
            'J2','J'
            ),
           'K2','K'
           ),
          'Q2','Q'
          ),
         'S2','S'
         ),
        'X2','X'
        ),
       'Z2','Z'
       ),
      'D3','D'
      ),
     'T3','T'
     ),
    'L4','L'
    ),
   'M5','M'
   ),
  'N5','N'
  ),
 'R6','R'
 )&'000',4)

5 replies

Forum|alt.badge.img+19
  • Inspiring
  • 351 replies
  • August 30, 2018

That’s a lot of substitutes :winking_face:

Impressive


Forum|alt.badge.img+5
  • Author
  • Inspiring
  • 1386 replies
  • August 30, 2018

That’s a lot of substitutes :winking_face:

Impressive


Amazingly, I didn’t have this going through my head…

https://www.youtube.com/watch?v=eswQl-hcvU0

…until your reply. :winking_face:


Forum|alt.badge.img+5
  • New Participant
  • 2 replies
  • September 5, 2018

It would be great if they had a powerfull regex replace function.


Forum|alt.badge.img+5
  • Author
  • Inspiring
  • 1386 replies
  • September 5, 2018

It would be great if they had a powerfull regex replace function.


Yeah, but then I’d have to go and learn regex… :winking_face:


Forum|alt.badge.img+5
  • New Participant
  • 2 replies
  • September 6, 2018

Yeah, but then I’d have to go and learn regex… :winking_face:


If you manage to build soundex with all that formulas, then you’ll feel comfortable with regexp :-). Much less typing.