Skip to main content

Extracting Date From URL String

  • December 5, 2022
  • 2 replies
  • 61 views

Forum|alt.badge.img+1

Hi All

I'm currently scraping data from a few event listing websites into Airtable to create a database of events for the LGBTQ community.

On a few of the websites, I can't extract the full event date from the listings, this only appears in the URL.

An example of the URL is:

https://www.admiral-duncan.co.uk/soho/02-01-2023/karaoke-showdown-with-candy-heals

Is there any easy way to do this?

The other issue is that each site link has a different URL length, but the way they show the date is the same each time:

https://www.the2brewers.com/london/08-12-2022/panto

Any suggestions?

Thanks

2 replies

Matthew_Lanni
Forum|alt.badge.img+18

Hi there,

I'm no formula expert, but this should do the trick, as long as the dates are indeed formatted the same way:

REGEX_EXTRACT({URLField},"\d\d-\d\d-\d\d\d\d") 

Matthew_Lanni
Forum|alt.badge.img+18

You could even go a step further and change the formatting of the date to read "February 01, 2023" for example:

DATETIME_FORMAT(REGEX_EXTRACT(Notes,"\d\d-\d\d-\d\d\d\d"),'MMMM DD, YYYY')