Twitter profile clipper

Twitter’s web app was refactored and I’ve re-jigged my web clipper as best I can. This web clipper will take a Twitter user’s profile information into a table with fields for:

Name
Profile from page URL - to be trimmed
Avatar
Bio from selected text

The refactored mark-up in Twitter web app means this clip action needs a little extra work.
The steps I follow:

  1. go to someone’s Twitter profile eg https://twitter.com/khoi
  2. select the text on their bio
  3. click their avatar to enlarge it (this exposes a CSS selector for clipping the image, and appends /photo to the URL eg https://twitter.com/khoi/photo)
  4. activate Airtable web clipper action
  5. trim off the /photo from the page URL
  6. tidy up any undesirable new lines in the Bio text produced by @mentions or #tags


Here’s the clip action recipe for Twitter profiles:

{
    "schemaVersion": 3,
    "fieldMappings": [
        {
            "fieldName": "Name",
            "fieldType": "singleLineText",
            "defaultValue": {
                "type": "cssSelector",
                "opts": {
                    "cssSelector": "h2[aria-level=\"2\"][role=\"heading\"] div div div span"
                }
            }
        },
        {
            "fieldName": "Profile",
            "fieldType": "url",
            "defaultValue": {
                "type": "pageUrl",
                "opts": null
            }
        },
        {
            "fieldName": "Avatar",
            "fieldType": "multipleAttachments",
            "defaultValue": {
                "type": "cssSelector",
                "opts": {
                    "cssSelector": "img[draggable=\"true\"]"
                }
            }
        },
        {
            "fieldName": "Bio",
            "fieldType": "multilineText",
            "defaultValue": {
                "type": "selectedText",
                "opts": null
            }
        }
    ]
}
3 Likes

Hey Vernon, I updated this slightly to auto-copy the description from the page. Tried a few ways but couldn’t pull the avatar without opening up the photo page - A bit of a pain, but this worked for my purposes. Thanks

{
"schemaVersion": 3,
"fieldMappings": [
    {
        "fieldName": "Brand Name",
        "fieldType": "singleLineText",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "h2[aria-level=\"2\"][role=\"heading\"] div div div span"
            }
        }
    },
    {
        "fieldName": "Twitter Handle",
        "fieldType": "multilineText",
        "defaultValue": {
            "type": "pageUrl",
            "opts": null
        }
    },
    {
        "fieldName": "Image Logo",
        "fieldType": "multipleAttachments",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "img[draggable=\"true\"]"
            }
        }
    },
    {
        "fieldName": "Bio",
        "fieldType": "singleLineText",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "[data-testid=\"UserDescription\"]"
            }
        }
    }
]

}

1 Like

I was able to make some improvements by digging in deep, which might make my solutions brittle over time. Still, you can now pull the Profile Photo, Location, and URL just from a typical profile page:

{
"schemaVersion": 3,
"fieldMappings": [
    {
        "fieldName": "Name",
        "fieldType": "singleLineText",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "h2[aria-level=\"2\"][role=\"heading\"] div div div span"
            }
        }
    },
    {
        "fieldName": "Twitter",
        "fieldType": "url",
        "defaultValue": {
            "type": "pageUrl",
            "opts": null
        }
    },
    {
        "fieldName": "Profile Photo",
        "fieldType": "multipleAttachments",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "div.css-1dbjc4n.r-14lw9ot.r-1tlfku8.r-1ljd8xs.r-13l2t4g.r-1phboty.r-1jgb5lz.r-11wrixw.r-61z16t.r-1ye8kvj.r-13qz1uu.r-184en5c > div > div:nth-child(2) > div > div > div:nth-child(1) > div.css-1dbjc4n.r-ku1wi2.r-1j3t67a.r-m611by > div.css-1dbjc4n.r-obd0qt.r-18u37iz.r-1w6e6rj.r-1wtj0ep > a > div.css-1dbjc4n.r-1adg3ll.r-1udh08x > div.r-1p0dtai.r-1pi2tsx.r-1d2f490.r-u8s1d.r-ipm5af.r-13qz1uu > div > img"
            }
        }
    },
    {
        "fieldName": "Bio",
        "fieldType": "multilineText",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "[data-testid=\"UserDescription\"]"
            }
        }
    },
    {
        "fieldName": "Location",
        "fieldType": "singleLineText",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "div.css-1dbjc4n.r-14lw9ot.r-1tlfku8.r-1ljd8xs.r-13l2t4g.r-1phboty.r-1jgb5lz.r-11wrixw.r-61z16t.r-1ye8kvj.r-13qz1uu.r-184en5c > div > div:nth-child(2) > div > div > div:nth-child(1) > div.css-1dbjc4n.r-ku1wi2.r-1j3t67a.r-m611by > div:nth-child(4) > div > span:nth-child(1) > span"
            }
        }
    },
    {
        "fieldName": "Website",
        "fieldType": "url",
        "defaultValue": {
            "type": "cssSelector",
            "opts": {
                "cssSelector": "div.css-1dbjc4n.r-ku1wi2.r-1j3t67a.r-m611by > div:nth-child(4) > div > a"
            }
        }
    }
]

}

3 Likes

Thank you @Andrew_Wingrave for adding the cssSelector for the Bio field from [data-testid="UserDescription"]. I’m learning more about data-* custom data attributes. :wink:

Hi @chrismessina and welcome to our community :wave:

Wow, what an upgrade to the Twitter profile clipper. Nice digging! :clap:
I like the cssSelector that pulls in the avatar without needing the extra clicking workaround. Nice. :wink:
What a bonus to grab the location and Website fields too!

I like these upgrades so much I retro-fitted some of records in my Airtable (clipping with the upgraded web clipper, then using a Dedupe block for matches on the profile URL).

1 Like

I’ve extended this clipper to also scrape from LinkedIn profile pages. Each CSS selector now has a comma separator (Twitter, LinkedIn).

Here’s an annotated screenshot outlining where the CSS selectors draw from.


This example lacks a profile background image. The selector avoids targeting an image with alt="Background Image" anyway.

Lastly, I’m using the About paragraph to populate my Bio field. If you prefer the one-liner that appears after their name, switch the selector from main p[class*="about__summary"] to main section h2

Clip action:

{
    "schemaVersion": 3,
    "fieldMappings": [
        {
            "fieldName": "Name",
            "fieldType": "singleLineText",
            "defaultValue": {
                "type": "cssSelector",
                "opts": {
                    "cssSelector": "h2[aria-level=\"2\"][role=\"heading\"] div div div span, main section li"
                }
            }
        },
        {
            "fieldName": "Profile",
            "fieldType": "url",
            "defaultValue": {
                "type": "pageUrl",
                "opts": null
            }
        },
        {
            "fieldName": "Avatar",
            "fieldType": "multipleAttachments",
            "defaultValue": {
                "type": "cssSelector",
                "opts": {
                    "cssSelector": "div.css-1dbjc4n.r-14lw9ot.r-1tlfku8.r-1ljd8xs.r-13l2t4g.r-1phboty.r-1jgb5lz.r-11wrixw.r-61z16t.r-1ye8kvj.r-13qz1uu.r-184en5c > div > div:nth-child(2) > div > div > div:nth-child(1) > div.css-1dbjc4n.r-ku1wi2.r-1j3t67a.r-m611by > div.css-1dbjc4n.r-obd0qt.r-18u37iz.r-1w6e6rj.r-1wtj0ep > a > div.css-1dbjc4n.r-1adg3ll.r-1udh08x > div.r-1p0dtai.r-1pi2tsx.r-1d2f490.r-u8s1d.r-ipm5af.r-13qz1uu > div > img, main section img:not([alt=\"Background Image\"])"
                }
            }
        },
        {
            "fieldName": "Bio",
            "fieldType": "multilineText",
            "defaultValue": {
                "type": "cssSelector",
                "opts": {
                    "cssSelector": "[data-testid=\"UserDescription\"], main p[class*=\"about__summary\"]"
                }
            }
        },
        {
            "fieldName": "Location",
            "fieldType": "singleLineText",
            "defaultValue": {
                "type": "cssSelector",
                "opts": {
                    "cssSelector": "div.css-1dbjc4n.r-14lw9ot.r-1tlfku8.r-1ljd8xs.r-13l2t4g.r-1phboty.r-1jgb5lz.r-11wrixw.r-61z16t.r-1ye8kvj.r-13qz1uu.r-184en5c > div > div:nth-child(2) > div > div > div:nth-child(1) > div.css-1dbjc4n.r-ku1wi2.r-1j3t67a.r-m611by > div:nth-child(4) > div > span:nth-child(1) > span, main section ul ~ ul li"
                }
            }
        },
        {
            "fieldName": "Website",
            "fieldType": "url",
            "defaultValue": {
                "type": "cssSelector",
                "opts": {
                    "cssSelector": "div.css-1dbjc4n.r-ku1wi2.r-1j3t67a.r-m611by > div:nth-child(4) > div > a"
                }
            }
        }
    ]
}
2 Likes

I am a beginner and my question may seem stupid: using the web clipper, I “import clipaction” first, correct?
and then I had pasted your script inside “snippet clipper”
it returns Incorrect.
Could you tell me why?

Hi @Anne_Marie_Helwaser! Here are the steps to import any of the clip actions above:

  1. Open your web clipper block
  2. Click the “Import” button, which is underneath the list of fields in your web clipper block
  3. Paste the snippet above into the text area, then click "Import"
  4. At this next step, you’ll map fields from the snippet to fields in your own table! As long as you setup at least one field, you’ll be able to continue with the import. For example, in the snippet Vernon shared for clipping Twitter profiles, you would map the “Name” field to whichever column in your table you’d like to save names to. If your table has a field to map to everything defined in the snippet, you can map them all!
  5. Click “Save”, then get clipping!

There are some logical restrictions with field mapping - for example, you can only map a field that fetches images to an Attachment field. For the most part, though, you can define mappings as you wish!

Let me know if you’re still running into issues with the clip action importing!

Cheers.

1 Like

Thank you so much!
One more question… just to make sure (this was a while ago): I can clip from LinkedIn profile with the snippet or was it for Twitter only( see above Vernon Fowler in the conversation mentioning LinkedIn which is better for me

Hi @Anne-Marie_Helwaser

Great news: The latest iteration of the clip action works for both LinkedIn and Twitter. :wink:

The clip action from late December includes comma separated CSS selectors that grab relevant text strings from either site. For example, the fieldName “Bio” has:

"cssSelector": "[data-testid=\"UserDescription\"], main p[class*=\"about__summary\"]"
The comma between selector elements acts like a match all.
When on a Twitter profile, [data-testid=\"UserDescription\"] will find a match.
When on a LinkedIn profile, main p[class*=\"about__summary\"] will find a match.

When either Twitter or LinkedIn change their sites to use different HTML, these selectors may need refactoring.


Using commas in CSS selectors a clip action can be extended to support multiple sites. This capability of using commas to specify multiple selectors doesn’t appear in the Airtable documentation: Creating CSS selectors for the web clipper block. However there is a link at the end pointing to CSS Selectors Reference which includes CSS element,element Selector

1 Like

Great ! Thanks,
meaning for now I don’t have to change anything as long as I use it on LinkedIn, and they don’t change their sites to use different HTML correct?

That’s correct @Anne_Marie_Helwaser That latest clip action works on LinkedIn. :slight_smile:

When they eventually change their HTML/CSS I’m sure someone will contribute an updated clip action right here in this thread. :wink: Thanks to all who have contributed / will contribute improvements and iterations. :+1:

That’s a great post and the code works great.

Is it possible to use the web clipper to select text (profile) from Linkedin search results and copy the data like (first/last name, title, picture URL and profile URL) to the Airtable?

I’d like to do this without visiting the profile.

Hi @Boyan_One,

It may be possible to extract from search results but I’m struggling to figure out how. My assumption is you want to grab a different search result each time you perform a clip (6th result this time, 4th another time, and then the 9th on another occasion).

I think extracting fields automatically would depend on the CSS selectors for a specific search result index (eg the first result) to be effective. I don’t know how it might work otherwise.

Anyone else have an idea how this might work?