This may indeed be the case today. You certainly know more about it than I do. However, are you certain that there would be no advantage to running some aspect of this technology on servers if in your hypothetical recommendation that SimpleScraper were acquired by Airtable? What would be the point of buying a scraper technology of you didn’t intend to integrate it in some way that creates competitive advantage.
But even so, how can you be so certain that the chrome extension you are using doesn’t interact with a server that performs scraping activity on your behalf?
To me, what “scrape in the cloud” means is far different from “scraping the cloud”. It suggests that there are process activities that may indeed be configured and managed in your browser, but which are actually performed by proxy services elsewhere.
If you can schedule a scrape and close your browser, how do you think that works? Surely, it cannot run [solely] in your browser.
I’ve worked for a few companies that provide a variety of data harvesting technologies (import.io) and every one of them utilized servers to do the heavy lifting because browsers are (a) inefficient, (b) unable to pace themselves, (c ) unable to spread requests across multiple IPs, multiple user-agents and domains, and (d) are miserable tools where same-origin security policies come into play.
I think scraping is a fine activity and it’s wonderful to utilize these tools to acquire data. My comment was simply to point out that …
- Airtable is not likely to get into the data scraping business;
- There are lots of potential issues with scraping (technical and otherwise);
- Airtable has some pretty big competitive gaps in their solution; scraping is not going to help to close those gaps.