Mar 08, 2022 02:45 AM
Hey everyone,
There’s a large open source intelligence (OSINT) community helping to document the invasion of Ukraine with many focused on geolocation: identifying exactly where an event took place or where certain media footage was recorded (here’s a recent article on the subject).
Given the volume of information, it’s sometimes difficult to keep up so I’ve built a system that searches Twitter for tweets containing coordinates, extracts them and plots the locations onto a map. It’s all fully automated and built on top of Airtable.
You can view the base here: Airtable - Simplescraper OSINT.
I’ve shared it to some OSINT communities and it seems to be useful, and at the same time it’s a neat example of how Airtable makes it super easy to quickly prototype new ideas.
For those interested in how it works:
And that’s it. Let me know what you think.
To make it more user-friendly for people not familiar with Airtable it would have been neat to extend this using Interfaces, but shared links and apps are not available on Interfaces (yet :crossed_fingers: ).
Mar 13, 2022 03:31 PM
This is an amazing and useful example! Thanks for sharing!
Mar 14, 2022 11:21 AM
Great example for real-life tasks.
In my ‘use case’, I would add a view with time filter like ‘after’ ‘2-3 days ago’ or ‘this week’.
Sitting in Kyiv, it’s quite ‘mental-draining’ to read all those news while working or perform volunteer activity of a different kind. But on the other side, we can’t ignore whole picture, it’s vitally important to monitor current state,
Grouping missile city hits, we may detect launch site(s), they are usually the same for target city. Thus, we can choose safer rooms in a flat, or cover in shelter during air alarm, if the flat is not safe in common.
Most valuable part - map of a land battle near city. Currently, our army can repeal them far enough, but in the worst case, it’s a really a matter of life - to retreat on safer place before road closed and it’s too late.
Mar 14, 2022 02:45 PM
you can actually filter each field, including time of tweet
Apr 30, 2022 07:16 AM
I beleive that answer should be here, in replies, instead of creating new topic.
For me, without much cloud experience, hardest part of such tasks - where and how I should place scraper to run from time to time.
Here is example of my code to get html answer from oryxspioenkop.com site (Equipment Losses) and place head lines into new table. Will not work if the table already exist, but you can change name in first line.
const TNAME='Oryx';
const urlRus = 'https://www.oryxspioenkop.com/2022/02/attack-on-europe-documenting-equipment.html';
const urlUkr = 'https://www.oryxspioenkop.com/2022/02/attack-on-europe-documenting-ukrainian.html';
const [S1,S2,SX]=[`id="Pistols">`,`</h3>`,`</span>`] //single item start/end/clean
const [T1,T2]=[`<span style="color: red;">`,`<br /></span>`] //totals start/end
const [TX,DIVIDER,WASTE]=[SX+T1,'mw-headline',`<span `] //items divider/waste pieces marker
const tableId=await base.createTableAsync(TNAME,[
{name:'Side',type:'singleLineText'}, {name:'Loss',type:'richText'}])
if (!tableId) throw new Error(`Cannot create base ${TNAME}`);
const table=base.getTable(tableId);
const cutX=(txt,pattern)=>txt.split(pattern).join('')
const cut=(txt,a,z,x)=>cutX((txt.split(a,2).pop().split(z,2).shift()), x);
const bold=text=>`**${text}**`;
const total=text=>(text.indexOf(T1)>0)? bold(cut(text,T1,T2,TX)):cut(text,S1,S2,SX)
const parse=txt=>txt.split(DIVIDER).map(total).filter(n=>!n.includes(WASTE));
const create=(el,side)=>({fields:{'Side':side,'Loss':el }})
const rows=(arr,side)=>arr.map(el=>create(el,side))
const queryRus = await remoteFetchAsync(urlRus);
const lostRus = await queryRus.text();
const queryUkr = await remoteFetchAsync(urlUkr);
const rawUkr = await queryUkr.text();
const CLN1 = ` </span></div><h3>`
const CLN2 = `<span class="mw-headline" id="Pistols">`
const CLN3 = CLN1+CLN2
const CLNU = `Ukraine - `
const lostUkr=cutX(rawUkr.replace(CLN3,'').replace(CLNU+TX,CLNU),CLN1);
const crt=[...rows(parse(lostRus),'Russia'),...rows(parse(lostUkr),'Ukraine')]
while (crt.length) await table.createRecordsAsync(crt.splice(0, 50));