Does anyone have insight/expertise into (politely!) spidering and downloading the logs on ER?
I would like to run some textual analysis on a corpus of the logs, e.g. do statistics and geospatial analysis on where fights take place, what races are most likely to die where, etc etc.
However, I do not actually have an ER account (I've requested one for past two years, but no luck) and thus I can't paginate. The URLs I can see have a rather cryptic double integer param which seems unguessable:
The first int (62828 in the example above) appears to possibly be autoincrement across all logs, but I have no clue what the second param is.
Code: Select all
I can probably spider them from the user profile pages (which have linear and guessable ids), but thought I'd ask here.
Of course, I do not want to abuse anyone's server, and I would throttle my requests very aggressively (e.g. fetch 1 log per 10 seconds, etc).