Sunday, June 16, 2013

Changing Tack

No more Newspapermapping

The person at got back to us with the database from his website. While I'm a little sad that 5+hrs of my time have been for naught, I'm glad I don't have to spend another 10+ hours pressing ctrl-c, ctrl-v...

I cleaned up the database by first eliminating all the non-English papers, and then adding "state" back in for about 50 entries that were lacking that field. I removed a handful of links whose connections timed out when I tried to visit their pages. As of right now, a crawl is running on the new URLs.

No comments:

Post a Comment