SEC EDGAR Robots.txt

For a long time, a lot of data in securities filings was hidden by obscurity. Sure, the SEC offered a full text search of EDGAR filings, but it only spanned the last four years. And, if you actually tried to use the full text search, it wasn’t too fruitful an experience.

Here’s a search for offer letter:

SEC EDGAR Full Text Search

Not particularly illuminating.

However, something changed about a year ago. On May 1, 2012, the robots.txt file from the sec.gov website looked like this:

User-agent: *
Disallow: /Archives
Disallow: /Archives/bin
Disallow: /Archives/dev
Disallow: /Archives/etc
Disallow: /Archives/ftp
Disallow: /Archives/gopher
Disallow: /Archives/tmp
Disallow: /Archives/usr
Disallow: /cgi-bin
Disallow: /bin
Disallow: /oursite/previews
Disallow: /edgar/vprr

Source: Internet Archive

On the following day, the robots.txt file looked like this:

User-agent: *
Allow: /Archives/edgar/data
Allow: /Archives/edgar/vprr
Disallow: /Archives/bin
Disallow: /Archives/dev
Disallow: /Archives/etc
Disallow: /Archives/ftp
Disallow: /Archives/gopher
Disallow: /Archives/tmp
Disallow: /Archives/usr
Disallow: /cgi-bin
Disallow: /bin
Disallow: /oursite/previews
Disallow: /Archives/edgar/vprr/XXXX
Disallow: /Archives/edgar/vprr/vprr_removal
Disallow: /Archives/edgar/vprr/bin

Source: Internet Archive

The big change was this line that allowed search engines, such as Google, to index EDGAR data:
Allow: /Archives/edgar/data

Now you can use Google to search for an offer letter in the EDGAR archives. The results are much better than the SEC’s full-text search.

0 comments… add one

Leave a Comment