Found and liberated 2,151 missing DHS files

On January 18, 2017 the US Department of Homeland Security discontinued its Daily Open Source Infrastructure Report service which it had run since October 2006. To enable researchers to study the content of these reports, I collected as many as I could find (2,151 PDF files) and released them to the Internet Archive. You can find them here: DHS Daily Open Source Infrastructure Reports 2006-2017

The PDF files came from the following URLs:

  • https://www.dhs.gov/sites/default/files/publications/
  • https://www.dhs.gov/sites/default/files/publications/nppd/ip/daily-report/
  • https://www.dhs.gov/xlibrary/assets/

And when these yielded 404 errors (which they did for most pre-2013 files) IĀ used the Internet Archive itself, with the following URL base:

http://web.archive.org/web/20061101153326/https://www.dhs.gov/xlibrary/assets/[filename]

Files are named as they were upon download, in one of the following patterns:

  • DHS_Daily_Report_2006-10-11.pdf (most 2006-2012 files have this format)
  • DHS-Daily-Report-2012-12-06.pdf (a single December 2012 file has this format)
  • dhs-daily-report-2013-01-09 (most 2013-2017 files have this format)

If you are interested in missing dates (for example Archive.org was missing some dates and a few files were corrupted), this blog might be able to help fill in the gaps.