Al-Ahram Data

This is a full extract of data from the Al-Ahram Digital Archive—one of the Middle East’s most circulated Arabic newspapers founded in Egypt in 1875—with coverage from 1876 to 2020.

Related: The Al-Ahram Digital Archive is the searchable web-based interface to the same information.

Data Set Details

Access Policy
Only current UW-Madison faculty, staff, students, and researchers in the United States can use the data.
Access Mode
Downloadable Files
Formats
Page images as Portable Document Format (PDF) files.
Optical Character Recognition (OCR) encoded Extensible Markup Language (XML) files.
Number of Records
648,515 records, each with a PDF image scan file and an XML OCR file.
Size of Data Set
1.1 terabytes of data

Coverage

This data set includes over 600,000 pages of newsprint from the Egyptian newspaper Al-Ahram. Coverage contains all obtainable published issues beginning in 1876 and extending through November 2020. Content includes news articles, op-eds, features, entertainment and sports, advertisements, photos, and illustrations.

Access Files Online