Data and Code Availability Policy
- Content and Scope
- Data and Software Citations
- Data Availability Statement
- Non-Public Data
- Version of Record
- Registration of Randomized Control Trials
- Ethics Approval
It is the policy of the American Economic Association to publish papers only if the data and code used in the analysis are clearly and precisely documented and access to the data and code is non-exclusive to the authors.
Authors of accepted papers that contain empirical work, simulations, or experimental work must provide, prior to acceptance, information about the data, programs, and other details of the computations sufficient to permit replication, as well as information about access to data and programs.
Data and programs should be archived in the AEA Data and Code Repository.* Authors will provide access to editors and reviewers, if requested, to both data and programs prior to acceptance. The Editor should be notified at the time of submission if access to the data used in a paper is restricted or limited, or if, for some other reason, the requirements above cannot be met.
If data or programs cannot be published in an openly accessible trusted data repository, authors must commit to preserving data and code for a period of no less than five years following publication of the manuscript, and to providing reasonable assistance to requests for clarification and replication.
The AEA Data Editor will assess compliance with this policy, and will verify the accuracy of the information prior to acceptance by the Editor.
For econometric, simulation, and experimental papers, the replication materials shall include (a) the data set(s), (b) description sufficient to access all data at their original source location, (c) the programs used to create any final and analysis data sets from raw data, (d) programs used to run the final models, and (e) description sufficient to allow all programs to be run.
For papers collecting original data through surveys or experiments, the replication materials shall also include (f) survey instruments or experiment instructions, (g) computer code for experiment or survey collection mechanisms, and (h) original instructions and details on subject selection. See the supplementary Policy on Experimental and Survey Papers.
All source data used in the paper shall be cited, following the AEA Sample References. Citation of software packages is also encouraged.
A data availability statement covering both the source data and any derivative data shall be provided in the README file. It may additionally be provided as part of online appendices. The data availability statement shall provide detailed information on how, where, and under what conditions an independent researcher can access the original source data, as well as author-generated derivative data, and must be explicit and accurate about any restrictions, requirements, payments, and processing delays. The data availability statement shall provide information to assure the reader that the data are available for a sufficiently long period of time.
This policy, with the exception of item (a) above, also applies to papers that use data that cannot be published as part of a replication package or in an openly accessible trusted data repository. Examples include confidential data with identifying information of persons or businesses and data subject to data use agreements or copyrights that prohibit redistribution. When possible, a private (not to be published) version of the data should be provided to the AEA Data Editor and/or a designated third-party replicator who can provide a third-party reproducibility report.
Data: The data files may be provided in any format compatible with any commonly used statistical package or software. Authors are encouraged to provide data files in open, non-proprietary formats. Authors should ensure that a meaningful name or description (label) is available for every variable in the provided datasets. Codebooks or similar metadata should describe the allowed values and their meaning for each variable. It is acceptable to reference publicly available documentation for these items.
Code: The programs may be provided in any format compatible with commonly used statistical package or software. Should unusual or costly software be required, authors are required to notify the AEA Data Editor. A master script is strongly encouraged.
As part of the archive, authors must provide a README file listing all included files and documenting the purpose, format, and provenance of each file provided, as well as instructing a user on how replication can be conducted. The README shall contain the data availability statement and proper citations for all data used.
The README shall follow the schema provided by the Social Science Data Editors' template README.
Common formats are txt, PDF, and Markdown. The README file should not require proprietary software to view.
After the data and code deposit is accepted by the AEA Data Editor, it will become the version of record associated with the paper. Corrections and revisions are subject to the Policy on Data and Code Revisions.
It is the policy of the AEA that randomized control trials must be registered on the RCT Registry. All such registrations shall be cited in the title footnote and elsewhere in the paper as appropriate. Please see the RCT Registry policy.
If applicable, approval by ethics boards—the Institutional Review Board (IRB) in the United States and equivalent institutions elsewhere—should be demonstrated by including the name of the ethics board and any approval or exemption record number in the title footnote and the author disclosure statement(s). See the Disclosure Policy.
Detailed instructions for preparing and depositing replication packages are provided in the AEA Data Editor's step-by-step guide.
For more information, see Frequently Asked Questions.
*Other repositories and archives may be acceptable, as long as these are considered to be "trusted" archives or repositories, see guidance. The AEA Data Editor will assess suitability of any such repositories and archives.
This version (September 2020) supplants all prior data policies.