Data and Code Availability Policy
The Econometric Society (ES) and its journals, Econometrica, Quantitative Economics, and Theoretical Economics, have the policy to publish papers that include empirical, experimental, and/or simulation results only if the data and code used in the analysis are clearly and precisely documented and access to the data and code is non-exclusive to the authors.
Authors of these papers must provide, prior to acceptance, raw data, codes, and sufficient documentation to permit the replication of all the results in the paper and in the appendices approved for publication online. They must also provide sufficient information to replicate the process of obtaining this raw data from the original sources and cite all the sources of data appropriately.
The journals of the ES will conduct reproducibility checks for the empirical, experimental, and/or simulation results included in the paper and in the approved online appendices prior to final acceptance.
Requests for an exemption from providing the materials described in this policy, or for restricting their usage, should be stated clearly when the paper is first submitted for review. It will be at the editors' discretion whether the paper can then be reviewed. Exceptions will not be considered later in the review and publication process.
By submitting to any journal of the Econometric Society, authors indicate their acceptance of this Data and Code Availability Policy.
The Econometric Society and its journals Econometrica, Quantitative Economics, and Theoretical Economics endorse DCAS, the Data and Code Availability Standard [v1.0], and their data and code availability policy is compatible with DCAS.
The specific terms and implementation of the Econometric Society's Data and Code Availability Policy are as follows.
A Data Availability Statement (DAS) must be provided with sufficient detail for independent researchers to replicate the necessary steps to access the original data, including information on any limitations and the expected monetary and time costs associated with data access. When applicable, the DAS should also specify the version of the dataset and the original date of access by the authors. Similarly, the DAS should clearly indicate which datasets are included and excluded from the replication package. The DAS should be included as a section of the README file (refer to Rule 13 below).
The raw data utilized in the research, including primary data collected by the author and secondary data, must be included in the replication package. If the exact extract of the raw data used in the analysis is published in a trusted repository that satisfies the FAIR data principles (see guidence here), including the permanent identifier (e.g., DOI) linking to these raw data is considered sufficient to fulfill the obligation of including the raw data in the replication package.
When exemptions are granted, authors are required to comply with at least one of the following two procedures:
- Whenever possible, provide the ES with temporary access to the data affected by the exemption for the purpose of implementing reproducibility checks. It is the authors' responsibility to obtain permission from the data provider to confidentially share the data with the ES.
- Include in the replication package a synthetic or simulated dataset that enables users to execute the code and verify that it generates all outputs presented in the paper and appendices, even if the results differ from those in the paper. While including synthetic/simulated data is not required when temporary access to ES is provided, it is still recommended, as it allows future users of the package to run the codes, increasing the publication's impact.
In either of these two procedures, the authors are expected to clarify the nature of the exemption in the DAS (see Rule 1 above).
As a general rule, the analysis data does not need to be included in the replication package. However, if the data and codes provided with the package cannot fully reproduce these data within a reasonable time frame, the analysis data should be included in the replication package.
The data files must be in plain ASCII format, such as comma-separated value (CSV) format, or any other non-proprietary format so that they can be read by any researcher on any machine. Additionally, the authors may choose to submit data in a format that is read by specific programs, such as Matlab (.mat) files, Stata (.dta), or Excel (.xlsx) files, but a copy of these files in a non-proprietary format is required in every case.
A description of the variables included in the data and their allowed values must be made publicly accessible. A non-exhaustive list of instruments that can be used to fulfill this requirement include labels in the dataset, comments in the code, easy-to-identify variable names, codebooks, and indications in the README file.
All data used in the paper and the approved online appendices must be appropriately cited in both the paper/appendices and in a dedicated references section of the README file. As a general guideline, citations of data employed in the paper should be included in the paper's references section, while citations exclusively pertaining to data used in the approved online appendices may be relegated to the appendix. However, in exceptional circumstances, such as when there is a large number of data sources to cite or when recommended by the handling co-editor, citations of data used in the paper may be included in a references section of the approved appendix. The citations included in the references section of the README file should follow the citation format specified by each journal, ensuring that references can be accurately indexed by bibliometric search engines.
All programs used to generate final and analysis data sets from raw data must be included, even if the raw datasets cannot be provided due to approved exemptions to comply with Rule 2 above. When these codes generate simulated data, a seed must be specified to ensure that the code reproduces the sequence of random numbers used in the analysis.
Programs that produce computational results such as estimation, simulation, model solution, and visualization must be included. These programs should reproduce all the computational exhibits in the paper and approved online appendices with minimal human intervention.
In addition, if the programs require long execution times, authors are encouraged to provide simplified versions of the original codes to enable users to run and test manageable portions of the code, reproducing feasible results within a reasonable time frame. It is also recommended to include summary output files that contain the numbers generated by the analysis, which can be used by other readers to recreate the authors' figures, tables, or later-stage analyses. Authors of these papers are requested to collaborate with the Data Editor in implementing reproducibility checks on selected parts of these packages. The extent of these partial reproducibility checks must be documented in the README file.
Codes must be provided in source format that can be directly interpreted or compiled by appropriate software. Master files that run all the code from raw data to final results are strongly encouraged, and may be required when the number of scripts to run or human steps to implement is large. The codes must save all the paper's and authorized appendices' exhibits in a specified directory within the replication package. When the codes are written in compiled languages, precise instructions of all steps and compiling options must be included in the documentation. A make file that reproduces compilation steps is strongly encouraged. Software that does not allow generating output using scripts (e.g., ArcGIS) is discouraged. When this type of software is used, sufficient and very precise step-by-step instructions allowing users to exactly reproduce the generated outputs independently of the authors must be included in the README file.
a. Replication package materials
If collecting original data through surveys or experiments, survey instruments or experiment instructions, as well as details on subject selection, must be included. Specifically, all the following documentation should be included in the replication package, regardless of whether or not some of this information is included in the paper or approved online appendices:
- The subject pool and recruiting procedures.
- The experimental technology – when and where the experiments were conducted; by computer or manually; online, and so forth.
- Any procedures to test for comprehension before running the experiment, including the use of practice trials and quizzes.
- Matching procedures, especially for game theory experiments.
- Subject payments, including whether artificial currency was used, the exchange rate, show-up fees, average earnings, lotteries, and/or grades.
- The number of subjects used in each session and, where relevant, their experience.
- Timing, such as how long a typical session lasted, and how much of that time was instructional.
- Any use of deception and/or any instructional inaccuracies.
- Detailed statement of protocols.
- Samples of permission forms and record sheets.
- Copies of instructions and slides/transparencies used to present instructions.
- Source code for computer programs used to conduct the experiment and to analyze the data. This does not include compilers (such as zTree) that are publicly available.
- Screen shots showing how the programs are used.
If any of these materials, such as experimental instructions, are not written in English, they should be provided in original language and in English.
Reasonable judgment should be used. For example, if instructions for different sessions differ only slightly, then one sample of the instructions suffices, with the differences noted in a short accompanying document. These rules should also be understood to apply to surveys conducted by the authors. When the authors are not the primary source of the data, only the data, a statement of where it came from, and the programs used to process it are required – detailed documentation of the procedures used by the original data providers about how it was collected is not required.
The documentation provided in the replication package must be self-contained, regardless of the content included in the paper and approved online appendices, as the replication package is a different citable object than the paper. As a general rule, the replication package must be at least as exhaustive as the paper and approved appendices since there are no space constraints in the documentation provided in the replication package.
b. Initial submission of experimental papers
Information about experimental procedures is relevant to the decision of whether or not to publish a paper reporting results from laboratory and field experiments and researcher-conducted surveys. Therefore, the authors of such manuscripts should include with their submission sufficient material on the procedures to enable review. All the materials listed above are desirable and typically expected, but further detail about what is needed in each case can be obtained from the co-editor handling the paper. If, during the review process, the editor or referees feel additional information is needed, requests for that material will be made and may naturally cause a delay in processing. Hence, we encourage as complete a submission as feasible.
During the submission process, all these materials will be password-protected and available only to the editor and referees evaluating the manuscript with the understanding that the material will be used for the sole purpose of evaluating the submitted paper (and not, e.g., for research purposes). For manuscripts that are rejected, supplementary material will be removed upon request from the authors.
For authors of primary collected data, either in the form of experiments or surveys, ethics approval (e.g by the IRB) must be included.
If applicable, pre-registration of the research must be identified and cited in the README file.
A README document in PDF format must be included in the replication package. The README must be a single file named README.pdf and must be included in the upper-most directory of the replication package, immediately accessible by users of the package. The README file must include the following information:
- A DAS with the information required in Rule 1 above.
- A description of the content of the replication package.
- Precise instructions on all the steps needed to run the codes and reproduce all the results.
- Detailed indications on where the outputs produced by the code are saved and on how to map each of these outputs to the exhibits included in the paper and approved online appendices.
- An indication of the software and hardware used in the package, including expected running time and specific requirements needed to successfully reproduce the results (e.g., software versions, libraries to be installed, etc.). When the requirements and execution time are heterogeneous across significant portions of the package, specific requirements and running times for each of the different parts must be indicated.
- Data citations, following the indications of Rule 6 above.
While not required, the use of the Social Sciences Data Editors' Template of README file is strongly encouraged.
Replication packages of papers conditionally accepted after July 1st, 2023, in any of the ES journals must be deposited by the authors at the Econometric Society Journals' Community at Zenodo after the reproducibility checks are completed. Other repositories and archives may be acceptable for all or part of the replication package, as long as they are considered to be "trusted" archives or repositories, as noted in Rule 2 above. The ES Data Editor will assess the suitability of any such repositories and archives.
If data or programs cannot be published in an openly accessible trusted data repository, authors who have requested an exemption to publish them at the time of first submission must commit to preserving the data and code for a period of no less than five years following the publication of the manuscript. They should also provide reasonable assistance to requests for clarification and replication.
The README must clearly indicate any omission of the required parts of the package as a result of a granted exemption. The README must also indicate the reasons for such omission, such as legal requirements, limitations, or other approved agreements. In cases where the extent of the reproducibility checks impedes the exact reproduction of all the results in the paper and approved appendices, such as when synthetic datasets are provided (Rule 2) or partial checks have been implemented (Rule 8), the README must clarify which results have not been checked for reproducibility.