Requirements for Data Availability Policies to enable Replications
Posted: June 21st, 2012 | Author: Sven | Filed under: Data Policy, EDaWaX | Tags: criteria, Replication, WP2 | 2 Comments »In our analyses for work package 2 we collected some criteria to evaluate the quality of the data policies we found in our sample.
It was important to identify some core requirements that aim to ensure the replicability of economic research. This was not an easy task, because we had to find some criteria that are suitable for many fields of research in economics.
Therefore we consulted several research papers and used the recommendations we found in the papers as a basis for analysing and assessing the suitability of data availability policies of economics journals in our study.
We’d like to discuss these criteria with our readers. Feel free to submit comments or send me an e-mail.
We suggest dividing the recommendations into two groups regarding their importance for facilitating replications. The first group comprises requirements that -in our opinion- are crucial for successful replication attempts:
- A data availability policy has to be mandatory.
- Besides pledging authors to provide datasets, also the provision of code, programs and detailed descriptions of the data (e.g. in form of a data dictionary) are required. Authors have to submit the original data from which the final dataset is derived and all instructions/code necessary to achieve the final results of computation. A README file should list all submitted files with a description of each and indicate which file/dataset/program corresponds to which results in the paper.
- All required files have to be provided to the journal’s editors prior to the publication of an article.
- All submitted data and files (if not confidential or proprietary) must be made publicly available to interested researchers.
- A data policy has to have a procedure in place, which allows other researchers to obtain proprietary or confidential datasets in principle.Other requirements that may be treated as important but not crucial for the success of replication attempts comprise:
- All data has to be submitted in the ASCII-format or at least in open formats that facilitate the long-term preservation of data as well as the interoperability of the data and code. The code submitted should call these ASCII files.
- The indication of the version of the operation systems and the software used for obtaining the results, because results may seriously differ depending on the used version of the operating systems and software package.
These seven recommendations were used as theoretical background for the analysis of the data availability policies in our sample. In our analysis we checked every data policy we found in regard to the compliance with these requirements.
References:
Dewald, William G. / Thursby,Jerry G. / Anderson, Richard G.: Replication in Empirical Economics: The Journal of Money, Credit and Banking Project. In: The American Economic Review, Vol. 76, No. 4 (1986), pp. 587-603
McCullough, B.D.: Got Replicability? The Journal of Money, Credit and Banking Archive. In: Econ Journal Watch, Vol.4, (2007), pp. 326-337
McCullough, B.D. / McGeary, Kerry Anne / Harrison, Teresa D.: Do economics journal archives promote replicable research? In: Canadian Journal of Economics, Vol. 41, No. 4. (2008), pp. 1406-1420
King, Gary: Replication, Replication. PS: Political Science and Politics, No. 28, (1995), pp. 443-499
Graphic: http://www.cyberseraphic. com/
[…] mentioned in some of my previous blogposts we analyzed more than 140 economic scholarly journals regarding their data availability […]
I’m not an expert when it comes to this. Didn’t even know this was possible. Useful read, appreciate your posting this.