When managing your data, it is useful to keep detailed documentation in appropriate folders. A few examples of documentation are:
You do not need to archive all of these materials; however, as a researcher you will determine which documents are essential for re-use and secondary analysis. If you keep these documents clearly labeled in their folders, you will be able to assess and archive them quickly at the end of the project. Make sure to check with your archive or repository to determine which documentation is also required.
For long-term preservation, access and reuse, consider formats that are open source, platform-independent, and widely used in the field of your research (see Directory of Metadata Standards). Whenever possible consult your selected repository for preferred formats of data and documentation for archiving.
Examples of commonly used formats:
At the beginning of a research project, it is best to determine where and how your files and folders will be organized on any project-related drives. You may create a basic README file or graphic to inform project team members about this organizational framework.
It is also essential to determine which team members should have access to specific files or folders and set those permission levels. You may restrict access to admin status, or lock folders and files with password-only access. This will keep unauthorized staff out of specific files or folders. If you need assistance with this, please contact the your project's or school's IT personnel. If you need help identifying who to contact for support with this, contact a Data Services Librarian.
You can keep track of the research project data files and documents by creating naming conventions and versioning policies. These policies will make sure that each file is unique so no one has to wonder which “final_version” data file is actually the final version. Before the project starts, it would be best to clearly specify naming conventions and versioning policies in a README file or spreadsheet. Inform all research project team members of the policies and show them where the document is in case they need to refer to it in the future.
Naming conventions are simply a set way for labeling each document and data file. Different methods for labeling can include:
Versioning policies specify the way in which you document changes to data files or documentation. Versioning can be specified in many ways such as
An example of a naming convention with versioning could be as simple as:
Quality assurance looks different in different fields, but following good practices in your discipline will improve data management and save your team time.
Quality assurance may include access control procedures, activity logging, "training activities, instrument calibration and verification tests, double-blind data entry, and statistical and visualization approaches to error detection. -Ten Simple Rules for Creating a Good Data Management Plan.
For example, it might be necessary to manually track who worked with specific data (electronic or even physical samples), when they touched the data, and what they did with it. In this case, creating and updating a tracking sheet like this may be necessary so that everyone will be able to see changes made to the data throughout the project’s lifetime:
Date |
Name |
File Manipulation |
11/22/2017 |
John Smith |
Added missing value labels to variable A, B, and C data |
Metadata is documentation or information that describes the structure, content and layout of a data file. It can be presented in the form of a codebook that includes data types, column locations and coded values of each variable. It may also contain a frequency listing of variables, questionnaire, a description of study design, methodology, sampling, data collection and data quality. Robust metadata is essential for making raw data meaningful and reusable.
A metadata standard refers to a metadata structure organized in a consistent format for computer interpretation making data information searchable and retrievable. Repositories usually specify their metadata standards for archived data documentation.