PhD Data Management Guide: How to Organize, Label, and Store Research Data Effectively
Written by Sara Bitar, PhD researcher at Johannes Gutenberg University of Mainz, Germany
When I started my Master´s degree several years ago, I was taught that organizing the experiment is more important than running it. This has proved to be one of the most valuable pieces of advice I have been given. During your PhD, you will generate a vast amount of data and knowing how to properly store and access that data ensures its integrity and reproducibility and eases collaboration efforts. Here are some top tips for proper data management which will help you during your PhD.
Organization starts in the lab
1. Lab Book Notes
When performing an experiment, it is important to write down all the experimental information in your lab notebook. The following information needs to be included:
-
Date: Record the exact date of the experiment.
-
Sample Details: Log the specific samples being quantified, including origin and any relevant background, number of replicates, plate layout, etc.
-
Reagent Information: Document all reagents used along with the supplier details (company name, catalog number, lot number).
-
Protocol: Clearly reference the exact protocol followed, including any modifications or important procedural steps.
Note: Record anything that occurred in the experiment that could lead to abnormal results. This could include: making an error, insufficient material, using a new reagent lot number.
2. Sample Labeling
During the experiment, it is important to properly label your samples. Ensure labels are clear and consistent for easy cross-referencing.
Carefully label each tube with:
-
Date
-
Initials (to identify the experimenter)
-
Sample content (e.g., type of lysate - DNA / RNA / protein)
Record keeping beyond the experimental day:
Once you have completed the experiment, remember to keep a detailed record of where your samples and data are kept.
-
Final Storage: After completing the quantification, place the samples into a designated storage box.
-
Location Documentation: Write down the exact location of the stored samples in your lab notebook, noting the box number or storage coordinates.
-
Results: Record the quantification measurements, ensuring they are linked to the corresponding samples and replicates.
Thorough record keeping such as this will ensure traceability and support reproducibility of your experiments.

Metadata
Metadata, often described as “data about the data,” encompasses all the contextual details that make your raw results interpretable and reproducible. This includes documenting reagents (catalogue and lot numbers), recording all calculations, and noting procedural steps directly alongside the experiment descriptions in your lab notebook. It’s also important to record technical details such as the instrument model, software version used for any analyses, and who performed each part of the experiment. These metadata elements will need to be included in the methods section of your thesis and publications, and will also prove invaluable during troubleshooting or collaborative work.

Figure: Metadata of an immunofluorescence image
Digital data documentation
Adopting a consistent and descriptive manner in labeling ensures that you will be able to identify the contents of your file. For this reason, post-experiment documentation should include all experimental aspects. This includes indicating the model that you are working with, a proper date stamp, type of assay, samples, and replicates (the more details, the merrier). It begins with the title of the big general folder down to the single data file from one replicate of an experiment. This looks something like this:
Model_date_assay_samples_target_replicates
E.g., Mouse embryonic fibroblasts_20240101_qPCR_WT_vs_KO _mRNA_rp49_n_4
You might think this is too much detail, but this will help you locate the correct file when you need it.
Labeling data as soon as you get the results
One last tip for proper data storage is labeling your data immediately. As soon as your data is generated (e.g., from qPCR or Western blot), take time to label it clearly and accurately.
Another point is to create a backup: always make a duplicate of your data. Label one file as “Raw” - this should remain untouched to preserve the original results. The second file can be labeled as “Labeled,” where you can annotate and process the data as needed.
For experiments like Western blots, use editing software to clearly label:
-
each lane and corresponding sample;
-
the ladder sizes;
-
what each band represents (antibodies);
-
what the membrane was probed with.
This can then be cross-referenced with your lab book and makes recovering and remembering what each experiment was much easier.
It is important to note that these different methods have to be maintained daily to have consistent and reliable data. If you take it day by day, it won’t feel like a chore and will ensure your data will be easy to find and reproduce years later.
Top Tips:
-
Back up everything and make sure the data remains on the research faculty premises.
-
Always delete files from shared equipment as soon as possible.
-
Create README files within each folder that contain extra comments on the experimental design, how the data was processed, and which researchers were responsible for each section.
-
Avoid naming files as ´FINAL version´ but rather use Version 1, Version 2, etc. - despite how sure you are this really is the final one, you will end up with names like “final final final version.”
Related Content
How to write a literature review
5 creative ways to stay productive in graduate school
How to create a LinkedIn profile for scientists
How to prepare for a PhD viva - Ten tips for success
Support
Newsletter Signup
Stay up-to-date with our latest news and events. New to Proteintech? Get 10% off your first order when you sign up.
