background-image: url("data:image/png;base64,#") background-position: 80% 80% background-size: 50% ## Open Science: Data sharing Slides by [**Esther Plomp**]( @ TU Delft, Faculty of Applied Sciences License: CC-BY --- class: center, middle # Slides are available ### --- class: inverse, center, middle # Research Data Management --- # Research data Any type of information that is collected, observed, or created, in the context of research -- - **Raw/primary data**: The data measured/recorded without manipulation -- - **Processed/secondary data**: Data that has been modified/processed for analysis -- **Data Management Plan (DMP)** to plan how to manage and share the data (see [The Turing Way for more information]( --- # Data Organisation  Image by [Allison Horst]( --- ## File naming - 20220113-PRES-Data-V001 - [8 step guide]( on how to set up your file naming convention - [Presentation on file naming]( - [Stanford’s best practices]( -- ## Folder structure - Templates by [Colomb et al.](, [Nikola]( and [Barbara Vreede]( for [cookiecutter]( - [Find Files Faster]( How to Organize Files and Folders - [Data Management: File organisation]( by Christine Malinowski - [Videos on project structure]( by Danielle Navarro --- # Data Organisation .pull-left[ ] .pull-right[ ## Spreadsheets - [Spreadsheet organisation tips]( - [Broman and Woo 2018]( - [Wickham 2014]( - Use tools for data validation like [OpenRefine]( ## Why? What could possibly go wrong? - [a lot]( - See the [Twitterpost source](] ] --- # Data Documentation .pull-left[ ] .pull-right[ - (electronic) Labnotes - Readme files ([template]( - [Guide for data documentation]( - [Data Dictionary]( - [Code Book]( #### More information - Book: Data Management for Researchers by Kristin Briney - [A Quick Guide to Organizing Computational Biology Projects]( by William Noble - [Some Simple Guidelines for Effective Data Management]( by Borer et al. ] .footnote[[Twitterpost source](] --- class: inverse, middle, center # Data Sharing --- # Open Data made freely available for use and re-use by anyone and everyone -- ✅ **Access** : Available (on the internet) to all on demand -- ♻️ **Reuse/distribution** : Provided under terms that permit reuse and redistribution -- ℹ️ **Transparency** : Providing information about data generation/collection -- 🌎 **Interoperability** : Interoperability with other data, machine readable formats -- 🤲 **Participation** : Everyone must be able to use, reuse and redistribute -- ➕ **Equity** : Data is not truly open if the research process is not open to all .footnote[ [#bropenscience is broken science]( by Kirstie Whitaker and Olivia Guest [Open Science Beyond Open Access: For and with communities](] --- # Not Open Data .pull-left[  ] .pull-right[ '[odds of obtaining the dataset] fell by 17% per year' ‘research data cannot be reliably preserved by individual researchers’ - [Vines et al. 2014]( "We received no response to 41.3% of our data requests" - [Tedersoo et al. 2021]( ] .footnote[[Meme explanation]( [Twitterpost source](] --- # Open Data .pull-left[  ] .pull-right[ ## data repository online archive that curates research datasets and provides long-term access - Finalised datasets - ~10-15 years (Long term preservation) - Access - DOI = more citations/visibility - File format support [How can you make research data accessible?]( by Esther Plomp] .footnote[Image by [Errant Science](] --- # Why not supplemental materials? 🛂 **Data control**: cannot be updated -- 🌐 **Interoperability**: not available in all formats which makes it difficult to integrate and interact with the data -- ✅ **Availability**: Difficult to access if the article is behind the paywall (supplemental materials are not included in the DOI and therefore the links can also break!) -- 🏆 **Impact**: Data should be a primary research output -- 🏮 **Publisher requirements**: Some publishers recommend using a data repository instead -- [The Push to Replace Journal Supplements with Repositories]( --- # How to find a repository? - Check publications in your field - [FAIRsharing]( - [re3data]( General repositories: - [4TU.ResearchData]( - [Zenodo]( --- # Data Licenses .pull-left[  ] .pull-right[ ## Data [Creative Commons License Chooser]( ## Software [Choose an open source license]( ] .footnote[Image [Source]( CC-BY-SA] --- # How to link your publication and data/code/protocols? - Publish the output before you publish the article OR - Reserve the DOI ## Use the DOI/citation in your publication Reference your data in the **Data Availability Statement** and the **References** .footnote[ [The Turing Way: Linking Research Objects]( ] --- # Publish or reserving a DOI ### [Zenodo]( -> [Upload]( -> [New Upload](  --- # Linking with publication ### Data accessibility/within table (descriptions)  ### Data availability statements (at the end)  .footnote[ [Data accessibility source](; [Data availability statement source]( ] --- # Linking with publication  Always check the dataset's readme file or metadata on how the contributors prefer to be cited! See [this document]( for more information about data/software citation. --- background-image: url("data:image/png;base64,#") background-position: 50% 50% background-size: 70% # Linking data/code/publication .footnote[ [Publication]( // [Data & Code](] --- # Questions? -- # Discussion? -- ### “publication of data and codes should be mandatory" -- ### "All data (also raw data!) and code underlying the publications should be shared" --- class: center, middle # Thanks! Slides created via the R package [**xaringan**](