FAQ & Glossary
Learn more about various topics in the field of research data in our FAQs. You can find explanations for key terms in our glossary.
FAQs
General questions:
Research Data Services (RDS) are a collaborative working group made up of staff from the University Library and IT.SERVICES at RUB. We are responsible for the evaluation and establishment of both a sustainable infrastructure and service for research data management.
We provide support in research data management and data handling for researchers at all career stages and in every stage of their research projects. We offer various services and tools for this purpose, including training sessions, consultations, and our research data repository.
On this page, you can find our contact information and get to know our team:
RDM stands for research data management. Research data management encompasses measures for organizing, documenting, and preserving all data used or generated in a research process. RDM is applicable to all disciplines and involves various types of research data defined by the disciplinary context. It ensures access, reuse, reproducibility, and the quality of all research data that forms the basis of scientific results.
For more information on research data management, you can refer to our glossary.
Research data encompasses all data generated, developed, or analysed in scientific work. They are defined by the disciplinary context and vary from one field to another. Examples of research data include measurement data, laboratory values, audiovisual information, texts, survey data, objects from collections, or samples. The task of research data management is to systematically handle the collected data.
For more information on research data, please refer to the entry for research data in our glossary and to the following sources:
The FAIR principles are designed to ensure sustainable research data management by preparing and storing data and associated metadata in a way that allows others to reuse them. FAIR data is, therefore, findable, accessible, interoperable, and reusable.
For more information on the FAIR principles and FAIR Data, refer to the glossary under the entry "FAIR Data".
DMP stands for data management plan. A data management plan accompanies the entire research process and supports it not only in the planning and proposal stages but also regulates processes such as archiving and data publication.
A detailed explanation of the DMP can be found in glossary under the entry "DMP".
Practical Questions
The specific requirements for the content and structure of a proposal vary significantly depending on the funding agency, format, and discipline. Many funding agencies expect information on the handling of research data to be included in the proposal. For instance, a data management plan may be required.
A useful reference for the information to be included in proposals regarding research data management is
the DFG Checklist. We provide a
Completion guide for this checklist.
On our page "Projects and Proposals"you can find additional information on grant applications. If you have any questions on this topic, please feel free to contact us. We are happy to assist you.
Is it a DFG proposal? Check
the DFG Checklist for Handling of Research Data and our associated completion guide.
Many other funding agencies also provide similar guidelines. Information on the handling of research data includes management of data during the project stage (storage, documentation, responsibilities, etc.) as well as the publication, reuse, and archiving of the data. A data management plan includes details on these aspects of research data management and should be created during the planning process before the start of any research project.
For creating a data management plan, you can use RDMO tool. If you have further questions regarding data management plans and funding applications for research data, we are here to provide guidance.
Long-term preservation of research data is an important part of research data management. The RUB offers various services for storing research data, such as
our repository,
Sciebo, the
Fileservice/Network Drive, and the
Backup Service. In a backup, all files are secured for emergencies, but only for a relatively short period of time. Archiving is suitable for the longer-term preservation of data.
If you have further questions regarding the storage and archiving of research data, we are happy to provide guidance.
A consistent naming and structure of research data makes it easier to keep track of data collections and experiments, and ensures that there is no accidental use of different versions of a dataset.
Various methods can be used for structuring data:
You can employ flat folder hierarchies (up to 3-4 levels), use descriptive names, and ensure clarity in the terms used. Maintaining file extensions (e.g., csv, tiff) is important. Additionally, the naming system and structure should be documented in a readme.txt file. If you regularly edit or supplement your data, versioning is recommended. Adequate methods include proper file naming or the use of version control software like Git. An example of versioning through naming is the three-level versioning: Major.Minor.Revision (e.g., 1.0.0).
Research Data Services offer training on using Git in the research process. If you have further questions, feel free to contact us.
Metadata is data used to describe research data. This form of documentation has the advantage of making data discoverable and comprehensible. The choice of metadata to be assigned during the research process depends on the subject area, research project, and funding source.
However, your metadata should at least answer the following questions:
Our services
Feel free to contact
us ! Together, we can find a suitable consultation appointment for you.
Each research community and project is unique, with its own specific needs. Funders also have individual expectations and may require information on handling of research data already during the application process. For this reason, personalized advice in the field of research data management (RDM) is particularly important.
We can assist you with various topics, including:
Here you can find the schedule for our upcoming events and training sessions on research data management. If you wish to have a tailored training session on one or more RDM topics for your research project, please feel free to contact us. Additionally, we offer an introduction to various data management topics through a Moodle course (German).
Glossary
Long-term archiving (LTA) is intended to ensure the long-term usability of data over an undefined period. However, in many academic disciplines, a ten-year retention period for research data has become established as the standard. Because this period is characterised by constant technical and socio-cultural change, it is necessary to regularly review the data with regard to the preservation of its usability.
LTA aims to preserve the:
To prevent data loss, it is advisable to regularly create backups, preferably at a predetermined time.
Source:
RUB, in collaboration with IT-Services, provides a
backup service . Additional offerings, such as storage infrastructure and organizational tools for research data, are currently under development.
For more information on data security, please visit:·
In order to ensure maximum reusability of scientific research data, which may in principle be subject to copyright, the granting of additional rights of use may be considered, e.g. by licensing the data accordingly. The use of liberal licensing models, in particular the globally recognised Creative Commons (CC) licences, is one way of defining conditions for the subsequent use of published research data in a comprehensible manner.
Source: https://forschungsdaten.info/praxis-kompakt/english-pages/glossary/#c403985
The file format (sometimes also referred to as file type) is generated during the storage of a file and includes information about the structure of the data within the file, its purpose, and affiliation. Using the information available in the file format, application programs can interpret the data and make the contents accessible. Typically, the format of a file can be identified by its corresponding extension appended to the actual file name, consisting of a dot and two to four letters.
Most file formats are designed for specific purposes and can be grouped based on certain criteria:
With file formats, a further distinction is made between proprietary and open formats. Proprietary formats allow files to be opened, edited, and saved only with the corresponding application, utility, or system programs (e.g., .doc/.docx, .xls/.xlsx). Open formats, on the other hand, (e.g., .html, .jpg, .mp3, .gif) allow files to be opened and edited with software from various manufacturers.
File formats can actively be changed through conversion during the saving process, but this may result in data loss. In the scientific domain, attention should be paid to compatibility, suitability for long-term archiving, and lossless conversion to alternative formats.
For more information, please visit: forschungsdaten.info (German).
Source:
A digital object identifier (DOI) is a permanently valid identifier that uniquely identifies digital objects, allowing them to be referenced. DOIs are particularly useful for citing, for example, articles or datasets published in a repository. They remain constant over the entire lifetime of the designated object.
More information about DOI registration of research data: Website of the University Magdeburg.
Source:
- https://forschungsdaten.info/praxis-kompakt/glossar/
- http://www.fdm.ovgu.de/home/Kurz+erkl%C3%A4rt/Glossar.html#D
A data management plan (DMP) structures the handling of research data in a scientific project. It describes how to deal with the data used during and after the end of the project. Many third-party funding institutions (DFG, FWF, SNF, Horizon Europe, Volkswagen Foundation) expect information on the handling of research data to be included as part of a funding proposal for the allocation of funds from certain funding lines. A formal DMP is only required in the rarest of cases, especially by the EU. Nevertheless, a DMP is useful for working on a research project. In particular, the current status and special features can be recorded in a DMP throughout the entire research data life cycle. It is therefore helpful for administration and for maintaining an overview.
Source: https://forschungsdaten.info/themen/informieren-und-planen/datenmanagementplan/
The Research Data Management Organizer (RDMO) is a tool for research data management and supports you in the creation of data management plans.
FAIR stands for Findable , Accessible , Interoperable , Reusable The FAIR principles are designed to ensure sustainable research data management by preparing and storing data and associated metadata in a way that allows third parties to reuse them. The principles apply to both data storage itself and to infrastructures and services, aiming to make research more transparent and efficient.
Key to implementation is the provision of comprehensive metadata, persistent identifiers, and clear usage licenses, ensuring that the data is well-prepared for both human and machine use. Additional information and tips on implementation see here (in German).
This results in a discipline- and project-specific understanding of research data, with varying requirements for data preparation, processing, and management—referred to as research data management.
For more detailed information, please refer to
Forschungsdaten.info and
Digitale Zukunft
Source:
- https://forschungsdaten.info/themen/informieren-und-planen/was-sind-forschungsdaten/#c502524
- https://forschungsdaten.info/praxis-kompakt/glossar/#c269824
- http://www.fdm.ovgu.de/home/Kurz+erkl%C3%A4rt/Glossar.html#F
Research data management involves a range of measures for organizing, documenting, and storing all data used or generated in a research process. Structured measures can be taken at all stages of the data lifecycle to maintain the scientific validity of research data, preserve their accessibility for third-party analysis, and secure the chain of evidence. Research data management is applicable to all disciplines and encompasses various types of research data defined by the disciplinary context.
In addition to increasing the visibility of one's own data and associated research, research data management enables:
Furthermore, sustainable research data management ensures compliance with requirements and standards from different disciplines, research funding, publishing bodies, and research ethics guidelines.
Source:
- https://forschungsdaten.info/praxis-kompakt/glossar/#c269824
- http://www.fdm.ovgu.de/home/Kurz+erkl%C3%A4rt/Glossar.html#F
- https://www.fu-berlin.de/sites/forschungsdatenmanagement/glossar/index.html#section_f
It is important to find a license that is appropriate for the type of material being published. The requirement to properly attribute authors when reusing an article, poem, or essay is deeply embedded in the norms of scholarly practice and serves as a means for users to appreciate and understand, in context, which parts of a work are original.
However, with data, there are often very good reasons to waive the obligation of attribution. Several prominent data portals for cultural heritage, such as Europeana, only accept data that is made available under the Creative Commons Zero license (CC0). The better a work's metadata can be combined with other data (linked open data), the more useful it is. Therefore, it is advisable to use the CC0 license for metadata, as otherwise, among other reasons, the chain of attributions can become very long.
It is crucial to consider licensing early in each step of the scholarly process.
The following points should be taken into account:
Source: https://forschungslizenzen.de/#lizenzen
Metadata refers to all additional information that is necessary or useful for interpreting the actual data, such as Research Data and enables the (automatic) processing of research data by technical systems. Metadata is often referred to as 'data about data' .
It serves to categorize and characterize different information about digital objects:
- Technical metadata, for example, includes details about data volume and data format and is crucial for long-term data storage .
- Descriptive metadata (also known as content metadata) provides information about (e.g. scientific) data within digital objects, influencing their findability, referencing, and reusability.
Descriptive explanations, such as through an abstract, stored alongside (research) data, are also valuable. This includes indications of
usage rights, equipment used, applied standards, especially when no associated publication is available.
Source:
https://www.forschungsdaten.org/index.php/Metadaten
„The aim of the national research data infrastructure (NFDI) is to systematically manage scientific and research data, provide long-term data storage, backup and accessibility, and network the data both nationally and internationally.
The NFDI will bring multiple stakeholders together in a coordinated network of consortia tasked with providing science-driven data services to research communities.“
Source:
https://www.dfg.de/en/research-funding/funding-initiative/nfdi
Open Data refers to data that can be used, disseminated, and reused by third parties for any purpose, including information, analysis, or even further economic use. Restrictions of use are only allowed to preserve the origin and openness of knowledge; for example, CC-BY may be used to ensure attribution to the creator. The concept of open data is based on the idea that allowing free reuse promotes greater transparency and encourages collaboration.
More information about Open Data can be found
here.
Source:
Open Science is a frequently used, but rarely specified term. It describes the approach of making transparent not only the results of scientific work but the entire process. The results of publicly funded research should, whenever possible, be made available worldwide for free, without legal or technical barriers on the internet, and made reusable. The easier research results can be found and accessed, the better they can serve as a foundation for further research activities. Under the umbrella term Open Science, strategies and practices are encompassed that describe the transformation in research methodology, organizational and content-related aspects of teaching, publishing, information and literature supply, as well as the preservation of research data.
This includes the following subpoints:
The open access to knowledge and science, including publications, data, and software, aims to achieve greater transparency, efficiency, visibility, and, consequently, an improvement in quality and an increase in trust in science and research. The establishment of Citizen Science projects at the university promotes a participatory research approach.
Source:
TU Berlin
The Open Research and Contributor (ORCID)-ID is an internationally recognised persistent identifier that can be used to uniquely identify researchers. The ID is publisher-independent and can be used permanently by researchers for their scientific output, regardless of their affiliated institution. It consists of 16 digits, which are represented in four blocks of four (e.g. 0000-0002-2792-2625). The ORCID-ID is established as an ID at numerous publishers, universities and science-related institutions and is integrated into the workflow, e.g. when reviewing journal articles.
Source: https://forschungsdaten.info/praxis-kompakt/english-pages/glossary/#c403990
To be able to find, identify, and cite electronically published data of any kind reliably and permanently, you need a persistent identifier. A persistent identifier is a unique label for a specific digital object and remains constant, even if the name or location of a publication changes.
Examples of persistent identifiers include
URN (Uniform Resource Name),
DOI (Digital Object Identifier),
ORCID,
Researcher ID Thomson Reuters,
Scopus Author ID,
GND-Nummer (Gemeinsame Normdatei-Nummer),
Google Scholar Citations Profi,
ISNI (International Standard Name Identifier),
ISBN and
ISSN.
Digital Object Identifier (DOIs) enable online publications to be citable.
In contrast to the more transient URL address, the DOI serves as a persistent identifier. The university library assigns DOIs to digital objects (e.g., research materials, articles, digitized content, images) belonging to university members.
Source:
Uni Bamberg
RDMO is a tool for creating data management plans. The aim of
RDMO is to plan, control and document the handling of data in your scientific project in a structured way. In addition, the collected information can be provided in the form of a report or data management plan. RDMO thus simplifies the submission of applications to research funding organisations such as the EU, DFG and BMBF.
If you have any questions or problems with RDMO UA Ruhr, please refer to the instructions or contact us directly.
A repository is a storage location for digital objects. In addition to repositories for software and text documents, there are also repositories for research data specifically. These repositories are used for publishing and usually for archiving data as well. Most data repositories collect metadata in a searchable database and offer the option of generating a permanent identifier (e.g. a DOI) and issuing a license when uploading a file. Repositories are either public or accessible to a restricted group of users only. Read more
Source:
- https://www.fdm.ovgu.de/en/
- https://forschungsdaten.info/praxis-kompakt/english-pages/glossary/#c403894
Versioning allows tracking changes and, if necessary, reverting them. This is particularly advantageous when files are regularly edited or updated. Recommended methods for versioning include proper file naming conventions or the use of version control software such as Git. An example of versioning through naming is three-tier versioning: Major.Minor.Revision (e.g., 1.0.0). Versioning should be applied both during the research process itself, for marking different working versions of data, and after subsequent modifications to already published research datasets. This allows users to cite the correct version of a research dataset.
Source: