This section is focussed on those ASSEMBLE Plus marine stations who have, or want to have, datasets catalogued in the ASSEMBLE Plus collection in the Integrated Marine Information System (IMIS), and/or want to archive their data in the Marine Data Archive (MDA).
As part of WP 4 of the ASSEMBLE Plus project, it has been agreed that long-term ecological datasets (LTEDS) and omics datasets from our marine stations will be catalogued and made accessible via a single access point: the ASSEMBLE Plus collection in IMIS. We stress that these datasets do not belong to ASSEMBLE Plus, rather we are advertising them, and ideally providing direct access to them for the marine scientific community. Additionally, the subset of these datasets that are interoperable (i.e. have a standard format for the data and metadata) will be accessible and analysable via a set of ASSEMBLE Plus VREs (virtual research environments).
To kick-start this process, the datasets that were already catalogued in IMIS and came from the ASSEMBLE Plus marine stations became the ASSEMBLE Plus collection in IMIS. These records are searchable via the data catalogue search page. These records were the subject of the workshop on FAIR Data Management for Long-term Biological and Omics datasets from Marine Stations that is mentioned near the top of this page. Working together, VLIZ (WP 4 lead) and the ASSEMBLE Plus marine stations are in the process of making these data records FAIR. Those datasets that are, or can become, interoperable will be a particular focus of this work.
In this section we give advice about the FAIR data management of these datasets with these particular purposes in mind. Each marine station will have its own data management procedures, and what is stated here is not intended to replace those procedures.
Data management advice
- Read the ASSEMBLE Plus Data Management Plan (the latest version will always be on the work package reporting page) – this describes the principles that have been agreed upon for data management in ASSEMBLE Plus, and which will therefore be useful for you to know about.
Note: the requirement that ASSEMBLE Plus data are Open Access does not apply to datasets being discussed here, i.e. those which the marine stations produce outside of the ASSEMBLe Plus project. While we would like to encourage everyone to follow this policy, we will not enforce it.
- Ensure that as you create your data (raw, processed, final), you follow the data mangement principles of your station/institute, and in particular those concerning: the metadata you include with the data; the vocabularies you use in the meta(data); the formatting of the data in your datafiles; the types of datafiles to provide.
- Archiving your data it is good policy to archive data in a public, community archive: this is the very first step towards making your data FAIR, and ensuring it is never lost to science. ASSEMBLE Plus is using the Marine Data Archive (MDA) as a common archive to ensure the long-term preservation of our marine data, while archiving elsewhere also is allowed (especially where that is more suitable for a discipline). In principle, the raw data should always be archived – the raw data are the fundamental of your scientific research – as should be the final version of the data – cleaned, rich with metadata, and ready for (re)use. However, as the formats, sizes, and useability of raw and processed data vary from project to project this is something that you will need to decide on a case-by-case basis.
- Creating a metadata record in a data catalogue for your archived data is a must if you want your data to be Findable. Note: findable not mean immediately accessible, since you can set the use licence in the metadata record. ASSEMBLE Plus is cataloguing its data in the ASSEMBLE Plus collection of the Integrated Marine Information System (IMIS). An IMIS metadata record can point to whichever version of the data you wish to make public (raw, final, data in the MDA, data archived elsewhere...).
To add data to the MDA
Anyone can archive their data in the MDA. If you wish to do so for your institute's (or your own) datasets, please read the advice here. For technical questions, send an email to mda@vliz.be; for other questions, send an email to data@embrc.eu.
First, however, who should not read this secton?
- If you do not want to archive your data in the MDA, skip to "To create an IMIS record"
- If you do want to archive your data in the MDA, but it is a very small number of datasets and you won't do this again, then you can instead create an IMIS record and at the same time upload a dataset that we can archive in the MDA for you. Also skip to "To create an IMIS record"
- It is nonetheless useful to read this section, so you know what it is that we will do for you. Sections General advice about organising files and Making data public will be especially useful.
The steps to follow when adding data to the MDA yourself
- To add your data to the MDA, you first need an account: register here. If you wish to add your data to the ASSEMBLE Plus collection you will need to request access to the "ASSEMBLEPLUS" and "ASSEMBLEPLUS public" folders in your registration form. If you wish to archive your data elsewhere (e.g. a folder for your institute), then say that instead. Which you chose depends on how heavily you will use the MDA as an archive. It is in any case always possible to move, copy, or link files across folders.
- Read the policy document and the user manual, both of which can be found in the main menu of the website. The MDA is not particularly difficult to use.
- When you log on, you will see the folders you have access to: your personal folder (a default) and e.g. the ASSEMBLEPLUS folder. There is also an "ASSEMBLEPLUS public" folder, and its pupose is explained below. If you have an institue folder, you will see that one instead (or as well).
- If you want to store your data in ASSEMBLEPLUS, please use its MarineStations subfolder. If you are storing your data in your institute's folder, note down where you have put the data, for later reference.
- If you have many datafiles to archive, you may consider making additional subfolders to hold them in.
- Upon initial upload of your datafile, it will go into a "quarantine" folder.
- After uploading a file you will be required to fill in a metadata form. If you need to upload multiple files that can have the same metadata, you can "apply a template", as described in the manual, to avoid having to create the same metadata form multiple times.
- Once you have filled in all the mandatory parts of this metadata form, your file will move to the folder you have indicated for it.
- Note: the metadata you add in the MDA are not added to the IMIS metadata catalogue. These MDA-metadata are rather used to help the search system of the MDA itself, so users can find data within the MDA (i.e. when logged on). These metadata are not visible outside the MDA, and so are not public.
- Your files are now in the MDA but they are not public. To make them public, see below.
General advice about organising files
- Please keep your datafiles understandable, e.g. give the files meaningful names (what is in the file, is it raw or processed, what version is it, what institute or project owns it, etc: some file-naming advice can be found in the DMP [go here]), and keep related files together in a subdirectory or zipped up (it is acceptable to zip up files that belong together and upload the zip file).
- Consider iincluding a README file in each (sub)directory/zip file to describe what is in there, and what each file contains – think of this as a user guide to your files.
- When a user downloads your data, they do not get the metadata record together with the data, neither that which you created in the MDA, nor that which you will later create in IMIS (although clearly they can also print the IMIS record). Therefore, it is good practise to also include metadata with your files, either inside the datafiles or as a separate file – in this case ideally a CSV file where necessary information are tabulated, but a text file (.txt) is acceptable. PDF or Word documents are not acceptable.
Making data public
In order for datafiles in the MDA to be open access, i.e they can be downloaded directly from the IMIS record (see below for instructions on creating this record), it is necessary that the files are placed in a "public" folder. The ASSEMBLEPLUS folder is not public, and normally if your institute has a folder it will also not be public. For data in ASSEMBLEPLUS, the public folder is "ASSEMBLEPLUS public". The data that you placed in "ASSEMBLEPLUS" or your institute's folder and which you wish to make public, should now be moved or copied to "ASSEMBLEPLUS public/MarineStations", unless your institute does indeed have a public folder and you are allowed to use that.
Bearing in mind that the next step is to create a metadata record that will link to the data, you need to consider whether you want to create a metadata record for a single file, or for multiple files. Which to do depends on your datafiles, and this can only be decided by you. Consider the general advice that a single data download should be self-standing, i.e. all the information that a user requires, to understand what is in the data and how (far) they can be used, should be present. If you have a data file plus a separate metadata file and a README file, then clearly you have multiple files that are associated with this metadata record. (You can, of course, zip these up into a single file.)
- Single: Once you have placed your single file in the public folder, note down its link: do this by clicking on the file and then the "Show metadata" icon above the file listing. On the webpage you then see, copy the value in the field "Direct link"
- Mulitiple: Multiple files can be attached to a single metadata record in two ways:
- Place all your files to a single folder that only has those files in there, and note down the name of that folder and the number of files therein, so you can include this information when creating your metadata record
- Link your files together using a "fileset" (this is explained in the MDA manual). This fileset will sit in the folder you made it in, and when you click on the file (in the right part of the MDA view) you will be able to select the icon "Show metadata" (located in the icon bar above the file listing) to see its metadata. A new page opens: from there copy what is written in the "Direct link" as this will need to be specified when you create your metadata record
- Raw vs processed: do you offer the raw or the processed, or both, to be downloaded? There is no general advice here, as it depends on what the difference between the two are, how useful the raw are, how big they are, and so on. It is possible to "fileset" link raw and processed files in the MDA, and so offer both to be downloaded, and it is also possible to specify more than one URL for the data download link if the data are archived elsewhere.
To create an IMIS record
An IMIS metadata record should be created for all data that you want to make public. If the data you wish to create the record for are located in the MDA, follow the steps explained above. But it is also possible to create a metadata record for data archived elsewhere, if that is more appropriate.
Creating an IMIS record is done via this webform. It is pretty straightforward, and each field has an "i" button which you can click for more information and examples. A few comments specifically for ASSEMBLE Plus:
- Although formally not all the fields in the webform are mandatory, we request that you consider them to be, and only leave blank those that cannot apply to your type of data. Use the picklists offered rather than entering values free-form.
- In the description sections, please describe the dataset(s) the record is for, not the overall project it was created under: the circumstances of collection, creation, collation of the data should be described, and if not otherwise explained in a publication, the provenance should also be described.
- Please remember to select a licence – CC BY is for open access – and to include full contact information. If you are offering access upon request, make sure you have indicate a name and email address for the dataset contact (in the "People" section of the webform).
- To deal with the issue that people move on with time but data remains in an archive forever, consider using an alias as your contact email –
for example – and ensure that there is always someone on the receiving end of that alias.
- In "The dataset" section of the webform, you have a choice of: giving the URL(s) to the data, saying where they are located in the MDA, or uploading a dataset via the webform itself. This final option is intended for those who are archiving very little data and just the once. This may or may not include you. If your data are archived in the MDA, see the advice given above as to what MDA file or folder names or URLs you need to include here. If your data are located in multiple other archives (e.g. in different formats, or if you have one link for a data downloader and another link for a data explorer), you can list all of them in the space on the form. If you are uploading data via the webform, note the 10 Mb limit. Larger files can be downloaded by us if you make them available (e.g. dropbox or WeTransfer). Please zip files before transfer.
- If the data you wish to make public are already published in an IPT, IMIS can collect the necessary information from that IPT. However, ideally you would still fill out this webform, including the minimum information, and add a note that we should use "this IPT [UIRL]" to harvest the necessary information.
- In the "Projects" section, indicate at a minimum "ASSEMBLE Plus".
- In the "Data Integration" section: if you want your dataset to contribute to any of the data systems there list (in which case, the data will need to be in an interoperable format), click on its button and you will be contacted to start this integration processes.
Once you have submitted the webform, you will get a confirmation email with a copy of the webform. After some potential back-and-forth over email, your record will be created and you will be sent the "dasid" of your record. Please then check your record to ensure that it is as complete as you wanted it to be!
IMIS also offer the option to create a DOI for your data. This can be requested via a check box on the webform. Please note that a more rigorous checking of the metadata and the data formats are necessary to receive a DOI.
If you wish your data to be incorporated in EurOBIS or WoRMS, check the relevant box at the bottom of the webform. For this incorporation, your data need to be in a standard and interoperable format, and hence this may require additional work on your side on your data. However, VLIZ can offer much advice and tools to use to help in this process.