Frequently-asked questions about EMPIAR

What is the difference between EMDB and EMPIAR?

The final 3D reconstruction from a single-particle, electron tomography or electron diffraction experiment can be deposited to EMDB. Image data related to EMDB entries, e.g., micrographs, particle-stacks and tilt-series can be deposited to EMPIAR. EMPIAR also supports the following experimental methods (taking image data and final reconstructions where available):

  • 3D Scanning Electron Microscopy (3DSEM)
  • Soft X-ray Tomography (SXT)
  • 2D EM image data from integrative/hybrid modelling experiments

For these experimental methods we require that the data set is directly related to a publication or is a part of a community initiative, e.g. as benchmarking data.

How do I download?

  • You can either download data using a web interface or via a command line interface
  • To download via the web interface:
    • Navigate to Aspera Connect and a web browser plugin. If you have not already done so you will be presented with a link which will take you to the Aspera Connect installation site
    • Once you have installed Aspera Connect and the browser plugin, you can download any of the datasets shown in the table by clicking on the corresponding download icon shown in column 1
  • Instructions on downloading via a command line interface can be found here
  • Instructions on downloading via Globus can be found here
  • Instructions on downloading via ftp can be found here

How do I install the Aspera Connect client?

  • Download Aspera Connect for the correct operating system
  • On a Linux machine, you will either download a shell script or a .tar.gz file. If you have dowloaded the .sh file, download and excecute it. If you downloaded a .tar.gz file then unzip/untar the package and follow install instructions.

How do I download data using a command line interface?

  • The Aspera copy command is installed by default in the ~/.aspera/connect/bin directory on Linux machines and in the ~/Applications/Aspera\ Connect.app/Contents/Resources directory on Macs. You will find the ascp program in this directory
  • Download and install the Aspera Connect client if you have not done so already
  • As an example, to download EMPIAR-10009, use:
    Linux: ./ascp -QT -l 200M -P33001 -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh emp_ext3@fasp.ebi.ac.uk:/10009 ~/Destination/path/10009
    Mac: ./ascp -QT -l 200M -P33001 -i "/Users/<replace_your_username_here>/Applications/Aspera Connect.app/Contents/Resources/asperaweb_id_dsa.openssh" emp_ext3@fasp.ebi.ac.uk:/10009 ~/Destination/path/10009
  • By using the flag -k3 you can ensure that the source and destination files are compared based on full checksum when resuming partially transferred (incomplete) files. Note that computing full checksums of large files takes time, and heavily utilizes the CPU. While this would increase the reliability of the transfer, please note that this could also reduce the overall speed of the transfer. A example of the command then would be:
    Linux: ./ascp -QT -k3 -l 200M -P33001 -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh emp_ext3@fasp.ebi.ac.uk:/10009 ~/Destination/path/10009
    Mac: ./ascp -QT -k3 -l 200M -P33001 -i "/Users/<replace_your_username_here>/Applications/Aspera Connect.app/Contents/Resources/asperaweb_id_dsa.openssh" emp_ext3@fasp.ebi.ac.uk:/10009 ~/Destination/path/10009

Do you have a small data set for testing downloads?

  • The 58MB data set "test" is available for this purpose
  • You can use the

    How are ORCIDs used in EMPIAR?

    ORCID iDs are shown on entry pages for every author that has provided us with such during the deposition process. This allows a quick and easy way to access the list of the author's publications from Europe PMC search or directly from the author's ORCID profile page.

    EMPIAR has now integrated ORCID into the deposition system. The user can sign in into the system using their ORCID account. This would automatically populate user profile fields like ORCID iD, first and last names and email.

Can I claim EMPIAR entries to my ORCiD profile?

Yes! More information is available here .

What formats does EMPIAR support?

We provide image data in the formats in which they are uploaded. We recommend the use of common formats in the field including MRC, MRCS, TIFF, DM4, IMAGIC, SPIDER, MRC FEI, RAW FEI and BIG DATA VIEWER HDF5.

Under what license is EMPIAR data available?

All data in EMPIAR is freely and publicly available to the global community under the CC0 license (https://creativecommons.org/share-your-work/public-domain/cc0/).

What cryo-EM data to deposit?

We accept data from a range of imaging modalities in EMPIAR, including cryo-EM, cryo-ET, and various volume EM, X-ray microscopy and correlative imaging studies. In the case of cryo-EM/ET, the recommendation is to deposit the “rawest” form of your data to EMPIAR (e.g., for single-particle studies, unaligned multi-frame micrographs). This enables wide data re-use and re-processing, e.g., for validation, methods development, teaching, and community challenges. Providing additional downstream intermediate files (e.g., motion-corrected micrographs, particle coordinates, particle stacks, particle Star files with assigned Euler angles and alignments) is also encouraged as these will aid users interested in specific steps of the EM workflow.

Ideally, as much as possible should be provided. For example:
  • motion-corrected micrographs are most useful for developers of an auto-picking algorithm
  • raw movies give the best chance of improving the resolution and/or deriving more information from the dataset in future
  • particle stacks allow closest reproduction and are the most convenient format for those working on flexibility analysis
  • STAR files for all reasonable particles, not just the ones that end up in the final model, are also helpful for flexibility analysis
  • final STAR files for each class allow “close” reproduction of the published structure plus they implicitly include information of good and bad movies because bad movies don't give final particles

In terms of metadata, EMPIAR’s deposition forms already collect information on numbers of images, frames per image, image format, dimensions, pixel spacing and type. Further information, for example, on parameters essential for reprocessing such as Cs, electron dose per frame and gain-reference orientation (for any deposited gain-reference images) can be provided in the “Details” fields. Additionally, workflows from Scipion (and in the future hopefully from other packages as well) can be uploaded as part of your submissions.

The full list of EMPIAR requirements is available online, including data and file types, formats, and extensions. For information on EMPIAR’s policies and processing procedures, please see {{emdb_global.ebi_ebi_empiaries. For guidance on EMPIAR’s deposition process, please refer to our deposition manual.

Note: for non-cryo data, there are different requirements and recommendations.

How do I deposit data to EMPIAR?

Use the online deposition system. Start by checking the manual.

What happens to my data after submission and when will it be released?

Pressing the "Submit" button sends the deposition for review by the EMPIAR annotation team. They will communicate with the depositor regarding any issues and may choose to unlock the deposition if complementary details or data are required. Once this is complete, the depositor will be sent a request to approve the curation by the EMPIAR annotation team. Upon receiving the depositor's approval the entry will enter the release phase following the instructions provided by the depositor. The time it takes to get the entry released depends on the size of the entry, the number of files, the current load on EMBL-EBI resources and the availability of the annotation staff. The depositor will be notified by email once the entry is publicly available.

How do I cite EMPIAR entries?

Please cite the original publication and cite the EMPIAR entry using the guidelines provided here

How do I cite EMPIAR?

We recommend that you follow the guidelines provided here

Can I selectively download files and directories for an entry?

Yes, EMPIAR entry pages offer three options - Aspera, download of a tarball via http, and download of individual files via ftp. The file browsers on the EMPIAR entry pages allow you to select files and sub-directories to be downloaded via Aspera or http. Aspera is the recommended option and can be used for any size of download. Please note that the download will be an as-is structure of the selected files and directories and not an archive (tar-ball)! The http download option will create a tarball that can be downloaded. This option only works for tarballs < 1.5GB. You can also download individual files (but not sub-directories) using Ftp (select "Browse Ftp").

Can I upload/download data via Globus?

Yes! To download data use endpoint: Shared EMBL-EBI public endpoint and directory: /gridftp/empiar/world_availability/ . For instructions on uploading data via Globus, please the deposition manual.

What is EMPIAR data model?

EMPIAR schema is described here. It consists of the main empiar.xsd XML schema file and additional requirements in empiar.sch in Schematron format.

Having trouble with transferring data?

Make sure that your Firewall does not block your transfer.

For Globus ensure that hx-gridftp-*.ebi.ac.uk addresses are whitelisted.

For Aspera you might need to modify the rules to accommodate the following:

permit TCP outbound to 193.62.197.71 on port 33001

permit UDP outbound to 193.62.197.71 on port 33001

permit UDP inbound from 193.62.197.71 on port 33001

Our web transfer interface has the target rate set to 200 000 Kbps and the policy to adaptive. Similarly, the command line displayed above. In case you are having issues with transferring data, it might help to reduce the target rate.

To do so, please use the command line transfer and change the value of '-l' parameter. For example: ascp -QT -l 10M -P33001 -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh emp_ext3@fasp.ebi.ac.uk:/test testset

Please get in touch if the issue persists. If possible, provide us Aspera logs.

How do I contact EMPIAR team?

Send us a message!

Quick Links
Citations
Liu T, Shilliday F, Cook AD, Zeeshan M, Brady D, Tewari R, Sutherland CJ, Roberts AJ, Moores CA. (2022)
Jespersen N, Ehrenbolger K, Winiger RR, Svedberg D, Vossbrinck CR, Barandun J. (2022)
Sheng Y, Harrison PJ, Vogirala V, Yang Z, Strain-Damerell C, Frosio T, Himes BA, Siebert CA, Zhang P, Clare DK. (2022)
See all citations
Tweets::PDBj
Tweets::EMPIAR-EBI