VirusSeq logo
Explore VirusSeq DataData Releases

Website and Data Usage Policies

General

The Canadian VirusSeq Data Portal (CVDP, also referred to as "the Portal") is an open-access data portal intended to facilitate access to Canadian SARS-CoV-2 sequences and associated non-sensitive metadata adhering to the FAIR Data principles. The Portal will manage and share limited contextual metadata and viral genome sequences among Canadian public health labs, researchers and other groups interested in accessing the data for surveillance, research, and innovation purposes. In doing so, it will complement the controlled-access platform developed by the National Microbiology Laboratory (NML).

The Portal will harmonize, validate, and automate submission to international databases and enable the creation of real-time dashboards that summarize the Canadian data contributions while facilitating exploration and access. The Data Portal is funded by Genome Canada.

The CVDP is not providing access to any personal data at the current time. If you are submitting sequences or metadata to the CVDP, do not include any data that could reveal the personal identity of the source.

Disclaimer

COVID-19 is an emerging and rapidly evolving situation. To promote responsive, collaborative, research and public health surveillance, virus sequences and minimal metadata are released on the CVDP rapidly and should be considered draft and subject to change.

Beyond limited editorial and quality controls and some internal integrity checks, the quality and accuracy of the record are the responsibility of submitters, not of the CVDP. It is also the responsibility of submitters to ascertain that they have the right to submit the data. The CVDP team will work with submitters to provide feedback on metadata and sequence data to improve the overall quality and consistency of the data submitted.

Data available

Currently, the CVDP exclusively focuses on providing access to data types that do not constitute "personal data/information". These include:

  • Consensus viral sequence
  • Raw de-hosted viral sequences
  • Contextual Metadata:
    1. Study id - a unique identifier for each data provider
    2. Specimen collector sample ID - a unique identifier for each sequenced specimen
    3. GISAID accession - the GISAID accession number assigned to the sequence
    4. Sample collected by - the name of the agency that collected the original sample
    5. Sequence submitted by - the name of the agency that generated the sequence
    6. Sample collection date - the date on which the sample was collected
    7. Geo_loc_name (country) - the country where the sample was collected
    8. Geo_loc_name (state/province/territory) - the province/territory where the sample was collected
    9. Organism - Taxonomic name of the organism
    10. Isolate - Identifier of the specific isolate
    11. Fasta header name - fasta file identifier of the isolate
    12. Purpose of sampling - the reason that the sample was collected
    13. Purpose of sampling details - the description of why the sample was collected, providing specific details
    14. Anatomical material - A substance obtained from an anatomical part of an organism e.g. tissue, blood
    15. Anatomical part - An anatomical part of an organism e.g. oropharynx
    16. Body product - A substance excreted/secreted from an organism e.g. feces, urine, sweat
    17. Environmental material - A substance obtained from the natural or man-made environment e.g. soil, water, sewage
    18. Environmental site - An environmental location may describe a site in the natural or built environment e.g. metal can, hospital
    19. Collection device - The instrument or container used to collect the sample e.g. swab
    20. Collection method - The process used to collect the sample e.g. phlebotamy, necropsy
    21. Host (scientific name) - The taxonomic, or scientific name of the host
    22. Host disease - The name of the disease experienced by the host
    23. Host age - Age of host at the time of sampling
    24. Host age unit - The unit used to measure the host age, in either months or years
    25. Host age bin - Age of host at the time of sampling, expressed as an age group
    26. Host gender - The gender of the host at the time of sample collection
    27. Purpose of sequencing - The reason that the sample was sequenced
    28. Purpose of sequencing details - The description of why the sample was sequenced providing specific details
    29. Sequencing instrument - The model of the sequencing instrument used
    30. Sequencing protocol - The protocol used to generate the sequence
    31. Raw sequence data processing method - The names of the software and version number used for raw data processing e.g. removing barcodes, filtering etc
    32. Dehosting method - The method used to remove host reads from the pathogen sequence
    33. Consensus sequence software name - The name of software used to generate the consensus sequence
    34. Consensus sequence software version - The version of the software used to generate the consensus sequence
    35. Breadth of coverage value - The percentage of the reference genome covered by the sequenced data, to a prescribed depth
    36. Depth of coverage value - The average number of reads representing a given nucleotide in the reconstructed sequence
    37. Reference genome accession - A persistent, unique identifier of a genome database entry
    38. Bioinformatics protocol - A description of the overall bioinformatics strategy used
    39. Gene name - The name of the gene used in the diagnostic RT-PCR test
    40. Diagnostic_pcr_ct_value - The Ct value result from a diagnostic SARS-CoV-2 RT-PCR test
    41. Lineage name - The name of the lineage assigned to a squenced sample
    42. Lineage analysis software name - The name of the software used to determine the lineage
    43. Lineage analysis software version - The version of the software used to determine the lineage
    44. Lineage analysis software data version - A version number that represents both pangolin-data version number
    45. Scorpio call - A software that performs snp-based calling of VOCs, mainly serious constellations of reoccurring phylogenetically-independent origin
    46. Scorpio version - The version of scorpio software to determine the lineage

Data users guidelines

Access to the data provided within the CVDP is provided in a completely open manner, and at no cost to members of the scientific community and other interested parties. Nevertheless, users are expected to follow the CVDP policy on Recognition of the work of data submitters. Users should not attempt to make use of the portal data to attempt to re-identify specific individuals. In the unlikely case you come across identifying data, please swiftly report the event, indicating the problematic dataset, at info@virusseq-dataportal.ca.

Data standards guidelines

As data needs change over time, the data standard implemented by the Data Portal evolves (additional fields and terms may be added, requirements may be updated, etc). This may alter the database schema as well as the types of information provided by data stewards. For more information, please contact Dr. Emma Griffiths at ega12@sfu.ca.

Recognition of the work of the data submitters

You may use the data from the CVDP to author results obtained from your analyses of relevant data, provided that your published results acknowledge, as the original source of the data, CanCOGeN-VirusSeq, CPLHN and its members.

Proposed sentence: “The authors of the manuscript would like to acknowledge the original source of the data CanCOGeN-VirusSeq, CPLHN and its members.”

Please note that the data that is being shared is the work of many individuals and should be treated as unpublished data. If you wish to publish research using the data, contact us at info@virusseq-dataportal.ca first to ensure that those who have generated the data can be involved in its analysis. You are responsible for making the best efforts to collaborate with representatives of the Originating Laboratory responsible for obtaining the specimen(s) and involve them in such analyses and further research using such Data.

Intellectual Property

The CVDP is designed to provide and encourage access within the scientific community to the most up-to-date and comprehensive COVID-19 viral sequencing data. Therefore, there are no restrictions on the use or distribution of the CVDP data. While we do not encourage this practice, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. The CVDP is not in a position to assess the validity of such claims and therefore cannot provide comment or unrestricted permission concerning the use, copying, or distribution of the information it contains.

Privacy Policy

The CVDP is committed to protecting the privacy and security of the personal information and data of its users to the greatest extent possible subject to Canada’s provincial/territorial and federal laws. Personal information is defined as information that can reasonably be used to identify an individual either alone or in combination with other available information. The CVDP will only use your personal information for specific and consented purposes. This policy will be maintained except in circumstances required by law to provide access, or in response to subpoenas or other legal instruments to authorize access to personal information. Except for these scenarios, personal information will not be shared outside of the CVDP and its associated personnel or contractors without your explicit consent. When collected, personal information will only be retained for as long as necessary to fulfil its purposes subject to the applicable Canadian legal requirements.

Purpose, use, and collection of information

The central purpose of this website and the CVDP is to facilitate scientific research by providing researchers and other interested parties a central access point to Canada-based COVID-19-related genomic data and certain associated contextual metadata.

Personal information is not required to view this website, although certain key features will require some personal information to function optimally. We may collect this personal information in the form of webforms in connection with your account and services provided to you. The level of information collected will also be subject to the type of services and account used. Providing us with your information where required is strictly voluntary. Any personal information collected will be appropriately protected through physical and electronic means such as password protection, and encryption.

By providing your personal information, you are consenting to its use for the purposes listed below:

  • Communicate with you regarding our services such as a newsletter, event notification, or a change of policy,
  • Provide you with a service (e.g., help data submission, data access),
  • Communicate and troubleshoot with you regarding any CVDP website functionalities and or services.

The CVDP website and its associated servers also collect the following analytics for the purposes of web presentation, troubleshooting, and web functionality. This information will not be associated with individual user identities, and will not be used to re-identify any users as subject to this privacy policy:

  • Internet Protocol (IP) address of the computer being used,
  • Web pages requested,
  • Referring web page,
  • Browser used,
  • Date and time of activities.

Distribution of information to third parties

Third-party contractors and or agents may be involved in maintaining and improving the functions (e.g., IT services) of the CVDP Website. In these scenarios, if any associated third party should be provided access to any personal information, personal information will be kept secure, private, and confidential in accordance with Canadian Provincial/territorial and Federal legislation, and that of the CVDP Privacy Policy. Such parties are only permitted to use such personal information for lawful purposes authorized by the CVDP.

Cookies

This website uses ‘Cookies’. Cookies may collect information such as your email address, username, or keep track of pages visited and documents downloaded. The CVDP may use ‘cookies’ to deliver web content specific to a user’s interests or to keep users logged in when such a feature is enabled. You may choose to enable or disable cookies on this website, and such information will not be collected. Disabling cookies will not restrict your access to the CVDP website but may affect the normal functioning of various features.

Hyperlinks and other privacy policies

If you follow a hyperlink from the CVDP website onto the website(s) of another entity, that entity may have/uphold a different privacy policy. The CVDP bears no responsibility for the privacy of the user in such a scenario, and we advise you to appropriately consult the privacy policies of these other entities.

Right to be “forgotten”

Users may request the erasure/deletion of any personal information they have provided to the CVDP website. If possible, the CVDP website will work to promptly erase/delete this personal information, except for where required by law (e.g. for records for auditing records). To request the erasure/deletion of your personal information, you may contact us at info@virusseq-dataportal.ca.

Privacy policy revisions

This privacy policy was last revised on May 31, 2021. These policies are subject to change and we encourage you to review this Privacy Policy each time you visit the portal. If any significant changes are made to this policy, a notice will be posted on the homepage for a reasonable period of time after the change is implemented, so that the user may be fully aware of any changes before using the CVDP website.

Contact us

Your privacy and concerns are important to us. We welcome you to contact us with your comments, questions, complaints, and or suggestions about our policy or a privacy-related issue. Please contact us at info@virusseq-dataportal.ca.

Data Submitters Guidelines

Registration

Individuals and organizations interested in submitting data to the CVDP must first apply for data submission authorization through info@virusseq-dataportal.ca. Registration can be completed with an email of choice. Afterwards, a verification email will be sent to the user. Once authorized, users can upload data via an account provided to them. Through this account, users can review the status of submission and review or reattempt any failed submissions. More detailed instructions for data submission can be found post-registration.

With registration, the CVDP will collect the user’s email address used, first and last name, user name, and password. Please refer to the CVDP Privacy Policy for more details on how the collection, storage, and processing of any user data.

Sensitive and or Identifiable Data

Submitters are responsible for not submitting any sensitive or personal information to the CVDP. “Dehosting” and other relevant risk mitigation procedures are recommended to minimize the risk of submitting any sensitive or identifiable data.

Helpdesk

The CVDP operates a helpdesk based on a ticketing system. You may contact the helpdesk at info@virusseq-dataportal.ca.

Cancogen logo
© 2024 Canadian VirusSeq Data Portal
Powered by:VirusSeq Github