Website and Data Usage Policies
The Canadian VirusSeq Data Portal (CVDP, also referred to as "the Portal") is an open-access data portal intended to facilitate access to Canadian SARS-CoV-2 sequences and associated non-sensitive metadata adhering to the FAIR Data principles. The Portal will manage and share limited contextual metadata and viral genome sequences among Canadian public health labs, researchers and other groups interested in accessing the data for surveillance, research, and innovation purposes. In doing so, it will complement the controlled-access platform developed by the National Microbiology Laboratory (NML).
The Portal will harmonize, validate, and automate submission to international databases and enable the creation of real-time dashboards that summarize the Canadian data contributions while facilitating exploration and access. The Data Portal is funded by Genome Canada.
The CVDP is not providing access to any personal data at the current time. If you are submitting sequences or metadata to the CVDP, do not include any data that could reveal the personal identity of the source.
COVID-19 is an emerging and rapidly evolving situation. To promote responsive, collaborative, research and public health surveillance, virus sequences and minimal metadata are released on the CVDP rapidly and should be considered draft and subject to change.
Beyond limited editorial and quality controls and some internal integrity checks, the quality and accuracy of the record are the responsibility of submitters, not of the CVDP. It is also the responsibility of submitters to ascertain that they have the right to submit the data. The CVDP team will work with submitters to provide feedback on metadata and sequence data to improve the overall quality and consistency of the data submitted.
Currently, the CVDP exclusively focuses on providing access to data types that do not constitute "personal data/information". These include:
- Consensus viral sequence
- Raw de-hosted viral sequences
- Contextual Metadata:
- Study id - a unique identifier for each data provider
- Specimen collector sample ID - a unique identifier for each sequenced specimen
- GISAID accession - the GISAID accession number assigned to the sequence
- Sample collected by - the name of the agency that collected the original sample
- Sequence submitted by - the name of the agency that generated the sequence
- Sample collection date - the date on which the sample was collected
- Geo_loc_name (country) - the country where the sample was collected
- Geo_loc_name (state/province/territory) - the province/territory where the sample was collected
- Organism - Taxonomic name of the organism
- Isolate - Identifier of the specific isolate
- Fasta header name - fasta file identifier of the isolate
- Purpose of sampling - the reason that the sample was collected
- Purpose of sampling details - the description of why the sample was collected, providing specific details
- Anatomical material - A substance obtained from an anatomical part of an organism e.g. tissue, blood
- Anatomical part - An anatomical part of an organism e.g. oropharynx
- Body product - A substance excreted/secreted from an organism e.g. feces, urine, sweat
- Environmental material - A substance obtained from the natural or man-made environment e.g. soil, water, sewage
- Environmental site - An environmental location may describe a site in the natural or built environment e.g. metal can, hospital
- Collection device - The instrument or container used to collect the sample e.g. swab
- Collection method - The process used to collect the sample e.g. phlebotamy, necropsy
- Host (scientific name) - The taxonomic, or scientific name of the host
- Host disease - The name of the disease experienced by the host
- Host age - Age of host at the time of sampling
- Host age unit - The unit used to measure the host age, in either months or years
- Host age bin - Age of host at the time of sampling, expressed as an age group
- Host gender - The gender of the host at the time of sample collection
- Purpose of sequencing - The reason that the sample was sequenced
- Purpose of sequencing details - The description of why the sample was sequenced providing specific details
- Sequencing instrument - The model of the sequencing instrument used
- Sequencing protocol - The protocol used to generate the sequence
- Raw sequence data processing method - The names of the software and version number used for raw data processing e.g. removing barcodes, filtering etc
- Dehosting method - The method used to remove host reads from the pathogen sequence
- Consensus sequence software name - The name of software used to generate the consensus sequence
- Consensus sequence software version - The version of the software used to generate the consensus sequence
- Breadth of coverage value - The percentage of the reference genome covered by the sequenced data, to a prescribed depth
- Depth of coverage value - The average number of reads representing a given nucleotide in the reconstructed sequence
- Reference genome accession - A persistent, unique identifier of a genome database entry
- Bioinformatics protocol - A description of the overall bioinformatics strategy used
- Gene name - The name of the gene used in the diagnostic RT-PCR test
- Diagnostic_pcr_ct_value - The Ct value result from a diagnostic SARS-CoV-2 RT-PCR test
- Lineage name - The name of the lineage assigned to a squenced sample
- Lineage analysis software name - The name of the software used to determine the lineage
- Lineage analysis software version - The version of the software used to determine the lineage
- Lineage analysis software data version - A version number that represents both pangolin-data version number
- Scorpio call - A software that performs snp-based calling of VOCs, mainly serious constellations of reoccurring phylogenetically-independent origin
- Scorpio version - The version of scorpio software to determine the lineage
Data users guidelines
Access to the data provided within the CVDP is provided in a completely open manner, and at no cost to members of the scientific community and other interested parties. Nevertheless, users are expected to follow the CVDP policy on Recognition of the work of data submitters. Users should not attempt to make use of the portal data to attempt to re-identify specific individuals. In the unlikely case you come across identifying data, please swiftly report the event, indicating the problematic dataset, at firstname.lastname@example.org.
Data standards guidelines
As data needs change over time, the data standard implemented by the Data Portal evolves (additional fields and terms may be added, requirements may be updated, etc). This may alter the database schema as well as the types of information provided by data stewards. For more information, please contact Dr. Emma Griffiths at email@example.com.
Recognition of the work of the data submitters
You may use the data from the CVDP to author results obtained from your analyses of relevant data, provided that your published results acknowledge, as the original source of the data, CanCOGeN-VirusSeq, CPLHN and its members.
Proposed sentence: “The authors of the manuscript would like to acknowledge the original source of the data CanCOGeN-VirusSeq, CPLHN and its members.”
Please note that the data that is being shared is the work of many individuals and should be treated as unpublished data. If you wish to publish research using the data, contact us at firstname.lastname@example.org first to ensure that those who have generated the data can be involved in its analysis. You are responsible for making the best efforts to collaborate with representatives of the Originating Laboratory responsible for obtaining the specimen(s) and involve them in such analyses and further research using such Data.
The CVDP is designed to provide and encourage access within the scientific community to the most up-to-date and comprehensive COVID-19 viral sequencing data. Therefore, there are no restrictions on the use or distribution of the CVDP data. While we do not encourage this practice, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. The CVDP is not in a position to assess the validity of such claims and therefore cannot provide comment or unrestricted permission concerning the use, copying, or distribution of the information it contains.
The CVDP is committed to protecting the privacy and security of the personal information and data of its users to the greatest extent possible subject to Canada’s provincial/territorial and federal laws. Personal information is defined as information that can reasonably be used to identify an individual either alone or in combination with other available information. The CVDP will only use your personal information for specific and consented purposes. This policy will be maintained except in circumstances required by law to provide access, or in response to subpoenas or other legal instruments to authorize access to personal information. Except for these scenarios, personal information will not be shared outside of the CVDP and its associated personnel or contractors without your explicit consent. When collected, personal information will only be retained for as long as necessary to fulfil its purposes subject to the applicable Canadian legal requirements.
Purpose, use, and collection of information
The central purpose of this website and the CVDP is to facilitate scientific research by providing researchers and other interested parties a central access point to Canada-based COVID-19-related genomic data and certain associated contextual metadata.
Personal information is not required to view this website, although certain key features will require some personal information to function optimally. We may collect this personal information in the form of webforms in connection with your account and services provided to you. The level of information collected will also be subject to the type of services and account used. Providing us with your information where required is strictly voluntary. Any personal information collected will be appropriately protected through physical and electronic means such as password protection, and encryption.
By providing your personal information, you are consenting to its use for the purposes listed below:
- Communicate with you regarding our services such as a newsletter, event notification, or a change of policy,
- Provide you with a service (e.g., help data submission, data access),
- Communicate and troubleshoot with you regarding any CVDP website functionalities and or services.
- Internet Protocol (IP) address of the computer being used,
- Web pages requested,
- Referring web page,
- Browser used,
- Date and time of activities.
Distribution of information to third parties
This website uses ‘Cookies’. Cookies may collect information such as your email address, username, or keep track of pages visited and documents downloaded. The CVDP may use ‘cookies’ to deliver web content specific to a user’s interests or to keep users logged in when such a feature is enabled. You may choose to enable or disable cookies on this website, and such information will not be collected. Disabling cookies will not restrict your access to the CVDP website but may affect the normal functioning of various features.
Hyperlinks and other privacy policies
Right to be “forgotten”
Users may request the erasure/deletion of any personal information they have provided to the CVDP website. If possible, the CVDP website will work to promptly erase/delete this personal information, except for where required by law (e.g. for records for auditing records). To request the erasure/deletion of your personal information, you may contact us at email@example.com.
Your privacy and concerns are important to us. We welcome you to contact us with your comments, questions, complaints, and or suggestions about our policy or a privacy-related issue. Please contact us at firstname.lastname@example.org.
Data Submitters Guidelines
Individuals and organizations interested in submitting data to the CVDP must first apply for data submission authorization through email@example.com. Registration can be completed with an email of choice. Afterwards, a verification email will be sent to the user. Once authorized, users can upload data via an account provided to them. Through this account, users can review the status of submission and review or reattempt any failed submissions. More detailed instructions for data submission can be found post-registration.
Sensitive and or Identifiable Data
Submitters are responsible for not submitting any sensitive or personal information to the CVDP. “Dehosting” and other relevant risk mitigation procedures are recommended to minimize the risk of submitting any sensitive or identifiable data.
The CVDP operates a helpdesk based on a ticketing system. You may contact the helpdesk at firstname.lastname@example.org.