# Data Upload
With the SEQ Platform, you can directly upload your files or use the cloud browser to browse and select the files pre-uploaded to your account.
# Direct Data Upload
Click on the "Upload" button at your homepage and select "Germline" option. Here, you can select FASTQ of VCF options.
# FASTQ Upload
# Select the files to upload
Click the "Browse" button under the “File” to upload all the files you want to analyze. Make sure that you upload both read files for paired-end reads. See the table below for the supported input file types:
File Types | Batch Sample Upload | |
---|---|---|
Illumina | .fastq.gz, .fq.gz | Supported |
Ion Torrent | .bam | Supported |
MGI | .fastq.gz, .fq.gz | Supported |
PacBio | .bam | Supported |
ONT | .fastq.gz, .fq.gz | Supported |
# Filename formatting for batch upload
# Illumina
You can upload multiple samples with two or more files. For batch uploads, we only support filenames following the “Illumina naming convention” e.g.
NA10831_ATCACG_L002_R1_001.fastq.gz
NA10831_ATCACG_L002_R2_001.fastq.gz
The filenames from Illumina platform are handled as below:
<name_field1>_<name_field2>_<lane_#>_<read_#>_<always_001>.fastq.gz
Name field 1, name field 2, lane #, and read # are used to match the corresponding files correctly. You can alter the name fields 1 and 2
without using space and underscore (_
) characters. Following this naming convention, you can upload multiple samples (each with multiple fastq.gz files).
Please check the number of samples and matched files on the confirmation screen.
# MGI
You can upload multiple samples with two or more files. For batch uploads, we only support filenames following the “MGI naming convention” e.g.
V12345678_L01_16_1.fastq.gz
V12345678_L01_16_2.fastq.gz
The filenames from MGI platform are handled as below:
<flowcell_id>_<lane_#>_<barcode>_<read_#>.fq.gz
Flowcell ID, lane #, barcode, and read # are used to match the corresponding files correctly. You can alter the flowcell ID field without using space and underscore (_
) characters.
If you have used more than one barcode for the same sample, you need to rename the file as follows:
Original file names:
sammple1234_L01_16_1.fastq.gz
sammple1234_L01_16_2.fastq.gz
sammple1234_L01_17_1.fastq.gz
sammple1234_L01_17_2.fastq.gz
Altered file names:
sammple1234_L01_16_1.fastq.gz
sammple1234_L01_16_2.fastq.gz
sammple1234_L02_16_1.fastq.gz
sammple1234_L02_16_2.fastq.gz
In this example, barcode numbers of the last two files are changed to 16, and their lane numbers are increased by 1.
Please check the number of samples and matched files on the confirmation screen.
# IonTorrent
IonTorrent files need to be unaligned BAM files with .bam
extension.
# Pacific Biosciences
Pacific Biosciences files need to be BAM files with .bam
extension. You can upload multiple samples in a single batch. One file for each sample is expected.
# Oxford Nanopore Technologies
Oxford Nanopore Technologies files need to be in fastq.gz
or fq.gz
format. You can upload multiple samples in a single batch. One file for each sample is expected.
# Select a previous Run or Create a New Run
You can upload your samples to a new run by selecting “Create New Run” under “Run Name” and giving it a name on the “Name For New Run” field. You can also upload your samples to an existing run by selecting the previous run from the dropdown menu.
Run information is very critical for Copy Number Variation analyses. Therefore, please make sure you organize the samples under the runs in the same way as you process your sample materials. Ideally, a run is a set of samples coming from the same wet-lab and run process (flow cell, etc.).
# Choose the technology type
Choose the next-generation sequencing machine associated with the samples. Mixing different technologies in one run is not permitted.
# Choose the kit type
The SEQ platform has hundreds of different kits predefined in the system. A new kit can be defined with a set of target coordinates and a list of targeted genes. The Addition of a new kit typically takes one business day. For kit requests, please contact us through support@genomize.com.
Every kit is associated with a standardized analysis version in SEQ. Probe-based kits, primer-based kits, Illumina & MGI technology, ION torrent technology, germline analysis, or somatic analyses all have a preset analysis version.
# Hotspot VCF file upload
A predefined VCF file can be uploaded to SEQ. The VCF4.2 standard is supported. If the user does not upload a VCF as a hotspot, SEQ automatically subsets Pathogenic or Likely Pathogenic variants from ClinVar as the default hotspot for panels. For Whole Exome Sequencing, the default hotspot assignment is currently not supported.
# Advanced options
# Variant Calling parameters
A set of parameters is used to assess the quality of every variant called in a sample. Two parameters, the primary coverage threshold and the minimum alternative fraction threshold, can cause the classification of the variant as “FAILED”. The “FAILED” variant calls will not be displayed.
The variant calls with an alternative allele count less than the primary coverage threshold will be classified as “FAILED” and not be displayed.
The variant calls with alternative allele frequency less than the allele fraction threshold will be classified as “FAILED” and not be displayed.
# Other parameters
When calculating coverage metrics for the gene coverage and the kit’s on-target coverage percentages, SEQ uses four different thresholds. 1X and 5X are the preset values. The other two values may be customized by the user per upload.
The default values of the advanced options are set under “Site Settings” in the Settings menu.
# Submit your data
As the last step, you can upload your data by clicking the “Continue” button to start the upload process. After clicking "Continue", you will see the "Case Information" screen. Please refer to the "Genomize's AI-Assisted Variant Prioritization" section for more information on entering the case information. You can then click the "Upload" button to see the number of analyses and the list of files matched for each analysis. Please be sure that both of these pieces of information are correct and hit "Approve" to start the upload process or "Cancel" to make changes.
When you start the upload, you will see the progress for each file. Transferred samples will immediately begin processing without waiting for the entire batch to finish uploading.
SEQ Platform's upload process is secure and performs a checksum to ensure the files are transferred correctly. Please do not close the browser tab or shut down your computer. Also, please ensure that your computer will not go into sleep/hibernation mode during the upload. Otherwise, the upload process will be aborted. Our upload process is resistant to intermittent loss of internet connection.
When the upload process is completed, you will be redirected to the corresponding Run's page, and your samples will be queued for analysis. Refresh the corresponding Run page to see the last status of the analysis.
# VCF Upload
# Select a Previous Run or Create a New Run
You can upload your samples to a new run by selecting “Create New Run” under “Run Name” and giving it a name on the “Name For New Run” field. You can also upload your samples to an existing run by selecting the previous run from the dropdown menu.
# Choose the Technology Type
Choose the next-generation sequencing machine associated with the samples. If you do not know which sequencing platform is used, you can select the "Unknown" option. Mixing different technologies in one run is not permitted.
# Select the VCF Files to be Uploaded
You can upload VCF files for SNV, CNV, SV, and STR from DRAGEN or other similar tools for the same sample in any combination you choose. VCF files should have vcf.gz
extension. Files are matched using the sample info field, not the file names. Multisample VCFs are not supported.
# Submit your data
As the last step, you can upload your data by clicking the “Continue” button to start the upload process. After clicking "Continue", you will see the "Case Information" screen. Please refer to the "Genomize's AI-Assisted Variant Prioritization" section for more information on entering the case information. You can then click the "Upload" button to see the number of analyses and the list of files matched for each analysis. Please be sure that both of these pieces of information are correct and hit "Approve" to start the upload process or "Cancel" to make changes.
When you start the upload, you will see the progress for each file. Transferred samples will immediately begin processing without waiting for the entire batch to finish uploading.
SEQ Platform's upload process is secure and performs a checksum to ensure the files are transferred correctly. Please do not close the browser tab or shut down your computer. Also, please ensure that your computer will not go into sleep/hibernation mode during the upload. Otherwise, the upload process will be aborted. Our upload process is resistant to intermittent loss of internet connection.
When the upload process is completed, you will be redirected to the corresponding Run's page, and your samples will be queued for analysis. Refresh the corresponding Run page to see the last status of the analysis.
# Cloud Browser
To use the cloud browser, select or create your run, select the sequencing platform and the kit by following the directions above. After the kit selection, you will see the option to select either your “COMPUTER” or the “CLOUD BROWSER” as the data source.
When you select the "CLOUD BROWSER" option, click the PLUS (➕) button to open the cloud browser interface. Using the cloud browser, you can choose the files with which you want to start the analysis and click “DONE”. The rest of the process is the same as described above. Please note that there is a 3-minute duration between each cloud upload process, and files will be removed from your cloud account upon starting the analysis.