# Data Upload

With the SEQ Platform, you can directly upload your files or use the cloud browser to browse and select the files pre-uploaded to your account.

# Direct Data Upload

Click on the "Upload" button at your homepage and select "Germline" option. Here, you can select FASTQ of VCF options.

# FASTQ Upload

Upload page

# Select the files to upload

Click the "Browse" button under the “File” to upload all the files you want to analyze. Make sure that you upload both read files for paired-end reads. See the table below for the supported input file types:

File Types Batch Sample Upload
Illumina .fastq.gz, .fq.gz Supported
Ion Torrent .bam Supported
MGI .fastq.gz, .fq.gz Supported
PacBio .bam Supported
ONT .fastq.gz, .fq.gz Supported

# Filename formatting for batch upload

# Illumina

You can upload multiple samples with two or more files. For batch uploads, we only support filenames following the “Illumina naming convention” e.g.

NA10831_ATCACG_L002_R1_001.fastq.gz
NA10831_ATCACG_L002_R2_001.fastq.gz

The filenames from Illumina platform are handled as below:

<name_field1>_<name_field2>_<lane_#>_<read_#>_<always_001>.fastq.gz

Name field 1, name field 2, lane #, and read # are used to match the corresponding files correctly. You can alter the name fields 1 and 2 without using space and underscore (_) characters. Following this naming convention, you can upload multiple samples (each with multiple fastq.gz files).

Please check the number of samples and matched files on the confirmation screen.

# MGI

You can upload multiple samples with two or more files. For batch uploads, we only support filenames following the “MGI naming convention” e.g.

V12345678_L01_16_1.fastq.gz
V12345678_L01_16_2.fastq.gz

The filenames from MGI platform are handled as below:

<flowcell_id>_<lane_#>_<barcode>_<read_#>.fq.gz

Flowcell ID, lane #, barcode, and read # are used to match the corresponding files correctly. You can alter the flowcell ID field without using space and underscore (_) characters. If you have used more than one barcode for the same sample, you need to rename the file as follows:

Original file names:
sammple1234_L01_16_1.fastq.gz
sammple1234_L01_16_2.fastq.gz
sammple1234_L01_17_1.fastq.gz
sammple1234_L01_17_2.fastq.gz

Altered file names:
sammple1234_L01_16_1.fastq.gz
sammple1234_L01_16_2.fastq.gz
sammple1234_L02_16_1.fastq.gz
sammple1234_L02_16_2.fastq.gz

In this example, barcode numbers of the last two files are changed to 16, and their lane numbers are increased by 1.

Please check the number of samples and matched files on the confirmation screen.

# IonTorrent

IonTorrent files need to be unaligned BAM files with .bam extension.

# Pacific Biosciences

Pacific Biosciences files need to be BAM files with .bam extension. You can upload multiple samples in a single batch. One file for each sample is expected.

# Oxford Nanopore Technologies

Oxford Nanopore Technologies files need to be in fastq.gz or fq.gz format. You can upload multiple samples in a single batch. One file for each sample is expected.

# Select a previous Run or Create a New Run

You can upload your samples to a new run by selecting “Create New Run” under “Run Name” and giving it a name on the “Name For New Run” field. You can also upload your samples to an existing run by selecting the previous run from the dropdown menu.

Run information is very critical for Copy Number Variation analyses. Therefore, please make sure you organize the samples under the runs in the same way as you process your sample materials. Ideally, a run is a set of samples coming from the same wet-lab and run process (flow cell, etc.).

# Choose the technology type

Choose the next-generation sequencing machine associated with the samples. Mixing different technologies in one run is not permitted.

# Choose the kit type

The SEQ platform has hundreds of different kits predefined in the system. A new kit can be defined with a set of target coordinates and a list of targeted genes. The Addition of a new kit typically takes one business day. For kit requests, please contact us through support@genomize.com.

Every kit is associated with a standardized analysis version in SEQ. Probe-based kits, primer-based kits, Illumina & MGI technology, ION torrent technology, germline analysis, or somatic analyses all have a preset analysis version.

# Hotspot VCF file upload

A predefined VCF file can be uploaded to SEQ. The VCF4.2 standard is supported. If the user does not upload a VCF as a hotspot, SEQ automatically subsets Pathogenic or Likely Pathogenic variants from ClinVar as the default hotspot for panels. For Whole Exome Sequencing, the default hotspot assignment is currently not supported.

# Advanced options

# Variant Calling parameters

A set of parameters is used to assess the quality of every variant called in a sample. Two parameters, the primary coverage threshold and the minimum alternative fraction threshold, can cause the classification of the variant as “FAILED”. The “FAILED” variant calls will not be displayed.

The variant calls with an alternative allele count less than the primary coverage threshold will be classified as “FAILED” and not be displayed.

The variant calls with alternative allele frequency less than the allele fraction threshold will be classified as “FAILED” and not be displayed.

# Other parameters

When calculating coverage metrics for the gene coverage and the kit’s on-target coverage percentages, SEQ uses four different thresholds. 1X and 5X are the preset values. The other two values may be customized by the user per upload.

Advanced options

The default values of the advanced options are set under “Site Settings” in the Settings menu.

# Submit your data

As the last step, you can upload your data by clicking the “Continue” button to start the upload process. After clicking "Continue", you will see the "Case Information" screen. Please refer to the "Genomize's AI-Assisted Variant Prioritization" section for more information on entering the case information. You can then click the "Upload" button to see the number of analyses and the list of files matched for each analysis. Please be sure that both of these pieces of information are correct and hit "Approve" to start the upload process or "Cancel" to make changes.

When you start the upload, you will see the progress for each file. Transferred samples will immediately begin processing without waiting for the entire batch to finish uploading.

SEQ Platform's upload process is secure and performs a checksum to ensure the files are transferred correctly. Please do not close the browser tab or shut down your computer. Also, please ensure that your computer will not go into sleep/hibernation mode during the upload. Otherwise, the upload process will be aborted. Our upload process is resistant to intermittent loss of internet connection.

When the upload process is completed, you will be redirected to the corresponding Run's page, and your samples will be queued for analysis. Refresh the corresponding Run page to see the last status of the analysis.

# Small Variant Detection

SEQ has standard analysis versions pre-setup for every kit defined in the system. Data processing and variant calling are handled differently based on the sample type, sequencing platform, and selected analysis pipeline.

# Analysis versions for long-read WGS

Name Explanation Alignment/ variant calling CMRG Support* SV Calling Phasing Available Genome Versions Available Platforms
Sentieon minimap2-DNAScope-SV Calling-Phasing-germline Optimized for ONT long-read WGS samples Sentieon minimap2 / DNAScope No Sention LongreadSV VariantPhaser hg38 Oxford Nanopore
PacBio pbmm2-DeepVariant-SV calling-Phasing-germline Optimized for Pacbio long-read WGS samples Pacbio pbmm2 / Deepvariant **Paraphase PBSV, HifiCNV, TRGT HiPhase hg38 PacBio

* CMRG: Challenging Medically Relevant Genes (Wagner et al., 2022 (opens new window))

** Paraphase: PacBio's recommended tool for detecting segmental duplication regions and medically relevant genes, such as SMN1/SMN2 and HBA1/HBA2 (Chen et al., 2024 (opens new window)). See Targeted Variant Calling for more details.

# Analysis versions for short-read WGS

Name Explanation Alignment/ variant calling BAM processing SV Calling Available Genome Versions Available Platforms
Sentieon BWA-DNAScope-SV Calling-germline Optimized for WGS samples prepared with a PCR enrichment step Sentieon BWA / DNAScope MarkDuplicate Delly, Manta, Tiddit, ExpansionHunter hg19, hg38 Illumina, MGI
Sentieon BWA (PCRfree)-DNAScope-SV Calling-germline Optimized for WGS samples prepared without a PCR enrichment step Sentieon BWA / DNAScope MarkDuplicate Delly, Manta, Tiddit, ExpansionHunter hg19, hg38 Illumina, MGI
BWA-Freebayes-SV Calling-germline Optimized for WGS samples BWA / Freebayes PCR Dedup + Indel Realignment Delly, Manta, Tiddit, ExpansionHunter hg19, hg38 Illumina, MGI
BWA-GATK-SV Calling-germline Optimized for WGS samples BWA / GATK PCR Dedup + Indel Realignment Delly, Manta, Tiddit, ExpansionHunter hg19, hg38 Illumina, MGI

# Analysis versions for capture based targeted panels, including WES

Name Explanation Alignment/ variant calling** BAM processing CNV Calling (Cohort Mode) Available Genome Versions Available Platforms
Sentieon BWA-DNAScope- germline Optimized for capture-based germline kits. Sentieon BWA / DNAScope MarkDuplicate GATK-CNV + delly* hg19, hg38 Illumina, MGI
BWA-Freebayes-PCR dedup - germline Optimized for capture-based germline kits. BWA / Freebayes PCR Dedup GATK-CNV + delly* hg19, hg38 Illumina, MGI
BWA-Freebayes-PCR dedup-Indel Realignment - germline Optimized for capture-based germline kits. Default analysis for most kits. BWA / Freebayes PCR Dedup + Indel Realignment GATK-CNV + delly* hg19, hg38 Illumina, MGI
BWA-GATK-PCR dedup-Indel Realignment - germline Optimized for capture-based germline kits. Uses a GATK variant caller. BWA / GATK PCR Dedup + Indel Realignment GATK-CNV + delly* hg19, hg38 Illumina, MGI
BWA-Freebayes High Sensitivity-PCR dedup-Indel Realignment - germline Optimized for capture-based germline kits to call variants with a low fraction (<20%). BWA / Freebayes PCR Dedup + Indel Realignment GATK-CNV + delly* hg19, hg38 Illumina, MGI

* Delly is utilized only for panels with more than 100 genes.

** Mitochondrial analysis is performed following GATK best practices, and gnomAD filters are applied (for details, see Mitochondrial Calling).

# Analysis versions for amplicon based panels

Name Explanation Alignment/ variant calling BAM processing Primer Trimming Available Genome Versions Available Platforms
BWA-Freebayes-BamKeser-Indel Realignment - germline Optimized for amplicon-based germline kits. BWA / Freebayes Indel Realignment BamKeser hg19, hg38 Illumina, MGI
BWA-Freebayes-BamKeser - germline Optimized for amplicon-based germline kits. Does not perform indel realignment step. BWA / Freebayes NA BamKeser hg19, hg38 Illumina, MGI
CVD Specific-BWA-Bowtie2-Freebayes-Bamkeser - germline Optimized for amplicon-based CVD kits. BWA + Bowtie2 / Freebayes NA BamKeser hg19 Illumina, MGI
Thalassemia Specific-BWA-Freebayes-Bamkeser - germline Optimized for amplicon-based thalassemia kits. BWA / Freebayes NA BamKeser hg19 Illumina, MGI
CAH Specific v2-BWA-Freebayes-Bamkeser - germline Optimized for amplicon-based CAH kits. BWA / Freebayes NA BamKeser hg19 Illumina, MGI

**BamKeser is our in-house designed and precisely working primer trimming tool.

# Analyis versions for IonTorrent uBAM samples

Name Explanation Alignment/ variant calling BAM processing Primer Trimming Available Genome Versions Available Platforms
Torrent Suite 5.8 - No Trimming - Default Parameters v2 - germline Optimized for IonTorrent samples. Does not perform primer trimming. Torrent Suite 5.8 NA NA hg19 IonTorrent
Torrent Suite 5.8 - No Trimming - BRCA specific - germline Optimized for IonTorrent BRCA samples. Torrent Suite 5.8 NA NA hg19 IonTorrent
Torrent Suite 5.8 - No Trimming - CFTR specific - germline Optimized for IonTorrent CFTR samples. Torrent Suite 5.8 NA NA hg19 IonTorrent

**BamKeser is our in-house designed and precisely working primer trimming tool.

After the variant calling, the Genomize-SEQ processes the resulting VCF file to form a Genomize standard VCF file which can be downloaded through the platform. The Genomize standard VCF line will have gstd=1 in the info field. Standardization of the VCF file includes the following important steps

  • Minimal variant representation: Some callers produce redundant bases at the left-hand or right-hand side of either alternative or reference allele. This redundancy has to be removed to obtain the correct annotation of variants in the subsequent steps.

# Mitochondrial Variant Detection

The mitochondrial variant calling pipeline follows GATK best practices (opens new window), using the revised Cambridge Reference Sequence (NC_012920.1) as the reference mitochondrial genome and the Mutect2 as variant caller.

Filtering and genotype assignment for mitochondrial variants are performed in line with the recommendations from gnomAD (opens new window) . Briefly, SNVs with Variant Allele Frequency (VAF) below 0.01 were removed. Variants with VAF equal to or higher than 0.95 are classified as homoplasmic. Variants with VAF lower than 0.95 are classified as heteroplasmic. Any variants in previously reported artifact-prone sites (positions 301, 302, 310, 316, 3107, and 16182) are ignored.

In targeted panel analyses (including WES), mitochondrial variant detection is only available for panels with distinct chrM targets in their bed files.

In WGS analyses, mitochondrial variant detection uses the same variant callers and filtering steps as those applied to other chromosomal regions. If your WGS sample preparation is optimized for mitochondrial DNA isolation and enrichment, please contact support for further assistance.

# Structural Variant Detection

SEQ Platform can detect and report various structural variants listed below. For single samples, maximum SV size is limited to 100,000 bps.

For cohort CNV analysis in WGS samples, please refer to Copy Number Variations section

# Supported Structural Variants

SV-Group Abbreviation Supporting callers
Deletion DEL Manta, Tiddit, Delly, PBSV, HifiCNV, Paraphase
Duplication DUP Manta, Tiddit, Delly, PBSV, HifiCNV, Paraphase
Insertion INS Manta, Delly, PBSV
Inversion INV Tiddit, Delly, PBSV
Breakend (Unresolved)* BND Manta, Tiddit, Delly, PBSV
Short Tandem Repeat STR ExpansionHunter, TRGT
Complex** CPX Manta, Tiddit, Delly

* Structural variants that cannot be classified into any other type are listed as BND

** If more than one type of SV is detected in combination, they are classified as CPX variant. ex: DEL:INS, DUP:INV, etc. Currently only DEL:INS variants are supported.

# Variant callers used for SV detection

Tool Algorithm Supported SV-types
Manta1 (opens new window) Manta (opens new window) divides the SV and indel discovery process into two primary steps:
1. Scanning the genome to find SV associated regions.
2. Analysis, scoring and output of SVs found in these regions.
- Deletions
- Duplications
- Deletion-Insertions
- Insertions
- Breakends
Delly2 (opens new window) DELLY (opens new window), short-range and long-range paired-end libraries are analyzed for discordantly mapped read pairs. Paired-end predicted structural variants are then refined using split-reads and reported at single-nucleotide breakpoint resolution. In addition to general parameters applied to SVs, insert size cutoff for split reads ≥ 15 bps, minimum paired-end MAPQ ≥ 20 filters are used for DELLY. - Deletions
- Duplications
- Deletion-Insertions
- Insertions
- Inversions
- Breakends
Tiddit3 (opens new window) TIDDIT (opens new window), detects structural variants by examining sequences for discordant pairs, split reads, and supplementary alignments, which must exceed a specified quality threshold. It uses a clustering method similar to DBSCAN, where a cluster forms if sufficient signals are within a designated distance. Clusters lacking enough signals are discarded; otherwise, they are included in the output regardless of other quality filters. - Deletions
- Duplications
- Inversions
- Breakends
ExpansionHunter4 (opens new window) ExpansionHunter (opens new window) is a tool designed for targeted genotyping of short tandem repeats (STRs) and flanking variants. It operates by analyzing BAM files to find reads that either span, flank, or are fully contained within each targeted repeat. This precise approach allows for effective characterization of these genomic elements, tailored specifically to identify and quantify repeat variations. Short Tandem Repeats
PBSV5 (opens new window) PBSV (opens new window) is a suite of tools to call and analyze structural variants in diploid genomes from PacBio single molecule real-time sequencing (SMRT) reads. - Deletions
- Duplications
- Insertions
- Inversions
- Breakends
HifiCNV6 (opens new window) HifiCNV (opens new window) is a cutting-edge tool specifically designed for calling copy number variants (CNVs) using high-fidelity (HiFi) sequencing reads. It offers optimized segmentation and calling for germline whole genome sequencing (WGS) using HiFi reads, ensuring accurate results. The tool automatically estimates and corrects GC-bias, which enhances the reliability of the data. - Deletions
- Duplications
TRGT7 (opens new window) TRGT (opens new window) is a tool for targeted genotyping of tandem repeats from PacBio HiFi data. In addition to the basic size genotyping, TRGT profiles sequence composition, mosaicism, and CpG methylation of each analyzed repeat and visualization of reads overlapping the repeats. Short Tandem Repeats
Paraphase8 (opens new window) Paraphase (opens new window) is a Python tool that takes HiFi aligned BAMs as input (whole-genome or enrichment), phases haplotypes for genes of the same family, determines copy numbers and makes phased variant calls. Paraphase supports 160 segmental duplication regions (opens new window). - Deletions
- Duplications

# Filtering Parameters Applied to SVs

  1. Allele Fraction (AF) Filter: SVs with fractions lower than 0.2 are filtered out.

  2. Pass Filter: SVs without the “PASS” flag assigned by their respective callers are filtered out.

  3. Depth of Coverage (DP) Filter: SVs with fewer than 10 supporting reads are filtered out.

  4. No Call Filter: SVs that have a 'no call' status in tandem repeat VCFS, ensuring that only fully determined genotypes are analyzed.

  5. The Same Gene and Same Oriented Breakpoint (BND) Filter: Structural variants that involve the same gene and are oriented in the same direction are filtered out to reduce complexity and focus on more relevant genomic rearrangements.

  6. Chromosome Filter: SVs that are not on chromosomes 1-22, X are filtered out.

  7. Genomic Region Filter: Variants overlapping with predefined blacklisted (Amemiya et al., 2019 (opens new window)) regions are filtered out. The complete list can be accessed here (opens new window).

# Special Note on Repeat Finding

The repeat catalog focuses exclusively on tandem repeat regions known to cause diseases. We employ gnomAD's algorithm for detecting repeat unit motifs and then use ExpansionHunter on these de novo tandem repeat units to identify repeat sequences.

Occasionally, short-read sequencing technology falls short in accurately genotyping tandem repeats. In particular, tools like ExpansionHunter are not designed to genotype multiallelic repeats where different motifs might vary from each other. As a solution, we run ExpansionHunter for different motifs at the same locus to provide information for all repeat units present in the GnomAD. This approach helps capturing the variability and complexity of tandem repeats in genomic studies.

# Targeted Variant Calling

# Targeted Variant callers used for Challenging Medically Relevant Genes

Tool Supported Region Names (Genes) Variant Types**
Paraphase1 (opens new window)* SMN1 (SMN1, SMN2)
HBA (HBA1, HBA2)
PMS2 (PMS2)
RCCX (CYP21A2, C4A, C4B, TNXB)
STRC (STRC)
NCF1 (NCF1)
IKBKG (IKBKG)
OPN1LW (OPN1LW, OPN1MW, OPN1MW2, OPN1MW3, TEX28)
- Copy Number Variant
-Small Variant (Coming Soon)
-Structural Variant (Coming Soon)

* Paraphase: More regions (opens new window) will be added soon.

# gVCF Upload

# Select a Previous Run or Create a New Run

You can upload your samples to a new run by selecting “Create New Run” under “Run Name” and giving it a name on the “Name For New Run” field. You can also upload your samples to an existing run by selecting the previous run from the dropdown menu.

# Choose the Technology Type

Choose the next-generation sequencing machine associated with the samples. If you do not know which sequencing platform is used, you can select the "Unknown" option. Mixing different technologies in one run is not permitted.

# Select the VCF Files to be Uploaded

You can upload one small-variant gVCF file (for SNVs) per sample. Additionally, you may upload VCF files for CNVs, SVs, and STRs generated by DRAGEN or similar tools for the same sample, in any combination. Files are matched based on the sample information field, not the file names. Multisample VCFs are not supported.

# Supported Variant Callers

Upload Type Variant Types Supported Callers File Format (Extension) Multisample Support
Small Variant SNV
INDEL
Dragen (v4.1, v4.2, v4.3, v4.4) gVCF
(.vcf.gz)
(.g.vcf.gz)
(.genome.vcf.gz)
No
Copy Number Variant CNV Dragen-CNV
HifiCNV
VCF (.vcf.gz) No
Structural Variant DEL (Deletion)
DUP (Duplication)
INV (Inversion)
INS (Insertion)
BND (Breakends)*
CPX (Complex)**
Delly
Dragen-SV
Dragen-Targeted Callers (Coming Soon)
Manta
PBSV
Sentieon Long Read SV
Sniffles2
TIDDIT
VCF (.vcf.gz) No
Short Tandem Repeat STR*** ExpansionHunter
Dragen-CNV
TRGT
VCF (.vcf.gz) No

* BND (Breakends): 5'–3' fusion transcript events are supported, with potential formation of chimeric transcripts.

** CPX (Complex): Combination of structural variant types. Currently only DEL:INS variants are supported.

*** STR (Short Tandem Repeat): Only disease-associated STR variants in the gnomAD STR catalog are supported. For details, see the gnomAD STR catalog (opens new window).

# Unsupported Variant Callers and VCF Version Compatibility

Custom integration may be required to fully support unlisted or unsupported variant callers. Please contact support for assistance. Only VCF version 4.1 or newer is supported for Copy Number, Structural, and Short Tandem Repeat (STR) VCF files.

# Targeted Caller Variants

Small variants in gVCF files generated by Dragen targeted callers (opens new window) are supported. If the gVCF file already includes these variants, there is no need for a separate VCF file. However, if the variants are not included, they should be merged into the gVCF file prior to uploading. For assistance, please contact support.

Structural variants generated by Dragen targeted callers can be uploaded as an additional JSON file (opens new window). VCF files (opens new window) generated by Dragen targeted callers are currently not supported.

Upload Type Variant Types Supported Callers File Format (Extension) Multisample Support
Structural Variant DEL (Deletion)
DUP (Duplication)
Dragen (Coming Soon) VCF (.targeted.vcf.gz) No
Targeted Caller Variant DEL (Deletion)
DUP (Duplication)
Dragen JSON (.targeted.json) No

# Mitochondrial Variants

The Revised Cambridge Reference Sequence (rCRS, NC_012920.1) is used as the reference for the mitochondrial genome regardless of the genome version used in VCF generation, which is the recommended sequence for clinical use (McCormick et al., 2020 (opens new window)). If the VCF file contains variants called using the older Yoruban (YRI) mitochondrial reference genome, errors may result due to incompatibility with our annotation sources. Unsupported chrM variants should also be removed before upload to prevent genome compatibility issues. For assistance, or if issues arise, please contact support.

# Submit your data

As the last step, you can upload your data by clicking the “Continue” button to start the upload process. After clicking "Continue", you will see the "Case Information" screen. Please refer to the "Genomize's AI-Assisted Variant Prioritization" section for more information on entering the case information. You can then click the "Upload" button to see the number of analyses and the list of files matched for each analysis. Please be sure that both of these pieces of information are correct and hit "Approve" to start the upload process or "Cancel" to make changes.

When you start the upload, you will see the progress for each file. Transferred samples will immediately begin processing without waiting for the entire batch to finish uploading.

SEQ Platform's upload process is secure and performs a checksum to ensure the files are transferred correctly. Please do not close the browser tab or shut down your computer. Also, please ensure that your computer will not go into sleep/hibernation mode during the upload. Otherwise, the upload process will be aborted. Our upload process is resistant to intermittent loss of internet connection.

When the upload process is completed, you will be redirected to the corresponding Run's page, and your samples will be queued for analysis. Refresh the corresponding Run page to see the last status of the analysis.

# Filtering Parameters Applied to Small Variants

  1. Pass Filter: Small variants without the “PASS” flag assigned by their respective callers are filtered out.

  2. No Call Filter: Small variants with a 'no call' status are excluded, ensuring that only fully determined genotypes are included in the analysis.

  3. Chromosome Filter: Small variants that are not on chromosomes 1-22, X,M are filtered out.

# VCF Upload

# Select a Previous Run or Create a New Run

You can upload your samples to a new run by selecting “Create New Run” under “Run Name” and giving it a name on the “Name For New Run” field. You can also upload your samples to an existing run by selecting the previous run from the dropdown menu.

# Choose the Technology Type

Choose the next-generation sequencing machine associated with the samples. If you do not know which sequencing platform is used, you can select the "Unknown" option. Mixing different technologies in one run is not permitted.

# Select the VCF Files to be Uploaded

You can upload VCF files for SNV, CNV, SV, and STR from DRAGEN or other similar tools for the same sample in any combination you choose. VCF files should have vcf.gzextension. Files are matched using the sample info field, not the file names. Multisample VCFs are not supported.

# Supported Variant Callers

Upload Type Variant Types Supported Callers File Format (Extension) Multisample Support
Small Variant SNV
INDEL
DeepVariant
Dragen
Freebayes
GATK - Haplotype Caller
Ion Torrent Variant Caller
Isaac Variant Caller
Mutect2
Pivat
Sentieon DNAscope, TNScope
VarDict
VCF (.vcf.gz) No
Copy Number Variant CNV Dragen
HifiCNV
VCF (.vcf.gz) No
Structural Variant DEL (Deletion)
DUP (Duplication)
INV (Inversion)
INS (Insertion)
BND (Breakends)*
CPX (Complex)**
Delly
Dragen
Manta
PBSV
Sentieon Long Read SV
Sniffles2
TIDDIT
VCF (.vcf.gz) No
Short Tandem Repeat STR*** ExpansionHunter
DRAGEN
TRGT
VCF (.vcf.gz) No

* BND (Breakends): 5'–3' fusion transcript events are supported, with potential formation of chimeric transcripts.

** CPX (Complex): Combination of structural variant types. Currently only DEL:INS variants are supported.

*** STR (Short Tandem Repeat): Only disease-associated STR variants in the gnomAD STR catalog are supported. For details, see the gnomAD STR catalog (opens new window).

# Unsupported Variant Callers and VCF Version Compatibility

Please note that using unlisted or unsupported variant callers may result in inaccurate VCF metrics. Unpredicted callers are categorized as "other," which may limit the capture of certain metrics. In some cases, custom integration may be required to support these callers fully. Only VCF version 4.1 or newer is supported for Copy Number, Structural and Short Tandem VCF files. If issues persist or critical data appears missing, please contact support for assistance.

# Mitochondrial Variants

The Revised Cambridge Reference Sequence (rCRS, NC_012920.1) is used as the reference for the mitochondrial genome regardless of the genome version used in VCF generation, which is the recommended sequence for clinical use (McCormick et al., 2020 (opens new window)). If the VCF file contains variants called using the older Yoruban (YRI) mitochondrial reference genome, errors may result due to incompatibility with our annotation sources. Unsupported chrM variants should also be removed before upload to prevent genome compatibility issues. For assistance, or if issues arise, please contact support.

# Submit your data

As the last step, you can upload your data by clicking the “Continue” button to start the upload process. After clicking "Continue", you will see the "Case Information" screen. Please refer to the "Genomize's AI-Assisted Variant Prioritization" section for more information on entering the case information. You can then click the "Upload" button to see the number of analyses and the list of files matched for each analysis. Please be sure that both of these pieces of information are correct and hit "Approve" to start the upload process or "Cancel" to make changes.

When you start the upload, you will see the progress for each file. Transferred samples will immediately begin processing without waiting for the entire batch to finish uploading.

SEQ Platform's upload process is secure and performs a checksum to ensure the files are transferred correctly. Please do not close the browser tab or shut down your computer. Also, please ensure that your computer will not go into sleep/hibernation mode during the upload. Otherwise, the upload process will be aborted. Our upload process is resistant to intermittent loss of internet connection.

When the upload process is completed, you will be redirected to the corresponding Run's page, and your samples will be queued for analysis. Refresh the corresponding Run page to see the last status of the analysis.

# Cloud Browser

To use the cloud browser, select or create your run, select the sequencing platform and the kit by following the directions above. After the kit selection, you will see the option to select either your “COMPUTER” or the “CLOUD BROWSER” as the data source.

Data source selection

When you select the "CLOUD BROWSER" option, click the PLUS (➕) button to open the cloud browser interface. Using the cloud browser, you can choose the files with which you want to start the analysis and click “DONE”. The rest of the process is the same as described above. Please note that there is a 3-minute duration between each cloud upload process, and files will be removed from your cloud account upon starting the analysis.