Omics Data Format
Omics Data Format¶
The data format that can be analyzed online are:
Pair-end FASTQ files, with suffixes of _R1.fastq.gz and _R2.fastq.gz or _R1.fq.gz and _R2.fq.gz.
unzipped VCF files, with suffixes of
We suggest to standardize the file names of updated data files as follows:
Here are some examples of file names:
Fields in the file name are explained as follows:
Omics type Character DNA WGS WGS DNA WES WES
Platform Character illumina ILM BGI BGI Pacbio Pacbio Nanopore ONT
Please note that the table provides examples of commonly used representative sequencing platforms, rather than an exhaustive compilation of all exiting sequencing platforms. If you use sequencing platforms not listed in the table, please use 2 to 6 letters as abbreviations to represent the sequencing platforms.
Platform Machine Character illumina XTen XTen illumina Novasq Nova illumina Hiseq4000 Hiseq4000 illumina Hiseq2500 Hiseq2500 illumina 10x 10x BGI SEQ500 SEQ500 BGI SEQ2000 SEQ2000 BGI DNBSEQ-T7 T7 Pacbio Sequel Sequel Pacbio Sequel II Sequel2 Nanopore PromethION 24 P24 Nanopore MinION MinION
Please note that the table provides examples of commonly used representative sequencing machines, rather than an exhaustive compilation of all exiting sequencing machines. If you use sequencing platforms not listed in the table, please use 2 to 6 letters as abbreviations to represent the sequencing machines.
LibraryPrep Character PCR PCR PCR-free PCRfree
Please use a few uppercase letters as abbreviations to represent the sequence centers or labs. For example, Fudan to FD.
Sample Character Quartet-D5 D5 Quartet-D6 D6 Quartet-F7 F7 Quartet-M8 M8
If you sequenced multiple replicates for the same reference material, use numbers such as 1, 2, 3, ... to represent technical replicates.
The format of Date is yyyymmdd. For example, June 25, 2023 shoud be written to 20230625.
In addition, there is one thing to note when using the Quartet DNAseq APP. The "Sample ID" in the parameter setting page refers to the number of data sets (one technical replicate of each sample is considered as one set, i.e. D5-D6-F7-M8).
For example, if you provide data for 8 samples, that means two sets of data, so fill in "2".