Illumina Sequence

Overview of the Illumina_sequence instruction and parameters

Illumina sequencing instruments allow massively parallel sequencing using the "sequencing by synthesis" (SBS) chemistry that detects single bases as they are incorporated into growing DNA strands.

Instruction and Parameters

Illumina sequence instruction specifies next generation sequencing of either DNA or RNA of organisms using Illumina sequencing technology. Transcriptic currently uses outside providers for Illumina sequencing. Samples are sequencing libraries that are normalized, pooled and presented in aliquots to be loaded on the indicated lane on a flowcell on available Illumina sequencers, miseq, nextseq and hiseq. The availability of these devices is vendor specific. Sequencing data will be uploaded as dataref and presented on a S3 server for use with other 3rd party tools.

  "op": "illumina_sequence",
  "flowcell": "SR" | "PE",
  "lanes": [
       "object": aliquot,
       "library_concentration": decimal,  // ng/uL,
  "sequencer": "miseq" | "hiseq" | "nextseq",
  "mode": "rapid" | "mid"| "high",
  "index": "single" | "dual"| "none",
  "cycles": //optional, vendor default  
   "read_1": integer, required if cycles is set
   "read_2": integer, optional,
   "index_1": integer, optional, default to 0
   "index_2": integer, optional, default to 0
  "library_size": integer,  // in bp
  "dataref": string

Illumina flowcells contain billions of nanowells that are arranged in an array for precise control of the cluster generation process. Each sequencing run on the sequencers can use either a single-read flowcell (SR) or paired-end flowcell (PE). A single flowcell can be either single-read (SR) or paired-end (PE) read. Single-read refers to Read Primer 1 sequencing in one direction and Paired-end refers to Read Primer 1 and Read Primer 2 sequencing in both directions. Both single-end read and paired-end read use standard sequencing primers compatible with Illumina instruments. Applies to all Illumina sequencers.

lanes is a group where each item corresponds to a lane run on the sequencer. Each lane has an object which is the aliquot to load and library_concentration which is the concentration of the object in ng/ul. The number of lanes available per sequencing run depends on the sequencer chosen and the type of run mode. For miseq lane maximum is 1, for hiseq lane maximum is 8, for nextseq lane maximum is 4 . Customers can choose to use one or few lanes or all lanes i.e the whole flowcell. For the pooling of libraries, it is important that the same kit or method was used to prepare and amplify them, they have similar concentrations and size distribution with indices that are compatible with Illumina instruments.

library_concentration :
This parameter refers to the concentration of the object (library pool) in ng/ul. Sequencing libraries are quantified either by a fluorometric method (Qubit, PicoGreen) or by qPCR before loading the library pools onto the sequencers. Libraries should also be analyzed on the Bioanalyzer/Fragment analyzer that would give information on other library parameters like size distribution as well.

The mode parameter is sequencer and vendor specific and details will be provided by each vendor. Generally, the high mode generates the maximum amount of data at normal speed in the nextseq and hiseq sequencers. The rapid mode has faster execution but fewer reads. miseq has a single mode of high output. hiseq has rapid and high output modes. nextseq has mid and high output modes.

The parameter library size refers to the average length of the fragments that are sequenced and is determined by the insert size ( DNA fragment between the adapter sequences), as the adaptor sequence lengths are a constant. Common library sizes are 200-900 bp. Anything higher than 1000 bp may give sub-optimal clustering efficiency on Illumina sequencers. library size depends on the experiment design, application, chosen sequencer and the library preparation method. For example, the Illumina TruSeq libraries (HiSeq) size is about 500bp whereas the MiSeq Nextera libraries maybe equal to or less than 800bp. For a paired end 100 bp read length, a 500 bp library works fine. However, if the read length is 300 bp then there will be overlap between the read 1 and read 2 which maybe ideal for some applications but not others.

The index sequences are unique identifiers added to DNA samples during library preparation that allows multiplexing of libraries for sequencing on Illumina SR or PE flowcells. single index sequencing refers to a separate read called the index 1 (i7) read that is done after read 1. When libraries are dual indexed, the sequencing run includes 2 additional reads called the Index 1 (i7) read and Index 2 (i5) read. If there is no multiplexing of libraries, then the none option applies. The index sequences are specific to library prep kits and compatible sequencers specific to the vendor. If custom sequencing primers are required by customers, such requests need to be customized and evaluated to make sure the sequencing primer design fits the chosen Illumina platform and the vendor’s availability of that instrument.

cycles refers to the read length or number of sequenced bases. Each bp is sequenced one cycle at a time. These cycles can be split into two reads, providing paired reads of the same DNA fragment. The read_1 and read_2 cycles refer to the sequencing cycles of the read 1 and read 2 primers. Single read (SR) flowcell allows only read_1 cycles and paired end (PE) flowcell allows both read_1 and read_2 sequencing cycles. The index_1and index_2 cycles can go to a maximum number of 8. All read_1 and read_2 sequencing must contain more than 25 cycles to generate FASTQ files. Depending on the available instruments, their compatible reagents and the type of flowcell (SR vs PE), upto a maximum of 300-600 cycles of sequence data can be obtained as shown below. The cycles required to read the index reads do not count against this maximum number. Most common applications of NGS for RNA/DNA use 50, 75, 100, 150 cycles for both single read and paired end read flowcells.cycles is a optional vendor-specific parameter that is specific for the various sequencers and their compatible versions of cluster kits, SBS reagents and sequencer control software for a specific flow cell. Please check vendor’s documentation for availability of cycles number specific to each sequencer and associated reagents and software. The default for the cycles in the Illumina_seq instruction will set to the vendor’s availability of instrument and default flowcell choice.

The table below lists the maximum cycle numbers specified by currently available kits for Illumina sequencers.


Read 1 and Read 2




Index 1




Index 2




Information about the standard sequencing workflows compatible with Illumina sequencers and their compatible reagents are available on the Illumina site.


Transcriptic currently uses an outside service for its Illumina sequencing needs. Three types of sequencers with the Illumina platform are available; miseq, nextseq and hiseq. The availability of different versions of these sequencers is vendor specific. Unique needs of various NGS applications along with the project budget and turn around time will determine which sequencer is best suited for sequencing.

Refer to the following links on the Illumina site for overview and information on applications, kits, workflow, specifications and support for the devices.

All illumina sequencers process the flowcell imaging data and outputs the sequencing files (FastQ) will be labelled with the dataref and well id.

More information can be found on the Illumina site.(