Home

Short Tandem Repeats

Short tandem repeats (STRs) are found in many prokaryotic and eukaryotic genomes, and are commonly used as genetic markers, in particular for identity and parental testing in DNA forensics.The unstable expansion of some STRs were associated with various genetic disorders (e.g., the Huntington disease), and thus were used in genetic testing for screening individuals at high risk.Traditional STR analyses were based on the PCR amplification of STR loci followed by gel electrophoresis. With the availability of massive whole genome sequencing data, it becomes practical to mine STR profiles in silico from genome sequences. Software tools such as lobSTR and STR-FM have been developed to address these demands, which are, however, built upon whole genome reads mapping tools, and thus may not be sensitive enough.

About STRScan

STRScan is a standalone software tool that uses a greedy algorithm for targeted STR profiling in next-generation sequencing (NGS) data. STRScan was tested on the whole genome sequencing data from Venter and 1000 Genomes Project. The results showed that STRScan can profile 20% more STRs in the target set that are missed by lobSTR and STR-FM. STRScan is particularly useful for the NGS-based targeted STR profiling, e.g., in genetic and human identity testing.

Latest profiling of STRs in NGS datasets with STRScan vs lobSTR

Detected YSTR markers of the Y STR 20plex panel

STR Venter HG00145 HG00140
Marker Ref. Allele STRScan lobSTR STRScan lobSTR STRScan lobSTR
DYS19 15 14(1) - - - - -
DYS385* 11,14 11(2),14(1) 11(1),14(1) 11(3) - 12(1) -
DYS388 12 12(2) 12(1) - - - -
DYS389I 12 13(3) 13(1) - - - -
DYS389II 29 29(1) 29(2) - - - -
DYS390 24 23(1) 23(1) 15(1) - - -
DYS391 11 10(1) 10(1) - - 10(2) 10(2)
DYS392 13 13(2) 13(2) - - - -
DYS393 12 13(2) - - - - -
DYS426 12 12(1) 12(1) - - - -
DYS437 16 - - - - 16(2) -
DYS438 10 12(1) 12(1) - - 10(1) 10(1)
DYS439 13 12(1) 12(1) - - 11(1) 11(1)
DYS447 23 25(1) - - - - -
DYS448 19 - - - - - 8(1)
DYS460 10 12(2) - - - 11(1) -
H4 12 - - 12(1) 12(1) 11(2) -
YCAII* 23 19(3),23(5) 19(3),23(4) 19(1) 19(2) - -

CODIS markers

STR Venter HG00145 HG00140
Marker Ref. Allele STRScan lobSTR STRScan lobSTR STRScan lobSTR
CSF1PO 13 11(7) 11(5) - - 11(1) 11(1)
D13S317* 11 12(1),13(2) 11(1) - - - -
D16S539 11 12(2) - 13(1) - 11(2) 11(1)
D18S51 18 14(2) 14(2) - - 15(1) -
D21S11 29 - - - - - -
D3S1358* 16 16(3) 16(3) - - - -
D5S818 11 - - - - - -
D7S820 13 10(3) 10(2) - - 8(3) -
D8S1179 13 12(1) 12(1) 8(1) 6(2) - 13(1)
FGA* 22 26(1),21(1) 26(1),21(1) - - - -
PentaD 13 13(2) - 9(1) 9(1) - -
PentaE 5 12(2) 12(1) - - 13(1) 13(1)
TH01 7 6(2) - - - 5(1),10(2) 10(2)
TPOX 8 8(5) 8(4) - - 8(1) 8(1)

(*)Multi-allelic STR markers

(1-9): Indicates the number of supporting reads