Sample status

This table gives the status of each sample, either completed successfully or the reason the assembly failed. If applicable the mean quality for the whole construct has been provided, derived by Medaka from aligning the reads to the consensus.

The assembly was generated using Flye. The assemblies and/or inserts were aligned with provided references and marked as expected if they meet both the acceptance criteria defined by the expected_coverage and expected_identity parameters which have been set to 95.0% and 99.0% respectively.
Sample Assembly completed / failed reason Length Mean Quality Expected insert Expected Assembly
sample01 Failed due to insufficient reads N/A N/A
sample02 Completed successfully 2871 52.79
sample03 Completed successfully 3096 48.08
sample04 Completed successfully 3097 41.88
sample05 Completed successfully 3036 52.23
sample06 Completed successfully 3037 39.08
sample07 Completed successfully 3096 46.94
sample08 Completed successfully 3096 46.46
sample09 Completed successfully 3036 47.16
sample10 Completed successfully 3036 50.27
sample11 Completed successfully 3096 44.41
sample12 Completed successfully 3096 47.78

Plannotate

The Plasmid annotation plot and feature table are produced using pLannotate

A pLannotate plot is shown for each assembly. A feature table provides descriptions of the annotated sequence. Unfilled features on the plannotate plots are incomplete features; the sequence match in the plasmid covers less than 95% of the full length of the feature in the database. These elements may be leftover fragments from earlier cloning steps used to create a plasmid. If they include only a small fraction of the feature, they likely do not still have the annotated function. However, even small feature fragments may affect plasmid function if they result in cryptic gene expression or are inadvertently combined with other elements during later cloning steps. The plannotate plot may have overlapping annotation labels, use the zoom and hover tools to decipher the labels.

Feature Database Identity Match Length Description Start Location End Location Length Strand
csgG swissprot 100.0% 100.0% CSGG_ECO57 - Experimental evidence at protein level: Swiss-Prot protein existence level 1. May be involved in the biogenesis of curli organelles. From Escherichia coli O157:H7. 765 1596 831 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1659 1778 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 1874 1965 91 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2743 554 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 687 735 48 -
KanR snapgene 99.3% 65.7% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 1965 2501 536 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 1842 1874 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 560 590 30 -
kanMX snapgene 99.0% 37.3% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 1995 2501 506 +
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 2741 2844 103 -
KanR snapgene 99.1% 14.2% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 2501 2616 115 +
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2622 2719 98 +
kanMX snapgene 99.1% 8.5% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 2501 2616 115 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 1884 1966 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
TXL_EISFE swissprot 100.0% 100.0% Experimental evidence at protein level: Swiss-Prot protein existence level 1. Pore-forming toxin that defensively acts against parasitic microorganisms by forming pores in sphingomyelin-containing membranes. Has hemolytic activity and is also cytotoxic to spermatozoa of some species of invertebrates and many species of vertebrates and to amphibian larvae, guinea pig polymorphonuclear leukocytes, chicken fibroblasts, normal spleen cells and various tumor cells. Is lethal for various species of reptiles, amphibian, birds and mammals. Induces smooth muscle contraction. It binds sphingomyelin and induces hemolysis in the same manner as lysenin-related protein 2, and is 10-fold more effective than lysenin-related protein 1. 909 1800 891 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1863 1982 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 2078 2169 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 2169 2985 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 16 698 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 831 879 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 2199 2985 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 2046 2078 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 704 734 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 14 117 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2991 3088 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 2088 2170 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
TXL_EISFE swissprot 100.0% 100.0% Experimental evidence at protein level: Swiss-Prot protein existence level 1. Pore-forming toxin that defensively acts against parasitic microorganisms by forming pores in sphingomyelin-containing membranes. Has hemolytic activity and is also cytotoxic to spermatozoa of some species of invertebrates and many species of vertebrates and to amphibian larvae, guinea pig polymorphonuclear leukocytes, chicken fibroblasts, normal spleen cells and various tumor cells. Is lethal for various species of reptiles, amphibian, birds and mammals. Induces smooth muscle contraction. It binds sphingomyelin and induces hemolysis in the same manner as lysenin-related protein 2, and is 10-fold more effective than lysenin-related protein 1. 909 1800 891 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1863 1982 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 2078 2169 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 2169 2985 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 16 698 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 831 879 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 2199 2985 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 2046 2078 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 704 734 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 14 117 103 -
CloDF13 ori snapgene 100.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2991 3089 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 2088 2170 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
csgG swissprot 100.0% 100.0% CSGG_ECO57 - Experimental evidence at protein level: Swiss-Prot protein existence level 1. May be involved in the biogenesis of curli organelles. From Escherichia coli O157:H7. 765 1596 831 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1659 1778 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 1874 1965 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 1965 2781 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2908 554 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 687 735 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 1995 2781 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 1842 1874 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 560 590 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 2906 3009 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2787 2884 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 1884 1966 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
csgG swissprot 100.0% 100.0% CSGG_ECO57 - Experimental evidence at protein level: Swiss-Prot protein existence level 1. May be involved in the biogenesis of curli organelles. From Escherichia coli O157:H7. 765 1596 831 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1659 1778 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 1874 1965 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 1965 2781 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2909 554 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 687 735 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 1995 2781 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 1842 1874 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 560 590 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 2907 3010 103 -
CloDF13 ori snapgene 100.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2787 2885 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 1884 1966 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
TXL_EISFE swissprot 100.0% 100.0% Experimental evidence at protein level: Swiss-Prot protein existence level 1. Pore-forming toxin that defensively acts against parasitic microorganisms by forming pores in sphingomyelin-containing membranes. Has hemolytic activity and is also cytotoxic to spermatozoa of some species of invertebrates and many species of vertebrates and to amphibian larvae, guinea pig polymorphonuclear leukocytes, chicken fibroblasts, normal spleen cells and various tumor cells. Is lethal for various species of reptiles, amphibian, birds and mammals. Induces smooth muscle contraction. It binds sphingomyelin and induces hemolysis in the same manner as lysenin-related protein 2, and is 10-fold more effective than lysenin-related protein 1. 909 1800 891 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1863 1982 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 2078 2169 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 2169 2985 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 16 698 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 831 879 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 2199 2985 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 2046 2078 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 704 734 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 14 117 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2991 3088 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 2088 2170 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
TXL_EISFE swissprot 100.0% 100.0% Experimental evidence at protein level: Swiss-Prot protein existence level 1. Pore-forming toxin that defensively acts against parasitic microorganisms by forming pores in sphingomyelin-containing membranes. Has hemolytic activity and is also cytotoxic to spermatozoa of some species of invertebrates and many species of vertebrates and to amphibian larvae, guinea pig polymorphonuclear leukocytes, chicken fibroblasts, normal spleen cells and various tumor cells. Is lethal for various species of reptiles, amphibian, birds and mammals. Induces smooth muscle contraction. It binds sphingomyelin and induces hemolysis in the same manner as lysenin-related protein 2, and is 10-fold more effective than lysenin-related protein 1. 909 1800 891 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1863 1982 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 2078 2169 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 2169 2985 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 16 698 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 831 879 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 2199 2985 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 2046 2078 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 704 734 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 14 117 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2991 3088 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 2088 2170 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
csgG swissprot 100.0% 100.0% CSGG_ECO57 - Experimental evidence at protein level: Swiss-Prot protein existence level 1. May be involved in the biogenesis of curli organelles. From Escherichia coli O157:H7. 765 1596 831 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1659 1778 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 1874 1965 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 1965 2781 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2908 554 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 687 735 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 1995 2781 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 1842 1874 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 560 590 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 2906 3009 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2787 2884 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 1884 1966 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
csgG swissprot 100.0% 100.0% CSGG_ECO57 - Experimental evidence at protein level: Swiss-Prot protein existence level 1. May be involved in the biogenesis of curli organelles. From Escherichia coli O157:H7. 765 1596 831 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1659 1778 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 1874 1965 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 1965 2781 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2908 554 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 687 735 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 1995 2781 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 1842 1874 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 560 590 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 2906 3009 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 2787 2884 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 1884 1966 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
TXL_EISFE swissprot 100.0% 100.0% Experimental evidence at protein level: Swiss-Prot protein existence level 1. Pore-forming toxin that defensively acts against parasitic microorganisms by forming pores in sphingomyelin-containing membranes. Has hemolytic activity and is also cytotoxic to spermatozoa of some species of invertebrates and many species of vertebrates and to amphibian larvae, guinea pig polymorphonuclear leukocytes, chicken fibroblasts, normal spleen cells and various tumor cells. Is lethal for various species of reptiles, amphibian, birds and mammals. Induces smooth muscle contraction. It binds sphingomyelin and induces hemolysis in the same manner as lysenin-related protein 2, and is 10-fold more effective than lysenin-related protein 1. 928 1819 891 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1882 2001 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 2097 2188 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 2188 3004 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 35 717 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 850 898 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 2218 3004 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 2065 2097 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 723 753 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 33 136 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 3010 11 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 2107 2189 82 +
Feature Database Identity Match Length Description Start Location End Location Length Strand
TXL_EISFE swissprot 100.0% 100.0% Experimental evidence at protein level: Swiss-Prot protein existence level 1. Pore-forming toxin that defensively acts against parasitic microorganisms by forming pores in sphingomyelin-containing membranes. Has hemolytic activity and is also cytotoxic to spermatozoa of some species of invertebrates and many species of vertebrates and to amphibian larvae, guinea pig polymorphonuclear leukocytes, chicken fibroblasts, normal spleen cells and various tumor cells. Is lethal for various species of reptiles, amphibian, birds and mammals. Induces smooth muscle contraction. It binds sphingomyelin and induces hemolysis in the same manner as lysenin-related protein 2, and is 10-fold more effective than lysenin-related protein 1. 928 1819 891 -
rhaB promoter snapgene 100.0% 100.0% promoter of the E. coli rhaBAD operon, conferring tight induction with L-rhamnose and repression with D-glucose in the presence of RhaR and RhaS (Giacalone et al., 2006) 1882 2001 119 -
cat promoter snapgene 100.0% 100.0% promoter of the E. coli cat gene encoding chloramphenicol acetyltransferase 2097 2188 91 +
KanR snapgene 99.4% 100.0% aminoglycoside phosphotransferase; aph(3')-Ia; confers resistance to kanamycin in bacteria or G418 (Geneticin®) in eukaryotes 2188 3004 816 +
CloDF13 ori snapgene 100.0% 92.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 35 717 682 +
T7 terminator snapgene 100.0% 100.0% transcription terminator for bacteriophage T7 RNA polymerase 850 898 48 -
kanMX snapgene 99.2% 57.9% yeast selectable marker conferring kanamycin resistance (Wach et al., 1994) 2218 3004 786 +
tonB terminator snapgene 100.0% 100.0% bidirectional E. coli tonB-P14 transcription terminator 2065 2097 32 +
T3Te terminator snapgene 100.0% 100.0% phage T3 early transcription terminator 723 753 30 -
RNAI Rfam 100.0% 99.0% Accession: RF00106 - RNAI 33 136 103 -
CloDF13 ori snapgene 99.0% 13.3% Plasmids containing the CloDF13 (CDF) origin of replication can be propagated in E. coli cells that contain additional plasmids with compatible origins. 3010 11 98 +
PDK intron snapgene 100.0% 5.1% pdk; modified pyruvate orthophosphate dikinase intron from Flaveria trinervia 3.0 2107 2189 82 +

Read Counts

Number of reads per sample.

Read stats

For each assembly, read length statistics and plots of quality before and after host filtering, and after downsampling if a host reference was provided.

Failed due to insufficient reads

Read count

149

Read count

149

Completed successfully

Read count

1,334

Read count

1,332

Read count

206

Completed successfully

Read count

1,334

Read count

1,333

Read count

200

Completed successfully

Read count

1,334

Read count

1,334

Read count

206

Completed successfully

Read count

1,334

Read count

1,332

Read count

226

Completed successfully

Read count

1,334

Read count

1,330

Read count

220

Completed successfully

Read count

1,334

Read count

1,331

Read count

208

Completed successfully

Read count

1,334

Read count

1,330

Read count

206

Completed successfully

Read count

1,334

Read count

1,331

Read count

206

Completed successfully

Read count

1,334

Read count

1,332

Read count

210

Completed successfully

Read count

1,334

Read count

1,333

Read count

211

Completed successfully

Read count

1,334

Read count

1,332

Read count

203

Insert sequences

This table shows which primers were found in the consensus sequence of each sample and where the inserts were found.

Sample start end primer strand Insert length
sample02 526 1655 pRham - 1129
sample05 526 1655 pRham - 1129
sample06 526 1655 pRham - 1129
sample09 526 1655 pRham - 1129
sample10 526 1655 pRham - 1129

This table shows which primers were found in the consensus sequence of each sample and where the inserts were found.

Sample start end primer strand Insert length
sample03 670 1859 pRham - 1189
sample04 670 1859 pRham - 1189
sample07 670 1859 pRham - 1189
sample08 670 1859 pRham - 1189
sample11 689 1878 pRham - 1189
sample12 689 1878 pRham - 1189

Multiple Sequence Alignment

This section shows the inserts aligned with each other or a reference sequence if provided.

Reference GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGCAGCGTCTGTTTCTGCTG
sample02  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGCAGCGTCTGTTTCTGCTG
sample05  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGCAGCGTCTGTTTCTGCTG
sample06  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGCAGCGTCTGTTTCTGCTG
sample09  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGCAGCGTCTGTTTCTGCTG
sample10  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGCAGCGTCTGTTTCTGCTG

Reference GTCGCGGTGATGCTGCTGAGTGGTTGTCTGACCGCACCGCCGAAAGAAGCGGCACGTCCGACCCTGATGCCGCGTGCACA
sample02  GTCGCGGTGATGCTGCTGAGTGGTTGTCTGACCGCACCGCCGAAAGAAGCGGCACGTCCGACCCTGATGCCGCGTGCACA
sample05  GTCGCGGTGATGCTGCTGAGTGGTTGTCTGACCGCACCGCCGAAAGAAGCGGCACGTCCGACCCTGATGCCGCGTGCACA
sample06  GTCGCGGTGATGCTGCTGAGTGGTTGTCTGACCGCACCGCCGAAAGAAGCGGCACGTCCGACCCTGATGCCGCGTGCACA
sample09  GTCGCGGTGATGCTGCTGAGTGGTTGTCTGACCGCACCGCCGAAAGAAGCGGCACGTCCGACCCTGATGCCGCGTGCACA
sample10  GTCGCGGTGATGCTGCTGAGTGGTTGTCTGACCGCACCGCCGAAAGAAGCGGCACGTCCGACCCTGATGCCGCGTGCACA

Reference GTCTTATAAAGATCTGACCCATCTGCCGGCTCCGACGGGCAAAATTTTTGTTAGCGTCTATAACATCCAGGACGAAACCG
sample02  GTCTTATAAAGATCTGACCCATCTGCCGGCTCCGACGGGCAAAATTTTTGTTAGCGTCTATAACATCCAGGACGAAACCG
sample05  GTCTTATAAAGATCTGACCCATCTGCCGGCTCCGACGGGCAAAATTTTTGTTAGCGTCTATAACATCCAGGACGAAACCG
sample06  GTCTTATAAAGATCTGACCCATCTGCCGGCTCCGACGGGCAAAATTTTTGTTAGCGTCTATAACATCCAGGACGAAACCG
sample09  GTCTTATAAAGATCTGACCCATCTGCCGGCTCCGACGGGCAAAATTTTTGTTAGCGTCTATAACATCCAGGACGAAACCG
sample10  GTCTTATAAAGATCTGACCCATCTGCCGGCTCCGACGGGCAAAATTTTTGTTAGCGTCTATAACATCCAGGACGAAACCG

Reference GTCAATTTAAACCGTACCCGGCGAGTAATTTCTCCACGGCCGTTCCGCAGAGTGCAACCGCTATGCTGGTCACGGCACTG
sample02  GTCAATTTAAACCGTACCCGGCGAGTAATTTCTCCACGGCCGTTCCGCAGAGTGCAACCGCTATGCTGGTCACGGCACTG
sample05  GTCAATTTAAACCGTACCCGGCGAGTAATTTCTCCACGGCCGTTCCGCAGAGTGCAACCGCTATGCTGGTCACGGCACTG
sample06  GTCAATTTAAACCGTACCCGGCGAGTAATTTCTCCACGGCCGTTCCGCAGAGTGCAACCGCTATGCTGGTCACGGCACTG
sample09  GTCAATTTAAACCGTACCCGGCGAGTAATTTCTCCACGGCCGTTCCGCAGAGTGCAACCGCTATGCTGGTCACGGCACTG
sample10  GTCAATTTAAACCGTACCCGGCGAGTAATTTCTCCACGGCCGTTCCGCAGAGTGCAACCGCTATGCTGGTCACGGCACTG

Reference AAAGATTCCCGTTGGTTCATTCCGCTGGAACGCCAGGGCCTGCAAAACCTGCTGAATGAACGTAAAATTATCCGCGCAGC
sample02  AAAGATTCCCGTTGGTTCATTCCGCTGGAACGCCAGGGCCTGCAAAACCTGCTGAATGAACGTAAAATTATCCGCGCAGC
sample05  AAAGATTCCCGTTGGTTCATTCCGCTGGAACGCCAGGGCCTGCAAAACCTGCTGAATGAACGTAAAATTATCCGCGCAGC
sample06  AAAGATTCCCGTTGGTTCATTCCGCTGGAACGCCAGGGCCTGCAAAACCTGCTGAATGAACGTAAAATTATCCGCGCAGC
sample09  AAAGATTCCCGTTGGTTCATTCCGCTGGAACGCCAGGGCCTGCAAAACCTGCTGAATGAACGTAAAATTATCCGCGCAGC
sample10  AAAGATTCCCGTTGGTTCATTCCGCTGGAACGCCAGGGCCTGCAAAACCTGCTGAATGAACGTAAAATTATCCGCGCAGC

Reference TCAGGAAAACGGTACCGTGGCCATTAACAATCGTATTCCGCTGCAAAGCCTGACCGCCGCAAACATCATGGTTGAAGGCT
sample02  TCAGGAAAACGGTACCGTGGCCATTAACAATCGTATTCCGCTGCAAAGCCTGACCGCCGCAAACATCATGGTTGAAGGCT
sample05  TCAGGAAAACGGTACCGTGGCCATTAACAATCGTATTCCGCTGCAAAGCCTGACCGCCGCAAACATCATGGTTGAAGGCT
sample06  TCAGGAAAACGGTACCGTGGCCATTAACAATCGTATTCCGCTGCAAAGCCTGACCGCCGCAAACATCATGGTTGAAGGCT
sample09  TCAGGAAAACGGTACCGTGGCCATTAACAATCGTATTCCGCTGCAAAGCCTGACCGCCGCAAACATCATGGTTGAAGGCT
sample10  TCAGGAAAACGGTACCGTGGCCATTAACAATCGTATTCCGCTGCAAAGCCTGACCGCCGCAAACATCATGGTTGAAGGCT

Reference CTATCATCGGTTACGAATCAAACGTCAAATCGGGCGGTGTGGGCGCACGTTATTTTGGCATTGGTGCTGATACCCAGTAC
sample02  CTATCATCGGTTACGAATCAAACGTCAAATCGGGCGGTGTGGGCGCACGTTATTTTGGCATTGGTGCTGATACCCAGTAC
sample05  CTATCATCGGTTACGAATCAAACGTCAAATCGGGCGGTGTGGGCGCACGTTATTTTGGCATTGGTGCTGATACCCAGTAC
sample06  CTATCATCGGTTACGAATCAAACGTCAAATCGGGCGGTGTGGGCGCACGTTATTTTGGCATTGGTGCTGATACCCAGTAC
sample09  CTATCATCGGTTACGAATCAAACGTCAAATCGGGCGGTGTGGGCGCACGTTATTTTGGCATTGGTGCTGATACCCAGTAC
sample10  CTATCATCGGTTACGAATCAAACGTCAAATCGGGCGGTGTGGGCGCACGTTATTTTGGCATTGGTGCTGATACCCAGTAC

Reference CAACTGGACCAGATCGCAGTTAACCTGCGCGTGGTTAATGTCAGCACCGGCGAAATTCTGAGCTCTGTGAATACCAGCAA
sample02  CAACTGGACCAGATCGCAGTTAACCTGCGCGTGGTTAATGTCAGCACCGGCGAAATTCTGAGCTCTGTGAATACCAGCAA
sample05  CAACTGGACCAGATCGCAGTTAACCTGCGCGTGGTTAATGTCAGCACCGGCGAAATTCTGAGCTCTGTGAATACCAGCAA
sample06  CAACTGGACCAGATCGCAGTTAACCTGCGCGTGGTTAATGTCAGCACCGGCGAAATTCTGAGCTCTGTGAATACCAGCAA
sample09  CAACTGGACCAGATCGCAGTTAACCTGCGCGTGGTTAATGTCAGCACCGGCGAAATTCTGAGCTCTGTGAATACCAGCAA
sample10  CAACTGGACCAGATCGCAGTTAACCTGCGCGTGGTTAATGTCAGCACCGGCGAAATTCTGAGCTCTGTGAATACCAGCAA

Reference AACGATCCTGTCTTACGAAGTGCAGGCTGGTGTTTTTCGTTTCATTGATTATCAACGCCTGCTGGAAGGCGAAGTCGGTT
sample02  AACGATCCTGTCTTACGAAGTGCAGGCTGGTGTTTTTCGTTTCATTGATTATCAACGCCTGCTGGAAGGCGAAGTCGGTT
sample05  AACGATCCTGTCTTACGAAGTGCAGGCTGGTGTTTTTCGTTTCATTGATTATCAACGCCTGCTGGAAGGCGAAGTCGGTT
sample06  AACGATCCTGTCTTACGAAGTGCAGGCTGGTGTTTTTCGTTTCATTGATTATCAACGCCTGCTGGAAGGCGAAGTCGGTT
sample09  AACGATCCTGTCTTACGAAGTGCAGGCTGGTGTTTTTCGTTTCATTGATTATCAACGCCTGCTGGAAGGCGAAGTCGGTT
sample10  AACGATCCTGTCTTACGAAGTGCAGGCTGGTGTTTTTCGTTTCATTGATTATCAACGCCTGCTGGAAGGCGAAGTCGGTT

Reference ACACCTCAAACGAACCGGTGATGCTGTGTCTGATGTCGGCGATTGAAACGGGTGTTATTTTCCTGATCAATGATGGCATC
sample02  ACACCTCAAACGAACCGGTGATGCTGTGTCTGATGTCGGCGATTGAAACGGGTGTTATTTTCCTGATCAATGATGGCATC
sample05  ACACCTCAAACGAACCGGTGATGCTGTGTCTGATGTCGGCGATTGAAACGGGTGTTATTTTCCTGATCAATGATGGCATC
sample06  ACACCTCAAACGAACCGGTGATGCTGTGTCTGATGTCGGCGATTGAAACGGGTGTTATTTTCCTGATCAATGATGGCATC
sample09  ACACCTCAAACGAACCGGTGATGCTGTGTCTGATGTCGGCGATTGAAACGGGTGTTATTTTCCTGATCAATGATGGCATC
sample10  ACACCTCAAACGAACCGGTGATGCTGTGTCTGATGTCGGCGATTGAAACGGGTGTTATTTTCCTGATCAATGATGGCATC

Reference GACCGTGGTCTGTGGGATCTGCAGAACAAAGCCGAACGTCAAAATGACATTCTGGTGAAATACCGCCACATGAGTGTTCC
sample02  GACCGTGGTCTGTGGGATCTGCAGAACAAAGCCGAACGTCAAAATGACATTCTGGTGAAATACCGCCACATGAGTGTTCC
sample05  GACCGTGGTCTGTGGGATCTGCAGAACAAAGCCGAACGTCAAAATGACATTCTGGTGAAATACCGCCACATGAGTGTTCC
sample06  GACCGTGGTCTGTGGGATCTGCAGAACAAAGCCGAACGTCAAAATGACATTCTGGTGAAATACCGCCACATGAGTGTTCC
sample09  GACCGTGGTCTGTGGGATCTGCAGAACAAAGCCGAACGTCAAAATGACATTCTGGTGAAATACCGCCACATGAGTGTTCC
sample10  GACCGTGGTCTGTGGGATCTGCAGAACAAAGCCGAACGTCAAAATGACATTCTGGTGAAATACCGCCACATGAGTGTTCC

Reference ACCAGAATCCTAATGAGCAGCAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGG
sample02  ACCAGAATCCTAATGAGCAGCAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGG
sample05  ACCAGAATCCTAATGAGCAGCAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGG
sample06  ACCAGAATCCTAATGAGCAGCAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGG
sample09  ACCAGAATCCTAATGAGCAGCAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGG
sample10  ACCAGAATCCTAATGAGCAGCAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGG

Reference GTTTTTTGCTGAAAGGAGGAACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGC
sample02  GTTTTTTGCTGAAAGGAGGAACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGC
sample05  GTTTTTTGCTGAAAGGAGGAACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGC
sample06  GTTTTTTGCTGAAAGGAGGAACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGC
sample09  GTTTTTTGCTGAAAGGAGGAACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGC
sample10  GTTTTTTGCTGAAAGGAGGAACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGC

Reference GTAACTGGACTGCAATCAACTCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAA
sample02  GTAACTGGACTGCAATCAACTCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAA
sample05  GTAACTGGACTGCAATCAACTCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAA
sample06  GTAACTGGACTGCAATCAACTCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAA
sample09  GTAACTGGACTGCAATCAACTCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAA
sample10  GTAACTGGACTGCAATCAACTCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAA

Reference AGTTACCCA
sample02  AGTTACCCA
sample05  AGTTACCCA
sample06  AGTTACCCA
sample09  AGTTACCCA
sample10  AGTTACCCA
Reference GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGAGTGCGAAGGCTGCTGAA
sample03  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGAGTGCGAAGGCTGCTGAA
sample04  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGAGTGCGAAGGCTGCTGAA
sample07  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGAGTGCGAAGGCTGCTGAA
sample08  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGAGTGCGAAGGCTGCTGAA
sample11  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGAGTGCGAAGGCTGCTGAA
sample12  GACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTATAAGAAGGAGATATACATATGAGTGCGAAGGCTGCTGAA

Reference GGTTATGAACAAATCGAAGTTGATGTGGTTGCTGTGTGGAAGGAAGGTTATGTGTATGAAAATCGTGGTAGTACCTCCGT
sample03  GGTTATGAACAAATCGAAGTTGATGTGGTTGCTGTGTGGAAGGAAGGTTATGTGTATGAAAATCGTGGTAGTACCTCCGT
sample04  GGTTATGAACAAATCGAAGTTGATGTGGTTGCTGTGTGGAAGGAAGGTTATGTGTATGAAAATCGTGGTAGTACCTCCGT
sample07  GGTTATGAACAAATCGAAGTTGATGTGGTTGCTGTGTGGAAGGAAGGTTATGTGTATGAAAATCGTGGTAGTACCTCCGT
sample08  GGTTATGAACAAATCGAAGTTGATGTGGTTGCTGTGTGGAAGGAAGGTTATGTGTATGAAAATCGTGGTAGTACCTCCGT
sample11  GGTTATGAACAAATCGAAGTTGATGTGGTTGCTGTGTGGAAGGAAGGTTATGTGTATGAAAATCGTGGTAGTACCTCCGT
sample12  GGTTATGAACAAATCGAAGTTGATGTGGTTGCTGTGTGGAAGGAAGGTTATGTGTATGAAAATCGTGGTAGTACCTCCGT

Reference GGATCAAAAAATTACCATCACGAAAGGCATGAAGAACGTTAATAGCGAAACCCGTACGGTCACCGCGACGCATTCTATTG
sample03  GGATCAAAAAATTACCATCACGAAAGGCATGAAGAACGTTAATAGCGAAACCCGTACGGTCACCGCGACGCATTCTATTG
sample04  GGATCAAAAAATTACCATCACGAAAGGCATGAAGAACGTTAATAGCGAAACCCGTACGGTCACCGCGACGCATTCTATTG
sample07  GGATCAAAAAATTACCATCACGAAAGGCATGAAGAACGTTAATAGCGAAACCCGTACGGTCACCGCGACGCATTCTATTG
sample08  GGATCAAAAAATTACCATCACGAAAGGCATGAAGAACGTTAATAGCGAAACCCGTACGGTCACCGCGACGCATTCTATTG
sample11  GGATCAAAAAATTACCATCACGAAAGGCATGAAGAACGTTAATAGCGAAACCCGTACGGTCACCGCGACGCATTCTATTG
sample12  GGATCAAAAAATTACCATCACGAAAGGCATGAAGAACGTTAATAGCGAAACCCGTACGGTCACCGCGACGCATTCTATTG

Reference GCAGTACCATCTCCACGGGTGACGCCTTTGAAATCGGCTCCGTGGAAGTTTCATATTCGCATAGCCACGAAGAATCACAA
sample03  GCAGTACCATCTCCACGGGTGACGCCTTTGAAATCGGCTCCGTGGAAGTTTCATATTCGCATAGCCACGAAGAATCACAA
sample04  GCAGTACCATCTCCACGGGTGACGCCTTTGAAATCGGCTCCGTGGAAGTTTCATATTCGCATAGCCACGAAGAATCACAA
sample07  GCAGTACCATCTCCACGGGTGACGCCTTTGAAATCGGCTCCGTGGAAGTTTCATATTCGCATAGCCACGAAGAATCACAA
sample08  GCAGTACCATCTCCACGGGTGACGCCTTTGAAATCGGCTCCGTGGAAGTTTCATATTCGCATAGCCACGAAGAATCACAA
sample11  GCAGTACCATCTCCACGGGTGACGCCTTTGAAATCGGCTCCGTGGAAGTTTCATATTCGCATAGCCACGAAGAATCACAA
sample12  GCAGTACCATCTCCACGGGTGACGCCTTTGAAATCGGCTCCGTGGAAGTTTCATATTCGCATAGCCACGAAGAATCACAA

Reference GTTTCGATGACCGAAACGGAAGTCTACGAATCAAAAGTGATTGAACACACCATTACGATCCCGCCGACCTCGAAGTTCAC
sample03  GTTTCGATGACCGAAACGGAAGTCTACGAATCAAAAGTGATTGAACACACCATTACGATCCCGCCGACCTCGAAGTTCAC
sample04  GTTTCGATGACCGAAACGGAAGTCTACGAATCAAAAGTGATTGAACACACCATTACGATCCCGCCGACCTCGAAGTTCAC
sample07  GTTTCGATGACCGAAACGGAAGTCTACGAATCAAAAGTGATTGAACACACCATTACGATCCCGCCGACCTCGAAGTTCAC
sample08  GTTTCGATGACCGAAACGGAAGTCTACGAATCAAAAGTGATTGAACACACCATTACGATCCCGCCGACCTCGAAGTTCAC
sample11  GTTTCGATGACCGAAACGGAAGTCTACGAATCAAAAGTGATTGAACACACCATTACGATCCCGCCGACCTCGAAGTTCAC
sample12  GTTTCGATGACCGAAACGGAAGTCTACGAATCAAAAGTGATTGAACACACCATTACGATCCCGCCGACCTCGAAGTTCAC

Reference GCGCTGGCAGCTGAACGCAGATGTCGGCGGTGCTGACATTGAATATATGTACCTGATCGATGAAGTTACCCCGATTGGCG
sample03  GCGCTGGCAGCTGAACGCAGATGTCGGCGGTGCTGACATTGAATATATGTACCTGATCGATGAAGTTACCCCGATTGGCG
sample04  GCGCTGGCAGCTGAACGCAGATGTCGGCGGTGCTGACATTGAATATATGTACCTGATCGATGAAGTTACCCCGATTGGCG
sample07  GCGCTGGCAGCTGAACGCAGATGTCGGCGGTGCTGACATTGAATATATGTACCTGATCGATGAAGTTACCCCGATTGGCG
sample08  GCGCTGGCAGCTGAACGCAGATGTCGGCGGTGCTGACATTGAATATATGTACCTGATCGATGAAGTTACCCCGATTGGCG
sample11  GCGCTGGCAGCTGAACGCAGATGTCGGCGGTGCTGACATTGAATATATGTACCTGATCGATGAAGTTACCCCGATTGGCG
sample12  GCGCTGGCAGCTGAACGCAGATGTCGGCGGTGCTGACATTGAATATATGTACCTGATCGATGAAGTTACCCCGATTGGCG

Reference GTACGCAGAGTATTCCGCAAGTGATCACCTCCCGTGCAAAAATTATCGTTGGTCGCCAGATTATCCTGGGCAAGACCGAA
sample03  GTACGCAGAGTATTCCGCAAGTGATCACCTCCCGTGCAAAAATTATCGTTGGTCGCCAGATTATCCTGGGCAAGACCGAA
sample04  GTACGCAGAGTATTCCGCAAGTGATCACCTCCCGTGCAAAAATTATCGTTGGTCGCCAGATTATCCTGGGCAAGACCGAA
sample07  GTACGCAGAGTATTCCGCAAGTGATCACCTCCCGTGCAAAAATTATCGTTGGTCGCCAGATTATCCTGGGCAAGACCGAA
sample08  GTACGCAGAGTATTCCGCAAGTGATCACCTCCCGTGCAAAAATTATCGTTGGTCGCCAGATTATCCTGGGCAAGACCGAA
sample11  GTACGCAGAGTATTCCGCAAGTGATCACCTCCCGTGCAAAAATTATCGTTGGTCGCCAGATTATCCTGGGCAAGACCGAA
sample12  GTACGCAGAGTATTCCGCAAGTGATCACCTCCCGTGCAAAAATTATCGTTGGTCGCCAGATTATCCTGGGCAAGACCGAA

Reference ATTCGTATCAAACATGCTGAACGCAAGGAATATATGACCGTGGTTAGCCGTAAATCTTGGCCGGCGGCCACGCTGGGTCA
sample03  ATTCGTATCAAACATGCTGAACGCAAGGAATATATGACCGTGGTTAGCCGTAAATCTTGGCCGGCGGCCACGCTGGGTCA
sample04  ATTCGTATCAAACATGCTGAACGCAAGGAATATATGACCGTGGTTAGCCGTAAATCTTGGCCGGCGGCCACGCTGGGTCA
sample07  ATTCGTATCAAACATGCTGAACGCAAGGAATATATGACCGTGGTTAGCCGTAAATCTTGGCCGGCGGCCACGCTGGGTCA
sample08  ATTCGTATCAAACATGCTGAACGCAAGGAATATATGACCGTGGTTAGCCGTAAATCTTGGCCGGCGGCCACGCTGGGTCA
sample11  ATTCGTATCAAACATGCTGAACGCAAGGAATATATGACCGTGGTTAGCCGTAAATCTTGGCCGGCGGCCACGCTGGGTCA
sample12  ATTCGTATCAAACATGCTGAACGCAAGGAATATATGACCGTGGTTAGCCGTAAATCTTGGCCGGCGGCCACGCTGGGTCA

Reference CAGTAAACTGTTTAAGTTCGTGCTGTACGAAGATTGGGGCGGTTTTCGCATCAAAACCCTGAATACGATGTATTCTGGTT
sample03  CAGTAAACTGTTTAAGTTCGTGCTGTACGAAGATTGGGGCGGTTTTCGCATCAAAACCCTGAATACGATGTATTCTGGTT
sample04  CAGTAAACTGTTTAAGTTCGTGCTGTACGAAGATTGGGGCGGTTTTCGCATCAAAACCCTGAATACGATGTATTCTGGTT
sample07  CAGTAAACTGTTTAAGTTCGTGCTGTACGAAGATTGGGGCGGTTTTCGCATCAAAACCCTGAATACGATGTATTCTGGTT
sample08  CAGTAAACTGTTTAAGTTCGTGCTGTACGAAGATTGGGGCGGTTTTCGCATCAAAACCCTGAATACGATGTATTCTGGTT
sample11  CAGTAAACTGTTTAAGTTCGTGCTGTACGAAGATTGGGGCGGTTTTCGCATCAAAACCCTGAATACGATGTATTCTGGTT
sample12  CAGTAAACTGTTTAAGTTCGTGCTGTACGAAGATTGGGGCGGTTTTCGCATCAAAACCCTGAATACGATGTATTCTGGTT

Reference ATGAATACGCGTATAGCTCTGACCAGGGCGGTATCTACTTCGATCAAGGCACCGACAACCCGAAACAGCGTTGGGCCATT
sample03  ATGAATACGCGTATAGCTCTGACCAGGGCGGTATCTACTTCGATCAAGGCACCGACAACCCGAAACAGCGTTGGGCCATT
sample04  ATGAATACGCGTATAGCTCTGACCAGGGCGGTATCTACTTCGATCAAGGCACCGACAACCCGAAACAGCGTTGGGCCATT
sample07  ATGAATACGCGTATAGCTCTGACCAGGGCGGTATCTACTTCGATCAAGGCACCGACAACCCGAAACAGCGTTGGGCCATT
sample08  ATGAATACGCGTATAGCTCTGACCAGGGCGGTATCTACTTCGATCAAGGCACCGACAACCCGAAACAGCGTTGGGCCATT
sample11  ATGAATACGCGTATAGCTCTGACCAGGGCGGTATCTACTTCGATCAAGGCACCGACAACCCGAAACAGCGTTGGGCCATT
sample12  ATGAATACGCGTATAGCTCTGACCAGGGCGGTATCTACTTCGATCAAGGCACCGACAACCCGAAACAGCGTTGGGCCATT

Reference AATAAGAGCCTGCCGCTGCGCCATGGTGATGTCGTGACCTTTATGAACAAATACTTCACGCGTTCTGGTCTGTGCTATGA
sample03  AATAAGAGCCTGCCGCTGCGCCATGGTGATGTCGTGACCTTTATGAACAAATACTTCACGCGTTCTGGTCTGTGCTATGA
sample04  AATAAGAGCCTGCCGCTGCGCCATGGTGATGTCGTGACCTTTATGAACAAATACTTCACGCGTTCTGGTCTGTGCTATGA
sample07  AATAAGAGCCTGCCGCTGCGCCATGGTGATGTCGTGACCTTTATGAACAAATACTTCACGCGTTCTGGTCTGTGCTATGA
sample08  AATAAGAGCCTGCCGCTGCGCCATGGTGATGTCGTGACCTTTATGAACAAATACTTCACGCGTTCTGGTCTGTGCTATGA
sample11  AATAAGAGCCTGCCGCTGCGCCATGGTGATGTCGTGACCTTTATGAACAAATACTTCACGCGTTCTGGTCTGTGCTATGA
sample12  AATAAGAGCCTGCCGCTGCGCCATGGTGATGTCGTGACCTTTATGAACAAATACTTCACGCGTTCTGGTCTGTGCTATGA

Reference TGACGGCCCGGCGACCAATGTGTATTGTCTGGATAAACGCGAAGACAAGTGGATTCTGGAAGTTGTCGGATAATGAGCAG
sample03  TGACGGCCCGGCGACCAATGTGTATTGTCTGGATAAACGCGAAGACAAGTGGATTCTGGAAGTTGTCGGATAATGAGCAG
sample04  TGACGGCCCGGCGACCAATGTGTATTGTCTGGATAAACGCGAAGACAAGTGGATTCTGGAAGTTGTCGGATAATGAGCAG
sample07  TGACGGCCCGGCGACCAATGTGTATTGTCTGGATAAACGCGAAGACAAGTGGATTCTGGAAGTTGTCGGATAATGAGCAG
sample08  TGACGGCCCGGCGACCAATGTGTATTGTCTGGATAAACGCGAAGACAAGTGGATTCTGGAAGTTGTCGGATAATGAGCAG
sample11  TGACGGCCCGGCGACCAATGTGTATTGTCTGGATAAACGCGAAGACAAGTGGATTCTGGAAGTTGTCGGATAATGAGCAG
sample12  TGACGGCCCGGCGACCAATGTGTATTGTCTGGATAAACGCGAAGACAAGTGGATTCTGGAAGTTGTCGGATAATGAGCAG

Reference CAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA
sample03  CAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA
sample04  CAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA
sample07  CAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA
sample08  CAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA
sample11  CAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA
sample12  CAGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA

Reference ACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGCGTAACTGGACTGCAATCAAC
sample03  ACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGCGTAACTGGACTGCAATCAAC
sample04  ACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGCGTAACTGGACTGCAATCAAC
sample07  ACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGCGTAACTGGACTGCAATCAAC
sample08  ACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGCGTAACTGGACTGCAATCAAC
sample11  ACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGCGTAACTGGACTGCAATCAAC
sample12  ACTATATCCGGGTAACGAATTCAAGCTTGATATCATTCAGGACGAGCCTCAGACTCCAGCGTAACTGGACTGCAATCAAC

Reference TCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAAAGTTACCCA
sample03  TCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAAAGTTACCCA
sample04  TCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAAAGTTACCCA
sample07  TCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAAAGTTACCCA
sample08  TCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAAAGTTACCCA
sample11  TCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAAAGTTACCCA
sample12  TCACTGGCTCACCTTCACGGGTGGGCCTTTCTTCGGGATCCGCGCTGCGGACACATACAAAGTTACCCA

Insert QC

This section can be used to ensure the provided insert reference matches the insert found in the assembly. The table belows shows coverage and BLAST identity between the two. Reference coverage is the percentage of the provided insert reference sequence covered in the alignment with the assembled construct. Assembly coverage is the percentage of the assembled insert sequence covered in the alignment with the provided insert reference. BLAST identity is calculated as: (length - ins - del - sub) / length. If both coverage and identity are 0, the assembled insert did not align with the provided insert reference.

Sample name Reference coverage Assembly coverage BLAST Identity
sample02 100.0 100.0 100.0
sample03 100.0 100.0 100.0
sample04 100.0 100.0 100.0
sample05 100.0 100.0 100.0
sample06 100.0 100.0 100.0
sample07 100.0 100.0 100.0
sample08 100.0 100.0 100.0
sample09 100.0 100.0 100.0
sample10 100.0 100.0 100.0
sample11 100.0 100.0 100.0
sample12 100.0 100.0 100.0

Insert variants

The following tables and figures are output from bcftools from finding any variants between the consensus insert and the provided reference insert.

Variant counts per sample. See output bcf file for info on individual variants.

filename id MNPs SNPs indels multiallelic SNP sites multiallelic sites no-ALTs others records
sample02.insert.stats sample02.insert.stats 0 0 0 0 0 0 0 0
sample03.insert.stats sample03.insert.stats 0 0 0 0 0 0 0 0
sample04.insert.stats sample04.insert.stats 0 0 0 0 0 0 0 0
sample05.insert.stats sample05.insert.stats 0 0 0 0 0 0 0 0
sample06.insert.stats sample06.insert.stats 0 0 0 0 0 0 0 0
sample07.insert.stats sample07.insert.stats 0 0 0 0 0 0 0 0
sample08.insert.stats sample08.insert.stats 0 0 0 0 0 0 0 0
sample09.insert.stats sample09.insert.stats 0 0 0 0 0 0 0 0
sample10.insert.stats sample10.insert.stats 0 0 0 0 0 0 0 0
sample11.insert.stats sample11.insert.stats 0 0 0 0 0 0 0 0
sample12.insert.stats sample12.insert.stats 0 0 0 0 0 0 0 0

Trans counts per sample. See output bcf file for info on individual transitions.

filename id ts tv ts/tv ts (1st ALT) tv (1st ALT) ts/tv (1st ALT)
sample02.insert.stats sample02.insert.stats 0 0 0.0 0 0 0.0
sample03.insert.stats sample03.insert.stats 0 0 0.0 0 0 0.0
sample04.insert.stats sample04.insert.stats 0 0 0.0 0 0 0.0
sample05.insert.stats sample05.insert.stats 0 0 0.0 0 0 0.0
sample06.insert.stats sample06.insert.stats 0 0 0.0 0 0 0.0
sample07.insert.stats sample07.insert.stats 0 0 0.0 0 0 0.0
sample08.insert.stats sample08.insert.stats 0 0 0.0 0 0 0.0
sample09.insert.stats sample09.insert.stats 0 0 0.0 0 0 0.0
sample10.insert.stats sample10.insert.stats 0 0 0.0 0 0 0.0
sample11.insert.stats sample11.insert.stats 0 0 0.0 0 0 0.0
sample12.insert.stats sample12.insert.stats 0 0 0.0 0 0 0.0

Full construct QC

This section can be used to ensure the provided reference matches the assembly. The table belows shows coverage and BLAST identity between the provided reference and assembly. Reference coverage is the percentage of the provided reference sequence covered in the alignment with the assembled construct. Assembly coverage is the percentage of assembled contruct sequence covered in the alignment with the provided reference. BLAST Identity is calculated as: (length - ins - del - sub) / length. If both coverage and identity are 0, the assembly did not align with the provided reference.

Sample name Reference coverage Assembly coverage BLAST Identity
sample02 100.0 100.0 94.57
sample03 99.3825 98.7726 100.0
sample04 99.3825 98.773 99.97
sample05 100.0 100.0 100.0
sample06 100.0 100.0 99.97
sample07 99.3825 98.7726 100.0
sample08 99.3825 98.7726 100.0
sample09 100.0 100.0 100.0
sample10 100.0 100.0 100.0
sample11 99.3825 98.7726 100.0
sample12 99.3825 98.7726 100.0

Additionally, BCFtools was used to report any variants between the provided reference and assembly.

Variant counts per sample. See output bcf file for info on individual variants.

filename id MNPs SNPs indels multiallelic SNP sites multiallelic sites no-ALTs others records
sample02.full_construct.stats sample02.full_construct.stats 0 0 0 0 0 0 0 0
sample03.full_construct.stats sample03.full_construct.stats 0 0 0 0 0 0 0 0
sample04.full_construct.stats sample04.full_construct.stats 0 0 0 0 0 0 0 0
sample05.full_construct.stats sample05.full_construct.stats 0 0 0 0 0 0 0 0
sample06.full_construct.stats sample06.full_construct.stats 0 0 0 0 0 0 0 0
sample07.full_construct.stats sample07.full_construct.stats 0 0 0 0 0 0 0 0
sample08.full_construct.stats sample08.full_construct.stats 0 0 0 0 0 0 0 0
sample09.full_construct.stats sample09.full_construct.stats 0 0 0 0 0 0 0 0
sample10.full_construct.stats sample10.full_construct.stats 0 0 0 0 0 0 0 0
sample11.full_construct.stats sample11.full_construct.stats 0 0 0 0 0 0 0 0
sample12.full_construct.stats sample12.full_construct.stats 0 0 0 0 0 0 0 0

Trans counts per sample. See output bcf file for info on individual transitions.

filename id ts tv ts/tv ts (1st ALT) tv (1st ALT) ts/tv (1st ALT)
sample02.full_construct.stats sample02.full_construct.stats 0 0 0.0 0 0 0.0
sample03.full_construct.stats sample03.full_construct.stats 0 0 0.0 0 0 0.0
sample04.full_construct.stats sample04.full_construct.stats 0 0 0.0 0 0 0.0
sample05.full_construct.stats sample05.full_construct.stats 0 0 0.0 0 0 0.0
sample06.full_construct.stats sample06.full_construct.stats 0 0 0.0 0 0 0.0
sample07.full_construct.stats sample07.full_construct.stats 0 0 0.0 0 0 0.0
sample08.full_construct.stats sample08.full_construct.stats 0 0 0.0 0 0 0.0
sample09.full_construct.stats sample09.full_construct.stats 0 0 0.0 0 0 0.0
sample10.full_construct.stats sample10.full_construct.stats 0 0 0.0 0 0 0.0
sample11.full_construct.stats sample11.full_construct.stats 0 0 0.0 0 0 0.0
sample12.full_construct.stats sample12.full_construct.stats 0 0 0.0 0 0 0.0

Dot plots

These dot plots have been created by aligning the assembly to itself to reveal any repeats or repetitive regions. Black is for repeats found in the forward strand and red for repeats found in the reverse complement. This was done using last

Software versions

Name Version
medaka 2.0.0
flye 2.9.4-b1799
minimap2 2.28-r1209
samtools 1.19.2
seqkit v2.4.0
Trycycler v0.5.5
bedtools v2.31.0
fastcat 0.18.6
rasusa 0.7.1
spoa 0.0.10
pandas 1.3.5
plannotate 1.2.0
bokeh 3.1.1

Workflow parameters

Key Value
out_dir wf-clone-validation
store_dir wf-clone-validation/store_dir
fastq wf-clone-validation/data/clone_val_test/fastq
primers wf-clone-validation/data/clone_val_test/primers.tsv
host_reference wf-clone-validation/data/clone_val_test/host_reference.fa.gz
regions_bedfile wf-clone-validation/data/clone_val_test/reference.bed
sample_sheet wf-clone-validation/data/clone_val_test/sample_sheet.csv
override_basecaller_cfg dna_r10.4.1_e8.2_400bps_hac@v4.2.0
bam None
db_directory None
threads 4
approx_size 7000
assm_coverage 60
trim_length 150
prefix None
insert_reference None
sample None
analyse_unclassified False
medaka_model_path None
flye_quality nano-hq
non_uniform_coverage False
large_construct False
full_reference None
cutsite_mismatch 1
primer_mismatch 2
expected_coverage 95
expected_identity 99
assembly_tool flye
canu_fast False
client_fields None