Questions about V3–V4 primers for 16S rRNA amplicon ...
文章推薦指數: 80 %
Background: We've done some 16S rRNA amplicon sequencing with primers that Illumina says are used to sequence the V3 and V4 variable regions of ... QuestionsaboutV3–V4primersfor16SrRNAampliconsequencing,andcalculatingoverlap UserSupport 16s, primers, dada2 KQUB July20,2021,5:06pm #1 Hithere, I’vegotsomequestionsaboutprimersandaboutcalculatingoverlapforpaired-endreads.IhopeI’mpostinginanappropriateplace.Sorry,ifnot.Thisismyfirstpost. Background: We’vedonesome16SrRNAampliconsequencingwithprimersthatIlluminasaysareusedtosequencetheV3andV4variableregionsofthe16SrRNAgene.TheIlluminaguidecanbefoundhere. Forwardprimer: 5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG-3’ Reverseprimer: 5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-3’ I’venoticedthatthelast17ntoftheforwardIlluminaprimerandthelast21ntofthereverseIlluminaprimercanbefoundinthefollowingthreepapers(andmaybeothers): Herlemann,D.P.,Labrenz,M.,Jürgens,K.,Bertilsson,S.,Waniek,J.J.,&Andersson,A.F.(2011).Transitionsinbacterialcommunitiesalongthe2000kmsalinitygradientoftheBalticSea.TheISMEjournal,5(10),1571-1579. Klindworth,A.,Pruesse,E.,Schweer,T.,Peplies,J.,Quast,C.,Horn,M.,&Glöckner,F.O.(2013).Evaluationofgeneral16SribosomalRNAgenePCRprimersforclassicalandnext-generationsequencing-baseddiversitystudies.Nucleicacidsresearch,41(1),e1-e1. Thijs,S.,OpDeBeeck,M.,Beckers,B.,Truyens,S.,Stevens,V.,VanHamme,J.D.,...&Vangronsveld,J.(2017).Comparativeevaluationoffourbacteria-specificprimerpairsfor16SrRNAgenesurveys.Frontiersinmicrobiology,8,494. Myinterpretation: AsfarasIcantell,theHerlemannetal.paperistheoriginalsourceofthesesequences.ThatpaperreportsprimerscalledBakt_341F(CCTACGGGNGGCWGCAG)andBakt_805R(GACTACHVGGGTATCTAATCC).Theampliconsizeisn’tgiven,butthetargetedvariableregionsofthe16SrRNAaregivenasV3andV4. TheKlindworthetal.paperiscitedbytheIlluminaguidereferencedabove.ThatpaperreportsprimerscalledS-DBact-0341-b-S-17(5’-CCTACGGGNGGCWGCAG-3’)andS-D-Bact-0785-a-A-21(5’-GACTACHVGGGTATCTAATCC-3’),whicharesaidtohaveanampliconsizeof464bp.Nodetailsaregivenastowhichvariableregionsofthe16SrRNAgenearetargeted. TheThijsetal.paperreportsprimerscalled341f(CCTACGGGNGGCWGCAG)and785r(GACTACHVGGGTATCTAATCC).341fissaidtotargetpositions341–357inE.coli,and785rtargetspositions785–805.Theampliconsizeisgivenas444bp.Thetargetedvariableregionsofthe16SrRNAgenearegivenasV3–V4. Myquestions: Myunderstandingisthatthefirst33ntoftheforwardIlluminaprimerandthefirst34ntofthereverseIlluminaprimereachcompriseaNexteraadapter,whichoverhangs(i.e.doesnotbindtothe16SrRNAgene).Inthatcase,arethelast17ntoftheforwardprimerandthelast21ntofthereverseprimertheregionsthatbindtothe16SrRNAgene?Figure1intheIlluminaguideseemstosuggestthattheyare. Inaforumpostabouttheseprimers,@benjjnebrecommendedremoving“theentireprimers”intheDADA2denoisingstepbecause“theprimersaren’tsequencesfromthesample”.Doeshemeanthatalthoughthe17-ntand21-ntregionsaredesignedtobindtothe16SrRNAgene,theyshouldberemovedintheDADA2denoisingstepbecausetheymaynotalways100%exactlymatcheach16SrRNAgenesequencethattheybindto?Thatwouldmakesensetome. TheprimerpairsinthethreepapersImentionedareallidentical,yettheyallhavedifferentnames,whichisconfusing.Whatdothenumbers341,785and805mean?Dotheserefertothepositionswheretheprimersbindtothe16SrRNAgeneintheE.coligenome?Iguessthatwouldmakesense,becausepresumably785and805wouldthenrefertooppositeendsofthereverseprimer. TheKlindworthetal.papergivesanampliconsizeof464bpfortheseprimers,whereastheThijsetal.papergivesanampliconsizeof444bp.Iftheprimers,plusthesequencetheyamplify,spanfrompositions341to805,Iguessthecorrectampliconsizeshouldbe805−341=464bp,right?Also,whatdoesthisnumbermeanexactly?IsittheampliconsizeexpectediftheprimersareusedtoamplifyfromtheE.coli16SrRNAgene?And,ifusingtheseprimerstosequenceanenvironmentalsample,wouldwethengetarangeofslightlydifferentampliconsizes(becausethe16SrRNAgenevariesslightlyindifferentspecies)?Inotherwords,is464bpkindofanapproximatenumber? Inthisforumpost,@colinbrislawngavethiskindofformulaforcalculatingoverlap:(lengthofforwardread)+(lengthofreverseread)−(lengthofamplicon)=lengthofoverlap.So,inthecaseof2x300bppaired-endIlluminaMiSeqsequencing,we’dhave300+300−464=136bpoverlap,tostartwith?Andthen,amIrightinthinkingthatthe--p-trim-left-fand--p-trim-left-rparametersofqiimedada2denoise-pairedbasicallydon’timpacttheoverlap(becausetheyactatthe5’endofthereads),but--p-trunc-len-fand--p-trunc-len-rdo(becausetheyactatthe3’endofthereads)?Ifso,thenIguesstheformulaforthefinaloverlapwouldbe:(lengthofforwardread)+(lengthofreverseread)−(lengthofamplicon)−(lengthofforwardread−--p-trunc-len-fvalue)−(lengthofreverseread−--p-trunc-len-rvalue)=overlap.So,forexample,ifwepicked--p-trunc-len-f=280and--p-trunc-len-r=250,we’dhave300+300−464−20−50=66bpoverlap? Apologiesfortheungodlylengthofthispost,andthepotentiallyverynoobquestions.Perhapsthiskindof'thinkingoutloud'postwillbesomehowusefultoothernoobsinthefuture.Anyway,bigthankstotheteamfordevelopingthisamazingtoolandformaintainingthisveryusefulforum.Youareappreciated,andwearegrateful!Keepupthegreatwork! Cheers, Kevin 8Likes WhydoIhavemoresequences(thereforemoretaxonomicgroupsrecovered)whenIuseonlymyforwardsequences? Needtotrimprimersbutthiswillpreventmergingsequences? DADA2-"maxMismatchrate"asanalternativeto"maxMismatch" Primernaminginconsistencies? Pickingvaluesfor--p-min-lengthand--p-max-lengthinqiimefeature-classifierextract-reads benjjneb (BenCallahan) July20,2021,7:56pm #2 Yestobasicallyallofthat. Oneadditionofactualsubstance: KQUB: TheKlindworthetal.papergivesanampliconsizeof464bpfortheseprimers,whereastheThijsetal.papergivesanampliconsizeof444bp.Iftheprimers,plusthesequencetheyamplify,spanfrompositions341to805,Iguessthecorrectampliconsizeshouldbe805−341=464bp,right?Also,whatdoesthisnumbermeanexactly?IsittheampliconsizeexpectediftheprimersareusedtoamplifyfromtheE.coli16SrRNAgene?And,ifusingtheseprimerstosequenceanenvironmentalsample,wouldwethengetarangeofslightlydifferentampliconsizes(becausethe16SrRNAgenevariesslightlyindifferentspecies)?Inotherwords,is464bpkindofanapproximatenumber? Yes,thereisvariationinlengthsof16Ssegments.Itisn'tlarge,butitexists.Inparticular,therearetwomodesofV3V4lengthinnature,oneat460ntsandanother~440nts(usingtheseprimers).Soyou'llwanttomakesureeventhelongernaturalampliconswillsufficientlyoverlapaftertruncation. 8Likes KQUB July21,2021,8:19am #3 Cool,goodtoknow.Thanksalotfortakingthetimetoreadmyramblings,@benjjneb! system (system) closed August21,2021,2:20pm #4 Thistopicwasautomaticallyclosed31daysafterthelastreply.Newrepliesarenolongerallowed.
延伸文章資訊
- 1High-Speed, Multiplexed 16S Microbial Sequencing ... - Illumina
Sequencing the V4 region of 16S. Sequence variation in the 16S ribosomal RNA (rRNA) gene is widel...
- 2Comparison of Two 16S rRNA Primers (V3–V4 and V4–V5) for ...
Two different hypervariable regions of the bacterial 16S rRNA gene were amplified using aliquots ...
- 3Why use V3-region of the 16S rRNA encoding gene for ...
The main reason for using V3/V4 region in majority of the cases that these regions contain the ma...
- 4Benchmark of 16S rRNA gene amplicon sequencing using ...
As the official Illumina protocol adopted the V3–V4 (V34) region, these two regions are widely us...
- 5Evaluation of Compatibility of 16S rRNA V3V4 and V4 ...
The 16S V3V4 and V4 hypervariable regions are widely selected for human microbiota profiling, but...