Knowledge Graphs in Natural Language Processing @ ACL ...

文章推薦指數: 80 %
投票人數:10人

Knowledge Graphs in Natural Language Processing @ ACL 2021. Your guide to the KG-related NLP research, ACL edition. OpeninappHomeNotificationsListsStoriesWritePublishedinTowardsDataScienceStateOfTheArt,Summer’21KnowledgeGraphsinNaturalLanguageProcessing@ACL2021YourguidetotheKG-relatedNLPresearch,ACLeditionWelcometothethirditerationofourregularoverviewofNLPpapersaroundKnowledgeGraphs,thistimepublishedatACL2021!Whatwillbe(orhasbeen)trendingthisyearthatyouwouldn’twanttomiss?👀BackgroundphotobyMilkovionUnsplash,adaptedbyAuthorACL’21remainstobeoneofthelargestNLPvenues:700+fullpapersand500+FindingsofACLpaperswereacceptedthisyear📈.Besides,don’tforgetoftenimpactfulshortpapersandawideselectionofworkshopsandtutorials.ItriedtodistillsomeKGpapersfromallthosetracksintoonepost.Herearetoday’stopics:NeuralDatabases&RetrievalKG-augmentedLanguageModelsKGEmbeddings&LinkPredictionEntityAlignmentKGConstruction,EntityLinking,RelationExtractionKGQA:Temporal,Conversational,andAMRtl;drFordatasetaficionados,Imarkedeverynewdatasetwith💾,soyoucouldsearchandnavigateabiteasier.Havingsaidthat,you’dprobablywantsomenavigationinthisoceanofhigh-qualitycontent🧭.Itried,Ipromise.MemebyAuthorNeuralDatabases&RetrievalNeuralretrievalcontinuestobeoneofthefastest-growingandhottest🔥topicsinNLP:itnowworkswithbillionsofvectorsandindicesofthescale100+GB.IftheNLPstackismatureenough,canweapproachtheholygrailofdatabaseresearch,well,databases,fromtheneuralside?🤨MemebyAuthorYes!Thorneetalintroducetheconceptofnaturallanguagedatabases(denotedasNeuralDB):thereisnopre-definedrigidschema,instead,youcanstorefactsrightastextutterancesasyouwritethem.NB:ifyouaremoreofadatabaseguyandrank“properDBvenues”higher,thefoundationalprincipleswerealsolaidintherecentVLDB’21paperbythesameteamofauthors.Howdoesitwork?Whatisthequeryengine?Arethereanyjoins?(nojoins—notadatabase!)AnintroducedNLDBconsistsofKtextualfacts(25–1000inthisstudy).Essentially,thequeryansweringtaskovertextualfactsisframedasretrieval🔎+extractiveQA📑+aggregation🧹(tosupportmin/max/countqueries).Givenanaturallanguagequestion,wefirstwanttoretrieveseveralrelevantfacts(supportingfacts).Then,havingaqueryandmsupportingsets,weperformajoin(select-projectjoin,SPJoperator,okay,nowitqualifiesasadatabase😀)againsteachpair(query,support)tofindananswerorconfirmitsabsence(extractiveQA).Finally,joinresultsareaggregatedwithsimplepost-processing.🧱CanwejustconcatenateallKfactsandputtheminonebigtransformer?Technically,yes,buttheauthorsshowitisratherinefficientwhenDBsizegrowsbeyond25facts.Incontrast,amulti-stagedapproachallowsforparallelprocessingandbetterscaling.Itseemsthat,currently,thecruxofNLDBsisintheretrievalmechanism—wedon’twanttocreateapowersetofallpossiblecombinationsbutextractonlyrelevantones.SofaritisdoneviatheDPR-likedenseretrieval(SupportSetGenerator)trainedonannotatedsupportingsetsforeachquery.Speakingofannotationandtraining,theauthorssupportNLDBswithanewcollectionofdatasets💾WikiNLDB:KGtriplesfromWikidatawereverbalizedinsentences(andyoucangenerateyourownDBsvaryinganumberoffacts).🧪Experimentally,T5andLongformer(withbiggercontextwindows)canonlycompetewiththeNeuralSPJoperatoronthesmallestgraphswhengivengoldenretrievalresults.Otherwise,onbiggerDBsof25+factstheirperformancequicklydeteriorateswhileSPJ+SSGisfarmorestable.ThepaperisveryaccessibletothegeneralNLPaudience,definitelyoneofmyfavoritesthisyear👍!NeuralDBarchitecture.Source:ThorneetalAsretrievalsbecomemoreimportant(evennotinthecontextofneuraldatabases),ACL’21hasarichcollectionofnewmethodsaroundtheseminalDensePassageRetrievalanditsfamilyofrelatedretrievers.Source:ChenetaltackleacommonandimportantIRproblemofentitydisambiguation,i.e.,youhavemanyentitiesthatsharethesamename(surfaceform)buthavedifferentproperties(🖼👈AbeLincoln-politicianandAbeLincoln-musician).Toevaluateretrieversmoresystematically,theauthorsdesignanewdataset💾AmbER(ambiguousentityretrieval)collectedfromWikipedia-Wikidatapagealignment.Specifically,thedatasetemphasizesthe🌟“popularitygap”🌟:inmostcases,retrieversfallbacktothemostprominententities(forexample,mostviewedpageswithmorecontent)intheirindex,andwewanttoquantifythatshift.WikidataentitiesandpredicatesareusedasareferenceKGcollectiontogeneratenewcomplexdisambiguationtasks(calledAmbERsets).AmbERconsistsoftwoparts:AmbER-H(disambiguatinghumans)andAmbER-N(non-humanslikefilms,musicbands,companies),andmeasuresperformancein3tasks:QA,slotfilling,andfactchecking.🧪Intheexperiments,theauthorsshowthatcurrentSOTAretrieversdosufferfrominefficientdisambiguation—performanceontasksinvolvingrareentitiesdrops15–20points📉.Thatis,thereisstillalottobedoneforimprovingretrievers’precision.Source:YamadaetalAcommoncomputationalissueofmodernretrieversistheirindexsize:DPRwith21Mitemstakes~65GBofmemory.Yamadaetalproposeanelegantsolution:usingthelearningtohashidealet’strainahashlayerthatapproximatesasignfunctionsuchthatcontinuousvectorsbecomebinaryvectorsof+1/-1.Then,insteadofacostlydotproduct(MIPSovertheindex)wecanusehighlyefficientCPUimplementationsofHammingdistancetocomputeroughtop-Kcandidates(1000inthepaper).Then,wecaneasilycomputeadotproductbetweenaquestionand1000candidates.TheBPRapproach(BinaryPassageRetriever)enjoysseveralwins:1️⃣indexsizeisreducedto~2GB(downfrom66GB!)withoutbigperformancedrops(onlyTop-1accuracyisexpectedlyaffected);2️⃣BPRisinthetopperformersoftheEfficientQANeurIPSChallenge👏.Overall,thisisaverygoodexampleofanimpactfulshortpaper!📬Finally,I’doutlineafewmoreretrieval-centricworksfromtheconference:Sachanetalexaminehowpre-trainingonInverseClozetaskandmaskedsalientspansimprovesDPRperformanceonopen-domainQAtasks.Maillard,Karpukhinetaldesigna🌐universalretriever🌐,amulti-tasktrainedretrieversuitableformanyNLPtasksandevaluatedontheKILTbenchmarkcombiningQA,entitylinking,slotfilling,anddialoguetasks.Ravfogeletalpresentacooldemo(overCovid-19data)oftheneuralextractivesearchsystemavailableforeverybodytoplayaroundwith.KG-augmentedLanguageModels:🪴🚿OneofthemajortrendsinBERTologyistoprobefactualknowledgeoflargeLMs,e.g.,feedingaquery“SteveJobswasbornin[MASK]”topredict“California”.WecanthenquantifythoseprobesusingvariousbenchmarkslikeLAMA.Inotherwords,canwetreatlanguagemodelsasknowledgebases?Sofar,wehavetheevidencethatLMscanpredictcorrectlyafewsimplefacts.Butreally,canthey?🤔Ourfindingsstronglyquestiontheconclusionsofpreviousliteratures,anddemonstratethatcurrentMLMscannotserveasreliableknowledgebaseswhenusingprompt-basedretrievalparadigm.—CaoetalSource:CaoetalTheworkofCaoetalisprettymuchacoldshowerforthewholearea—theyfindthatmostofthereportedperformancecanbeattributedtospuriouscorrelations🥴ratherthanactual“knowledge”.Theauthorsstudy3typesofprobing(illustrated👈):prompts,cases(akafew-shotlearning)andcontexts.Inallscenarios,LMsexhibitnumerousflaws,e.g.,casescanonlyhelptoidentifyanswertype(person,city,etc)butcannotpointtoaparticularentitywithinthisclass.Thepaperisveryeasytoreadandfollow,andhaslotsofillustrativeexamples🖌,soI’drecommendgivingitaproperreadevenforthosewhodonotactivelyworkinthisarea.Interestingly,asimilarresultintheopen-domainQAisreportedbyWangetalhereatACL’21,too.TheyanalyzeBARTandGPT-2performanceandarriveatprettymuchthesameconclusions.TimetorethinkhowwepackexplicitknowledgeinLMs?🤔WhenyourealizedLMswerecheatingallthetime.Source:gfycatFromthepreviousposts,weknowthereexistsquiteanumberofTransformerlanguagemodelsenrichedwithfactsfromknowledgegraphs.Let’swelcometwonewfamilymembers!👨‍👩‍👦‍👦WangetalproposeK-Adapters,aknowledgeinfusionmechanismontopofpre-trainedLMs.WithK-Adapters,youdon’tneedtotrainalargeTransformerstackfromscratch.Instead,theauthorssuggestplacingafewadapterlayersinbetweenthelayersofalreadypre-trainedfrozenmodels(theyexperimentwithBERTandRoBERTa),forexample,afterlayers0,12,and23.ThefrozenLMfeaturesareconcatenatedwithlearnableadapterfeaturesandtrainedonasetofnewtasks—here,itis1️⃣relationpredictionbasedontheT-RExdatasetofalignedWikipedia-Wikidatatext-triples;2️⃣dependency-treerelationprediction.Experimentally,thisapproachimprovesperformanceonentitytyping,commonsenseQA,andrelationclassificationtasks.Source:WangetalERICApre-trainingtask.Source:QinetalQinetaldesignERICA,acontrastivemechanismtoenrichLMswithentityandrelationalinformation.Specifically,theyaddtwomorelossestothestandardMLM:entitydiscrimination(🖼👈)andrelationdiscrimination.Ontheexampleofentitydiscrimination,pre-trainingdocumentshavepairwiseannotations🍏🍏ofentityspans.Themodelisaskedtoyieldhighercosinesimilaritiesoftruepairs🍏🍏thannegativeones🍏🍅throughacontrastivelossterm.ERICAperformsparticularlywellinthelow-resourcefine-tuningscenarios(1–10%oftrainingdata)inrelationpredictionandmulti-hopQAtasks.KGEmbeddingsandLinkPredictionCanthestrengthsofmulti-relationalKGembeddingmodelsbetheirownweaknessespronetoadversarialattacks?Azooofalgorithmsisoftencomparedbytheirabilitiestocapturecertainrelationalpatternslikesymmetry,inversion,composition,andmore.Ashortanswerisyes:/Source:BhardwajetalAninsightfulworkofBhardwajetalstudiesvarioustypesanddirectionsofpoisoning🔫embeddingmodelsbyaddingadversarialtriples(checktheexampleillustration👈).Itdoesassumewehaveallaccesstopre-trainedweightsandcanperformforwardcalls(white-boxsetup).Aftersuggestingseveralwaysofsearchingforadversarialrelationsandpotentiallydecoyentities,experimentsshowthatthemosteffectiveattackleveragessymmetry🦋patterns(atleastonstandardFB15k-237andWN18RRgraphs).Interestingly,aconvolutionalmodelConvEwithoutgeometricortranslationalpriorslooksmostresilient🛡todesignedattacks,ie,vanillaTransEorDistMultgetpoisonedmoreseverely.⚖️I’dalsooutlinealong-anticipatedstudyofKamigaitoandHayashionthetheoreticalsimilaritiesoftwopopularfamiliesoflossfunctionsfortrainingKGembeddingmodels:softmaxcross-entropyandnegativesampling,andinparticular,self-adversarialnegativesampling.Innumerousstudies(e.g.,shamelessplug,orRuffinellietalfromICLR’20)we’veseenthatmodelstrainedwithoneoranotherlossexhibitsimilarperformance.Andfinally,inthiswork,theauthorsstudytheirtheoreticalpropertiesthroughthelensofBregmandivergence.Twoimportantmessagesyouwanttotakehomeafterreadingthisarticle:1️⃣Self-adversarialnegativesamplingisverysimilartocross-entropywithlabelsmoothing.2️⃣Cross-entropymodelsmightfitbetterthannegativesamplingones.ProTip:youcannowcitethispaperifyouforgottorunexperimentswithmorelossfunctions😉Source:Caoetal🚨NewLPDatasetAlert🚨FreebaseandWordnetgraphsasbenchmarkshavebeentherefortoolongandwe,asacommunity,shouldfinallyadoptnewdatasetswithfewerbiasesandlargerscaleas2021–2022testingsuites.CaoetalexploretestsetsofFB15k-237andWN18RRandfind(likeinthepicture👈)that,often,testtriplesareeitherunpredictableevenforhumans,ordonotmakemuchpracticalsense.Motivatedbythat,theycreatedanewsetofdatasets💾InferWiki16K&InferWiki64K(basedonWikidata😍)wheretestcasesdohavegroundinginthetrainset.Theyalsocreatedasetofunknowntriplesforthetripleclassificationtask(inadditiontotrue/false).🧪Themainhypothesisisconfirmedintheexperiments—embeddingmodelsindeedoperatemuchbetteronnon-randomsplitswhentestingtriplesdohavegroundingintrain.Let’swelcome👋afewnewapproachesforlinkprediction.1️⃣BERT-ResNetbyLovelaceetalencodesentitynamesanddescriptionsthroughBERTandpassestriplesthroughaResNet-styledeepCNNwithsubsequentre-rankinganddistillation(quiteabitofeverythingputthere!).Themodelyieldslargeimprovements📈oncommonsense-stylegraphslikeSNOMEDCTCoreandConceptNetwithlotsofknowledgeencodedintotextualdescriptions.2️⃣Nextup,ChaoetalproposePairRE,anextensionofRotatEwhererelationembeddingsaresplitintohead-specificandtail-specificparts.PairREshowedquitecompetitiveresultsontheOGBdatasets.Bytheway,themodelisalreadyavailableinthePyKEENlibraryfortrainingandevaluatingKGembeddingmodels.😉3️⃣LietaldesignCluSTeR,amodelfortemporalKGlinkprediction.CluSTeRemploysRLatthefirstcluesearchstageandrunsR-GCNontopofthematthesecondstage.4️⃣Finally,Iamexcitedtoseemoreresearchonhyper-relationalKGs!🎇(findmyreviewarticlehere).WangetalbuildtheirGRANmodelontopofTransformerwithamodifiedattentionmechanismthatincludesqualifiersinteraction.I’dbeinterestedtoseeitsperformanceonournewWD50Khyper-relationalbenchmarks!EntityAlignment:2NewDatasets💾Inthetaskofentityalignment(EA),youhavetwographs(possiblysharingthesamesetofrelations)withtwodisjointsetsofentities,likeentitiesfromEnglishandChineseDBpedia,andyouhavetoidentifywhichentitiesfromonegraphcanbemappedontoanotherone.Foryears⏳⌛️,entityalignmentdatasetsimpliedthereisaperfect1–1mappingbetweentwographs,butitisquiteanartificialassumptionforreal-worldtasks.Finally,Sunetalstudythissetupmoreformallythroughthenotionofdanglingentities(thosewhodon’thaverespectivemappings).Source:SunetalTheauthorsbuildanewdataset💾,DBP2.0,whereonly30–50%ofentitiesare“mappable”andtherestbeingdangling.Itthereforemeans,thatyouralignmentmodelhastolearnawaytodecidewhetheranodecanbemappedornot—theauthorsexplore3possibleapproachesfordoingthat.AsmostEAbenchmarksarealreadysaturatedaroundveryhighvalues,it’sintriguingtoseethatadding“noisy”entitiesdrasticallydrops📉theoverallperformance.Onemoresteptowardsmorepracticalsetups!Source:PahujaetalOften,someedgesifagraphcanbecontainedimplicitlyinsometext—thenwetalkaboutKG-Textalignment.Particularly,weareinterestedifthereisanywaytoenrichgraphembeddingswithtextembeddingsandviceversa.Pahujaetalprovidealarge-scalestudyofthisproblembydesigninganoveldataset💾derivedfromthewholeEnglishWikipediaandWikidata:15Mentitiesand261Mfacts🏋.Theauthorsanalyze4alignmentmethods(e.g.,byprojectingKGembeddingstothetextembeddingspace)andtrainKG/Textembeddingsjointly.🧪Task-wise,theauthorsmeasureperformanceinfew-shotlinkprediction(overtheKGtriples)andanalogicalreasoning(overthetextualpart).Indeed,all4alignmentmethodsdoimprovethequalityonbothtaskscomparedtosingle-modalityonly,e.g.,inanalogicalreasoningthebestmethodoffusingKGinformationbrings16%Hits@1ofabsoluteimprovementovertheWikipedia2Vecbaseline💪.Onthelinkpredictiontask,fusioncanyieldupto10%[email protected]’sworthnotingthattheapproachassumesjointtrainingoftwoseparatemodels.ItwouldbedefinitelyinterestingtoprobeKG-augmentedLMs(onemodelpre-trainedonKGs)onthisnewtaskbypassingthealignmentissue.KGConstruction,EntityLinking,RelationExtraction🧩AutomaticKGconstructionfromtextisahighlynon-trivialandsought-aftertasksuitableformanyindustrialapplications.MondaletalproposeaworkflowforKGconstructionofNLPpapersfromtheACLAnthology(towhichbelong,forexample,allpapersreviewedforthisarticle).TheresultinggraphiscalledSciNLP-KG.It’snotexactlyend-to-endasstatedinthetitle(theauthorsjustifyitbyerrorpropagationinSection5)andconsistsof3stages(🖼👇)aroundrelationextraction.SciNLP-KGbuildsuponthelineofpreviousresearch(NAACL’21)onextractingmentionsofTasks,Datasets,andMetrics(TDM).TheKGschemahas4distinctpredicates:evaluatedOn,evaluatedBy,coreferent,andrelatedtocapturelinksamongTDMentities.TheauthorsbuildtwoversionsofSciKG:asmallMVPandafully-fledgedonewith5Knodesand15Kedges.AsolidplusoftheapproachisthattheautomaticallybuiltbigSciKGhasabigoverlap(about50%ofentities)withPapersWithCode!Yes,it’sjust4relationsonarestricteddomain,butit’sagoodstart-surely,morescalableandend-to-endapproacheswillfollow.Source:MondaletalAnunorthodoxapproachtoaknowntask.Source:RedditIntheeraofneuralentitylinkerslikeBLINKandELQ,aworkbyJiang,Gurajadaetaltakesanunorthodoxviewontheproblem:let’scombinetextualheuristicstogetherwithneuralfeaturesinaweightedrule-basedframework(LogicalNeuralNets).Infact,LNN-ELisacomponentofthebiggerneuro-symbolicNSQAsystem,butmoreonthatinthefollowingKGQAsection.📝Theapproach,LNN-EL,requiresentitymentionstobealreadytherealongwithtop-KcandidatesfromatargetKG.Sometextualfeaturescanbe,forinstance,Jaro-WinklerdistancebetweenmentionandcandidatesornodecentralityintheunderlyingKG.BLINKandneuralmethodscanbepluggedinasfeatures,too.Then,anexpertcreatesasetofruleswithweights,e.g.,assignw1forJaro-Winklerandw2forBLINK,andtheweightsarelearnedwithmarginlossandnegativesampling.🧪LNN-ELperformsonparwithBLINKandreturnsanexplainabletreeofweightedrules.Moreover,itcangeneralizeontootherdatasetsthatusethesameunderlyingKG👏➖Thereexistsomedrawbacks,too:itseemsthatBLINKisactuallythecrucialfactorintheoverallperformanceresponsiblefor70–80%oftotalweightsintherules.Sothenaturalquestionis—isitpracticallyworthittocomeupwithsophisticatedexpert-heavyrules?🤔Second,theauthorsuseDBpedialookupforretrievingtop-Kcandidatesand“assumethatsimilarservicesexistorcanbeimplementedontopofotherKGs”.Unfortunately,thisisoftennotthecase—infact,suchcandidateretrievalsystemsexistonlyforDBpediaand(partially)WikidatawhilefortherestoflargeKGsit’shighlynon-trivialtocreatesuchamechanism.Nevertheless,LNN-ELlaysastrongfoundationforneuro-symbolicentitylinkingforKGQA.Whenboxembeddingdimensionistoosmall.Source:gifsboom.netEntitylinkingoftengoeshandinhand🤝withentitytyping.Onoeetaltackletheproblemoffine-grainedentitytyping(whenyouhavehundredsandthousandsoftypes)withboxembeddings(Box4Types).Usually,fine-grainedentitiesaremodeledasvectorswithadotproductbetweenencodedmention+contextvectorandamatrixofalltypesvectors.Instead,theauthorsproposetomovefromvectorsto📦boxes(d-dimensionalhyper-rectangles).Moreover,not“justboxes”butGumbel(soft)boxes(NeurIPS’20)thatallowfordoingbackpropincornercaseswhen“justboxes”donotintersect.The🖼👇givesaniceintuition:essentially,wemodelallinteractionsasgeometricoperatorswith📦andnormingtheirvolumeto1givesanadditionalbonusofprobabilisticinterpretation.🧪Experimentally,boxesworkatleastasgoodasheaviervector-basedmodels,andinsomecasesoutperformthembyagoodmarginof5–7F1points👏.Besides,therearenumerousqualitativeexperimentswithinsightfulfigures.Overall,Ienjoyedreadingthispaperalot—highlyrecommenditasanexampleofastrongpaper.Source:OnoeetalLet’saddafewwordsonRelationExtractionpapersthatslightlyimproveSOTAinseveralbenchmarks.Huetalinvestigatehowpre-trainedKGentityembeddingscanhelpinbag-levelrelationextraction(infact,justabit),andcreateanewdatasetBagRel-Wiki73K💾basedonentitiesandrelationsfromWikidata!Tianetalpresentastereoscopic🧊perspective,StereoRel,ontheREtask,i.e.,entities,relations,andwordsintheparagraphcanbemodeledasa3Dcube.BERTencodingofapassageissenttoseveraldecoderstoreconstructacorrectrelationaltriple.Finally,NadgerietalpresentKGPoolwhereknownentitiesfromasentenceinducealocalneighborhoodwithsubsequentGCNlayersandpooling.QuestionAnsweringoverKGs:Temporal,Conversational,AMRContemporaryKGQAfocusespredominantlyonclassicalstaticgraphs,i.e.,whenyouhaveafixedsetofentitiesandedges,andquestionsdonothaveanytemporaldimension.⏳Butthetimehascome!Saxenaetalintroducealarge-scaletaskofQAoverTemporalKGs,thosethathaveatimestampovertheedgeindicatingitsvaliditylike(BarackObama,positionheld,POTUS,2008,2016).Itopensawholenewvarietyofsimpleandcomplexquestionsaroundtimedimension:“WhowasPOTUSbefore/afterObama?”,“WhoportrayedIronManwhenObamawasPOTUS?”andsoon.Theauthorscreatedanewdataset💾CronQuestions(basedonWikidata😍)with410KquestionsoverKGwith123Kentities,~200relationsand300Ktriplesenrichedwithtimestamps.🧐Notsurprisingly,BERTandT5cannothandlesuchquestionswithanydecentaccuracy,sotheauthorscombineEmbedKGQA(anapproachfromACL’20thatwehighlightedinthepreviousACLreview)withpre-trainedtemporalKGembeddingsTNT-ComplEx(fromtheICLR’20review,see,withthisseriesyoucanstayup-to-datewithmostoftherecentgoodies😉)inanewmodelCronKGQA.Essentially,wetakeaBERTembeddingofasentenceasarelationembeddingandpassitintostatic&temporalscoringfunctionsasdepictedbelow.🧪Experimentally,CronKGQAyieldsaround99%Hits@1rateforsimplequestionsbutstillhasaroomforimprovementonmorecomplexones.FortherestofKGQAcommunity:look,thereisanewnon-saturatedbenchmark👀!Source:Saxenaetal🗣ConversationalKGQAdealswithsequentialquestion-answerstepswherecontextanddialoguehistoryareofhigherimportancewhengeneratingqueriestoanunderlyingKGandformingpredictions.InconversationalKGQA,follow-upquestionsareoftenthehardesttodealwith.Traditionally,dialoguehistoryisencodedasonevectorandthereisnospecialtreatmentofrecententities.Furthermore,explicitentitynaminginfollow-upquestionsisoftenomitted(sincehumansaregenerallygoodatcoreferenceresolution),sothenaturalquestionis:howcanwekeeptrackofthemostrelevantentityinacurrentconversation?🎯LanandJiangproposeaconceptoffocalentities,i.e.,anentitywhichisbeingdiscussedinaconversationaboutwhichwe’llmostprobablyaskfollow-upquestions.TheapproachassumeswehaveanaccesstoaSPARQLendpointtoqueryKGonthefly(obviously,it’snotend-to-endneural,butinsteadwecanoperateonmuchbiggergraphsofthescaleofthewholeWikidata).ThemainideaisthatwecandynamicallychangethefocusofanongoingconversationbycomputingadistributionoverentitiesinanEntityTransitionGraph.1️⃣First,webuildsuchanETGbyexpandingthegrapharoundthestartingnodeofaconversation(by1–2hops).2️⃣Then,theETGispassedthroughaGCNencodertogetupdatedentitystates.3️⃣UpdatedentitystatesareaggregatedwiththedialoguehistoryintheFocalEntityPredictor(seetheillustrationbelow)thatbuildsadistributionoverentitiesastobeingthefocalentity.4️⃣Finally,theupdateddistributionissenttoanoff-the-shelfAnswerPredictorthatreturnsananswertoacurrentutterance.🧪Theideaofchangingfocalentitiesyieldssignificantgains(10pointsonaverageoverstrongbaselines)onConvQuestionsandtheconversationalversionofCSQA💾!Thebiggesterrorsourcestemsfromincorrectrelationprediction,sothereisaroomtoimproveforsure.Source:LanandJiangFinally,let’stalkvanillaKGQAwithonequestion—oneanswergivenagraph.Kapanipathiand29morefolksfromIBMResearchpresentahugeneuro-symbolicKGQAsystem,NSQA,builtaroundAMRparsing.NSQAisapipelinedsystemwithspecificallytailoredcomponents🧩.Thatis,aninputquestionisfirstparsedintoanAMRtree(pre-trainedcomponent1️⃣),thenentitiesinthetreearelinkedtoabackgroundKG(2️⃣that’stheLNN-ELdescribedabove!).Aquerygraphisconstructedviarule-basedBFStraversaloftheAMRtree.AndRelationLinkingisaseparatecomponentSemRel(3️⃣presentedintheotherACL’21paperbyNaseemetal).NSQAheavilyreliesonAMRframesandtheirinterconnectionsforbetterparsingatreetoaSPARQLquery,e.g.,an“amr-unknown”nodewillbeavariableinthequery.Forsure,alotofworkwasputintometiculouslycreatedrulestoprocessAMRoutput👏Ontheotherhand,allothercomponentsemployTransformersinoneoranotherway.Looksprettyneuro-symbolicindeed!🧪Experimentally,AMRparsingis~84%accurate(comparedtothosecreatedbyhumanexperts)ontheLC-QuAD1.0benchmarkwhiletheoverallNSQAimprovestheF1measureby~11points.Someopenlyavailablesourcecodewouldbequitehandy,dearIBM😉Source:Kapanipathietaltl;drYoumadeittothefinalsection!Fromthetableofcontentsorafterreadingsomerelevantsections,eitherway,thankyouforyourtimeandinterestinthisarea😍.Letmeknowinthecommentswhatyouthinkaboutthiswholeendeavorandtheformatingeneral!FromneuraldatabasestoquestionansweringKGsarebeingappliedinmoretasksthanever.Overall,Ithinkit’sagreattimetodoKGresearch:youcanalwaysfindanicheandtackleboththeoreticalandpracticalchallengesthatmightbeusedby(tensof)thousandsoffolksinthecommunity.Lookingforwardtowhatwe’llseeatthenextconference!Nextbunchofpapers!Source:tenor.com--4MorefromTowardsDataScienceFollowYourhomefordatascience.AMediumpublicationsharingconcepts,ideasandcodes.ReadmorefromTowardsDataScienceRecommendedfromMediumDevindiJayathilakainArtificialIntelligenceinPlainEnglishLanguageProcessingwithPythonSantoshThapaDropOutinDEEPLEARNINGKweyakieBleboinGeekCultureDataLeakageinMachineLearningAkihiroFUJIIinAnalyticsVidhyaAkira’sMachineLearningnews — #27HanyIMAMinAnalyticsVidhyaObjectdetectionandDeepLearning:IdentifyGoogleStreetviewimagecontentusingYOLOinRAbhinavBandaruTermFrequency — InverseDocumentFrequencyChengweiZhangAnEasyGuidetobuildnewTensorFlowDatasetsandEstimatorwithKerasModelOmarM.AtefStartingguidetoartificialintelligencepart2AboutHelpTermsPrivacyGettheMediumappGetstartedMichaelGalkin1.7KFollowersPostdoc@Mila&McGillUniversity.WorkingonKnowledgeGraphs,GraphML,andNLPFollowMorefromMediumNilsReimersOpenAIGPT-3TextEmbeddings-Reallyanewstate-of-the-artindensetextembeddings?FabioChiusanoinNLPlanetTwominutesNLP — ExplainableAIwithKnowledgeGraphsArjunKumbakkaraForgetComplexTraditionalApproachestohandleNLPDatasets,HuggingFaceDatasetLibraryisyour…TimSchopfinTowardsDataScienceKeyphraseExtractionwithBERTTransformersandNounPhrasesHelpStatusWritersBlogCareersPrivacyTermsAboutKnowable



請為這篇文章評分?