KGCNs: Machine Learning over Knowledge Graphs with ...

文章推薦指數: 80 %
投票人數:10人

... model: the Knowledge Graph Convolutional Network (KGCN), available free to use from the GitHub repo under Apache licensing. It's written in Python, ... OpeninappHomeNotificationsListsStoriesWritePublishedinTowardsDataScienceKGCNs:MachineLearningoverKnowledgeGraphswithTensorFlowThisprojectintroducesanovelmodel:theKnowledgeGraphConvolutionalNetwork(KGCN),availablefreetousefromtheGitHubrepounderApachelicensing.It’swritteninPython,andavailabletoinstallviapipfromPyPi.Theprincipalideaofthisworkistoforgeabridgebetweenknowledgegraphs,automatedlogicalreasoning,andmachinelearning,usingTypeDBastheknowledgegraph.SummaryAKGCNcanbeusedtocreatevectorrepresentations,embeddings,ofanylabelledsetofTypeDBThingsviasupervisedlearning.AKGCNcanbetraineddirectlyfortheclassificationorregressionofThingsstoredinTypeDB.Futureworkwillincludebuildingembeddingsviaunsupervisedlearning.Whatisitusedfor?Often,datadoesn’tfitwellintoatabularformat.Therearemanybenefitstostoringcomplexandinterrelateddatainaknowledgegraph,notleastthatthecontextofeachdatapointcanbestoredinfull.However,manyexistingmachinelearningtechniquesrelyupontheexistenceofaninputvectorforeachexample.Creatingsuchavectortorepresentanodeinaknowledgegraphisnon-trivial.Inordertomakeuseofthewealthofexistingideas,toolsandpipelinesinmachinelearning,weneedamethodofbuildingthesevectors.Inthiswaywecanleveragecontextualinformationfromaknowledgegraphformachinelearning.ThisiswhataKGCNcanachieve.Givenanexamplenodeinaknowledgegraph,itcanexaminethenodesinthevicinityofthatexample,itscontext.Basedonthiscontextitcandetermineavectorrepresentation,anembedding,forthatexample.TherearetwobroadlearningtasksaKGCNissuitablefor:Supervisedlearningfromaknowledgegraphforpredictione.g.multi-classclassification(implemented),regression,linkpredictionUnsupervisedcreationofKnowledgeGraphEmbeddings,e.g.forclusteringandnodecomparisontasksInordertobuildausefulrepresentation,aKGCNneedstoperformsomelearning.Todothatitneedsafunctiontooptimise.Revisitingthebroadtaskswecanperform,wehavedifferentcasestoconfigurethelearning:Inthesupervisedcase,wecanoptimisefortheexacttaskwewanttoperform.Inthiscase,embeddingsareinterimtensorsinalearningpipelineTobuildunsupervisedembeddingsastheoutput,weoptimisetominimisesomesimilaritymetricsacrossthegraphMethodologyTheideologybehindthisprojectisdescribedhere,andavideoofthepresentation.TheprinciplesoftheimplementationarebasedonGraphSAGE,fromtheStanfordSNAPgroup,heavilyadaptedtoworkoveraknowledgegraph.Insteadofworkingonatypicalpropertygraph,aKGCNlearnsfromcontextualdatastoredinatypedhypergraph,TypeDB.Additionally,itlearnsfromfactsdeducedbyTypeDB’sautomatedlogicalreasoner.FromthispointonwardssomeunderstandingofTypeDB’sdocsisassumed.Nowweintroducethekeycomponentsandhowtheyinteract.KGCNAKGCNisresponsibleforderivingembeddingsforasetofThings(andtherebydirectlylearntoclassifythem).WestartbyqueryingTypeDBtofindasetoflabelledexamples.Followingthat,wegatherdataaboutthecontextofeachexampleThing.Wedothisbyconsideringtheirneighbours,andtheirneighbours’neighbours,recursively,uptoKhopsaway.WeretrievethedataconcerningthisneighbourhoodfromTypeDB(diagramabove).Thisinformationincludesthetypehierarchy,roles,andattributevalueofeachneighbouringThingencountered,andanyinferredneighbours(representedabovebydottedlines).Thisdataiscompiledintoarraystobeingestedbyaneuralnetwork.ViaoperationsAggregateandCombine,asinglevectorrepresentationisbuiltforaThing.ThisprocesscanbechainedrecursivelyoverKhopsofneighbouringThings.ThisbuildsarepresentationforaThingofinterestthatcontainsinformationextractedfromawidecontext.Insupervisedlearning,theseembeddingsaredirectlyoptimisedtoperformthetaskathand.Formulti-classclassificationthisisachievedbypassingtheembeddingstoasinglesubsequentdenselayeranddetermininglossviasoftmaxcrossentropy(againsttheexampleThings’labels);then,optimisingtominimisethatloss.AKGCNobjectbringstogetheranumberofsub-components,aContextBuilder,NeighbourFinder,Encoder,andanEmbedder.Theinputpipelinecomponentsarelessinteresting,sowe’llskiptothefunstuff.YoucanreadabouttherestintheKGCNreadme.EmbedderTocreateembeddings,webuildanetworkinTensorFlowthatsuccessivelyaggregatesandcombinesfeaturesfromtheKhopsuntila‘summary’representationremains—anembedding(diagrambelow).Tocreatethepipeline,theEmbedderchainsAggregateandCombineoperationsfortheK-hopsofneighboursconsidered.e.g.forthe2-hopcasethismeansAggregate-Combine-Aggregate-Combine.Thediagramaboveshowshowthischainingworksinthecaseofsupervisedclassification.TheEmbedderisresponsibleforchainingthesub-componentsAggregatorandCombiner,explainedbelow.AggregatorAnAggregator(picturedbelow)takesinavectorrepresentationofasub-sampleofaThing’sneighbours.Itproducesonevectorthatisrepresentativeofallofthoseinputs.Itmustdothisinawaythatisorderagnosticsincetheneighboursareunordered.Toachievethisweuseonedenselyconnectedlayer,andmaxpooltheoutputs(maxpoolisorder-agnostic).CombinerOncewehaveAggregatedtheneighboursofaThingintoasinglevectorrepresentation,weneedtocombinethiswiththevectorrepresentationofthatthingitself.ACombinerachievesthisbyconcatenatingthetwovectors,andreducesthedimensionalityusingasingledenselyconnectedlayer.SupervisedKGCNClassifierASupervisedKGCNClassifierisresponsiblefororchestratingtheactuallearning.IttakesinaKGCNinstanceandasforanylearnermakinguseofaKGCN,itprovides:Methodsfortrain/evaluation/predictionApipelinefromembeddingtensorstopredictionsAlossfunctionthattakesinpredictionsandlabelsAnoptimiserThebackpropagationtrainingloopItmustbetheclassthatprovidesthesebehaviours,sinceaKGCNisnotcoupledtoanyparticularlearningtask.Thisclass,therefore,providesallofthespecialisationsrequiredforasupervisedlearningframework.BelowisaslightlysimplifiedUMLactivitydiagramoftheprogramflow.BuildwithKGCNsTostartbuildingwithKGCNs,takealookatthereadme’squickstart,ensurethatyouhavealloftherequirementsandfollowthesampleusageinstructions,reiteratedbelow:Thiswillgetyouonyourwaytobuildamulti-classclassifierforyourownknowledgegraph!There’salsoanexampleintherepositorywithrealdatathatshouldfillinanygapsthetemplateusagemisses.Ifyoulikewhatwe’reupto,andyouuse/areinterestedinKGCNsthereareseveralthingsyoucando:SubmitanissueforanyproblemsyouencounterinstallingorusingKGCNsStartherepoifyou’reinclinedtohelpusraisetheprofileofthiswork:)Askquestions,proposeideasorhaveaconversationwithusontheVaticleDiscordchannelThisposthasbeenwrittenforthethirdkglibpre-release,usingTypeDBcommit20750ca0a46b4bc252ad81edccdfd8d8b7c46caa,andmaysubsequentlyfalloutoflinewiththerepo.Checkthereforthelatest!--1MorefromTowardsDataScienceFollowYourhomefordatascience.AMediumpublicationsharingconcepts,ideasandcodes.ReadmorefromTowardsDataScienceRecommendedfromMediumPavolBielikinLatticeFlowModelassessmentbeyondsamplesinyourdataset — LatticeFlowSurveshChauhanDogIdentificationAppKevinCLeeinTowardsDataScienceTheEvolutionofTrees-BasedClassificationModelsIvoMerchiersinVectrConsultingCreateyourownNLPpipelineAmosLayolaUnderstandingVectorsandMatricesHarrisMohammedinDataDrivenInvestorACareerinDataScience — Part2 — MachineLearning — PerceptronsAnnaKómárinTheStartupTextClassificationofQuantumPhysicsPapersMugdhainAlgoAnalyticsSmallObjectDetectionAboutHelpTermsPrivacyGettheMediumappGetstartedJamesFletcher205FollowersPrincipalScientistatVaticle.Researchingtheintelligentsystemsofthefuture,enabledbyKnowledgeGraphs.FollowMorefromMediumChristianMonsoninTowardsDataScienceA.I.TalkswithAnimalsMadhanaBalaLet’slearnInteloneAPIAIAnalyticsToolkitSymantoResearchDisinformationWars:AIandFakeNewsDetectiononSocialMediaJoséLuisDomínguezPreventingrobberiesinurbanenvironmentsusingdeeplearningandOpenCVHelpStatusWritersBlogCareersPrivacyTermsAboutKnowable



請為這篇文章評分?