What is Big Data Analytics and Why is it Important? - TechTarget

文章推薦指數: 80 %
投票人數:10人

Learn how big data analytics works, the importance it can have for the businesses that use it, and how it can help increase revenues and improve business ... Home Datascienceandanalytics Analytics bigdataanalytics TechAccelerator Theultimateguidetobigdataforbusinesses ViewMore Sharethisitemwithyournetwork: By WesleyChai, TechnicalWriter MarkLabbe CraigStedman, IndustryEditor Whatisbigdataanalytics? Bigdataanalyticsistheoftencomplexprocessofexaminingbigdatatouncoverinformation--suchashiddenpatterns,correlations,markettrendsandcustomerpreferences--thatcanhelporganizationsmakeinformedbusinessdecisions. Onabroadscale,dataanalyticstechnologiesandtechniquesgiveorganizationsawaytoanalyzedatasetsandgathernewinformation.Businessintelligence(BI)queriesanswerbasicquestionsaboutbusinessoperationsandperformance. Bigdataanalyticsisaformofadvancedanalytics,whichinvolvecomplexapplicationswithelementssuchaspredictivemodels,statisticalalgorithmsandwhat-ifanalysispoweredbyanalyticssystems. Whyisbigdataanalyticsimportant? Organizationscanusebigdataanalyticssystemsandsoftwaretomakedata-drivendecisionsthatcanimprovebusiness-relatedoutcomes.Thebenefitsmayincludemoreeffectivemarketing,newrevenueopportunities,customerpersonalizationandimprovedoperationalefficiency.Withaneffectivestrategy,thesebenefitscanprovidecompetitiveadvantagesoverrivals. Thisarticleispartof Theultimateguidetobigdataforbusinesses Whichalsoincludes: 8benefitsofusingbigdataforbusinesses Whatabigdatastrategyincludesandhowtobuildone 10bigdatachallengesandhowtoaddressthem Download1 DownloadthisentireguideforFREEnow! Howdoesbigdataanalyticswork? Dataanalysts,datascientists,predictivemodelers,statisticiansandotheranalyticsprofessionalscollect,process,cleanandanalyzegrowingvolumesofstructuredtransactiondataaswellasotherformsofdatanotusedbyconventionalBIandanalyticsprograms. Hereisanoverviewofthefourstepsofthebigdataanalyticsprocess: Dataprofessionalscollectdatafromavarietyofdifferentsources.Often,itisamixofsemistructuredandunstructureddata.Whileeachorganizationwillusedifferentdatastreams,somecommonsourcesinclude: internetclickstreamdata; webserverlogs; cloudapplications; mobileapplications; socialmediacontent; textfromcustomeremailsandsurveyresponses; mobilephonerecords;and machinedatacapturedbysensorsconnectedtotheinternetofthings(IoT). Dataispreparedandprocessed.Afterdataiscollectedandstoredinadatawarehouseordatalake,dataprofessionalsmustorganize,configureandpartitionthedataproperlyforanalyticalqueries.Thoroughdatapreparationandprocessingmakesforhigherperformancefromanalyticalqueries. Dataiscleansedtoimproveitsquality.Dataprofessionalsscrubthedatausingscriptingtoolsordataqualitysoftware.Theylookforanyerrorsorinconsistencies,suchasduplicationsorformattingmistakes,andorganizeandtidyupthedata. Thecollected,processedandcleaneddataisanalyzedwithanalyticssoftware.Thisincludestoolsfor: datamining,whichsiftsthroughdatasetsinsearchofpatternsandrelationships predictiveanalytics,whichbuildsmodelstoforecastcustomerbehaviorandotherfutureactions,scenariosandtrends machinelearning,whichtapsvariousalgorithmstoanalyzelargedatasets deeplearning,whichisamoreadvancedoffshootofmachinelearning textminingandstatisticalanalysissoftware artificialintelligence(AI) mainstreambusinessintelligencesoftware datavisualizationtools Keybigdataanalyticstechnologiesandtools Manydifferenttypesoftoolsandtechnologiesareusedtosupportbigdataanalyticsprocesses.Commontechnologiesandtoolsusedtoenablebigdataanalyticsprocessesinclude: Hadoop,whichisanopensourceframeworkforstoringandprocessingbigdatasets.Hadoopcanhandlelargeamountsofstructuredandunstructureddata. Predictiveanalyticshardwareandsoftware,whichprocesslargeamountsofcomplexdata,andusemachinelearningandstatisticalalgorithmstomakepredictionsaboutfutureeventoutcomes.Organizationsusepredictiveanalyticstoolsforfrauddetection,marketing,riskassessmentandoperations. Streamanalyticstools,whichareusedtofilter,aggregateandanalyzebigdatathatmaybestoredinmanydifferentformatsorplatforms. Distributedstoragedata,whichisreplicated,generallyonanon-relationaldatabase.Thiscanbeasameasureagainstindependentnodefailures,lostorcorruptedbigdata,ortoprovidelow-latencyaccess. NoSQLdatabases,whicharenon-relationaldatamanagementsystemsthatareusefulwhenworkingwithlargesetsofdistributeddata.Theydonotrequireafixedschema,whichmakesthemidealforrawandunstructureddata. Adatalakeisalargestoragerepositorythatholdsnative-formatrawdatauntilitisneeded.Datalakesuseaflatarchitecture. Adatawarehouse,whichisarepositorythatstoreslargeamountsofdatacollectedbydifferentsources.Datawarehousestypicallystoredatausingpredefinedschemas. Knowledgediscovery/bigdataminingtools,whichenablebusinessestominelargeamountsofstructuredandunstructuredbigdata. In-memorydatafabric,whichdistributeslargeamountsofdataacrosssystemmemoryresources.Thishelpsprovidelowlatencyfordataaccessandprocessing. Datavirtualization,whichenablesdataaccesswithouttechnicalrestrictions. Dataintegrationsoftware,whichenablesbigdatatobestreamlinedacrossdifferentplatforms,includingApache,Hadoop,MongoDBandAmazonEMR. Dataqualitysoftware,whichcleansesandenricheslargedatasets. Datapreprocessingsoftware,whichpreparesdataforfurtheranalysis.Dataisformattedandunstructureddataiscleansed. Spark,whichisanopensourceclustercomputingframeworkusedforbatchandstreamdataprocessing. Bigdataanalyticsapplicationsoftenincludedatafrombothinternalsystemsandexternalsources,suchasweatherdataordemographicdataonconsumerscompiledbythird-partyinformationservicesproviders.Inaddition,streaminganalyticsapplicationsarebecomingcommoninbigdataenvironmentsasuserslooktoperformreal-timeanalyticsondatafedintoHadoopsystemsthroughstreamprocessingengines,suchasSpark,FlinkandStorm. Earlybigdatasystemsweremostlydeployedonpremises,particularlyinlargeorganizationsthatcollected,organizedandanalyzedmassiveamountsofdata.Butcloudplatformvendors,suchasAmazonWebServices(AWS),GoogleandMicrosoft,havemadeiteasiertosetupandmanageHadoopclustersinthecloud.ThesamegoesforHadoopsupplierssuchasCloudera,whichsupportsthedistributionofthebigdataframeworkontheAWS,GoogleandMicrosoftAzureclouds.Userscannowspinupclustersinthecloud,runthemforaslongastheyneedandthentakethemofflinewithusage-basedpricingthatdoesn'trequireongoingsoftwarelicenses. Bigdatahasbecomeincreasinglybeneficialinsupplychainanalytics.Bigsupplychainanalyticsutilizesbigdataandquantitativemethodstoenhancedecision-makingprocessesacrossthesupplychain.Specifically,bigsupplychainanalyticsexpandsdatasetsforincreasedanalysisthatgoesbeyondthetraditionalinternaldatafoundonenterpriseresourceplanning(ERP)andsupplychainmanagement(SCM)systems.Also,bigsupplychainanalyticsimplementshighlyeffectivestatisticalmethodsonnewandexistingdatasources. Bigdataanalyticsisaformofadvancedanalytics,whichhasmarkeddifferencescomparedtotraditionalBI. Bigdataanalyticsusesandexamples Herearesomeexamplesofhowbigdataanalyticscanbeusedtohelporganizations: Customeracquisitionandretention.Consumerdatacanhelpthemarketingeffortsofcompanies,whichcanactontrendstoincreasecustomersatisfaction.Forexample,personalizationenginesforAmazon,NetflixandSpotifycanprovideimprovedcustomerexperiencesandcreatecustomerloyalty. Targetedads.Personalizationdatafromsourcessuchaspastpurchases,interactionpatternsandproductpageviewinghistoriescanhelpgeneratecompellingtargetedadcampaignsforusersontheindividuallevelandonalargerscale. Productdevelopment.Bigdataanalyticscanprovideinsightstoinformaboutproductviability,developmentdecisions,progressmeasurementandsteerimprovementsinthedirectionofwhatfitsabusiness'customers. Priceoptimization.Retailersmayoptforpricingmodelsthatuseandmodeldatafromavarietyofdatasourcestomaximizerevenues. Supplychainandchannelanalytics.Predictiveanalyticalmodelscanhelpwithpreemptivereplenishment,B2Bsuppliernetworks,inventorymanagement,routeoptimizationsandthenotificationofpotentialdelaystodeliveries. Riskmanagement.Bigdataanalyticscanidentifynewrisksfromdatapatternsforeffectiveriskmanagementstrategies. Improveddecision-making.Insightsbusinessusersextractfromrelevantdatacanhelporganizationsmakequickerandbetterdecisions. Bigdataanalyticsbenefits Thebenefitsofusingbigdataanalyticsinclude: Quicklyanalyzinglargeamountsofdatafromdifferentsources,inmanydifferentformatsandtypes. Rapidlymakingbetter-informeddecisionsforeffectivestrategizing,whichcanbenefitandimprovethesupplychain,operationsandotherareasofstrategicdecision-making. Costsavings,whichcanresultfromnewbusinessprocessefficienciesandoptimizations. Abetterunderstandingofcustomerneeds,behaviorandsentiment,whichcanleadtobettermarketinginsights,aswellasprovideinformationforproductdevelopment. Improved,betterinformedriskmanagementstrategiesthatdrawfromlargesamplesizesofdata. Bigdataanalyticsinvolvesanalyzingstructuredandunstructureddata. Bigdataanalyticschallenges Despitethewide-reachingbenefitsthatcomewithusingbigdataanalytics,itsusealsocomeswithchallenges: Accessibilityofdata.Withlargeramountsofdata,storageandprocessingbecomemorecomplicated.Bigdatashouldbestoredandmaintainedproperlytoensureitcanbeusedbylessexperienceddatascientistsandanalysts. Dataqualitymaintenance.Withhighvolumesofdatacominginfromavarietyofsourcesandindifferentformats,dataqualitymanagementforbigdatarequiressignificanttime,effortandresourcestoproperlymaintainit. Datasecurity.Thecomplexityofbigdatasystemspresentsuniquesecuritychallenges.Properlyaddressingsecurityconcernswithinsuchacomplicatedbigdataecosystemcanbeacomplexundertaking. Choosingtherighttools.Selectingfromthevastarrayofbigdataanalyticstoolsandplatformsavailableonthemarketcanbeconfusing,soorganizationsmustknowhowtopickthebesttoolthatalignswithusers'needsandinfrastructure. Withapotentiallackofinternalanalyticsskillsandthehighcostofhiringexperienceddatascientistsandengineers,someorganizationsarefindingithardtofillthegaps. Historyandgrowthofbigdataanalytics Thetermbigdatawasfirstusedtorefertoincreasingdatavolumesinthemid-1990s.In2001,DougLaney,thenananalystatconsultancyMetaGroupInc.,expandedthedefinitionofbigdata.Thisexpansiondescribedtheincreasing: Volumeofdatabeingstoredandusedbyorganizations; Varietyofdatabeinggeneratedbyorganizations;and Velocity,orspeed,inwhichthatdatawasbeingcreatedandupdated. Thosethreefactorsbecameknownasthe3Vsofbigdata.GartnerpopularizedthisconceptafteracquiringMetaGroupandhiringLaneyin2005. AnothersignificantdevelopmentinthehistoryofbigdatawasthelaunchoftheHadoopdistributedprocessingframework.HadoopwaslaunchedasanApacheopensourceprojectin2006.Thisplantedtheseedsforaclusteredplatformbuiltontopofcommodityhardwareandthatcouldrunbigdataapplications.TheHadoopframeworkofsoftwaretoolsiswidelyusedformanagingbigdata. By2011,bigdataanalyticsbegantotakeafirmholdinorganizationsandthepubliceye,alongwithHadoopandvariousrelatedbigdatatechnologies. Initially,astheHadoopecosystemtookshapeandstartedtomature,bigdataapplicationswereprimarilyusedbylargeinternetande-commercecompaniessuchasYahoo,GoogleandFacebook,aswellasanalyticsandmarketingservicesproviders. Morerecently,abroadervarietyofusershaveembracedbigdataanalyticsasakeytechnologydrivingdigitaltransformation.Usersincluderetailers,financialservicesfirms,insurers,healthcareorganizations,manufacturers,energycompaniesandotherenterprises. ThiswaslastupdatedinDecember2021 ContinueReadingAboutbigdataanalytics Howtobuildanall-purposebigdatapipelinearchitecture 6bigdatabenefitsforbusinesses Howtobuildanenterprisebigdatastrategyin4steps 10bigdatachallengesandhowtoaddressthem Top25bigdataglossarytermsyoushouldknow RelatedTerms datacuration Datacurationistheprocessofcreating,organizingandmaintainingdatasetssotheycanbeaccessedandusedbypeoplelooking... See complete definition logisticregression Logisticregressionisastatisticalanalysismethodtopredictabinaryoutcome,suchasyesorno,basedonpriorobservations... See complete definition Whatisdatapreparation?Anin-depthguidetodataprep Datapreparationistheprocessofgathering,combining,structuringandorganizingdatasoitcanbeusedinbusiness... See complete definition DigDeeperonDatascienceandanalytics Hadoopvs.Spark:Anin-depthbigdataframeworkcomparison By:George Lawton CompareHadoopvs.Sparkvs.Kafkaforyourbigdatastrategy By:Daniel Robinson Hadoop By:Craig Stedman Hadoopasaservice(HaaS) By:Sarah Wilson LatestTechTargetresources DataManagement AWS ContentManagement Oracle SAP SQLServer SearchDataManagement DirectusbringsOpenDataPlatformtechnologytothecloud Thevendor'snewmanagedcloudserviceisdesignedtoenableorganizationstoconnectapplications,businessintelligenceand... Dremioopensupdatalakehousewithnewengine ThedatalakehousevendorisexpandingitscloudplatformwithanewSQLqueryengineanddatametastorefordatalakesthat... AnomaloPulsedashboardaimsfordataqualityinsights Thestartupisenhancingitsplatformbyprovidinguserswithanewdashboardthatbringsvisibilityintothestateofdataused... SearchAWS InsearchofAWSSolutionsArchitectpreparation? Thinkyou'rereadyfortheAWSCertifiedSolutionsArchitectcertificationexam?Testyourknowledgewiththese12questions,and... ExpertsraiseprivacyconcernsoverAmazonfleetsurveillance Amazonsaiditsvanmonitoringsystemisdesignedsolelyfordriversafety.Butmanyindustryexpertshaveconcernsregardingthe... Here'swhyAmazon'sglobalexpansionwon'tcomeeasy Amazonwouldliketostrengthenitsglobalfootprint,butthee-commercegiantfacesroadblocksandchallengestodaythatdidnot... SearchContentManagement Thetop5contentmanagementtrendsin2022 Thetopfivecontentmanagementtrendsof2022focusonflexibilityandefficiency,asorganizationsfacechallengesrelatedto... SalesforceMediaCloud,AWSpartnerforstreamingvideo Ascord-cuttersgiverisetostreamingservices'popularity,SalesforceandAWSpartnertocapitalizeoneachother'sstrengths:... BoxintegrationwithSlack,Teamsenablessecurefilesharing BoxexpandsSlackintegrationtobecomeSlack'scontentlayerforBoxusersandaddsecurityforsharingdocuments;Dropboxjoins... SearchOracle WithCerner,OracleCloudInfrastructuregetsaboost OracleplanstoacquireCernerinadealvaluedatabout$30B.Thesecond-largestEHRvendorintheU.S.couldinjectnewlife... SupremeCourtsideswithGoogleinOracleAPIcopyrightsuit TheSupremeCourtruled6-2thatJavaAPIsusedinAndroidphonesarenotsubjecttoAmericancopyrightlaw,endinga... OracleAutonomousDatabaseshiftsITfocustostrategicplanning ThishandbooklooksatwhatOracleAutonomousDatabaseofferstoOracleusersandissuesthatorganizationsshouldconsider... SearchSAP SAPBusinessByDesignvs.SAPS/4HANA:Acomparison Afteryearsofevolution,BusinessByDesignhasfoundasweetspotinSaaSERPforSMBsandthepublicsectorcomparedwithS/... AshortguidetoprimarySAPS/4HANAmodulesandLOBs SAPgroupsS/4HANAmodulesaroundlinesofbusiness.Here'saquickoverviewofmodulesandfeaturesforfinance,HCM,... 9topSAPS/4HANAbenefitsforbusinesses Fortoday'shighlynimblebusinessmodels,SAPS/4HANAcanprovidespeed,flexibility,simplicity,fasteranalytics,lowercosts... SearchSQLServer SQLServerdatabasedesignbestpracticesandtipsforDBAs GooddatabasedesignisamusttomeetprocessingneedsinSQLServersystems.Inawebinar,consultantKoenVerbeeckoffered... SQLServerinAzuredatabasechoicesandwhattheyofferusers SQLServerdatabasescanbemovedtotheAzurecloudinseveraldifferentways.Here'swhatyou'llgetfromeachoftheoptions... UsingaLEFTOUTERJOINvs.RIGHTOUTERJOINinSQL Inthisbookexcerpt,you'lllearnLEFTOUTERJOINvs.RIGHTOUTERJOINtechniquesandfindvariousexamplesforcreatingSQL... Close



請為這篇文章評分?