Building The Enterprise Data Lake With Cloudera & Cisco

2y ago
51 Views
2 Downloads
4.30 MB
28 Pages
Last View : 13d ago
Last Download : 3m ago
Upload by : Dahlia Ryals
Transcription

Building the Enterprise DataLake with Cloudera & CiscoPrepared by :Marilyn Tan, Country Manager SingaporeXue Daming, Senior Systems Engineer Cloudera, Inc. All rights reserved.1

Digital Transformation with Data Cloudera, Inc. All rights reserved.2

DATA is Transforming the World!CONNECTED WORLDMORE DATAINTERNET OF THINGSINDUSTRY 4.04thDATA ASPRODUCTION FACTORSMART APPsDIGITAL DEMOCRACY &SECURITYTHE NEW UXRISE OF OPEN SOURCE-Smart Things / Devices-New Analytics-Connected Experience-New Use Cases-Machine Learning & AI--New Architectures-New data sources-By 2020: 20.8b devices-Data VirtualizationUbiquitous computing:Everywhere & on everydevice (Voice, VR, AR,mobile, Wearables)-IoT: 1.7trillion in 2020-Data Science-Digital Sharing Economy:Open Data & Algorithms-Enterprise ready OpenSource (e.g. Apache)-Digital (distributed) Trust(esp. Blockchain) Cloudera, Inc. All rights reserved.3

The 9 year Cloudera journey 200820112012201320142016CLOUDERA FOUNDED BYMIKE OLSON,AMR AWADALLAH &JEFF HAMMERBACHER,CHRISTOPHE BISCIGLIAJOINED BYDOUG CUTTING (2009)CLOUDERA REACHES 100PRODUCTION CUSTOMERSCLOUDERA ENTERPRISE 4THE STANDARD FOR HADOOPIN THE ENTERPRISECLOUDERA EXPANDSBEYOND MR AND HBASE,INTRODUCING IMPALA,SOLR AND SPARKCLOUDERA FOCUSSES ONSECURITY, ANDGOVERNANCE WITHNAVIGATOR 2 AND CLOUDWITH DIRECTORNAVIGATOR OPTIMIZERGENERAL AVAILABILITY,IMROVED CLOUD COVERAGEWITH AWS, AZURE AND GCP CloudsCLOUDERAENTERPRISE4CDH / CMENTERPRISEDATA HUBAltusCDSW20092011201220142015CDH: FIRST COMMERICALAPACHE HADOOPDISTRIBUTION &CLOUDERA MANAGERCLOUDERA UNIVERSITYEXPANDS TO 140 COUNTRIESSUPPORT IMPLEMENTSFOLLOW THE SUN MODELCLOUDERA CONNECTREACHES 300 PARTNERSACROSS SI, HARDEWARE,AND SOFTWARE PARTNERSCLOUDERA INTRODUCESTHE ENTERPRISE DATA HUBAND CLOUDERAENTERPRISE 5CLOUDERA INCLUDESKAFKA, KUDU ANDRECORD SERVICE WITHINCLOUDERA ENTERPRISE2017 CLOUDERA ACQUIRED FASTFORWARD LABS, ANNOUNCED PaaSALTUS, DATA SCIENCE WORKBENCH,SHARED DATA EXPERIENCE (SDX)AND MORE TO COME! Cloudera, Inc. All rights reserved.4

What Happen Next: A Decade of Hadoop18 Projectsand beyond Cloudera, Inc. All rights reserved.5

Gartner Analytics Ascendancy ModelWhat willmake it happened?What willhappened?ValueWhy did HDifficulty Cloudera, Inc. All rights reserved.6

Cloudera & Cisco Enterprise Data Lake Innovation Cloudera, Inc. All rights reserved.7

Cisco UCS Integrated Infrastructure with Clouderafor IoTData AnalyticsReal-TimeData Inject (CoAP/MQTT.XMPP)FogC800/UCS Mini/UCS C240ISR 8x9 with 4G LTE and Dual 802.11na/g/n (WiFi) RadiosManaged by Cisco FogDirectorReal-Time Data StoreUCS C220/C240Data ProcessingKafkaCisco UCS C240DATAAggregatorCisco UCS C240Speed LayerBatch LayerBatchBig Data StoreUCS C240/C3160Serving LayerCisco UCS at all layers, fully validated architectures with all major players Cloudera, Inc. All rights reserved.8

Fabric Centric DesignHigh Performance40 GB/s Ethernet; 320 GB/sper ChassisUnified FabricSingle Cable for Network, Storage, andManagement TrafficUCS ManagerManagementEthernetStorageEasy to ScaleSingle Point of Management: AddCables for Bandwidthvs. Fabric Type Cloudera, Inc. All rights reserved.9

Management SimplicityBig Data: Management ConsistencyHundreds of ServersThousands of management pointsSimplified ScalabilityEasily Scale your infrastructure from fewservers to thousands of servers with afully Integrated InfrastructureUCS Service ProfileCisco ACI Application ProfileCentralized ManagementService Profiles for Servers Manage all servers centrallyApplication Profiles for Network Manage all network centrally Cloudera, Inc. All rights reserved.10

The enterprise platform for machine learningPATTERNRECOGNITIONDRIVE CUSTOMER INSIGHTSMarket segmentationCustomer 360Next best offerChurn analysis & preventionDETECTIONPROTECT BUSINESSCybersecurityFraudAnti-money launderingRisk modeling & assessmentSPAM detection500 CUSTOMERS RUNONPREDICTIONCONNECT PRODUCTS & SERVICES (IoT)Predictive maintenanceGenomics & personalized medicinePredicting and preventing diseaseNatural language Cloudera, Inc. All rights reserved.11

Machine learning requires a complete stack.Business IntegrationPrepare Load external dataProcess structured dataProcess unstructured dataProcess streaming dataCleanse dataVectorize data Batch ProcessingStream ProcessingInteractive SQLSearch ToolsText/Image ProcessingAnalyze Data Diagnose/treat data issuesDesign experimentsPartition dataEngineer featuresTrain and validate modelsEvaluate and assess models Analytic Languages ML LibrariesDeploy Publish to BI/VizDeploy to batch scoringDeploy to real-time scoringDeploy to scoring APIManage modelsMonitor model performance BI/VizInteractive SQLBatch ProcessingStream ProcessingOperational DBAdministration, Governance and Security Cloudera, Inc. All rights reserved.12

A complete, integrated enterprise platformCloudera Enterprise Data HubCloudera Distribution for Hadoop Cloudera, Inc. All rights reserved.13

Cloudera Data Science WorkbenchSupports data science end-to-end Full access to data Secure self-service provisioning Containerized environments Supports Python, R, and Scala Automates: WorkflowVersion controlCollaborationSharing Cloudera, Inc. All rights reserved.14

CDSW BenefitsData Scientists Web browser, no desktop footprint Use R, Python, or Scala Install any library or framework Isolated project environments Direct access to data in secure clusters Share insights with team Reproducible, collaborative research Automate and monitor data pipelines Built-in job schedulingIT Support self-service data science Full platform security Kerberos authentication Run on-premises or in the cloud Cloudera, Inc. All rights reserved.15

Deep learning in Cloudera with Apache SparkSpark Packages Two packages: CaffeOnSpark TensorFlowOnSpark Developed by Yahoo Python and Scala APIs All DL architectures Integrated pipeline Run on existing clusters Training and inferenceDL4J Open source DL libraryDeveloped by SkymindBuilt on JVMsSupports CPUs and GPUsJava, Scala, Python APIsTraining and inferenceImports models from: TensorFlow Caffe Torch TheanoRuns on existing clustersBigDL Deep learning frameworkDeveloped by IntelSupports CPUs onlyLeverages Intel MKLScala, Python APIsImports models from: TensorFlow Caffe TorchRuns on existing clusters Cloudera, Inc. All rights reserved.16

New! Accelerated deep learning on-demand with GPUs“Our data scientists want GPUs, but we can’tfind a way to deliver multi-tenancy.If they go to the cloud on their own, it’sexpensive and we lose governance.” Extend existing CDSW benefits to GPUoptimized deep learning tools Schedule & share GPU resources Train on GPUs, deploy on CPUs Works on-premises or cloudMulti-tenant GPU support on-premises or cloudData ScienceWorkbenchCDHCDHCPUCPUCPUGPUsingle-node trainingdistributedtraining, scoring Cloudera, Inc. All rights reserved.17

Enterprise Data Lake Architecture Cloudera, Inc. All rights reserved.18

Canonical Ingestion & Spark Streaming Analytics with CiscoBig Data Analytics Platform Integrate with Apache Spark Streaming for real-time analysis of data Write back to Kafka for further processing or to send to an application layer Cloudera, Inc. All rights reserved.19

Proposed Architecture for Enterprise Data PlatformAI PlatformData SourcesMeteorologicalDataSensors ageGeospatialAdvanced AnalyticsData VisualizationDataWarehouseDataWarehouseBW HANABPCany SAP NWDataWarehouseEnterpriseData Warehouse Cloudera, Inc. All rights reserved.20

Big Data Blueprints: Cisco Validated DesignsDesigns Big DataCisco Validated Designs with ClouderaWhat you getIndustry-leading partnershipsTested and validated reference architecturesto meet performance, capacity, and scaleJoint engineering labExtensive options for data management(Hadoop, MPP, and NoSQL) to meet yourbusiness needsSolution bundles optimized for cost ofownershipand ease of orderingSolution designed, tested, and documented to facilitate faster, more reliable, and more predictable customer deployments. Cloudera, Inc. All rights reserved.21

Our Customers’ Success Stories Cloudera, Inc. All rights reserved.22

CASE STUDYDATA-DRIVENPRODUCTSTRANSPORTATION» PREDICTIVE MAINTENANCE» IMPROVED SERVICE» DATA DRIVEN PRODUCTSUsing Predictive Maintenance to ImprovePerformance and Reduce Fleet Downtime OnCommand Connection is collectingtelematics and geolocation data acrossthe fleet Reduced maintenance costs to .03 permile from .12- .15 per mile Centralizing data from 13 systems withvarying frequency and semanticdefinitions Real-time visibility of 250,000 trucks inorder to improve uptime and vehicleperformance Cloudera, Inc. All rights reserved.23

CASE STUDYDATA-DRIVENPRODUCTSPROCESSTRAVEL & TRANSPORTATION» SMART BUILDINGS» PREDICTIVE MAINTENANCE» ADVANCED ANALYTICSSmart Buildings - Preventative MaintenanceUsing Sensors & IoT to Improve PassengerSafety and Airport EfficiencyChallenge: Improve traveler satisfaction and safety,by reducing downtime for criticaloperational machinerySolution: Cloudera on Azure to capture, secure,and correlate sensor (IoT) data collectedfrom escalators, elevators, and baggagecarouselsProvide necessary fixes to preventunplanned downtime Cloudera, Inc. All rights reserved.24

CASE STUDY2016 Data Impact Award WinnerState of Kentucky Department ofTransportationSmart CitiesEnabling the State of Kentucky manage snowand ice events in real timeChallenge: Needed more efficient approach toinclement weather road managementSolution: Real-time weather response system thatincorporates real-time data from Waze,HERE, ESRI’s GeoEvent processor, andAutomatic Vehicle Locations (providingsensor data from salt trucks). KYTC aggregates 15-20 million recordsevery day and process more than amillion records per second. Cloudera, Inc. All rights reserved.25

Data Protection & Governance0 – Data inIsolation1 – Behavior andTransaction Fusion2 – Expanded DataSurface Area3 – EDH: SecureData VaultFully Compliance ReadyAudit-Ready & ProtectedData in application silosLimited InsightsSummarizedBasic Security ControlsAuthorizationAuthenticationComprehensive AuditingData Security & GovernanceLineage VisibilityMetadata DiscoveryEncryption & KeyManagementAudit Ready For:PCIPIIFull encryption, keymanagement, transparency,and enforcement for all dataat-rest and data-in-motionSecurity Compliance & Risk Mitigation Cloudera, Inc. All rights reserved.26

Why ClouderaThe Platform for Next-Generation AnalyticsCloudera Enterprise delivers the capabilities required by the largest enterprises, spanning analytics,security, governance, and management. We make Hadoop fast, easy, and secure.The Experience to Help You SucceedNo one knows Hadoop like Cloudera.As the first Hadoop company, Cloudera is the world’s leading contributor to and provider of enterpriseHadoop, with experience you can rely on to help you succeed.Open InnovationOur unique hybrid open source strategy enables us to lead the enterprise expansion of the Hadoopecosystem, driving innovative new capabilities and open standards in the community. Cloudera, Inc. All rights reserved.27

Thank youmarilyn@cloudera.com 65 9822 2338daming@cloudera.com 65 9368 2316 Cloudera, Inc. All rights reserved.28

the enterprise data hub and cloudera enterprise 5 2015 cloudera includes kafka, kudu and record service within cloudera enterprise cdh / cm enterprise data hub cloudera enterprise 4 2016 navigator optimizer general availability, imroved cloud coverage with aws, azure and gcp clouds

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

Jack Carr Bonar Lake . Troy Turley Center Lake . John Bender Diamond Lake . Sandra Buhrt Elizabeth Lake . Chuck Brinkman Irish Lake . Jeff & Pam Thornburgh James, Oswego, & Tippecanoe Lake . Debra Hutnick Palestine Lake . Sandra Buhrt Rachel Lake . Toney Owsley Ridinger Lake .

Lake Michigan Lake Geneva OkaucheeLake Lake Mendota Big Green Lake Chain of Lakes Long Lake (Chippewa Co.) Long Lake (Washburn Co.) Lake Owen Turtle ‐Flambeau Flowage Lake Tomahawk Trout Lake Lake Superior Found in 175 Lakes