Accepted Manuscript A novel big data analytics framework for smart cities Ahmed M. Shahat Osman PII: DOI: Reference: S0167-739X(17)30744-6 https://doi.org/10.1016/j.future.2018.06.046 FUTURE 4308 To appear in: Future Generation Computer Systems Received date : 27 April 2017 Revised date : 18 March 2018 Accepted date : 25 June 2018 Please cite this article as: A.M.S. Osman, A novel big data analytics framework for smart cities, Future Generation Computer Systems (2018), https://doi.org/10.1016/j.future.2018.06.046 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
A NOVEL BIG DATA ANALYTICS FRAMEWORK FOR SMART CITIES Ahmed M. Shahat Osman Luleå Technical University- Department of Computer Science, Electrical and Space Engineering 971 87 Luleå - SWEDEN Abstract The emergence of smart cities aims at mitigating the challenges raised due to the continuous urbanization development and increasing population density in cities. To face these challenges, governments and decision makers undertake smart city projects targeting sustainable economic growth and better quality of life for both inhabitants and visitors. Information and Communication Technology (ICT) is a key enabling technology for city smartening. However, ICT artifacts and applications yield massive volumes of data known as big data. Extracting insights and hidden correlations from big data is a growing trend in information systems to provide better services to citizens and support the decision making processes. However, to extract valuable insights for developing city level smart information services, the generated datasets from various city domains need to be integrated and analyzed. This process usually referred to as big data analytics or big data value chain. Surveying the literature reveals an increasing interest in harnessing big data analytics applications in general and in the area of smart cities in particular. Yet, comprehensive discussions on the essential characteristics of big data analytics frameworks fitting smart cities requirements are still needed. This paper presents a novel big data analytics framework for smart cities called “Smart City Data Analytics Panel – SCDAP”. The design of SCDAP is based on answering the following research questions: what are the characteristics of big data analytics frameworks applied in smart cities in literature and what are the essential design principles that should guide the design of big data analytics frameworks have to serve smart cities purposes? In answering these questions, we adopted a systematic literature review on big data analytics frameworks in smart cities. The proposed framework introduces new functionalities to big data analytics frameworks represented in data model management and aggregation. The value of the proposed framework is discussed in comparison to traditional knowledge discovery approaches. Keywords: Analytics framework; Apache Hadoop; Apache Spark; Big data; Smart cities 1. Introduction The concept of smart cities emerged as a strategy to mitigate the unprecedented challenges of continuous urbanization, increasing population density and at the same time provide better quality of life to the citizens and visitors [1]. A smart city is composed of smart components such as smart buildings, smart farms and smart hospitals, which constitute various city domains where the meaning of the label “smart” has different connotations in each domain [2]. ICT applications and intensive use of digital artifacts such as sensors, actuators and mobiles are essential means for realizing smartness in any of smart city domains [3]. However, “smartening” of various city domains is not enough for a city to be smart, whereas the interrelationship between the underlying city domains should be taken into account to realize city smartness [3, 4]. As such, a smart city is viewed as a whole body of systems or system of systems. This integrated view for a smart city implies cross-domain sharing of information [5]. This holistic view for smart city characterizes the meaning of “smart” in the context of “smart city” compared to smartening of particular city domain.
On the other hand, the extensive use of digital technologies in various city domains and the diffusion of digital technologies in people’s daily life have boosted human-to-human, human-tomachine, and machine-to-machine interactions which yield massive volumes of data, commonly known as big data which is a mixture of complex data characterized by large and fast growing volumes datasets which go beyond the abilities of commonly known data management systems to accommodate. By analyzing these big data volumes, valuable insights and correlations can be extracted [6]. The process of analyzing big data to extract useful information and insights is usually referred to as big data analytics or big data value chain [6], which is considered as one of the key enabling technologies of smart cities [7, 8, 9]. However, big data complexities comprise non-trivial challenges for the processes of big data analytics [3]. Although literature is replete with articles addressing big data analytics frameworks and their applications of in different smart domains, detailed discussions on the characteristics of big data analytics frameworks fitting smart city’s requirements are still needed. The lack of this type of articles is the essential motive for this research. The main contribution of this paper is a proposal of a novel big data analytic framework for smart cities. To identify the necessary characteristics of big data analytics frameworks for smart cities, we adopted a systematic literature review approach on big data analytics frameworks in smart cities to answer two basic research questions. RQ1: what are the characteristics of big data analytics frameworks applied in smart cities in literature? And RQ2: what are the essential design principles that should characterize big data analytics frameworks to serve smart cities purposes? To achieve this objective, 30 articles addressing big data analytics frameworks and applications in smart cities are analyzed. This paper is organized as follows: This section is an introductory section about the subjects and motive for this paper. The second section presents fundamental concepts about big data and smart cities and how the two subjects are related. The scope of the review is defined in the third section. In the fourth section, the 30 articles selected for review are analyzed with respect to the value chain operators and the functional requirements that fit smart cities. Findings are discussed in the fifth section. The sixth section presents the main contribution of this paper, proposal for a novel big data analytics framework for smart cities and its Hadoop-based prototype implementation. In the same section, SCDAP design principles are discussed. Also, the value of SCDAP approach is demonstrated in comparison to traditional knowledge discovery approaches. The seventh section is the conclusion section. Finally, in the eighth section SCDAP architecture limitation is discussed and list of recommended directions for future research is presented. This paper includes three appendices: Appendix A: Details of the search process, Appendix B: Results with respect to big data value chain operators and Appendix C: Results with respect to Functional Requirements. Appendices are available on the following URL: [Appendices]
2. T Topic Conceeptualization n In thiis section, fuundamental concepts of the two maain subjects of o this articlle big data and a smart citiess are presentted to uncov ver the challlenges of harnessing h biig data analyytics in smaart cities. Uncoovering thesee challengess help deterrmining, at a high leveel, essential functional and a nonfuncttional requireements that should be coonsidered in the design of big data aanalytics fraameworks for sm mart city purrposes. 2.1 B Big data Big ddata is a natuural crop of the advancedd digital artiifacts and theeir applicatioons. Mobiless, sensors and S Social Mediaa Networks are a exampless of modern digital techn nologies that at have permeated our daily lives. Prevaalence of thesse technologiies in the everyday life boosted b humaan-to-human n, humanto-maachine and machine-to--machine innteraction in nto unprecedented leveels yielding massive volum mes of data known k as big data. How wever, volum me of data is not the onlyy characteristtic of big data. Big data is commonly c ch haracterized by four Vs characteristic c cs: Volume, V Velocity, Vaariety and Veraccity (Figure 1) [10]. Figuure (1) - Big data d 4Vs Volum me, as the naame indicates, big data vvolumes goess far beyond the size of tr traditional op perational databbases or dataa warehouses. Traditionaal databases usually grow to the ordder of gigab bytes (109 1 byte) or even terabytes (1012 byte). Bigg data volum mes are big enough to the extent that new measuuring units are a required such s as Petabbyte (1015 by yte) and Exab byte (1018 byyte). Veloccity refers too the high rate of data sttreaming into o hosting plaatforms. Forr example, ho ow many mousse clicks peer second caan be captuured from Social S Mediaa Network applications such as Facebbook or LinkkedIn? In ad ddition to thee high rate of o incoming data, d velocity ty raises an important i conceern on data aging i.e. “for “ how loong these daata will be valuable?” IIn some casses, realtime\\online analyysis of stream ming data is ccritical. For instance, reaal-time analyysis for video o streams captuured by traffi fic surveillan nce cameras is critical to o predict trafffic jams andd prevent bo ottlenecks within limited tim me brackets. Varieety refers too the compleexity of bigg data formaats. Big datta is mostlyy composed of semistructtured (e.g. IooT sensed daata files); unsstructured daata (e.g. text data files annd images) an nd stream data ((e.g. geospattial data streeams) in adddition to traditional structtured data. Itt is usually estimated e that thhe ratio betw ween structurred data to otther data types is by %20 0 to %80, resspectively. Veraccity refers too the trustw worthiness off the data. False F data will w definitelyy lead to misleading m resultts. Thereforee, there is neccessity to ennsure that datta sources are trustworthyy and data arre correct especcially in casee of automateed decision-m making wherre no human interventionn is involved. Howeever, we exppect the adjecctive big wil l fade over time and the self-evident meaning of data will intuittively be exxtended to in nclude all ty types of datta as mentio oned above, i.e. semi-sttructured, unstruuctured and stream data in i addition cclassically kn nown structurred data.
2.1.1 Big data an nalytics platfforms a tools req quired for kknowledge discovery d Big ddata analyticcs refers to the entire processes and includding data exxtraction, transformationn, loading and a analysis; specific toools, techniq ques, and methoods; and how w to successffully providee results to deecision makeers. Althoughh developing g big data analyytics platform ms encounterr non-trivial cchallenges due d to the com mplex naturee of big data, big data analyytics holds an a unpreced dented opporrtunity to shift the trad ditional metthods of infformation extracction into neew dimensions. This oppportunity ind duced researcchers and tecchnology pro oviders to devellop sophisticcated platform ms, framewoorks and algo orithms to compete with tthe challengiing of big data [[6, 11, 12]. To deeal with the challenging nature of bigg data, platfo orm scalabiliity representss the logical solution. Theree are two com mmonly scalling approacches: verticall and horizon ntal scaling w which are kn nown also as sccale up and scale out reespectively [6]. Verticall scaling meeans empow wering the prrocessing platfoorm with addditional computing poower (memo ory, CPUs etc.) to acccommodate with the increm mental volum mes of data. This appro ach involvess execution of a single ooperating sy ystem. On the otther side, horrizontal scaliing is a dividde-and-conqu uer approach h. The worklo load is distrib buted and proceessed in parallel across multiple inddependent computing machines. m Moore machinees can be addedd as much as a needed to o improve thhe overall system s perfo ormance. Thi his approach involves execuution of multtiple instancees of differennt autonomo ous operating g systems runnning on ind dependent machhines, which as such a furrther compleexity (Figure 2). Figure F (2) - V Vertical vs Horizontal H Scaling Each approach haas its advantages and dissadvantages. Although veertical scalinng shows resilience in platfoorm upgradiing, yet, it is i restricted by the plattform upper limits (e.g. maximum memory, numbber of CPUss etc.). On the other siide, in horizzontal scaling g more com mputing nodees can be addedd as much ass needed. Ho owever, horiizontal scalin ng involves maintaining m multiple insstances of differrent operatinng systems th hat is a com mplicated chaallenge. High h Performancce Computin ng (HPC) clusteers and Apacche Hadoop [13] [ are two examples off vertical and d horizontal sscaling platfo orms. Howeever, the cosst of verticall scalability and upper ceiling c limitaations are maajor and con nsiderable drawbbacks of verrtical scaling g. These twoo drawbacks come in fav vor of horizoontal scaling g when it comees to smart city projectss, considerinng the dynam mics of mullti-domain nnature of sm mart cities projects and prosspects for futture expansioon, it is morre rational to o rely on horrizontal scaliing rather than vertical scalling. This note n interpretts why mosst of the researches andd designs of big data analyytics framewoorks and plaatforms are bbuilt using ho orizontally sccalable platfo forms such as Apache Hadooop platform. 2.1.2 Big data vaalue chain op perators ogues approaach used forr Knowledgee Discovery in Databasees (KDD) In [6], authors addopted analo nding bottlennecks in casse of big data analytics. In the KDD D model, modeel to study thhe correspon analyytics is divideed to three operators: inpput, analysis and output (Figure ( 3). Inn the case off big data, data input operaator ingest stream s volum mes of noisy incomplette raw data from heterogeneous sourcces. The proocesses of in nput operatorr have a cru ucial impact in mitigatinng the effecct of data
volum minous on the t overall analysis proocesses. Piccking up rellevant data, data clean nsing and comppression will influence th he efficiencyy of the dataa analysis execution perfo formance. Deespite the importance of thee input operaator processees, the autho ors in [6] notted an imporrtant observaation that the nuumber of ressearch articlees and techniical reports th hat focus on data analysiss operator is typically more than the num mber focusin ng on other ooperators. Figu ure (3) - Knoowledge Disccovery in Daatabases a the survey in [66] revealed that t parallel computing aand cloud co omputing As foor the data analysis, technnologies havee a strong im mpact on the rresearch on this t area, wh here most of the big data analytics frameeworks and platforms were w developped based on o these tech hnologies. H However, relliance on paralllel distributeed processin ng platformss compelled d researcherss to modifyy traditional machine learniing and datta mining algorithms a too adapt witth the new environmennt and prog gramming paraddigm (e.g. Map and Redu uce) [13]. 2.2 Smart cities Althoough the term m smart city seems to be simple and intuitively i un nderstandablee, there is no o globally recoggnized definiition of the term; this iis due to diffferent persp pectives abouut the how the label “smaart” is loadeed in the co ontext of sm mart cities [1]. [ There are a several surveys in literature addreessing the deefinition of “smart city” ffrom differen nt perspectiv ves [1, 2, 3, 55]. For exam mple, IBM definnes a smart city c as: “thee city that m makes optim mal use of alll the intercoonnected infformation availaable today to t better und derstand andd control its operations and optimizze the use of o limited resouurces” [5]. Also, A YIN Ch huanTao, et al [3] defineed a smart city as “a sysstem of tech hnological infrasstructure thatt relies on ad dvanced dataa processing with w the goal of making ccity governaance more efficiient, citizenss’ happier, business b moore prospero ous and the environmennt more susttainable”. Authoors in [2] annalyzed and categorized several defiinitions of th he term “sm mart city” in literature settinng three fuundamental factors thatt make a city smart. These facttors are technology (infraastructures of o hardware and softw ware), peoplee (creativity, diversity, and educattion) and instituution (goverrnance and policy), p giveen the conneection betweeen these facctors. Howev ver, from these definitions we w can draw w out three esssential charaacteristics thaat characterizze smart citiees as: Thhe vital rolee of ICT ass key enablling technolo ogy in deveeloping smar art cities, wh here ICT represents the essential baackbone for connecting city core sy ystems togethher infusing data and innformation beetween diffeerent city sysstems. Digitaal artifacts an nd ICT appliications generate raw daata that can be b blended and a analyzedd to extract useful u inform mation and innsights abou ut the city ussing artificiall intelligencee and data annalysis appliccations. Thhe integral view v of a sm mart city, whhere the interrrelationship between thee core system ms of the citty should bee considered d. Scholars aand relevantt organizatio ons define diifferent dom mains that coomprise a sm mart city, forr example, aauthors in [4 4] defined six x smart dom mains for a city c to be sm mart. In this regard, r a sm mart city is vieewed as a whole w body off systems (i.ee. system off systems) whhere no systeem in the city y domains fuunctions in issolation [14, 15, 16]. Suustainability,, which meaans in its brroad meanin ng the ability y to continuue and grow w without siggnificant detterioration. In n smart city context, the concept of su ustainabilityy applies into o different asspects of life in a smart ciity such as ecconomics an nd environmeent.
Thesee characterisstics are not exclusive to smart cities, whereas there are manyy other relatiive labels in literature suchh as “digital city”, “inteelligent city””, “virtual ciity” and “ubbiquitous city y” which addreess different aspects thatt characterizze modern cities and shaare some off these charaacteristics with smart cities with w differen nt degrees off deepness. 2.2.1 Modeling of o smart citiees - an ICT perspective From m an ICT perrspective, ind dustrialist annd scholars adopted a the layered apprroach to mod del smart citiess. For exampple, IBM adopted a three--layer modell including in nstrumented layer, intercconnected layer and intelliggent layer [5]. Similarlyy, YIN Chu uanTao, et al. a proposedd a four-layer model includding: data accquisition an nd transmissioon layer; datta vitalization n layer; com mmon data an nd service layer; and the appplications lay yer [3]. Reggardless of th he number of layers, thesse models prroject the journney of the biig data from its birth in its raw form m until valuaable informaation and inssights are extraccted for the benefit b of deecision makerrs and citizen ns (Figure 4)). Although tthere is no on ne-to-one correspondence between b big data value chain operaators (Figuree 3) and the layers of smart city n objective off projecting the t processes of extractinng insights and a useful modeels, they sharre a common inform mation out of o raw big datta to supportt smart servicces and decission-making processes [1 17]. Figure (4) - Big Data in i Smart Cities 2.3 B Big data anallytics in sma art cities The rrole of big daata in developing smart ccities is undeniable [7]. There T are manny applicatio ons of big data analytics inn different domains d of smart citiess, such as planning p [8]], traffic con ntrol and transpportation [188, 19], crimee analysis [200], energy [2 21] and envirronment [22]]. To design software platfoorms and arrchitectures fitting smarrt city purpo oses, specifiic non-functtional and functional f requirements linkked to smarrt city’s natuure of data sources and d applicationns should be b clearly identiified first. Foor example, the heterogeeneity of data sources (e.g. sensed IooT data, Sociial Media Netw work and Ellectronic Meedical Recoords) entails considerable design ffactors such h as data integrration, system m scalability y, privacy annd security [9 9, 23]. Also, the dynamicc nature of smart s city life nnecessitates considerablee attention ffor stream data d analyticss to enable online and real-time servicces [8, 9]. Additionally, A “historical ddata” or “battch data” anaalytics is an essential req quirement for pllanning (shorrt and long teerm) and deccision-makin ng purposes in i smart citiees. In [17], authors compiled a comprehenssive list of eleven e essen ntial requirem ments that should s be consiidered in dessigning smarrt cities softw ware architecctures. Typiccally, these reequirements could be classiified into twoo categories. Firstly, the functional requirements: object interroperability, real time monitoring, histoorical data, mobility; m servvice compossition and integrated urbban managem ment. The seconnd category is the non--functional rrequirementss: sustainability, availabbility, privaccy, social aspeccts and flexibbility/extensiibility (scalabbility). Simillarly, authorrs in [24] introduced a ccomprehensiive study baased on exam mining 23 smart city platfoorms to specify the fun ndamental fuunctional and non-functional requireements to develop d a
software platform to enable construction of scalable integrated smart city applications. The study concluded with eight functional requirements: data management, application run-time, Wireless Sensor Network WSN management, data processing, external data access, service management, software engineering tools and definition of city model. This is in addition to eight non-functional requirements: interoperability, scalability, security, privacy, context awareness, adaption, extensibility and configurability. Authors in [23] specified six characteristics a big data analytics platform should maintain to accommodate with the V challenges of big data, specifically: scalability, I/O performance, fault tolerance, real-time processing, data size and support for iterative tasks. In summary, we can recap the functional requirements into interoperability, real time analysis, historical data analysis, mobility, iterative processing, data integration and model aggregation. While the non-functional requirements are scalability, security, privacy, context awareness, adaption, extensibility, sustainability, availability, and configurability. 3. The Scope of Literature Review In order to define the scope of this literature review, authors followed a widely known taxonomy scheme proposed by [25] and adapted by [26] that includes six characteristics for literature review: (1) focus, (2) goal, (3) organization, (4) perspective, (5) audience and (6) coverage. 3.1 Focus Focus is the central area of interest to the review process. According to [25], it could be research outcomes, research methods, theories, practices or applications. The candidate articles for review are analyzed in two dimensions. The first is the analysis with respect to big data value chain operators presented in subsection 2.1.2 (appendix B). The second one is the analysis with respect to the functional requirements mentioned in section 2.3 (appendix C). The reason for choosing these two dimensions for analysis is that they complement each other. This approach gives a broader picture about the traits of big data analytics frameworks in smart cities, which in return gives a pointer to answer the first research question. In this regard, it is worth mentioning that nonfunctional will be not be addressed in this article as it is considered general requirements applicable for any analytics system which also required in smart cities with different levels of complexity. 3.2 Goal Goal refers to the objectives to be fulfilled by the review that could be integration, criticism, central issue. As the objective of this research is to study the traits of the available big data analytics frameworks applied in smart cities and identify the essential functionalities that should characterize big data analytics frameworks to serve smart cities purposes, the goal of this research is to integrate and criticize the finding of the past literature. 3.3 Organization Organization refers how the literature review is organized. The literature could be organized as chronological order, conceptual order (sharing of the same ideas) or methodological order (sharing of the same methods of work). In this article, literature is organized and discussed in a conceptual order. 3.4 Perspective Perspective refers to the reviewer’s point of view in discussing the literature. Perspective could be: a neutral position (impartial role as an honest “judge”) or an espousal position (advocate to certain
idea(s) or methodology). In this research, the author adopted a neutral position search perspective since there is no need to foster a specific position. 3.5 Audience Audience refers to beneficiaries whom the review addresses (specialized researchers, general researchers, practitioners, policy makers). Since the second research question is identifying essential functionalities that should big data analytics frameworks have to serve smart cities purposes, the audience of this literature review are specialized scholars, practitioners and smart cities planners. 3.6 Coverage Coverage refers to how the reviewer searches the literature and how he makes decisions about the suitability and quality of documents. According to Cooper, there are four categories of coverage: exhaustive (including the entirety of literature on a topic or at least most of it), exhaustive with selective citation (considering all the relevant sources, but describing only a sample), representative (including only a sample that typifies larger groups of articles), and central (reviewing the literature pivotal to a topic). In this literature review, authors adopted exhaustive with selected citations coverage since it is not realistic to claim exhaustive coverage. Additionally, adoption of representation or central coverage does not serve the objectives of this review. Table (1) summarizes the choices made by the author, regarding the Cooper’s taxonomy about the review scope. Table (1) Taxonomy of literature review Characteristics Categories Focus Research outcomes Goal Integration Criticism Central Issue Organization Historical Conceptual Methodological Perspective Research methods Neutral representation Theories Applications Espousal of position Audience Specialized scholars General scholars Practitioners\ Policy-makers General public Coverage Exhaustive Exhaustive with selective citation Representative Central or pivotal 4. Literature analysis and synthesis According to the reference review scheme [26], the search process involves four steps: (1) identifying search databases; (2) search keywords; (3) forward and backward search; and (4) evaluation of articles. To collect quality scholar articles, six information systems online databases were searched using two search keywords “big data” and “smart city”, total of 247 articles were retrieved. After filtration and evaluation, process only 30 articles were shortlisted for analysis. Details of the search process and sources of shortlisted articles are listed in appendix A. The process of evaluating the articles in terms of addressing the two dimensions of analysis in section 3.1 involved reading and evaluating the article’s abstract and conclusion sections to decide which of the focus points (section 3.1) are addressed. If article abstract and conclusion does not
lead into clear decision, the article’s full body is reviewed. In the following two subsections, the results of analysis with respect to big data value chain operators and functional requirements are demonstrated. For a more detailed analysis of some proposed end-to-end architectural frameworks are reviewed in details, namely: BASIS [27], SWIFT [28] and RADICAL [16] are reviewed in details in subsections 4.3, 4.4 and 4.5 respectively. Points of strength and weakness of each architectural framework are listed after each review. 4.1 Analysis with respect to value chain operators Results of the evaluation process with respect to big data value chain operators are shown in the table presented in appendix B. From these, we could recognize that most of the analyzed articles focused on data gathering (21 out of 30). This ratio reflects the high interest of researchers in finding efficient solutions for data gathering from different sources, while relatively, less number of researches addressed the rest of data input functionalities (selection, preparation, transformation) although its significant impact on the efficiency of the analysis processes. In the following three subsections, we will review how each of these operators is addressed. 4.1.1 Data Input The central challenge in this operator is related to the ability to acquire timely raw data about city events from a large number of heterogeneous sources. Also, in through this operator data are prepared to the following operator for
The process of analyzing big data to extract useful information and insights is usually referred to as big data analytics or big data valu e chain [6], which is considered as one of the key enabling technologies of smart cities [7, 8, 9]. However, big data complexities comprise non-trivial challenges for the processes of big data analytics [3].
tdwi.org 5 Introduction 1 See the TDWI Best Practices Report Next Generation Data Warehouse Platforms (Q4 2009), available on tdwi.org. Introduction to Big Data Analytics Big data analytics is where advanced analytic techniques operate on big data sets. Hence, big data analytics is really about two things—big data and analytics—plus how the two have teamed up to
big data analytics" To discuss the in-depth analysis of hardware and software platforms for big data analytics The study only focused on the hardware and software platform for big data analytics. The review is centered on the impact of parameters such as scalability, data sizes, resources availability on big data analytics. However, the
India has the second largest unmet demand for AI and Big Data/Analytics, driven primarily by large service providers, GCCs and the start-up ecosystem NCR Others Hyderabad Pune Mumbai Bangalore Chennai Top Skills Talent Big Data/ Analytics 5,800 AI 1,200 Top Skills Talent Big Data/ Analytics 19,100 AI 7.400 Top Skills Talent Big Data/ Analytics .
Q) Define Big Data Analytics. What are the various types of analytics? Big Data Analytics is the process of examining big data to uncover patterns, unearth trends, and find unknown correlations and other useful information to make faster and better decisions. Few Top Analytics tools are: MS Excel, SAS, IBM SPSS Modeler, R analytics,
example, Netflix uses Big Data Analytics to prescribe favourite song/movie based on customer‟s interests, behaviour, day and time analysis. 3. Python For Big Data Analytics 3.1 . Advantages. of . Python for Big Data Analytics Python. is. the most popular language amongst Data Scientists for Data Analytics not only because of its ease in
The Rise of Big Data Options 25 Beyond Hadoop 27 With Choice Come Decisions 28 ftoc 23 October 2012; 12:36:54 v. . Gauging Success 35 Chapter 5 Big Data Sources.37 Hunting for Data 38 Setting the Goal 39 Big Data Sources Growing 40 Diving Deeper into Big Data Sources 42 A Wealth of Public Information 43 Getting Started with Big Data .
Retail. Big data use cases 4-8. Healthcare . Big data use cases 9-12. Oil and gas. Big data use cases 13-15. Telecommunications . Big data use cases 16-18. Financial services. Big data use cases 19-22. 3 Top Big Data Analytics use cases. Manufacturing Manufacturing. The digital revolution has transformed the manufacturing industry. Manufacturers
Dosen Jurusan Pendidikan Akuntansi Fakultas Ekonomi Universitas Negeri Yogyakarta CP: 08 222 180 1695 Email : adengpustikaningsih@uny.ac.id. 23-2. 23-3 PREVIEW OF CHAPTER Intermediate Accounting IFRS 2nd Edition Kieso, Weygandt, and Warfield 23. 23-4 6. Identify sources of information for a statement of cash flows. 7. Contrast the direct and indirect methods of calculating net cash flow from .