Business Intelligence, 4e(sharda/Delen/Turban) Chapter 7 Big data Concepts and tools 1)In the opening vignette, the Access Telecom(AT), built a system to better visualize customers who were unhappy before they canceled their service Answer: TRUE Diff: 2 Page Ref: 372 2)The term"Big Data"is relative as it depends on the size of the using organization Answer TRUE Diff: 2 Page Ref: 373 3)Satellite data can be used to evaluate the activity at retail locations as a source of alternative dat Answer: TRUE Diff: 2 Page Ref: 377 4)Big Data is being driven by the exponential growth, availability, and use of information Answer: TRUE Diff: 2 Page Ref: 373 5)The quality and objectivity of information disseminated by influential users of Twitter is higher than that disseminated by noninfluential users RUE Diff: 2 Page Ref: 392 6) Big Data uses commod ity hardware, which is expensive, specialized hardware that is custom built for a client or application Answer: FALsE Diff: 2 Page Ref: 375 7) MapReduce can be easily understood by skilled programmers due to its procedural nature Answer: TRUE Diff: 2 Page Ref: 385 8)Hadoop was designed to handle petabytes and exabytes of data distributed over multiple nodes In p Answer: TRUE Diff: 2 age Ref: 385 9)Hadoop and MapReduce require each other to work Answer: FALSE Diff: 2 Page Ref: 386 10)In most cases, Hadoop is used to replace data warehouses Answer FALSE Diff: 2 Page Ref: 389 Copyright C 2018 Pearson Education, Inc
1 Copyright © 2018 Pearson Education, Inc. Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 7 Big Data Concepts and Tools 1) In the opening vignette, the Access Telecom (AT), built a system to better visualize customers who were unhappy before they canceled their service. Answer: TRUE Diff: 2 Page Ref: 372 2) The term "Big Data" is relative as it depends on the size of the using organization. Answer: TRUE Diff: 2 Page Ref: 373 3) Satellite data can be used to evaluate the activity at retail locations as a source of alternative data. Answer: TRUE Diff: 2 Page Ref: 377 4) Big Data is being driven by the exponential growth, availability, and use of information. Answer: TRUE Diff: 2 Page Ref: 373 5) The quality and objectivity of information disseminated by influential users of Twitter is higher than that disseminated by noninfluential users. Answer: TRUE Diff: 2 Page Ref: 392 6) Big Data uses commodity hardware, which is expensive, specialized hardware that is custom built for a client or application. Answer: FALSE Diff: 2 Page Ref: 375 7) MapReduce can be easily understood by skilled programmers due to its procedural nature. Answer: TRUE Diff: 2 Page Ref: 385 8) Hadoop was designed to handle petabytes and exabytes of data distributed over multiple nodes in parallel. Answer: TRUE Diff: 2 Page Ref: 385 9) Hadoop and MapReduce require each other to work. Answer: FALSE Diff: 2 Page Ref: 386 10) In most cases, Hadoop is used to replace data warehouses. Answer: FALSE Diff: 2 Page Ref: 389
11) Despite their potential, many current NosQL tools lack mature management and monitoring tools Answer: TRUE Diff: 2 Page Ref: 389 12)There is a clear difference between the type of information support provided by influential users versus the others on twitter Answer TRUE Diff: 2 Page Ref: 392 13)Social media mentions can be used to chart and predict flu outbreaks Answer: TRUE Diff: 2 Page Ref: 400 4)In Application Case 7.6, Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse it was found that urban ind ividuals have a higher number of diagnosed disease cond itions Answer: TRUE Diff: 2 Page Ref: 403 15) For low latency, interactive reports, a data warehouse is preferable to Hadoop Answer: TRUE Diff: 2 Page Ref: 396 16) If you have many flexible programming languages running in parallel, Hadoop is preferable to a data warehouse answe RUE Diff: 2 Page Ref: 396 17)In the Salesforce case study, streaming data is used to identify services that customers use most Answer: FALsE Diff: 2 Page Ref: 410 18)It is important for Big Data and self-service business intelligence to go hand in hand to get maximum value from analytics Answer: TRUE Diff: 1 age Ref: 395 19) Big Data simplifies data governance issues, especially for global firms Answer FALSE Diff: 2 Page Ref: 406 Copyright C 2018 Pearson Education, Inc
2 Copyright © 2018 Pearson Education, Inc. 11) Despite their potential, many current NoSQL tools lack mature management and monitoring tools. Answer: TRUE Diff: 2 Page Ref: 389 12) There is a clear difference between the type of information support provided by influential users versus the others on Twitter. Answer: TRUE Diff: 2 Page Ref: 392 13) Social media mentions can be used to chart and predict flu outbreaks. Answer: TRUE Diff: 2 Page Ref: 400 14) In Application Case 7.6, Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse, it was found that urban individuals have a higher number of diagnosed disease conditions. Answer: TRUE Diff: 2 Page Ref: 403 15) For low latency, interactive reports, a data warehouse is preferable to Hadoop. Answer: TRUE Diff: 2 Page Ref: 396 16) If you have many flexible programming languages running in parallel, Hadoop is preferable to a data warehouse. Answer: TRUE Diff: 2 Page Ref: 396 17) In the Salesforce case study, streaming data is used to identify services that customers use most. Answer: FALSE Diff: 2 Page Ref: 410 18) It is important for Big Data and self-service business intelligence to go hand in hand to get maximum value from analytics. Answer: TRUE Diff: 1 Page Ref: 395 19) Big Data simplifies data governance issues, especially for global firms. Answer: FALSE Diff: 2 Page Ref: 406
20)Current total storage capacity lags behind the digital information being generated in the world Answer: TRUE Diff: 2 Page Ref: 406 21)Using data to understand customers/clients and business operations to sustain and foster growth and profitability is A)easier with the advent of BI and Big Data B)essentially the same now as it has always been C)an increasingly challenging task for today's enterprises D)now completely automated with no human intervention required A Diff: 2 Page Ref: 373 22)A newly popular unit of data in the Big Data era is the petabyte(PB), which is A)109 bytes B)1012 bytes C)1015by D)1018 bytes Answer: C Diff: 2 Page Ref: 375 23)Which of the following sources is likely to produce Big Data the fastest? A)order entry clerks B)cashier C)RFID tags D)online customers A Diff: 2 Page Ref: 374 24)Data flows can be highly inconsistent, with period ic peaks, making data loads hard to manage. What is this feature of big Data called? A)volatility B)period icity C)inconsistency D)variability Answer D Diff: 2 Page Ref: 376 25) In the Twitter case study, how did influential users support their tweets? A)opinion B)objective data C)multiple posts D)references to other users Answer: B Diff: 2 Page Ref: 39 Copyright C 2018 Pearson Education, Inc
3 Copyright © 2018 Pearson Education, Inc. 20) Current total storage capacity lags behind the digital information being generated in the world. Answer: TRUE Diff: 2 Page Ref: 406 21) Using data to understand customers/clients and business operations to sustain and foster growth and profitability is A) easier with the advent of BI and Big Data. B) essentially the same now as it has always been. C) an increasingly challenging task for today's enterprises. D) now completely automated with no human intervention required. Answer: C Diff: 2 Page Ref: 373 22) A newly popular unit of data in the Big Data era is the petabyte (PB), which is A) 109 bytes. B) 1012 bytes. C) 1015 bytes. D) 1018 bytes. Answer: C Diff: 2 Page Ref: 375 23) Which of the following sources is likely to produce Big Data the fastest? A) order entry clerks B) cashiers C) RFID tags D) online customers Answer: C Diff: 2 Page Ref: 374 24) Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called? A) volatility B) periodicity C) inconsistency D) variability Answer: D Diff: 2 Page Ref: 376 25) In the Twitter case study, how did influential users support their tweets? A) opinion B) objective data C) multiple posts D) references to other users Answer: B Diff: 2 Page Ref: 392
26)Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near-real time with highly accurate insights. What is this proces cane A)in-memory analyt B)in-database analytics C)grid computing )appliances Answer: A Diff: 2 Page Ref: 380 27)Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources? A)in-memory analytics B)in-database analytics C)grid computing D)applianc A C Diff: 2 Page Ref: 380 28)How does Hadoop work? A)It integrates Big Data into a whole so large data elements can be processed as a whole on one B)It integrates Big Data into a whole so large data elements can be processed as a whole on multiple computers C)It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on one computer. D)It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers Answer D Diff: 3 Page Ref: 386 29)What is the Hadoop Distributed File System(HDFS)designed to handle? A)unstructured and semistructured relational data B)unstructured and semistructured non-relational data C)structured and semistructured relational data D)structured and semistructured non-relational data Answer: B Diff: 2 Page Ref: 385 30)In a Hadoop >as of programs are stoce? stack, what is a slave nod A)a node where bits B)a node where metadata is stored and used to organize data processing C)a node where data is stored and processed D)a node responsible for holding all the source programs ansy Diff: 2 Page Ref: 386 Copyright C 2018 Pearson Education, Inc
4 Copyright © 2018 Pearson Education, Inc. 26) Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near–real time with highly accurate insights. What is this process called? A) in-memory analytics B) in-database analytics C) grid computing D) appliances Answer: A Diff: 2 Page Ref: 380 27) Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources? A) in-memory analytics B) in-database analytics C) grid computing D) appliances Answer: C Diff: 2 Page Ref: 380 28) How does Hadoop work? A) It integrates Big Data into a whole so large data elements can be processed as a whole on one computer. B) It integrates Big Data into a whole so large data elements can be processed as a whole on multiple computers. C) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on one computer. D) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers. Answer: D Diff: 3 Page Ref: 386 29) What is the Hadoop Distributed File System (HDFS) designed to handle? A) unstructured and semistructured relational data B) unstructured and semistructured non-relational data C) structured and semistructured relational data D) structured and semistructured non-relational data Answer: B Diff: 2 Page Ref: 385 30) In a Hadoop "stack," what is a slave node? A) a node where bits of programs are stored B) a node where metadata is stored and used to organize data processing C) a node where data is stored and processed D) a node responsible for holding all the source programs Answer: C Diff: 2 Page Ref: 386
31)In a Hadoop"stack, what node periodically replicates and stores data from the Name Node should it fail? A) backup no B)secondary node C)substitute node D)slave node Answer: B Diff: 2 Page Ref: 386 32)All of the following statements about Map Reduce are true EXCEPT A)MapReduce is a general-purpose execution engine B)Map Reduce handles the complexities of network communication C) MapReduce handles parallel programming D)MapReduce runs without fault tolerance Answer: D Diff: 2 Page Ref: 389 33)In a network analysis, what connects nodes? A)edges B)metrics C) paths D)visualizations Answer: A Diff: 2 Page Ref: 403 34)In the Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse case study, what was the analytic goal? a)determine if diseases are accurately diagnosed B)determine probabilities of d iseases that are comorbid C)determine differences in rates of d isease in urban and rural populations D)determine differences in rates of disease in males v. females Answer: C Diff: 2 Page Ref: 402 35)Traditional data warehouses have not been able to keep up with A)the evolution of the SQL language B) the variety and complexity of data. C)expert systems that run on them D)OLAP Al nswer Diff: 2 Page Ref: 393 Copyright C 2018 Pearson Education, Inc
5 Copyright © 2018 Pearson Education, Inc. 31) In a Hadoop "stack," what node periodically replicates and stores data from the Name Node should it fail? A) backup node B) secondary node C) substitute node D) slave node Answer: B Diff: 2 Page Ref: 386 32) All of the following statements about MapReduce are true EXCEPT A) MapReduce is a general-purpose execution engine. B) MapReduce handles the complexities of network communication. C) MapReduce handles parallel programming. D) MapReduce runs without fault tolerance. Answer: D Diff: 2 Page Ref: 389 33) In a network analysis, what connects nodes? A) edges B) metrics C) paths D) visualizations Answer: A Diff: 2 Page Ref: 403 34) In the Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse case study, what was the analytic goal? A) determine if diseases are accurately diagnosed B) determine probabilities of diseases that are comorbid C) determine differences in rates of disease in urban and rural populations D) determine differences in rates of disease in males v. females Answer: C Diff: 2 Page Ref: 402 35) Traditional data warehouses have not been able to keep up with A) the evolution of the SQL language. B) the variety and complexity of data. C) expert systems that run on them. D) OLAP. Answer: B Diff: 2 Page Ref: 393
36)Under which of the following requirements would it be more appropriate to use Hadoop over a data warehouse? A)ANSI 2003 SQL compliance is required B)online archives alternative to tape C)unrestricted, ungoverned sand box explorations D)analysis of provisional data Answer: C Diff: 2 Page Ref: 396 37)What is Big Data's relationship to the cloud A) Hadoop cannot be deployed effectively in the cloud just yet B)Amazon and google have working Hadoop cloud offerings C)IBMs homegrown Hadoop platform is the only option D)Only Map Reduce works in the cloud; Hadoop does not A nswer Diff: 2 Page Ref: 403 38)Companies with the largest revenues from Big Data tend to be A)the largest computer and IT services firms B)small computer and IT services firm C) pure open source Big Data firms D)non-US Big Data firms Answer: A Diff: 2 Page Ref: 405 39)In the financial services industry, Big Data can be used to improve A)regulat B)decision making D)botha b Diff: 2 Page ref: 411 40) In the Alternative Data for Market Analysis or Forecasts case study, satellite data was NOT A)evaluating retail traffic at factories C)tracking agricultural estimates D)monitoring individual customer patterns Al nswer Diff: 2 Page Ref: 377 41)Big Data comes from Al nswer ev eryth ere Dift Page Ref: 373 Copyright C 2018 Pearson Education, Inc
6 Copyright © 2018 Pearson Education, Inc. 36) Under which of the following requirements would it be more appropriate to use Hadoop over a data warehouse? A) ANSI 2003 SQL compliance is required B) online archives alternative to tape C) unrestricted, ungoverned sandbox explorations D) analysis of provisional data Answer: C Diff: 2 Page Ref: 396 37) What is Big Data's relationship to the cloud? A) Hadoop cannot be deployed effectively in the cloud just yet. B) Amazon and Google have working Hadoop cloud offerings. C) IBM's homegrown Hadoop platform is the only option. D) Only MapReduce works in the cloud; Hadoop does not. Answer: B Diff: 2 Page Ref: 403 38) Companies with the largest revenues from Big Data tend to be A) the largest computer and IT services firms. B) small computer and IT services firms. C) pure open source Big Data firms. D) non-U.S. Big Data firms. Answer: A Diff: 2 Page Ref: 405 39) In the financial services industry, Big Data can be used to improve A) regulatory oversight. B) decision making. C) customer service. D) both A & B. Answer: D Diff: 2 Page Ref: 411 40) In the Alternative Data for Market Analysis or Forecasts case study, satellite data was NOT used for A) evaluating retail traffic. B) monitoring activity at factories. C) tracking agricultural estimates. D) monitoring individual customer patterns. Answer: D Diff: 2 Page Ref: 377 41) Big Data comes from ________. Answer: everywhere Diff: 2 Page Ref: 373
refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of the data Answer: Veracity Diff: 2 Page Ref: 376 43)In-motion is often overlooked tod ay in the world of Bl and Big Data Answer: analytics Diff: 2 Page Ref: 376-377 44)The of Big Data is its potential to contain more useful pattems and interesting anomalies than"small"data Answer: value proposition Diff: 2 Page Ref: 376 45) As the size and the complexity of analytical systems increase, the need for more analytical systems is also increasing to obtain the best performance Answer: efficient Diff: 2 Page Ref: 380 integration and peeds time to insights and enables better data governance by performing data 46) d analytic functions inside the database Answer: In-database analytics Diff: 2 Page Ref: 380 47) bring together hardware and software in a physical unit that is not only fast but on an as-needed basis Answer: Applianc Diff: 2 Page Ref: 380 48)Big Data employs processing techniques and nonrelational data storage capabilities in order to process unstructured and semistructured data Answer: parallel Diff: 2 Page Ref: 383 49)In the world of Big Data, aids organizations in processing and analyzing large volumes of multistructured data Examples include indexing and search, graph analysis, etc Answer: Mapreduce Diff. 2 Page Ref: 385 50)The Node in a Hadoop cluster provides client information on where in the cluster particular data is stored and if any nodes fail Answer: Name Diff: 2 Page Ref: 385 Copyright C 2018 Pearson Education, Inc
7 Copyright © 2018 Pearson Education, Inc. 42) ________ refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of the data. Answer: Veracity Diff: 2 Page Ref: 376 43) In-motion ________ is often overlooked today in the world of BI and Big Data. Answer: analytics Diff: 2 Page Ref: 376-377 44) The ________ of Big Data is its potential to contain more useful patterns and interesting anomalies than "small" data. Answer: value proposition Diff: 2 Page Ref: 376 45) As the size and the complexity of analytical systems increase, the need for more ________ analytical systems is also increasing to obtain the best performance. Answer: efficient Diff: 2 Page Ref: 380 46) ________ speeds time to insights and enables better data governance by performing data integration and analytic functions inside the database. Answer: In-database analytics Diff: 2 Page Ref: 380 47) ________ bring together hardware and software in a physical unit that is not only fast but also scalable on an as-needed basis. Answer: Appliances Diff: 2 Page Ref: 380 48) Big Data employs ________ processing techniques and nonrelational data storage capabilities in order to process unstructured and semistructured data. Answer: parallel Diff: 2 Page Ref: 383 49) In the world of Big Data, ________ aids organizations in processing and analyzing large volumes of multistructured data. Examples include indexing and search, graph analysis, etc. Answer: MapReduce Diff: 2 Page Ref: 385 50) The ________ Node in a Hadoop cluster provides client information on where in the cluster particular data is stored and if any nodes fail. Answer: Name Diff: 2 Page Ref: 385
51)A job is a node in a Hadoop cluster that initiates and coordinates mapReduce jobs or the processing of the data Answer: tracker Diff: 2 Page Ref: 386 52)HBase is a nonrelational that allows for low-latency, quick lookups in Hadoop Answer: database Diff: 2 Page Ref: 387 53)Hadoop is primarily a(n) file system and lacks capabil ities we'd associate with a DBMS, such as indexing, random access to data, and support for SQL Answer: distributed Diff: 2 Page Ref: 388 54) HBase, Cassandra, MongoDB, and Accumulo are examples of databases Answer: NOSQL Diff: 2 Page Ref: 389 55) The problem of forecasting economic activity or microclimates based on a variety of data beyond the usual retail data is a very recent phenomenon and has led to another buzzword Answer. alternative data Diff: 2 Page Ref: 377 56)As volumes of Big Data arrive from multiple sources such as sensors, machines, social media, and clickstream interactions, the first step is to all the data reliably and cost Answer: capture Diff: 2 Page Ref: 393 57)In open-source databases, the most important performance enhancement to date is the cost-based Answer: optimizer Diff: 2 Page Ref: 395 of data provides business value, pulling of data from multiple subject areas and numerous applications into one repository is the raison d'etre for data warehouses Answer: Integration Diff: 2 Page Ref: 395 59)In the energy industry grids are one of the most impactful applications of stream analytics Answer: smart Diff: 2 Page Ref: 407 Copyright C 2018 Pearson Education, Inc
8 Copyright © 2018 Pearson Education, Inc. 51) A job ________ is a node in a Hadoop cluster that initiates and coordinates MapReduce jobs, or the processing of the data. Answer: tracker Diff: 2 Page Ref: 386 52) HBase is a nonrelational ________ that allows for low-latency, quick lookups in Hadoop. Answer: database Diff: 2 Page Ref: 387 53) Hadoop is primarily a(n) ________ file system and lacks capabilities we'd associate with a DBMS, such as indexing, random access to data, and support for SQL. Answer: distributed Diff: 2 Page Ref: 388 54) HBase, Cassandra, MongoDB, and Accumulo are examples of ________ databases. Answer: NoSQL Diff: 2 Page Ref: 389 55) The problem of forecasting economic activity or microclimates based on a variety of data beyond the usual retail data is a very recent phenomenon and has led to another buzzword — ________. Answer: alternative data Diff: 2 Page Ref: 377 56) As volumes of Big Data arrive from multiple sources such as sensors, machines, social media, and clickstream interactions, the first step is to ________ all the data reliably and cost effectively. Answer: capture Diff: 2 Page Ref: 393 57) In open-source databases, the most important performance enhancement to date is the cost-based ________. Answer: optimizer Diff: 2 Page Ref: 395 58) ________ of data provides business value; pulling of data from multiple subject areas and numerous applications into one repository is the raison d'être for data warehouses. Answer: Integration Diff: 2 Page Ref: 395 59) In the energy industry, ________ grids are one of the most impactful applications of stream analytics. Answer: smart Diff: 2 Page Ref: 407
60)Organizations are working with data that meets the three V's-variety, volume, and characterizations Answer: velocity Diff: 2 Page Ref: 374 61)In the opening vignette, why was the Telecom company so concerned about the loss of customers, if customer churn is common in that industry? Answer: The company was concerned about its loss of customers, because the loss was at such a high rate. The company was losing customers faster than it was gaining them. add itionally, th company had identified that the loss of these customers could be traced back to customer service interactions. Because of this, the company felt that the loss of customers is something that could be analyzed and hopefully controlled Diff: 2 Page Ref: 370-371 62)List and describe the three main"V"'s that characterize Big data nswer Volume: This is obviously the most common trait of Big Data. Many factors contributed to the exponential increase in data volume, such as transaction-based data stored through the years text data constantly streaming in from social media, increasing amounts of sensor data being collected, automatically generated RFID and GPS data, and so forth Variety: Data today comes in all types of formats-ranging from trad itional databases to hierarchical data stores created by the end users and OLaP systems, to text documents, e-mail estimates, 80 to 85 percent of all organizations' data is in some sort of unstructuredorSome XML, meter-collected, sensor-captured data, to video, aud io, and stock ticker data. By semistructured format Velocity: This refers to both how fast data is being produced and how fast the data must be processed (i.e, captured, stored, and analyzed) to meet the need or demand RFid tags automated sensors, GPS devices, and smart meters are driving an increasing need to deal with torrents of data in near-real time Diff: 2 Page Ref: 374-375 Copyright C 2018 Pearson Education, Inc
9 Copyright © 2018 Pearson Education, Inc. 60) Organizations are working with data that meets the three V's–variety, volume, and ________ characterizations. Answer: velocity Diff: 2 Page Ref: 374 61) In the opening vignette, why was the Telecom company so concerned about the loss of customers, if customer churn is common in that industry? Answer: The company was concerned about its loss of customers, because the loss was at such a high rate. The company was losing customers faster than it was gaining them. Additionally, the company had identified that the loss of these customers could be traced back to customer service interactions. Because of this, the company felt that the loss of customers is something that could be analyzed and hopefully controlled. Diff: 2 Page Ref: 370-371 62) List and describe the three main "V"s that characterize Big Data. Answer: • Volume: This is obviously the most common trait of Big Data. Many factors contributed to the exponential increase in data volume, such as transaction-based data stored through the years, text data constantly streaming in from social media, increasing amounts of sensor data being collected, automatically generated RFID and GPS data, and so forth. • Variety: Data today comes in all types of formats—ranging from traditional databases to hierarchical data stores created by the end users and OLAP systems, to text documents, e-mail, XML, meter-collected, sensor-captured data, to video, audio, and stock ticker data. By some estimates, 80 to 85 percent of all organizations' data is in some sort of unstructured or semistructured format. • Velocity: This refers to both how fast data is being produced and how fast the data must be processed (i.e., captured, stored, and analyzed) to meet the need or demand. RFID tags, automated sensors, GPS devices, and smart meters are driving an increasing need to deal with torrents of data in near–real time. Diff: 2 Page Ref: 374-375
63 )List and describe four of the most critical success factors for Big Data analytics Answer: a clear business need (alignment with the vision and the strategy ) Business investments ought to be made for the good of the business, not for the sake of mere technology advancements Therefore, the main driver for Big data analytics should be the needs of the business at any level-strategic, tactical, and operations Strong, committed sponsorship(executive champion). It is a well-known fact that if you dont have strong, committed executive sponsorship, it is difficultif not impossible)to succeed If the scope is a single or a few analytical applications, the sponsorship can be at the departmental level. However, if the target is enterprise-wide organizational transformation which is often the case for Big Data initiatives, sponsorship needs to be at the highest levels and organization-wide Alignment between the business and It strategy. It is essential to make sure that the analytics work is always supporting the business strategy, and not other way around. Analytics should play the enabling role in successful execution of the business strategy A fact-based decision making culture. In a fact-based decision-making culture, the numbers rather than intuition, gut feeling, or supposition drive decision making. There is also a culture of experimentation to see what works and doesn't To create a fact-based decision-making culture senior management needs to do the following: recognize that some people can't or won't adjust; be a vocal supporter; stress that outdated methods must be discontinued; ask to see what analytics went into decisions; link incentives and compensation to desired behaviors A strong data infrastructure. Data warehouses have provided the data infrastructure fo analytics. This infrastructure is changing and being enhanced in the Big Data era with new technologies. Success requires marrying the old with the new for a holistic infrastructure that works synergistically Diff: 2 Page Ref: 379-380 Copyright C 2018 Pearson Education, Inc
10 Copyright © 2018 Pearson Education, Inc. 63) List and describe four of the most critical success factors for Big Data analytics. Answer: • A clear business need (alignment with the vision and the strategy). Business investments ought to be made for the good of the business, not for the sake of mere technology advancements. Therefore, the main driver for Big Data analytics should be the needs of the business at any level—strategic, tactical, and operations. • Strong, committed sponsorship (executive champion). It is a well-known fact that if you don't have strong, committed executive sponsorship, it is difficult (if not impossible) to succeed. If the scope is a single or a few analytical applications, the sponsorship can be at the departmental level. However, if the target is enterprise-wide organizational transformation, which is often the case for Big Data initiatives, sponsorship needs to be at the highest levels and organization-wide. • Alignment between the business and IT strategy. It is essential to make sure that the analytics work is always supporting the business strategy, and not other way around. Analytics should play the enabling role in successful execution of the business strategy. • A fact-based decision making culture. In a fact-based decision-making culture, the numbers rather than intuition, gut feeling, or supposition drive decision making. There is also a culture of experimentation to see what works and doesn't. To create a fact-based decision-making culture, senior management needs to do the following: recognize that some people can't or won't adjust; be a vocal supporter; stress that outdated methods must be discontinued; ask to see what analytics went into decisions; link incentives and compensation to desired behaviors. • A strong data infrastructure. Data warehouses have provided the data infrastructure for analytics. This infrastructure is changing and being enhanced in the Big Data era with new technologies. Success requires marrying the old with the new for a holistic infrastructure that works synergistically. Diff: 2 Page Ref: 379-380