Trends in big data analytics
by Kambatla, Karthik; Kollias, Giorgos; Kumar, Vipin; Grama, Ananth
One of the major applications of future generation parallel and distributed systems is in big-data analytics. Data repositories for such applications currently exceed exabytes and are rapidly increasing in size. Beyond their sheer magnitude, these datasets and associated applications' considerations pose significant challenges for method and software development. Datasets are often distributed and their size and privacy considerations warrant distributed techniques. Data often resides on platforms with widely varying computational and network capabilities. Considerations of fault-tolerance, security, and access control are critical in many applications (Dean and Ghemawat, 2004; Apache hadoop). Analysis tasks often have hard deadlines, and data quality is a major concern in yet other applications. For most emerging applications, data-driven models and methods, capable of operating at scale, are as-yet unknown. Even when known methods can be scaled, validation of results is a major issue. Characteristics of hardware platforms and the software stack fundamentally impact data analytics. In this article, we provide an overview of the state-of-the-art and focus on emerging trends to highlight the hardware, software, and application landscape of big-data analytics.
Big data analytics with applications
by Bi, Zhuming; Cochran, David
In this paper, recent developments on the Internet of Things (IoT) and its applications are surveyed, and the impact of newly developed Big Data (BD) on manufacturing information systems is especially discussed. Big Data analytics (BDA) has been identified as a critical technology to support data acquisition, storage, and analytics in data management systems in modern manufacturing. The purpose of the presented work is to clarify the requirements of predictive systems, and to identify research challenges and opportunities on BDA to support cloud-based information systems.
Big data analytics and business analytics
by Duan, Lian; Xiong, Ye
Over the past few decades, with the development of automatic identification, data capture and storage technologies, people generate data much faster and collect data much bigger than ever before in business, science, engineering, education and other areas. Big data has emerged as an important area of study for both practitioners and researchers. It has huge impacts on data-related problems. In this paper, we identify the key issues related to big data analytics and then investigate its applications specifically related to business problems.
Big Data Analytics for Concurrent Data Processing
by A Samydurai; C Vijayakumaran; G Kumaresan; B Muthusenthil
Attractively voluminous data describes an immense volume of structured and unstructured data that is difficult to process utilizing traditional database techniques. The tremendous growth in arrival rates of data to support a large number of user queries creates complex problems in the traditional structured databases. In this paper, the input file is assigned to a master who has the ability to split and control the work flow with different workers. This will reduce the fault tolerance issues raised with nodes. They will evaluate the intermediate files and data items. Over again the processed data will be amalgamated and the required output will be immediately/middle file given to the user. Also the first solution for processing perpetual text queries efficiently to address the above challenges is given. The solution indexes the streamed documents in main recollection with a structure predicate on the principles of the inverted file, and processes document advent and expiration events with an incremental threshold-predicated method.
Big data and management
by George, Gerry; Haas, Martine; Pentland, Alex
Big data is everywhere. In recent years, there has been an increasing emphasis on big data, business analytics, and "smart" living and work environments. Though these conversations are predominantly practice driven, organizations are exploring how large-volume data can usefully be deployed to create and capture value for individuals, businesses, communities, and governments. Whether it is machine learning and web analytics to predict individual action, consumer choice, search behavior, traffic patterns, or disease outbreaks, big data is fast becoming a tool that not only analyzes patterns, but can also provide the predictive likelihood of an event. Organizations have jumped on this bandwagon of using ever-increasing volumes of data, often in tera- or petabytes' worth of storage capacity, to better predict outcomes with greater precision. In current information technology infrastructures, the provision of services such as network connectivity is usually associated with a Service Level Agreement defining the nature and quality of the service to be provided.