Microsoft Azure HDInsight is Microsoft’s 100 percent compliant distribution of Apache Hadoop on Microsoft Azure. This means that standard Hadoop concepts and technologies apply, so learning the Hadoop stack helps you learn the HDInsight service. At the time of this writing, HDInsight (version 3.0) uses Hadoop version 2.2 and Hortonworks Data Platform 2.0.
In Introducing Microsoft Azure HDInsight, we cover what big data really means, how you can use it to your advantage in your company or organization, and one of the services you can use to do that quickly—specifically, Microsoft’s HDInsight service. We start with an overview of big data and Hadoop, but we don’t emphasize only concepts in this book—we want you to jump in and get your hands dirty working with HDInsight in a practical way. To help you learn and even implement HDInsight right away, we focus on a specific use case that applies to almost any organization and demonstrate a process that you can follow along with.
We also help you learn more. In the last chapter, we look ahead at the future of HDInsight and give you recommendations for self-learning so that you can dive deeper into important concepts and round out your education on working with big data.
Who should read this book
This book is intended to help database and business intelligence (BI) professionals, programmers, Hadoop administrators, researchers, technical architects, operations engineers, data analysts, and data scientists understand the core concepts of HDInsight and related technologies. It is especially useful for those looking to deploy their first data cluster and run MapReduce jobs to discover insights and for those trying to figure out how HDInsight fits into their technology infrastructure.
Many readers will have no prior experience with HDInsight, but even some familiarity with earlier versions of HDInsight and/or with Apache Hadoop and the MapReduce framework will provide a solid base for using this book. Introducing Microsoft Azure HDInsight assumes you have experience with web technology, programming on Windows machines, and basic data analysis principles and practices and an understanding of Microsoft Azure cloud technology.
Who should not read this book
Not every book is aimed at every possible audience. This book is not intended for data mining engineers.