Microsoft Azure Data Fundamentals Exam Ref DP-900 (Microsoft Press)
Article Index
Microsoft Azure Data Fundamentals Exam Ref DP-900 (Microsoft Press)
Chapters 3 & 4; Conclusion

 

Chapter 3: Describe how to work with non-relational data on Azure

This chapter follows the same general structure as the previous one, but now in terms of non-relational (or NoSQL) systems. Non-relational is an umbrella term for various disparate systems that are only united by not being relational.

The chapter opens with a look at non-relational data workloads, briefly examining non-adherence to the ACID principles, complexity of data, scalability, and having no standard interfaces to this disparate grouping. Next, some of the more popular non-relational data systems are examined, including: 

  • Key-value store – a hash table or dictionary

  • Document store – similar to key-value but the value is a document

  • Columnar data store – like having a giant spreadsheet where different rows and columns store different tables of data

  • Graph store – used to represent relationships (edges) between entities (nodes) 

In each case, the data store is briefly described, together with a simple example usage. This is followed by a useful, but much too brief, section on how to choose the correct type of data store.

Having looked at non-relational systems in general, the chapter now moves on to look at what Azure can provide, including: 

  • Azure Cosmos DB – this can store any of the non-relational stores given above 

  • Azure Blob Storage – useful for high volume unstructured data

  • Azure File Storage – useful as network shares distributed and replicated to different locations   

In each case, the storage is described with useful examples, discussions, and screenshots/tables.

The next section takes a look at various basic management tasks you’re likely to want to perform on your non-relational database systems, including: 

  • Provisioning and deploying your non-relational databases 

  • Deployment using ARM templates (for scalability and consistency), Azure portal, CLI etc

  • Security components – including encryption, authentication, firewalls

  • Identifying and fixing connectivity issues – e.g. on-premise, vNets, firewalls

  • Useful tools – e.g. Azure Data Explorer, AzCopy, Cosmos Explorer 

This chapter covers a VERY big topic, I think a book such as this one can only cover the very basics of these disparate systems.   

Chapter 4: Describe an analytics workload on Azure

Having looked at relational and non-relational databases/storage, the book next looks at analytics workloads. These typically involve very large amounts of data (think data warehouse or bigger). The chapter takes a look at what analytics workloads are, and covers: 

  • Differences between transactional and analytics workloads 

  • Batch and real-time processing 

  • Data warehouse workload  

  • When is a data warehouse solution appropriate  

Some of these topics build on or repeat what’s been said in previous chapters.

Next, there’s a look at what constitutes a modern data warehouse. This covers:  

  • Azure HDInsight – cloud distribution of Hadoop, Spark, etc

  • Azure Databricks – essentially Spark optimized for Azure

  • Azure Synapse Analytics – Microsoft’s Azure data warehouse 

In discussing these components, there’s a brief explanation of Hadoop (and its many components), HDFS, Map Reduce (batch) processing, in-memory processing, head and worker nodes, and various tools (e.g. IntelliJ). Some practical walkthroughs are provided (e.g. setting up an HDInsight Hadoop cluster).

The chapter then looks at the various ways of loading and processing data, covering: 

  • Azure Data Factory components – (i.e. pipeline, activity, dataset, linked service, runtime)

  • Processing options – using HDInsight, Databricks, and Azure Synapse Analytics

  • Common practices – examples provided based on PolyBase and Azure Synapse Analytics 

If you’re familiar with Microsoft’s SSIS tool for data loading and processing, then you’ll find Azure Data Factory (ADF) is very similar. Useful discussions and examples are provided.

The chapter ends with a look at data visualization using Power BI. This is Microsoft’s preferred tool for quickly creating impressive interactive/static reports and dashboards. It can be used with various data sources, and examples are provided.This is a big topic, with many components, the book makes a valiant attempt at explaining these components, but it really is just a brief introduction to these many disparate parts. Useful links are provided for further information.

Conclusion 

This book aims to introduce Azure data services and their use with different types of data and workloads, and mostly succeeds.

The book is generally easy to read, with useful discussions, diagrams, tables, and helpful exam tips throughout. There’s a good flow between the topics. The chapter summaries are very useful.

The book covers the exam syllabus, giving suitable examples for each subtopic. However, this is a big topic, especially when considering the non-relational systems – here only a glancing introduction can be provided. Luckily, links are provided for further information.

While the book is ‘introductory’, it covers a wide area of differing technologies. The more you already know about the topics (e.g. Big Data) the easier it is to understand the book.

It might be argued that some of the chapters assume too much prior knowledge, for example there’s an implied understanding that you’re aware of Spark, Hadoop, subnets etc. 

Will the book help you pass the exam? Well, it does cover all the expected topics, however I think you will need to explore the included links, and gain some practical experience, to pass the exam.

If you visit Microsoft's Virtual Training Days (https://www.microsoft.com/en-us/trainingdays) and register to watch the free Data Fundamentals course, a few days after completing it, you’ll get a voucher to take this exam for free. Additionally, there is a free online course provided by Microsoft: 
https://docs.microsoft.com/en-us/learn/certifications/exams/dp-900 

My top tip for the exam, if you see a question about graph databases, think Gremlin API.

Overall, a useful book for helping you understand the exam’s topics.

Related Reviews

Exam Ref AZ-900 Microsoft Azure Fundamentals

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.

Banner


Learn Quantum Computing with Python and Q#

Author: Dr. Sarah Kaiser and Dr. Chris Granade
Publisher: Manning
Date: June 2021
Pages: 384
ISBN: 978-1617296130
Print: 1617296139
Kindle: B098BNK1T9
Audience: Developers interested in quantum computing
Rating: 4.5
Reviewer: Mike James
Quantum - it's the future...



Wild West To Agile (Addison-Wesley)

Author: Jim Highsmith
Publisher: Addison-Wesley
Pages: 304
ISBN: 978-0137961009
Print: 0137961006
Kindle: B0BXWP88KP
Audience: Adherents of Agile methodology
Rating: 4.5
Reviewer: Alex Denham

The subtitle of this book is Adventures in Software Development Evolution and Revolution and it is personal reminin [ ... ]


More Reviews

 

<ASIN:0137252161>
<ASIN:0135732182>



Last Updated ( Tuesday, 07 September 2021 )