Microsoft Azure Data Fundamentals Exam Ref DP-900 (Microsoft Press)

Article Index
Microsoft Azure Data Fundamentals Exam Ref DP-900 (Microsoft Press)
Chapters 3 & 4; Conclusion

Page 2 of 2

Chapter 3: Describe how to work with non-relational data on Azure

This chapter follows the same general structure as the previous one, but now in terms of non-relational (or NoSQL) systems. Non-relational is an umbrella term for various disparate systems that are only united by not being relational.

The chapter opens with a look at non-relational data workloads, briefly examining non-adherence to the ACID principles, complexity of data, scalability, and having no standard interfaces to this disparate grouping. Next, some of the more popular non-relational data systems are examined, including:

Key-value store – a hash table or dictionary
Document store – similar to key-value but the value is a document
Columnar data store – like having a giant spreadsheet where different rows and columns store different tables of data
Graph store – used to represent relationships (edges) between entities (nodes)

In each case, the data store is briefly described, together with a simple example usage. This is followed by a useful, but much too brief, section on how to choose the correct type of data store.

Having looked at non-relational systems in general, the chapter now moves on to look at what Azure can provide, including:

Azure Cosmos DB – this can store any of the non-relational stores given above
Azure Blob Storage – useful for high volume unstructured data
Azure File Storage – useful as network shares distributed and replicated to different locations

In each case, the storage is described with useful examples, discussions, and screenshots/tables.

The next section takes a look at various basic management tasks you’re likely to want to perform on your non-relational database systems, including:

Provisioning and deploying your non-relational databases
Deployment using ARM templates (for scalability and consistency), Azure portal, CLI etc
Security components – including encryption, authentication, firewalls
Identifying and fixing connectivity issues – e.g. on-premise, vNets, firewalls
Useful tools – e.g. Azure Data Explorer, AzCopy, Cosmos Explorer

This chapter covers a VERY big topic, I think a book such as this one can only cover the very basics of these disparate systems.

Chapter 4: Describe an analytics workload on Azure

Having looked at relational and non-relational databases/storage, the book next looks at analytics workloads. These typically involve very large amounts of data (think data warehouse or bigger). The chapter takes a look at what analytics workloads are, and covers:

Differences between transactional and analytics workloads
Batch and real-time processing
Data warehouse workload
When is a data warehouse solution appropriate

Some of these topics build on or repeat what’s been said in previous chapters.

Next, there’s a look at what constitutes a modern data warehouse. This covers:

Azure HDInsight – cloud distribution of Hadoop, Spark, etc
Azure Databricks – essentially Spark optimized for Azure
Azure Synapse Analytics – Microsoft’s Azure data warehouse

In discussing these components, there’s a brief explanation of Hadoop (and its many components), HDFS, Map Reduce (batch) processing, in-memory processing, head and worker nodes, and various tools (e.g. IntelliJ). Some practical walkthroughs are provided (e.g. setting up an HDInsight Hadoop cluster).

The chapter then looks at the various ways of loading and processing data, covering:

Azure Data Factory components – (i.e. pipeline, activity, dataset, linked service, runtime)
Processing options – using HDInsight, Databricks, and Azure Synapse Analytics
Common practices – examples provided based on PolyBase and Azure Synapse Analytics

If you’re familiar with Microsoft’s SSIS tool for data loading and processing, then you’ll find Azure Data Factory (ADF) is very similar. Useful discussions and examples are provided.

The chapter ends with a look at data visualization using Power BI. This is Microsoft’s preferred tool for quickly creating impressive interactive/static reports and dashboards. It can be used with various data sources, and examples are provided.This is a big topic, with many components, the book makes a valiant attempt at explaining these components, but it really is just a brief introduction to these many disparate parts. Useful links are provided for further information.

Conclusion

This book aims to introduce Azure data services and their use with different types of data and workloads, and mostly succeeds.

The book is generally easy to read, with useful discussions, diagrams, tables, and helpful exam tips throughout. There’s a good flow between the topics. The chapter summaries are very useful.

The book covers the exam syllabus, giving suitable examples for each subtopic. However, this is a big topic, especially when considering the non-relational systems – here only a glancing introduction can be provided. Luckily, links are provided for further information.

While the book is ‘introductory’, it covers a wide area of differing technologies. The more you already know about the topics (e.g. Big Data) the easier it is to understand the book.

It might be argued that some of the chapters assume too much prior knowledge, for example there’s an implied understanding that you’re aware of Spark, Hadoop, subnets etc.

Will the book help you pass the exam? Well, it does cover all the expected topics, however I think you will need to explore the included links, and gain some practical experience, to pass the exam.

If you visit Microsoft's Virtual Training Days (https://www.microsoft.com/en-us/trainingdays) and register to watch the free Data Fundamentals course, a few days after completing it, you’ll get a voucher to take this exam for free. Additionally, there is a free online course provided by Microsoft:
https://docs.microsoft.com/en-us/learn/certifications/exams/dp-900

My top tip for the exam, if you see a question about graph databases, think Gremlin API.

Overall, a useful book for helping you understand the exam’s topics.

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.

SQL Server Query Tuning and Optimization (Packt)

Author: Benjamin Nevarez
Publisher: Packt Publishing Pages: 446
ISBN: 9781803242620
Print: 1803242620
Kindle: B0B42SVBFY
Audience: Intermediate to advanced DBAs and developers
Rating: 4.7
Reviewer: Ian Stirk

This book aims to give you the tools and knowledge to get peak performance from your que [ ... ]

+ Full Review

SQL Query Design Patterns and Best Practices

Author: Steve Hughes et al
Publisher: Packt Publishing
Pages: 270
ISBN: 978-1837633289
Print: 1837633282
Kindle: B0BWRD7HQ7
Audience: Query writers
Rating: 2.5
Reviewer: Ian Stirk

This book aims to improve your SQL queries using design patterns, how does it fare?

+ Full Review

More Reviews

<< Prev - Next

Last Updated ( Tuesday, 07 September 2021 )

Recent Articles

Recent Book Reviews

Popular Articles

Related Reviews