I passed the AWS Certified Data Analytics Specialty exam in July 2021. This is Amazon's data engineering certification, and it's the closest thing they have to the Google Cloud Professional Data Engineer Certification Exam. This is an excellent exam for those who work in AWS data engineering or want to become data engineers.
When compared to other popular certification tests, such as the two AWS solutions architect certification exams, the exam is difficult because there are limited resources. Here's how I dealt with it.
Tip #1: The AWS Maniac Guide
I propose starting your research here:
The AWS Certified Data Analytics Specialty Exam: An Unofficial Guide
Wojciech Gawronski deserves credit for summarizing the exam content and compiling materials. In fact, the handbook urges you to prepare a little too much. This isn't necessarily a negative thing, because the information you gain from your preparation may prove to be more beneficial than the certification itself in the long run.
Tip #2: Do You Need to Know Hadoop?
The AWS Certified Data Analytics Specialty evolved from the prior AWS Certified Big Data Specialty. According to Gawronski and others, one big shift is that the new exam places a lot less focus on detailed engineering in Hadoop. The previous exam had questions on hdfs CLI commands, as well as other low-level topics. These inquiries are no longer being asked. Test takers must still be familiar with the Hadoop ecosystem, particularly as it pertains to Amazon EMR (Elastic Map Reduce). You should be aware of the differences between HDFS and EMRFS, as well as how to choose between the two in a given situation. They should also be familiar with Hive Metastore and how it differs from AWS Glue Data Catalog.
There were no questions about Apache Impala or Pig in general that I could locate. You'll benefit from knowing Hive and its capabilities, but you won't need to know about the low-level setup and architecture details. There is an emphasis on Spark, which is unsurprising. Fundamental Spark primitives, such as data frames, are useful to know, as you'll see in official AWS practice materials.
AWS Official Practice Questions (Tip #3)
The official AWS practice questions and practice exam were absolutely invaluable to me. On the AWS exam site, you can find a basic set of practice questions. When you register for the certification exam, you can purchase a practice test for $40 through your AWS Certmetrics account. If you have already passed an AWS certification test, you may also get a coupon for a free practice exam. Don't skimp on the practice exam; it's well worth the admission money. The questions are fairly identical to the ones you can obtain for free, but more is better, especially because these are the closest you'll get to the real thing.
Work through the free questions and practice the exam in a timed manner, then go over each question again to make sure you understand it completely. I recommend that you read all that is mentioned in each question and its responses. Both in topic and style, these questions are fairly typical of what will be on the actual exam. Several times through the logical process of eliminating choices and selecting a final answer.
Use realistic AWS Certified Data Analytics Specialty practice questions from Study4Exam to evaluate your preparation. These questions frequently follow the pattern of the official exam, they emphasize information that is really significant, and their solutions are highly correct. These practice questions are useful for inspiring your research and must be used after your preparation to assess yourself.
Tip #4: How to Pick the Right Answers
Here are some ways I employ to find the correct answer when it comes to questioning style. Instead of reading the multiple-choice replies in order, learn to move about and look for analogies. Frequently, a substantial piece of text in two responses will be similar word for word. Try bouncing between responses fast to see if there are any similarities, then look at the differences. Is there anything in one of the responses that is technically incorrect? Is it a question of following best practices? Pay close attention to the requirements in the question statement. Is it evident from the question that you should put a premium on low prices? Low operational costs? Latency?
Tip #5: Be aware of AWS-specific cloud best practices.
Also, while the questions may appear to be subjective, they are subjective in a very specific way. If you're familiar with the AWS culture, you'll be able to spot the recommended solution. When choosing between AWS Glue and Spark on EMR, for example, Glue is almost always the better option, unless there is a specific necessity that prevents it; in general, choose managed services. Other components of AWS best practices and culture can be learned from blog postings, the AWS Well-Architected Framework, white papers, and general AWS exposure.
Furthermore, you should be aware that cloud best practices and culture vary widely. Following GCP best practices may result in you answering questions incorrectly on the AWS exam. If you routinely work across various clouds, this can be a problem; make sure you're in the appropriate mindset before taking the exam.
I should also mention that while you are not required to obtain another AWS certification before taking this exam, it may be beneficial. Having a reasonable understanding of fundamental AWS services used in data engineering, such as S3, is required to pass the AWS Associate Solutions Architect test. You should also be familiar with basic cloud concepts such as zones and regions. However, I considered the exam to be lacking in several key architecture elements, such as networking.
Tip #6: Read the Frequently Asked Questions
This piece of advice comes from the AWS Maniac guide, but it's worth mentioning separately because it's so useful: The AWS service FAQs listed in the guide should be read. These are a wealth of targeted technical material in a highly distilled style; they extract the details that aid in solving architecture difficulties, which are exactly the types of challenges you'll encounter on the exam. Specification details are also included in the FAQs, which you should memorize. (The AWS Kinesis Data Streams FAQs, for example, state that messages are limited to 1 MB.) Can you think of a practice question where this might be beneficial?)
One word of caution: it appears that these FAQs are almost append-only. New information regarding recent advances (EMR Studio, EMR on EKS) is supplied on a regular basis, but some of the content is fairly out of date. (Did you know that Twitter Storm is a new framework? Really?) You won't be misled by the answers, but you can skim over information that isn't as relevant to the current ecology.
Tip #7: Advice for Remote Testing
There are also a few comments about remote testing, which Gawronski also covers. The testing environment at testing centers, I've discovered, can be dreadful. When I was taking an AWS certification exam on a sweltering summer day, the air conditioning broke down. It's not enjoyable.
I like the convenience of at-home, remote testing, and the option to test in a relaxed setting. However, I had to continually worry about noise from my corridor causing an exam failure, and the Pearson app occasionally displayed a bad internet connection, which is one of the most prevalent causes of exam revocation. Furthermore, when trying to work through problems, I have a habit of silently talking to myself, and the proctor threatened me with failing. I've never had a similar reprimand at a testing facility.
Finally, You've Got This!
Okay, thanks for taking the time to read this! You've got this! If you take the exam, please send me an email to tell me about your experience, your thoughts on the content and its utility, and so on.