Introduction
The AWS Certified Big Data – Specialty (BDS-C00) examination is intended for individuals who perform complex Big Data analyses. This exam validates an examinee’s technical skills and experience in designing and implementing AWS services to derive value from data.

It validates an examinee’s ability to:
* Implement core AWS Big Data services according to basic architectural best practices
* Design and maintain Big Data
* Leverage tools to automate Data Analysis

Examination Prerequisite

In order to take this examination, you must hold an AWS Associate Certification (AWS Certified Solutions Architect – Associate, AWS Certified Developer – Associate, or AWS Certified SysOps Administrator – Associate) or Valid AWS Cloud Practitioner Certification in good standing.

Recommended AWS Knowledge
* A minimum of 2 years’ experience using AWS technology
* AWS Security best practices
* Independently define AWS architecture and services and understand how they integrate with each other.
* Define and architect AWS big data services and explain how they fit in the data lifecycle of collection, ingestion, storage, processing, and visualization.

Examkingdom Amazon AWS Certified Big Data Specialty BDS-C00 Exam pdf,

MCTS Training, MCITP Trainnig

Best Amazon AWS Certified Big Data Specialty BDS-C00 downloads , Amazon AWS Certified Big Data Specialty BDS-C00 Dumps at Certkingdom.com

Recommended General IT Knowledge
* At least 5 years’ experience in a data analytics field
* Understand how to control access to secure data
* Understand the frameworks that underpin large scale distributed systems like Hadoop/Spark and MPP data warehouses
* Understand the tools and design plateforms that allow processing of data from multiple heterogeneous sources with difference frequencies (batch/real-time)
* Capable of designing a scalable and cost-effective architecture to process data

Exam Preparation
These training courses and materials may be helpful for examination preparation:

AWS Training (aws.amazon.com/training)
* Big Data Technology Fundamentals
* Big Data on AWS

AWS Whitepapers (aws.amazon.com/whitepapers) Kindle and .pdf
* AWS Cloud Computing Whitepapers (aws.amazon.com/whitepapers), specifically Database and Analytics
* AWS Documentation (aws.amazon.com/documentation)

Exam Content
Response Types

There are two types of questions on the examination:
* Multiple-choice: Has one correct response and three or four incorrect responses (distractors).
* Multiple-response: Has two or more correct responses out of five or more options.

Select one or more responses that best complete the statement or answer the question. Distractors, or incorrect answers, are response options that an examinee with incomplete knowledge or skill would likely choose. However, they are generally plausible responses that fit in the content area defined by the test objective.

Unanswered questions are scored as incorrect; there is no penalty for guessing.

Unscored Content
Your examination may include unscored items that are placed on the test to gather statistical information. These items are not identified on the form and do not affect your score.
Exam Results

The AWS Certified Big Data – Specialty (BDS-C00) examination is a pass or fail exam. The examination is scored against a minimum standard established by AWS professionals who are guided by certification industry best practices and guidelines.

Your score report contains a table of classifications of your performance at each section level. This information is designed to provide general feedback concerning your examination performance. The examination uses a compensatory scoring model, which means that you do not need to “pass” the individual sections, only the overall examination. Each section of the examination has a specific weighting, so some sections have more questions than others. The table contains general information, highlighting your strengths and weaknesses. Exercise caution when interpreting section-level feedback.

Content Outline
This exam guide includes weightings, test domains, and objectives only. It is not a comprehensive listing of the content on this examination. The table below lists the main content domains and their weightings.
Domain 1: Collection 17%
Domain 2: Storage 17%
Domain 3: Processing 17%
Domain 4: Analysis 17%
Domain 5: Visualization 12%
Domain 6: Data Security 20%

Domain 1: Collection
1.1 Determine the operational characteristics of the collection system
1.2 Select a collection system that handles the frequency of data change and type of data being ingested
1.3 Identify the properties that need to be enforced by the collection system: order, data structure, metadata, etc.
1.4 Explain the durability and availability characteristics for the collection approach

Domain 2: Storage

2.1 Determine and optimize the operational characteristics of the storage solution
2.2 Determine data access and retrieval patterns
2.3 Evaluate mechanisms for capture, update, and retrieval of catalog entries
2.4 Determine appropriate data structure and storage format

Domain 3: Processing
3.1 Identify the appropriate data processing technology for a given scenario
3.2 Determine how to design and architect the data processing solution
3.3 Determine the operational characteristics of the solution implemented

Domain 4: Analysis

4.1 Determine the tools and techniques required for analysis
4.2 Determine how to design and architect the analytical solution
4.3 Determine and optimize the operational characteristics of the Analysis

Domain 5: Visualization
5.1 Determine the appropriate techniques for delivering the results/output
5.2 Determine how to design and create the Visualization platform
5.3 Determine and optimize the operational characteristics of the Visualization system

Domain 6: Data Security
6.1 Determine encryption requirements and/or implementation technologies
6.2 Choose the appropriate technology to enforce data governance
6.3 Identify how to ensure data integrity
6.4 Evaluate regulatory requirements

QUESTION 1
A company collects temperature, humidity, and atmospheric pressure data in cities across multiple continents. The average volume of data collected per site each day is 500 GB. Each site has a highspeed
internet connection. The company’s weather forecasting applications are based in a single Region and analyze the data daily.
What is the FASTEST way to aggregate data from all of these global sites?

A. Enable Amazon S3 Transfer Acceleration on the destination bucket. Use multipart uploads to directly upload site data to the destination bucket.
B. Upload site data to an Amazon S3 bucket in the closest AWS Region. Use S3 cross-Region replication to copy objects to the destination bucket.
C. Schedule AWS Snowball jobs daily to transfer data to the closest AWS Region. Use S3 cross-Region replication to copy objects to the destination bucket.
D. Upload the data to an Amazon EC2 instance in the closest Region. Store the data in an Amazon Elastic Block Store (Amazon EBS) volume. Once a day take an EBS snapshot and copy it to the centralized Region. Restore the EBS volume in the centralized Region and run an analysis on the data daily.

Answer: A

Explanation:
You might want to use Transfer Acceleration on a bucket for various reasons, including the following:
You have customers that upload to a centralized bucket from all over the world.
You transfer gigabytes to terabytes of data on a regular basis across continents.
You are unable to utilize all of your available bandwidth over the Internet when uploading to Amazon S3.

“Amazon S3 Transfer Acceleration can speed up content transfers to and from Amazon S3 by as much
as 50-500% for long-distance transfer of larger objects. Customers who have either web or mobile
applications with widespread users or applications hosted far away from their S3 bucket can
experience long and variable upload and download speeds over the Internet”
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html
“Improved throughput – You can upload parts in parallel to improve throughput.”

QUESTION 2
A company needs the ability to analyze the log files of its proprietary application. The logs are stored in JSON format in an Amazon S3 bucket Queries will be simple and will run on-demand A solutions
architect needs to perform the analysis with minimal changes to the existing architecture
What should the solutions architect do to meet these requirements with the LEAST amount of operational overhead?

A. Use Amazon Redshift to load all the content into one place and run the SQL queries as needed
B. Use Amazon CloudWatch Logs to store the logs Run SQL queries as needed from the Amazon CloudWatch console
C. Use Amazon Athena directly with Amazon S3 to run the queries as needed
D. Use AWS Glue to catalog the logs Use a transient Apache Spark cluster on Amazon EMR to run the SQL queries as needed

Answer: C

Explanation:
Amazon Athena can be used to query JSON in S3

QUESTION 3
A company uses AWS Organizations to manage multiple AWS accounts for different departments.
The management account has an Amazon S3 bucket that contains project reports. The company
wants to limit access to this S3 bucket to only users of accounts within the organization in AWS Organizations.
Which solution meets these requirements with the LEAST amount of operational overhead?

A. Add the aws:PrincipalOrgID global condition key with a reference to the organization ID to the S3 bucket policy.
B. Create an organizational unit (OU) for each department. Add the aws:PrincipalOrgPaths global condition key to the S3 bucket policy.
C. Use AWS CloudTrail to monitor the CreateAccount, InviteAccountToOrganization, LeaveOrganization, and RemoveAccountFromOrganization events. Update the S3 bucket policy accordingly.
D. Tag each user that needs access to the S3 bucket. Add the aws:PrincipalTag global condition key to the S3 bucket policy.

Answer: A
Explanation:
The aws:PrincipalOrgID global key provides an alternative to listing all the account IDs for all AWS
accounts in an organization. For example, the following Amazon S3 bucket policy allows members of
any account in the XXX organization to add an object into the examtopics bucket.
{“Version”: “2020-09-10”,
“Statement”: {
“Sid”: “AllowPutObject”,
“Effect”: “Allow”,
“Principal”: “*”,
“Action”: “s3:PutObject”,
“Resource”: “arn:aws:s3:::examtopics/*”,
“Condition”: {“StringEquals”:
{“aws:PrincipalOrgID”:[“XXX”]}}}}

QUESTION 4
An application runs on an Amazon EC2 instance in a VPC. The application processes logs that are
stored in an Amazon S3 bucket. The EC2 instance needs to access the S3 bucket without connectivity to the internet.
Which solution will provide private network connectivity to Amazon S3?

A. Create a gateway VPC endpoint to the S3 bucket.
B. Stream the logs to Amazon CloudWatch Logs. Export the logs to the S3 bucket.
C. Create an instance profile on Amazon EC2 to allow S3 access.
D. Create an Amazon API Gateway API with a private link to access the S3 endpoint.

Answer: A

Explanation:
VPC endpoint allows you to connect to AWS services using a private network instead of using the public Internet

QUESTION 5
A company is hosting a web application on AWS using a single Amazon EC2 instance that stores useruploaded
documents in an Amazon EBS volume. For better scalability and availability, the company
duplicated the architecture and created a second EC2 instance and EBS volume in another Availability
Zone placing both behind an Application Load Balancer After completing this change, users reported
that, each time they refreshed the website, they could see one subset of their documents or the
other, but never all of the documents at the same time.
What should a solutions architect propose to ensure users see all of their documents at once?

A. Copy the data so both EBS volumes contain all the documents.
B. Configure the Application Load Balancer to direct a user to the server with the documents
C. Copy the data from both EBS volumes to Amazon EFS Modify the application to save new documents to Amazon EFS
D. Configure the Application Load Balancer to send the request to both servers Return each document from the correct server.

Answer: C

QUESTION 6
A company uses NFS to store large video files in on-premises network attached storage. Each video
file ranges in size from 1MB to 500 GB. The total storage is 70 TB and is no longer growing. The
company decides to migrate the video files to Amazon S3. The company must migrate the video files
as soon as possible while using the least possible network bandwidth.
Which solution will meet these requirements?

A. Create an S3 bucket Create an IAM role that has permissions to write to the S3 bucket. Use the AWS CLI to copy all files locally to the S3 bucket.
B. Create an AWS Snowball Edge job. Receive a Snowball Edge device on premises. Use the Snowball Edge client to transfer data to the device. Return the device so that AWS can import the data into Amazon S3.
C. Deploy an S3 File Gateway on premises. Create a public service endpoint to connect to the S3 File Gateway Create an S3 bucket Create a new NFS file share on the S3 File Gateway Point the new file
share to the S3 bucket. Transfer the data from the existing NFS file share to the S3 File Gateway.

D. Set up an AWS Direct Connect connection between the on-premises network and AWS. Deploy an S3 File Gateway on premises. Create a public virtual interlace (VIF) to connect to the S3 File Gateway.
Create an S3 bucket. Create a new NFS file share on the S3 File Gateway. Point the new file share to the S3 bucket. Transfer the data from the existing NFS file share to the S3 File Gateway.

Answer: B

Explanation:
The basic difference between Snowball and Snowball Edge is the capacity they provide. Snowball
provides a total of 50 TB or 80 TB, out of which 42 TB or 72 TB is available, while Amazon Snowball
Edge provides 100 TB, out of which 83 TB is available.

QUESTION 7
A company has an application that ingests incoming messages. These messages are then quickly
consumed by dozens of other applications and microservices.
The number of messages varies drastically and sometimes spikes as high as 100,000 each second.
The company wants to decouple the solution and increase scalability.
Which solution meets these requirements?

A. Persist the messages to Amazon Kinesis Data Analytics. All the applications will read and process the messages.
B. Deploy the application on Amazon EC2 instances in an Auto Scaling group, which scales the number of EC2 instances based on CPU metrics.
C. Write the messages to Amazon Kinesis Data Streams with a single shard. All applications will read from the stream and process the messages.
D. Publish the messages to an Amazon Simple Notification Service (Amazon SNS) topic with one or more Amazon Simple Queue Service (Amazon SQS) subscriptions. All applications then process the messages from the queues.

Answer: D

Explanation:
By routing incoming requests to Amazon SQS, the company can decouple the job requests from the
processing instances. This allows them to scale the number of instances based on the size of the
queue, providing more resources when needed. Additionally, using an Auto Scaling group based on
the queue size will automatically scale the number of instances up or down depending on the
workload. Updating the software to read from the queue will allow it to process the job requests in a
more efficient manner, improving the performance of the system.

Click to rate this post!
[Total: 0 Average: 0]

Leave a Reply

Your email address will not be published. Required fields are marked *