Length: Two hours
Registration fee: $ (plus tax where applicable)
Language: English
Exam format: 50-60 multiple choice and multiple select questions
Exam delivery method:
a. Take the online-proctored exam from a remote location, review the online testing requirements.
b. Take the onsite-proctored exam at a testing center, locate a test center near you
Prerequisites: None
Recommended experience: 3+ years of industry experience including 1 or more years designing and managing solutions using Google Cloud.
Certification Renewal / Recertification: Candidates must recertify in order to maintain their certification status. Unless explicitly stated in the detailed exam descriptions, all Google Cloud certifications are valid for two years from the date of certification. Recertification is accomplished by retaking the exam during the recertification eligibility time period and achieving a passing score. You may attempt recertification starting 60 days prior to your certification expiration date.
Exam overview
Step 1: Get real world experience
Before attempting the Machine Learning Engineer exam, it’s recommended that you have 3+ years of hands-on experience with Google Cloud products and solutions. Ready to start building? Explore the Google Cloud Free Tier for free usage (up to monthly limits) of select products.
Examkingdom Google-Professional-Machine-Learning-Engineer Exam pdf
Best Google-Professional-Machine-Learning-Engineer Downloads, Google-Professional-Machine-Learning-Engineer Dumps at Certkingdom.com
Try the Google Cloud Free Tier
Step 2: Understand what’s on the exam
The exam guide contains a complete list of topics that may be included on the exam. Review the exam guide to determine if your skills align with the topics on the exam.
See current exam guide
Step 3: Review the sample questions
Familiarize yourself with the format of questions and example content that may be covered on the Machine Learning Engineer exam.
Review sample questions
Step 4: Round out your skills with training
Prepare for the exam by following the Machine Learning Engineer learning path. Explore online training, in-person classes, hands-on labs, and other resources from Google Cloud.
Start preparing
Prepare for the exam with Googlers and certified experts. Get valuable exam tips and tricks, as well as insights from industry experts.
Explore Google Cloud documentation for in-depth discussions on the concepts and critical components of Google Cloud.
Learn about designing, training, building, deploying, and operationalizing secure ML applications on Google Cloud using the Official Google Cloud Certified Professional Machine Learning Engineer Study Guide. This guide uses real-world scenarios to demonstrate how to use the Vertex AI platform and technologies such as TensorFlow, Kubeflow, and AutoML, as well as best practices on when to choose a pretrained or a custom model.
Step 5: Schedule an exam
Register and select the option to take the exam remotely or at a nearby testing center.
Review exam terms and conditions and data sharing policies.
A Professional Machine Learning Engineer builds, evaluates, productionizes, and optimizes ML models by using Google Cloud technologies and knowledge of proven models and techniques. The ML Engineer handles large, complex datasets and creates repeatable, reusable code. The ML Engineer considers responsible AI and fairness throughout the ML model development process, and collaborates closely with other job roles to ensure long-term success of ML-based applications. The ML Engineer has strong programming skills and experience with data platforms and distributed data processing tools. The ML Engineer is proficient in the areas of model architecture, data and ML pipeline creation, and metrics interpretation. The ML Engineer is familiar with foundational concepts of MLOps, application development, infrastructure management, data engineering, and data governance. The ML Engineer makes ML accessible and enables teams across the organization. By training, retraining, deploying, scheduling, monitoring, and improving models, the ML Engineer designs and creates scalable, performant solutions.
* Note: The exam does not directly assess coding skill. If you have a minimum proficiency in Python and Cloud SQL, you should be able to interpret any questions with code snippets.
The Professional Machine Learning Engineer exam assesses your ability to:
Architect low-code ML solutions
Collaborate within and across teams to manage data and models
Scale prototypes into ML models
Serve and scale models
Automate and orchestrate ML pipelines
Monitor ML solutions
Sample Question and Answers
QUESTION 1
As the lead ML Engineer for your company, you are responsible for building ML models to digitize
scanned customer forms. You have developed a TensorFlow model that converts the scanned images
into text and stores them in Cloud Storage. You need to use your ML model on the aggregated data
collected at the end of each day with minimal manual intervention. What should you do?
A. Use the batch prediction functionality of Al Platform
B. Create a serving pipeline in Compute Engine for prediction
C. Use Cloud Functions for prediction each time a new data point is ingested
D. Deploy the model on Al Platform and create a version of it for online inference.
Answer: A
Explanation:
Batch prediction is the process of using an ML model to make predictions on a large set of data
points. Batch prediction is suitable for scenarios where the predictions are not time-sensitive and can
be done in batches, such as digitizing scanned customer forms at the end of each day. Batch
prediction can also handle large volumes of data and scale up or down the resources as needed. AI
Platform provides a batch prediction service that allows users to submit a job with their TensorFlow
model and input data stored in Cloud Storage, and receive the output predictions in Cloud Storage as
well. This service requires minimal manual intervention and can be automated with Cloud Scheduler
or Cloud Functions. Therefore, using the batch prediction functionality of AI Platform is the best
option for this use case.
Reference:
Batch prediction overview
Using batch prediction
QUESTION 2
You work for a global footwear retailer and need to predict when an item will be out of stock based
on historical inventory data. Customer behavior is highly dynamic since footwear demand is influenced by many different
factors. You want to serve models that are trained on all available data, but track your performance
on specific subsets of data before pushing to production. What is the most streamlined and reliable
way to perform this validation?
A. Use the TFX ModelValidator tools to specify performance metrics for production readiness
B. Use k-fold cross-validation as a validation strategy to ensure that your model is ready forproduction.
C. Use the last relevant week of data as a validation set to ensure that your model is performingaccurately on current data
D. Use the entire dataset and treat the area under the receiver operating characteristics curve (AUC ROC) as the main metric.
Answer: A
Explanation:
TFX ModelValidator is a tool that allows you to compare new models against a baseline model and
evaluate their performance on different metrics and data slices1. You can use this tool to validate
your models before deploying them to production and ensure that they meet your expectations and requirements.
k-fold cross-validation is a technique that splits the data into k subsets and trains the model on k-1
subsets while testing it on the remaining subset. This is repeated k times and the average
performance is reported2. This technique is useful for estimating the generalization error of a model,
but it does not account for the dynamic nature of customer behavior or the potential changes in data distribution over time.
Using the last relevant week of data as a validation set is a simple way to check the models
performance on recent data, but it may not be representative of the entire data or capture the longterm
trends and patterns. It also does not allow you to compare the model with a baseline or evaluate it on different data slices.
Using the entire dataset and treating the AUC ROC as the main metric is not a good practice because
it does not leave any data for validation or testing. It also assumes that the AUC ROC is the only
metric that matters, which may not be true for your business problem. You may want to consider
other metrics such as precision, recall, or revenue.
QUESTION 3
You work on a growing team of more than 50 data scientists who all use Al Platform. You are
designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?
A. Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.
B. Separate each data scientist’s work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.
C. Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources
D. Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.
Answer: C
Explanation:
Labels are key-value pairs that can be attached to any AI Platform resource, such as jobs, models,
versions, or endpoints1. Labels can help you organize your resources into descriptive categories, such
as project, team, environment, or purpose. You can use labels to filter the results when you list or
monitor your resources, or to group them for billing or quota purposes2. Using labels is a simple and
scalable way to manage your AI Platform resources without creating unnecessary complexity or overhead.
Therefore, using labels to organize resources is the best strategy for this use case.
Reference:
Using labels
Filtering and grouping by labels
QUESTION 4
During batch training of a neural network, you notice that there is an oscillation in the loss. How should you adjust your model to ensure that it converges?
A. Increase the size of the training batch
B. Decrease the size of the training batch
C. Increase the learning rate hyperparameter
D. Decrease the learning rate hyperparameter
Answer: D
Explanation:
Oscillation in the loss during batch training of a neural network means that the model is
overshooting the optimal point of the loss function and bouncing back and forth. This can prevent
the model from converging to the minimum loss value. One of the main reasons for this
phenomenon is that the learning rate hyperparameter, which controls the size of the steps that the
model takes along the gradient, is too high. Therefore, decreasing the learning rate hyperparameter
can help the model take smaller and more precise steps and avoid oscillation. This is a common
technique to improve the stability and performance of neural network training12.
Reference:
Interpreting Loss Curves
Is learning rate the only reason for training loss oscillation after few epochs?
QUESTION 5
You are building a linear model with over 100 input features, all with values between -1 and 1.
You suspect that many features are non-informative. You want to remove the non-informative features
from your model while keeping the informative ones in their original form. Which technique should you use?
A. Use Principal Component Analysis to eliminate the least informative features.
B. Use L1 regularization to reduce the coefficients of uninformative features to 0.
C. After building your model, use Shapley values to determine which features are the most informative.
D. Use an iterative dropout technique to identify which features do not degrade the model when removed.
Answer: B
Explanation:
L1 regularization, also known as Lasso regularization, adds the sum of the absolute values of the
models coefficients to the loss function1. It encourages sparsity in the model by shrinking some
coefficients to precisely zero2. This way, L1 regularization can perform feature selection and remove
the non-informative features from the model while keeping the informative ones in their original
form. Therefore, using L1 regularization is the best technique for this use case.
Reference:
Regularization in Machine Learning – GeeksforGeeks
Regularization in Machine Learning (with Code Examples) – Dataquest
L1 And L2 Regularization Explained & Practical How To Examples
L1 and L2 as Regularization for a Linear Model
QUESTION 6
Your team has been tasked with creating an ML solution in Google Cloud to classify support requests
for one of your platforms. You analyzed the requirements and decided to use TensorFlow to build the
classifier so that you have full control of the model’s code, serving, and deployment. You will use
Kubeflow pipelines for the ML platform. To save time, you want to build on existing resources and
use managed services instead of building a completely new model. How should you build the classifier?
A. Use the Natural Language API to classify support requests
B. Use AutoML Natural Language to build the support requests classifier
C. Use an established text classification model on Al Platform to perform transfer learning
D. Use an established text classification model on Al Platform as-is to classify support requests
Answer: C
Explanation:
Transfer learning is a technique that leverages the knowledge and weights of a pre-trained model
and adapts them to a new task or domain1. Transfer learning can save time and resources by
avoiding training a model from scratch, and can also improve the performance and generalization of
the model by using a larger and more diverse dataset2. AI Platform provides several established text
classification models that can be used for transfer learning, such as BERT, ALBERT, or XLNet3. These
models are based on state-of-the-art natural language processing techniques and can handle various
text classification tasks, such as sentiment analysis, topic classification, or spam detection4. By using
one of these models on AI Platform, you can customize the models code, serving, and deployment,
and use Kubeflow pipelines for the ML platform. Therefore, using an established text classification
model on AI Platform to perform transfer learning is the best option for this use case.
Reference:
Transfer Learning – Machine Learnings Next Frontier