The DP-100 certificate is a title given to experts excelling in Data Science and Machine Learning (ML). This means they can train and deploy solutions on Azure (Designing and Implementing a Data Science Solution on Azure), and among other things, run Machine Learning workloads on Azure with Azure ML Service. This involves knowing how to plan and generate a relevant work parameter for Data Science workloads on Azure.
Prerequisites for DP-100 Exam
Before taking the DP-100 exam, each candidate needs to know a few things:
- Fundamentals of Azure services
- Experience in the Python language for working with data, using libraries such as Numpy, Pandas and Matplotlib.
- Understanding of Data Science to prepare data and train machine learning models using popular ML libraries such as Scikit-Learn, PyTorch or Tensorflow.
DP-100 Exam topics
The following areas are the main subjects on which the knowledge required to acquire the Microsoft DP-100 certificate is measured.
1. Setting up an Azure Machine Learning workspace
- Create an Azure Machine Learning workspace (create an Azure Machine Learning workspace in Azure ML Studio).
- Create and manage databases and datasets (registering databases and creating datasets).
- Create computational targets for deploying ML workloads and for experimentation.
2. Running experiments and training models
- Create ML models using the Azure ML Designer (automate a Microsoft Certified Azure Data Scientist Associate pipeline using a designer).
- Run training scripts in an Azure ML workspace (create and run an experiment using the Azure ML SDK, use data from a dataset, configure script execution parameters, etc.).
- Generate metrics, retrieve experiment results and troubleshoot experiment execution (this involves covering log metrics generated from an experiment execution and displaying experiment output).
3. Optimization and model management
- Create an ML model using Azure ML Studio’s automated Machine Learning and the Azure ML SDK.
- Set hyperparameters using hyperdrive (define hyperparameter values for the model, select sampling methods, etc.).
- Interpret each model and generate data by importance of its characteristics (selection of a model and data interpreter).
Record and monitor the model to avoid data drift (record the trained model and monitor for data drift).
4. Deploying and using ML models
- Create compute targets for production (focus on security for deployed services and evaluation of compute options).
Deploy the ML model as a service (configure deployment parameters, use deployed services, etc.). - Create and run a batch inference pipeline (create a batch inference pipeline and obtain outputs.
- Deploy a designer pipeline as a Web service (create a compute target resource and use the deployed endpoint).