Google Certified Professional Data Engineer Set 2 Author: CloudVikas Published Date: 18 June 2021 Welcome to Google Certified Professional Data Engineer Set 2. Click on Next Button to proceed. 1. How can you import training data into an AutoML Natural Language dataset?As TXT, PDF, TIF or ZIP files, from a local upload or Cloud Storage bucket.As TXT, PDF, TIF or ZIP files, or as a CSV containing references to documents and training labels, from a local upload or Cloud Storage bucket.From existing datasets in BigQuery or Cloud Bigtable.2. What are some of the properties that can be returned by the Cloud Vision API when requesting the FACE_DETECTION feature?Positions of the features of the face (eg. eyes and noise), likelihood of certain emotions, likelihood of headwearPositions of the features of the face (eg. eyes and noise), likelihood of certain emotions, estimation of ageBounding boxes for all faces in the image, as well as best-guess matches of faces from social media and Google Images3. Which features could you include in your request to the Cloud Vision API to detect text in images? (Choose 2) TEXT_DETECTION DOCUMENT_TEXT_DETECTION LOGO_DETECTION LABEL_DETECTION4. What would be the best visualization to present a small amount of data in a dashboard, where you wish to show the proportions of distribution in the data as percentages?Pie chartScore cardHeat map5. What categories are returned in the safeSearchAnnotation response from the Cloud Vision API?Safe for Work, Not Safe for WorkAdult, Spoof, Medical, Violence, RacyGeneral Audiences, PG-13, Mature6. What would be the best approach to transcribing a conversation from a recorded telephone call?Use synchronous speech recognition. Set `enableSpeakerDiarization` to false and the model to 'phone_call' in the RecognitionConfig.Use asynchronous speech recognition. Set `enableSpeakerDiarization` to false and the model to 'phone_call' in the RecognitionConfig.Use asynchronous speech recognition. Set `enableSpeakerDiarization` to true and the model to 'phone_call' in the RecognitionConfig.7. Your model is performing very well on training data, but very poorly on new or unknown data. What could be the problem?There is not enough data in the evaluation set - the model has been overfitted.There is not enough data in the training set - the loss function cannot be minimized.There are insufficient infrastructure resources assigned to the model.8. Which AutoML Vision feature would you use to detect multiple objects in an image?Object localizationConfidence threshold curvesHuman labeling9. What are hyperparameters used for in machine learning?Hyperparameters are the variables that your chosen machine learning technique uses to adjust data, such as weight values.Hyperparameters represent the data used during training to configure a model to make accurate predictions.Hyperparameters are the variables that govern the training process itself.10. Which of these problems would require a regression model?Identifying changes in a series of images.Estimating the price of a house in 5 years time.Recognizing faces in a picture.11. If you make a request to the documents:analyzeEntities endpoint of the Natural Language API, what sort of information can you expect in the response?An array of Entity objects that will each contain metadata including a Wikipedia URL and Knowledge Graph MID, if available.An array of Entity objects that will each contain the name of the object as a string.An array of strings for each object detected in the document.12. What is the primary purpose of a business intelligence dashboard?To provide links to various other reporting and visualization tools.To view high-level key performance indicators (KPIs) at a glance.To present detailed analysis that can be studied over time.13. What data sources are supported by Cloud Data Studio?Multiple database formats from supported GCP services.Files and databases from GCP, as well as other Google services and third-party products.Multiple file formats uploaded via Cloud Storage.14. You have a large number of images that you wish to process through a custom AutoML Vision model. Time is not a factor, but cost is. Which approach should you take?Select a random subset of the images and send each as a synchronous online prediction request.Sort the images into small batches, and make an asynchronous prediction request for each batch using the batchPredict method.Make an asynchronous prediction request for the entire batch of images using the batchPredict method.15. You're developing a mobile application that allows a food processing business to detect fruit that has gone bad. Staff at warehouses will use mobile devices to take pictures of fruit to determine whether it should be discarded. Which GCP services could you use to accomplish this? Train an AutoML Vision model using unlabeled images of fruit that has gone bad. Export the model with Tensorflow Lite. Deploy an API on App Engine to provice a backend for a mobile scanning application.Train an AutoML Vision model using labeled images of fruit that has gone bad. Use AutoML Vision Edge in ML Kit to deploy the custom model to mobile devices using ML Kit client libraries.Train an AutoML Vision model using unlabeled images of fruit that has gone bad. Use AutoML Vision Edge in ML Kit to deploy the custom model to mobile devices using ML Kit client libraries.16. How do you provide GCP authentication credentials when making request to Cloud MP APIs from the command line with curl?Call curl from inside a bash script with hard-coded credentials.Use the credentials in a service account key JSON file by encoding these in the HTTP request.Configure authentication with gcloud, then use 'print-access-token' to include an authorization bearer in the HTTP request17. How should you interpret the documentSentiment response from the Natural Language API?'score' represents an overall emotional leaning of a test from -1.0 (negative) to 1.0 (positive), and 'magnitude' indicates the overall strength of emotion. 'magnitude' is not normalized so longer text blocks may have greater magnitudes.'score' represents an overall emotional leaning of a test from -1.0 (negative) to 1.0 (positive), and 'magnitude' indicates the overall strength of emotion. 'magnitude' is normalized so the length of the text is not a factor.'magnitude' represents an overall emotional leaning of a test from -1.0 (negative) to 1.0 (positive), and 'score' indicates the overall strength of emotion. 'score' is normalized so the length of the text is not a factor.18. Which statement correctly describe the relationship between Keras and TensorFlow?Keras is a high-level deep-learning Python library that has nothing to do with TensorFlow.Keras is a high-level deep-learning Python library that includes support for TensorFlow functionality.Keras is a high-level deep-learning Python library that can only function with TensorFlow.19. You need to share a Data Studio dashboard that presents sales data with a non-technical member of the marketing team. This person needs to be able to quickly see how sales are performing in specific regions of the world over specific date ranges. What is the best way to approach this?Use a Geo map show how data varies across geographical areas and add a date-range control to the report.Create multiple bar charts to show sales from different regions and different months of the last financial year.Show the marketing person how to edit the Data Studio dashboard to configure the regions and date ranges they require.20. You wish to build an AutoML Natural Language model for classifying some documents with user-defined labels. How can you ensure you are providing quality training data for the model?Ensure you provide at minimum 100 training documents per label, but ideally 10 times more documents for the most common label than for the least common label.Aim to provide over 1,000,000 documents in total.Ensure you provide at minimum 10 training documents per label, but ideally 100 times more documents for the most common label than for the least common label.21 out of 20Please fill in the comment box below. Email Author: CloudVikas