AWS Certified Big Data Specialty Set1 Author: CloudVikas Published Date: 13 March 2020 Leave a Comment on AWS Certified Big Data Specialty Set1 Welcome to AWS Certified Big Data Specialty Set1. Please enter your email details to get QUIZ Details on your email id. Click on Next Button to proceed. Email 1. Data delivery from your Kinesis Firehose delivery stream to the destination is falling behind. When this happens, you need to manually change the buffer size to catch up and ensure that the data is delivered to the destination.TrueFalse2. For which of the following AWS services can you not create a rule action in AWS IoT? (Choose 2) Kinesis Firehose Redshift CloudWatch Aurora3. Which service does Kinesis Firehose not load streaming data into?DynamoDBRedshiftElasticsearch4. Kinesis Firehose buffers incoming data before delivering the data to your S3 bucket. What are the buffer size ranges?2 MB to 128 MB4 MB to 256 MB1 MB to 128 MB5. Regarding SQS, which of the following are true? (Choose 3) A queue can only be created in limited regions, and you should check the SQS website to see which are supported. Messages can be sent and read simultaneously. A queue can be created in any region. Messages can be retained in queues for up to 7 days. Messages can be retained in queues for up to 14 days.6. True or False: Data Pipeline does not integrate with on-premise servers.TrueFalse7. Which of the following AWS IoT components transforms messages and routes them to different AWS services?Device GatewayRules EngineDevice Shadow8. What are the main uses of Kinesis Data Streams? (Choose 2)They can undertake the loading of streamed data directly into data storesThey can provide long term storage of dataThey can carry out real-time reporting and analysis of streamed dataThey can accept data as soon as it has been produced, without the need for batching9. For an unknown reason, data delivery from Kinesis Firehose to your Redshift cluster has failed. Kinesis Firehose retries the data delivery every 5 minutes for a maximum period for of 60 minutes; however, none of the retries deliver the data to Redshift. Kinesis Firehose skips the files and move onto the next batch of files in S3. How can you ensure that the undelivered data is eventually loaded into Redshift?Check the STL_LOAD_ERRORS table in Redshift, find the files that failed to load and manually, and load the data in those files using the COPY command.You create a Lambda function to automatically load these files into Redshift by reading the manifest after the retries have been completed and the COPY command has been run.Skipped files are delivered to your S3 bucket as a manifest file in an errors folder. Run the COPY command manually to load the skipped files after you have determined why they failed to load.10. Your company is launching an IoT device that will send data to AWS. All the data generated by the millions of devices your company is going to sell will be stored in DynamoDB for use by the Engineering team. Each customer's data, however, will only be stored in DynamoDB for 30 days. A mobile application will be used to control the IoT device, and easy user sign-up and sign-in to the mobile application are requirements. The engineering team is designing the application to scale to millions of users. Their preference is to not have to worry about building, securing, and scaling authentication for the mobile application. They also want to use their own identity provider. Which option would be the best choice for their mobile application?Use an Amazon Cognito identity pool.Since everyone uses Facebook, Amazon, and Google, keep it simple and use all three.Use a SAML identity provider.11. Your team has successfully migrated the corporate data warehouse to Redshift. So far, all the data coming into the ETL pipeline for the data warehouse has been from other corporate systems also running on AWS. However, after signing some new business deals with a 3rd party, they will be securely sending files directly to S3. The data in these files needs to be ingested into Redshift. Members of your team are debating the most efficient and best automated way to introduce this change into the ETL pipeline. Which of the following options would you suggest? (Choose 2) Use Lambda (AWS Redshift Database Loader). Procure a new 3rd party tool that integrates with S3 and Redshift that provides powerful scheduling capabilities. Use Data Pipeline. Run a cron job on a t2.micro instance that will execute Linux shell scripts.12. If Kinesis Firehose experiences data delivery issues to S3, it will retry delivery to S3 for a period of ________.7 hours7 days24 hours1 out of Please fill in the comment box below. Author: CloudVikas