AWS Solution Architect Associate Part 2:
AWS Solution Architect Associate Part 2:
Question 21
To develop a two-tier web application on AWS where the company's developers have deployed the application on an Amazon EC2 instance that connects directly to a backend Amazon RDS database, the following solution meets the requirements with the least operational overhead:
Option C: Store the database credentials as a secret in AWS Secrets Manager. Turn on automatic rotation for the secret. Attach the required permission to the EC2 role to grant access to the secret.
Explanation
- Using AWS Secrets Manager to store credentials: This provides a secure way to manage and store sensitive information such as database credentials. It supports direct integration with Amazon RDS, simplifying setup.
- Enabling automatic rotation: AWS Secrets Manager supports automatic rotation, allowing you to set a schedule to regularly update credentials without manual intervention.
- Granting access via IAM role: Attach necessary permissions to the EC2 instance role, enabling the application to access secrets stored in AWS Secrets Manager without hardcoding credentials in the code or configuration files.
Why Choose Option C
- Minimal Operational Overhead: Compared to other options, C offers built-in support and minimal manual configuration, reducing operational complexity.
- Security: Avoids the risk of hardcoding credentials and uses IAM role security mechanisms to control access to sensitive information.
- Ease of Implementation: Leverages existing AWS services and features, reducing the need for custom scripts or additional components, making deployment and maintenance simpler.
Analysis of Other Options
- Option A: While functional, it introduces more operational complexity, including managing instance metadata and writing Lambda function logic.
- Option B: Adds management of S3 buckets and version control, requiring handling Lambda function scheduling and execution, which increases unnecessary complexity and potential failure points.
- Option D: Although AWS Systems Manager Parameter Store is viable, it is not specifically designed for managing secrets like AWS Secrets Manager and does not directly support automatic RDS credential rotation.
Question22
To modify the Lambda code to identify protected health information (PHI) in PDF and JPEG format reports, the hospital needs a solution that meets these requirements with the least operational overhead.
Option C: Use Amazon Textract to extract the text from the reports. Use Amazon Comprehend Medical to identify the PHI from the extracted text.
Explanation
Using Amazon Textract to Extract Text
- Amazon Textract is an AWS service designed to automatically extract text and data from scanned documents. It can accurately handle both simple text extraction and complex document structures like tables and forms, making it suitable for processing PDFs and image files (such as JPEGs).
- Compared to using Python libraries for manual text extraction, Amazon Textract reduces development time and maintenance costs while providing more accurate results.
Using Amazon Comprehend Medical to Identify PHI
- Amazon Comprehend Medical is a specialized natural language processing (NLP) service tailored for handling medical text. It can identify and classify medical information, including PHI, directly from the text extracted by Amazon Textract.
- This method avoids the complexity and overhead associated with training or managing custom models, offering highly accurate PHI identification without additional effort.
Why Choose Option C
- Minimal Operational Overhead: Both Amazon Textract and Amazon Comprehend Medical are managed services that can be quickly integrated into the existing architecture, reducing development and operational workloads.
- High Accuracy: These services are optimized for high-quality text extraction and PHI identification, ensuring compliance with the stringent requirements of the healthcare industry.
- Ease of Implementation: Leveraging existing AWS services simplifies the deployment process, with good integration support between the services.
Analysis of Other Options
- Option A: Using Python libraries for text extraction and PHI identification requires more development effort and may not match the accuracy or efficiency of AWS's dedicated services.
- Option B: Utilizing Amazon SageMaker for PHI identification introduces additional complexity due to the need to build, train, and manage custom models, increasing operational burden.
- Option D: While Amazon Rekognition excels at image recognition, it is not optimized for text extraction from complex document layouts. Combining it with Amazon Comprehend Medical would add unnecessary complexity.
Question23
To ensure high availability with minimum downtime and minimal data loss for a business-critical web application running on Amazon EC2 instances behind an Application Load Balancer, the following solution meets these requirements with the least operational effort:
Option B: Configure the Auto Scaling group to use multiple Availability Zones. Configure the database as Multi-AZ. Configure an Amazon RDS Proxy instance for the database.
Explanation
Using Multiple Availability Zones for Auto Scaling Group
- High Availability and Resilience: Distributing EC2 instances across multiple AZs enhances application availability and fault tolerance. If one AZ fails, the Auto Scaling group can launch new instances in other AZs to maintain service.
- Automatic Scaling: Auto Scaling adjusts the number of instances automatically based on traffic, ensuring efficient resource utilization and cost-effectiveness.
Configuring the Database as Multi-AZ Deployment
- High Availability and Disaster Recovery: Aurora PostgreSQL Multi-AZ deployments create one or more read replicas in different AZs. This improves database availability and provides automatic failover, reducing downtime.
- Data Durability: Multi-AZ deployments offer better data protection due to synchronous replication between the primary instance and replicas, minimizing the risk of data loss.
Configuring an Amazon RDS Proxy Instance
- Connection Management: RDS Proxy helps manage and optimize connections between the application and the database, especially under high concurrency, significantly improving performance and reducing latency.
- Simplified Database Connection Logic: RDS Proxy handles connection pooling and failover, reducing complexity in application code.
Why Choose Option B
- Minimal Operational Effort: This solution leverages AWS managed services and features, reducing manual configuration and management work.
- High Availability and Fault Tolerance: Multi-AZ deployment for both the application and database layers ensures high availability.
- Data Security and Continuity: Multi-AZ database deployments provide better data protection and rapid recovery capabilities.
Analysis of Other Options
- Option A: While cross-region replication and Route 53 health checks offer higher availability, they add complexity and cost, introducing more potential failure points.
- Option C: Configuring the Auto Scaling group in a single AZ and relying on periodic snapshots does not provide sufficient fault tolerance, and snapshot-based recovery can take longer, leading to unacceptable service interruptions.
- Option D: Configuring the Auto Scaling group across multiple regions and using S3 with Lambda to write to the database adds unnecessary complexity and may cause data consistency and latency issues.
Question24
To meet the recovery point objective (RPO) of 15 minutes and the recovery time objective (RTO) of 1 hour for a shopping application using Amazon DynamoDB to store customer information, the following solution is recommended:
Option B: Configure DynamoDB point-in-time recovery. For RPO recovery, restore to the desired point in time.
Explanation
- DynamoDB Point-in-Time Recovery
- Continuous Backups: DynamoDB’s point-in-time recovery feature automatically backs up your table data continuously, allowing you to restore your table to any point within the last 35 days, with an RPO down to seconds.
- Granular Restoration: You can restore your table to any second within the backup window, which meets the requirement of an RPO of 15 minutes.
- Fast Recovery: Restoring from a point-in-time backup typically takes less than an hour, depending on the size of the table and the amount of data being restored, meeting the RTO of 1 hour.
Why Choose Option B
- Meets RPO Requirements: By enabling point-in-time recovery, you ensure that data loss is limited to no more than the most recent 15 minutes, as required by the RPO.
- Meets RTO Requirements: The restoration process from a point-in-time backup is designed to be fast, often completing within the specified RTO of 1 hour.
- Operational Simplicity: This option requires minimal operational effort compared to other solutions because it leverages AWS-managed services, reducing the need for manual intervention or complex configurations.
Analysis of Other Options
- Option A: Configuring DynamoDB global tables would provide cross-region replication, enhancing availability and disaster recovery capabilities but does not directly address the RPO of 15 minutes. Additionally, switching to a different AWS Region for recovery might not meet the RTO of 1 hour due to potential delays in region failover and traffic redirection.
- Option C: Exporting DynamoDB data to Amazon S3 Glacier daily would not meet the RPO of 15 minutes since the export frequency is too low. Moreover, restoring data from S3 Glacier is a slow process, involving retrieval requests that could take several hours, failing to meet the RTO.
- Option D: DynamoDB does not use Amazon Elastic Block Store (Amazon EBS) volumes, so scheduling EBS snapshots for DynamoDB tables is not applicable. This option is incorrect and would not meet either the RPO or RTO requirements.
Question25
To configure a real-time data ingestion architecture for an application that requires an API, a process to transform data as it is streamed, and a storage solution for the data, with the least operational overhead, the following solution is recommended:
Option C: Configure an Amazon API Gateway
API to send data to an Amazon Kinesis Data Stream
. Create an Amazon Kinesis Data Firehose
delivery stream that uses the Kinesis Data Stream
as a data source. Use AWS Lambda
functions to transform the data. Use the Kinesis Data Firehose
delivery stream to send the data to Amazon S3
.
Explanation
Amazon API Gateway API
Serverless API Endpoint
: Provides a managed service for creating, deploying, and managing APIs without the need to manage servers.
Amazon Kinesis Data Stream
Real-Time Data Streaming
: Captures and processes large streams of real-time data such as clickstreams, application logs, and IoT telemetry.
Amazon Kinesis Data Firehose Delivery Stream
Data Transformation & Loading
: Automatically delivers streaming data to destinations likeAmazon S3
, while applying transformations usingAWS Lambda
if necessary.
AWS Lambda Functions
Event-Driven Processing
: Allows you to run code in response to triggers from other AWS services, includingKinesis Data Streams
, without provisioning or managing servers.
Amazon S3 Storage
Durable Object Storage
: Provides scalable and secure object storage suitable for storing transformed data for further processing or analysis.
Why Choose Option C
- Minimal Operational Overhead: Utilizes fully managed
AWS
services (API Gateway
,Kinesis
,Lambda
,Firehose
, andS3
) that reduce the need for manual server management and scaling. - Automatic Scaling: All components automatically scale to handle varying amounts of traffic and data volume.
- Cost-Effective: Pay only for what you use, avoiding the cost of idle resources.
- Event-Driven Architecture: Enables real-time processing by triggering
Lambda
functions on data arrival, ensuring timely data transformation. - Simplified Integration: Seamless integration between AWS services simplifies setup and reduces complexity.
Analysis of Other Options
Option A: This option also involves
Kinesis
andLambda
but requires deploying and managing anAmazon EC2
instance to host the API. This increases operational overhead compared to usingAPI Gateway
.Option B: Using
AWS Glue
for data transformation and sending data toS3
introduces unnecessary complexity and latency.AWS Glue
is primarily designed for ETL jobs rather than real-time data processing. Additionally, stopping source/destination checking on theEC2
instance does not provide any benefit for this use case.Option D: Configuring
API Gateway
to send data directly toAWS Glue
bypasses the advantages of usingKinesis
for real-time streaming.AWS Glue's
primary purpose is not suited for low-latency, real-time data processing, making this option less efficient for the given requirements.
Question26
To meet the requirement of retaining user transaction data in an Amazon DynamoDB table for 7 years, the following solution is recommended:
Option B: Use AWS Backup to create backup schedules and retention policies for the table.
Explanation
AWS Backup
Automated Backup Management
: AWS Backup provides a fully managed service that enables centralized management of backups across AWS services.Backup Schedules and Retention Policies
: You can define backup schedules and set retention policies to ensure that your DynamoDB table data is retained for 7 years automatically.- `Compliance and Auditing**: AWS Backup integrates with AWS Organizations, AWS CloudTrail, and AWS Key Management Service (KMS) to help meet compliance requirements and provide detailed audit trails.
Why Choose Option B
- Operational Efficiency: AWS Backup reduces the operational burden by automating the backup process, ensuring that backups are performed consistently without manual intervention.
- Long-Term Retention: By setting up retention policies, you can guarantee that backups are kept for the required 7-year period.
- Scalability and Reliability: AWS Backup is designed to handle large volumes of data reliably and efficiently, scaling as your data grows.
- Cost-Effective: Pay only for the storage used by your backups, with no upfront costs or commitments.
Analysis of Other Options
Option A: Using
DynamoDB point-in-time recovery
provides continuous backups but does not support long-term retention policies beyond 35 days. It is not suitable for retaining data for 7 years.Option C: Creating
on-demand backups
manually and storing them in anAmazon S3 bucket
with an S3 Lifecycle configuration requires manual intervention each time a backup is needed. This approach increases operational overhead and does not automate the retention policy enforcement.Option D: Setting up an
Amazon EventBridge
(formerlyAmazon CloudWatch Events
) rule to invoke anAWS Lambda function
for backing up the table introduces additional complexity. While this method can automate backups, it requires custom development and maintenance of the Lambda function. Moreover, managing S3 lifecycle configurations adds another layer of operational complexity.