AWS Solution Architect Associate Part 1:

Ruochen ChenNovember 14, 2024About 31 min

AWS Solution Architect Associate Part 1:

Question1

An application runs on an Amazon EC2 instance in a VPC. The application processes logs that are stored in an Amazon S3 bucket. The EC2 instance needs to access the S3 bucket without connectivity to the internet.
Which solution will provide private network connectivity to Amazon S3?

A. Create a gateway VPC endpoint to the S3 bucket.
B. Stream the logs to Amazon CloudWatch Logs. Export the logs to the S3 bucket.
C. Create an instance profile on Amazon EC2 to allow S3 access.
D. Create an Amazon API Gateway API with a private link to access the S3 endpoint.

Answer:

A. Create a gateway VPC endpoint to the S3 bucket.

Explanation:

To allow an EC2 instance in a VPC to access an S3 bucket without needing internet connectivity, you can use a VPC endpoint. Specifically, a gateway VPC endpoint is designed to provide private network connectivity between your VPC and Amazon S3 (and DynamoDB) without using the internet.

B. Stream the logs to Amazon CloudWatch Logs. Export the logs to the S3 bucket.
- This option does not directly address the need for private connectivity between the EC2 instance and S3. It involves routing logs to CloudWatch, which may not be necessary if direct access to S3 is required.
C. Create an instance profile on Amazon EC2 to allow S3 access.
- An instance profile allows EC2 instances to assume an IAM role for permissions to access S3, but this does not provide private network connectivity. You still need an endpoint like a VPC endpoint for private access to S3.
D. Create an Amazon API Gateway API with a private link to access the S3 endpoint.
- API Gateway and PrivateLink are typically used for private access to services within VPCs, but creating an API Gateway for accessing S3 is unnecessary and complex compared to simply using a VPC endpoint.

Thus, creating a gateway VPC endpoint is the simplest and most efficient solution for providing private network connectivity to S3.

Question2

A company collects data for temperature, humidity, and atmospheric pressure in cities across multiple continents. The average volume of data that the company collects from each site daily is 500 GB. Each site has a high-speed Internet connection.
The company wants to aggregate the data from all these global sites as quickly as possible in a single Amazon S3 bucket. The solution must minimize operational complexity.
Which solution meets these requirements?

A. Turn on S3 Transfer Acceleration on the destination S3 bucket. Use multipart uploads to directly upload site data to the destination S3 bucket.
B. Upload the data from each site to an S3 bucket in the closest Region. Use S3 Cross-Region Replication to copy objects to the destination S3 bucket. Then remove the data from the origin S3 bucket.
C. Schedule AWS Snowball Edge Storage Optimized device jobs daily to transfer data from each site to the closest Region. Use S3 Cross-Region Replication to copy objects to the destination S3 bucket.
D. Upload the data from each site to an Amazon EC2 instance in the closest Region. Store the data in an Amazon Elastic Block Store (Amazon EBS) volume. At regular intervals, take an EBS snapshot and copy it to the Region that contains the destination S3 bucket. Restore the EBS volume in that Region.

Answer:

A. Turn on S3 Transfer Acceleration on the destination S3 bucket. Use multipart uploads to directly upload site data to the destination S3 bucket.

Explanation:

Option A is the best solution because S3 Transfer Acceleration speeds up the upload process for large amounts of data over long distances. By using multipart uploads, you can upload large files in parallel, improving throughput and efficiency. This solution minimizes operational complexity as it directly uploads to S3 without additional services or manual processes.
Option B involves using S3 Cross-Region Replication, which introduces unnecessary complexity by replicating data across regions. While it can be useful for disaster recovery, it's not as efficient for a simple data aggregation task.
Option C involves using AWS Snowball Edge, which is a physical data transfer solution. This is more complex and would be slower than using direct uploads for daily data aggregation, especially since the sites have high-speed Internet connections.
Option D is overly complex, as it involves setting up EC2 instances, EBS volumes, and taking snapshots. This is far more complicated than directly uploading the data to S3 using transfer acceleration.

Thus, Option A is the most suitable solution for aggregating data with minimal complexity and high speed.

Question3

A company needs the ability to analyze the log files of its proprietary application. The logs are stored in JSON format in an Amazon S3 bucket. Queries will be simple and will run on-demand. A solutions architect needs to perform the analysis with minimal changes to the existing architecture.
What should the solutions architect do to meet these requirements with the LEAST amount of operational overhead?

A. Use Amazon Redshift to load all the content into one place and run the SQL queries as needed.
B. Use Amazon CloudWatch Logs to store the logs. Run SQL queries as needed from the Amazon CloudWatch console.
C. Use Amazon Athena directly with Amazon S3 to run the queries as needed.
D. Use AWS Glue to catalog the logs. Use a transient Apache Spark cluster on Amazon EMR to run the SQL queries as needed.

Answer:

C. Use Amazon Athena directly with Amazon S3 to run the queries as needed.

Explanation:

Option C is the best choice as Amazon Athena allows you to run SQL queries directly on data stored in Amazon S3. Since the logs are already stored in S3 in JSON format, Athena can query this data without needing to move it or change the existing architecture. Athena is fully managed and requires minimal operational overhead. It is ideal for simple, on-demand queries and provides a straightforward solution for this use case.
Option A (Amazon Redshift) would require loading the logs into Redshift, which introduces additional complexity and overhead to set up and manage the Redshift cluster. It is more suited for complex analytics and data warehousing, not for simple log analysis.
Option B (Amazon CloudWatch Logs) would require moving the logs to CloudWatch, which introduces additional complexity for log management and is more suited for logs that need to be actively monitored, rather than for on-demand query-based analysis.
Option D (AWS Glue + Apache Spark on Amazon EMR) involves setting up AWS Glue for cataloging and running queries on an EMR cluster, which introduces significant operational overhead for simple log queries. This solution would be much more complex than necessary for on-demand queries on S3 data.

Thus, Option C is the most efficient and cost-effective solution for running simple, on-demand queries on JSON log data stored in Amazon S3.

Question4

A company uses AWS Organizations to manage multiple AWS accounts for different departments. The management account has an Amazon S3 bucket that contains project reports. The company wants to limit access to this S3 bucket to only users of accounts within the organization in AWS Organizations.
Which solution meets these requirements with the LEAST amount of operational overhead?

A. Add the aws:PrincipalOrgID global condition key with a reference to the organization ID to the S3 bucket policy.
B. Create an organizational unit (OU) for each department. Add the aws:PrincipalOrgPaths global condition key to the S3 bucket policy.
C. Use AWS CloudTrail to monitor the CreateAccount, InviteAccountToOrganization, LeaveOrganization, and RemoveAccountFromOrganization events. Update the S3 bucket policy accordingly.
D. Tag each user that needs access to the S3 bucket. Add the aws:PrincipalTag global condition key to the S3 bucket policy.

Answer:

A. Add the aws:PrincipalOrgID global condition key with a reference to the organization ID to the S3 bucket policy.

Explanation:

Option A is the most efficient solution because the aws:PrincipalOrgID condition key in the S3 bucket policy allows you to restrict access based on the AWS Organization ID. This ensures that only users from accounts within the specified organization can access the S3 bucket, without needing to manually manage tags, monitor account events, or create organizational units. This is the least operationally complex solution.
Option B introduces unnecessary complexity by requiring the creation of organizational units (OUs) for each department and adding additional conditions. While this could work, it is more complicated than using the organization ID directly.
Option C involves using AWS CloudTrail to monitor account-related events. While this could be used to track account changes, it does not directly address the requirement of restricting access to the S3 bucket, and would require additional operational overhead to monitor and update the bucket policy based on those events.
Option D requires tagging each user and adding a condition in the S3 bucket policy to check those tags. While this can work, it introduces additional manual overhead in managing tags for each user and does not directly leverage the organizational structure for access control.

Thus, Option A provides the simplest and most operationally efficient way to restrict access to the S3 bucket based on the AWS Organization ID.

Question5

A company is hosting a web application on AWS using a single Amazon EC2 instance that stores user-uploaded documents in an Amazon EBS volume. For better scalability and availability, the company duplicated the architecture and created a second EC2 instance and EBS volume in another Availability Zone, placing both behind an Application Load Balancer. After completing this change, users reported that, each time they refreshed the website, they could see one subset of their documents or the other, but never all of the documents at the same time.
What should a solutions architect propose to ensure users see all of their documents at once?

A. Copy the data so both EBS volumes contain all the documents
B. Configure the Application Load Balancer to direct a user to the server with the documents
C. Copy the data from both EBS volumes to Amazon EFS. Modify the application to save new documents to Amazon EFS
D. Configure the Application Load Balancer to send the request to both servers. Return each document from the correct server

Answer:

C. Copy the data from both EBS volumes to Amazon EFS. Modify the application to save new documents to Amazon EFS.

Explanation:

Option C is the most suitable solution because Amazon EFS (Elastic File System) provides a scalable, shared file system that can be mounted simultaneously by multiple EC2 instances. By using Amazon EFS, both EC2 instances in different Availability Zones will have access to the same set of documents, ensuring that users can see all of their documents no matter which EC2 instance handles their request. This solution also removes the need for complex data synchronization between EBS volumes.
Option A suggests copying the data to both EBS volumes, which would introduce complexity and might not be scalable, as each EC2 instance would need to maintain its own copy of the data. Additionally, this could lead to data inconsistency issues.
Option B suggests configuring the Application Load Balancer to direct users to the server with the required documents, but this doesn’t address the core issue of sharing data between the EC2 instances. It could lead to users being directed to different instances where their documents are not available.
Option D suggests configuring the Application Load Balancer to send requests to both servers, which could cause confusion and performance issues. It does not ensure that both servers have access to all the documents, and users may still see incomplete data.

Thus, Option C is the most efficient and scalable approach to ensure users can access all of their documents simultaneously.

Question6

A company uses NFS to store large video files in on-premises network attached storage. Each video file ranges in size from 1 MB to 500 GB. The total storage is 70 TB and is no longer growing. The company decides to migrate the video files to Amazon S3. The company must migrate the video files as soon as possible while using the least possible network bandwidth.
Which solution will meet these requirements?

A. Create an S3 bucket. Create an IAM role that has permissions to write to the S3 bucket. Use the AWS CLI to copy all files locally to the S3 bucket.
B. Create an AWS Snowball Edge job. Receive a Snowball Edge device on premises. Use the Snowball Edge client to transfer data to the device. Return the device so that AWS can import the data into Amazon S3.
C. Deploy an S3 File Gateway on premises. Create a public service endpoint to connect to the S3 File Gateway. Create an S3 bucket. Create a new NFS file share on the S3 File Gateway. Point the new file share to the S3 bucket. Transfer the data from the existing NFS file share to the S3 File Gateway.
D. Set up an AWS Direct Connect connection between the on-premises network and AWS. Deploy an S3 File Gateway on premises. Create a public virtual interface (VIF) to connect to the S3 File Gateway. Create an S3 bucket. Create a new NFS file share on the S3 File Gateway. Point the new file share to the S3 bucket. Transfer the data from the existing NFS file share to the S3 File Gateway.

Answer:

B. Create an AWS Snowball Edge job. Receive a Snowball Edge device on premises. Use the Snowball Edge client to transfer data to the device. Return the device so that AWS can import the data into Amazon S3.

Explanation:

Option B is the most suitable solution because AWS Snowball Edge is designed for large-scale data transfer with minimal bandwidth usage. The Snowball Edge device can handle up to 100 TB of data, and since the company's total storage is 70 TB, it fits within the device's capacity. The solution minimizes the network bandwidth required by using physical transport, which is ideal for migrating large datasets quickly and efficiently.
Option A requires using the AWS CLI to copy the files directly to S3, which could consume a significant amount of network bandwidth, especially considering the large size and number of files. This would not meet the requirement of minimizing bandwidth usage.
Option C and Option D both involve setting up an S3 File Gateway, which allows transferring data to S3 as if it were an NFS share. While this could work, it still requires using the network to transfer all the files, and it might not be as bandwidth-efficient as using a physical device like Snowball Edge. Option D also introduces the complexity of setting up an AWS Direct Connect connection, which adds unnecessary operational overhead for this use case.

Thus, Option B offers the least bandwidth consumption while meeting the speed and scale requirements for the migration.

Question7:

A company is running an SMB file server in its data center. The file server stores large files that are accessed frequently for the first few days after the files are created. After 7 days the files are rarely accessed.
The total data size is increasing and is close to the company's total storage capacity. A solutions architect must increase the company's available storage space without losing low-latency access to the most recently accessed files. The solutions architect must also provide file lifecycle management to avoid future storage issues.
Which solution will meet these requirements?

A. Use AWS DataSync to copy data that is older than 7 days from the SMB file server to AWS.
B. Create an Amazon S3 File Gateway to extend the company's storage space. Create an S3 Lifecycle policy to transition the data to S3 Glacier Deep Archive after 7 days.
C. Create an Amazon FSx for Windows File Server file system to extend the company's storage space.
D. Install a utility on each user's computer to access Amazon S3. Create an S3 Lifecycle policy to transition the data to S3 Glacier Flexible Retrieval after 7 days.

Answer:

B. Create an Amazon S3 File Gateway to extend the company's storage space. Create an S3 Lifecycle policy to transition the data to S3 Glacier Deep Archive after 7 days.

Explanation:

Option B is the best solution because Amazon S3 File Gateway provides a way to extend on-premises storage to the cloud while allowing access to files over the SMB protocol. The S3 Lifecycle policy can be used to transition data to S3 Glacier Deep Archive after 7 days, which is the ideal solution for infrequently accessed data. This meets the requirement of increasing available storage while still allowing low-latency access to recent files.
Option A involves using AWS DataSync to copy data older than 7 days to AWS, but it does not provide a seamless solution for accessing the data on an ongoing basis, nor does it handle lifecycle management effectively.
Option C involves Amazon FSx for Windows File Server, which is a fully managed Windows file system, but it doesn't address the need for lifecycle management (i.e., transitioning data to lower-cost storage after 7 days). FSx is more suited for continuous file access, but it can be more expensive compared to transitioning data to S3 Glacier.
Option D requires installing a utility on each user’s computer to access Amazon S3 directly, which would complicate access and management, and is not ideal for seamless file server access over SMB.

Thus, Option B offers a comprehensive solution by extending storage with an S3 File Gateway, providing lifecycle management, and ensuring low-latency access to recently used files.

Question 8

A global company hosts its web application on Amazon EC2 instances behind an Application Load Balancer (ALB). The web application has static data and dynamic data. The company stores its static data in an Amazon S3 bucket. The company wants to improve performance and reduce latency for the static data and dynamic data. The company is using its own domain name registered with Amazon Route 53. What should a solutions architect do to meet these requirements?

A. Create an Amazon CloudFront distribution that has the S3 bucket and the ALB as origins. Configure Route 53 to route traffic to the CloudFront distribution.
B. Create an Amazon CloudFront distribution that has the ALB as an origin. Create an AWS Global Accelerator standard accelerator that has the S3 bucket as an endpoint. Configure Route 53 to route traffic to the CloudFront distribution.
C. Create an Amazon CloudFront distribution that has the S3 bucket as an origin. Create an AWS Global Accelerator standard accelerator that has the ALB and the CloudFront distribution as endpoints. Create a custom domain name that points to the accelerator DNS name. Use the custom domain name as an endpoint for the web application.
D. Create an Amazon CloudFront distribution that has the ALB as an origin. Create an AWS Global Accelerator standard accelerator that has the S3 bucket as an endpoint. Create two domain names. Point one domain name to the CloudFront DNS name for dynamic content. Point the other domain name to the accelerator DNS name for static content. Use the domain names as endpoints for the web application.

Answer:

A. Create an Amazon CloudFront distribution that has the S3 bucket and the ALB as origins. Configure Route 53 to route traffic to the CloudFront distribution.

Explanation:

Amazon CloudFront is a content delivery network (CDN) service that can cache content at edge locations around the world, reducing latency by serving content from the location closest to the user. By creating a CloudFront distribution, you can configure it to use both the S3 bucket (for static content) and the Application Load Balancer (ALB) (for dynamic content) as origins.
S3 Bucket Origin: For static data, CloudFront will cache the content in its edge locations, which means that when users request static files, they will be served directly from the nearest edge location, significantly reducing latency.
ALB Origin: For dynamic data, CloudFront will forward requests to the ALB, which will then route the requests to the appropriate EC2 instances. This ensures that dynamic content is served with low latency, as the ALB is designed to distribute traffic efficiently.
Route 53 Configuration: You can configure Route 53 to point the company's domain name to the CloudFront distribution. This way, all traffic (both static and dynamic) will go through CloudFront, which will handle the routing based on the request type.

Why Other Options Are Less Suitable:

Option B: Creating a CloudFront distribution with the ALB as an origin and using AWS Global Accelerator for the S3 bucket adds unnecessary complexity. Global Accelerator is more suited for accelerating TCP and UDP traffic, not HTTP/HTTPS traffic, and it doesn't provide the same caching benefits as CloudFront for static content.
Option C: Using AWS Global Accelerator with both the ALB and the CloudFront distribution as endpoints introduces additional complexity and cost. It also requires managing two different services for what could be handled more simply with a single CloudFront distribution.
Option D: Creating two separate domain names, one for static content and one for dynamic content, would complicate the architecture and potentially lead to a less seamless user experience. Users would need to be directed to different domains based on the type of content, which is not ideal.

Therefore, Option A is the most efficient and straightforward solution, providing the best balance between performance, simplicity, and ease of management.

Question9

A company runs an ecommerce application on Amazon EC2 instances behind an Application Load Balancer. The instances run in an Amazon EC2 Auto Scaling group across multiple Availability Zones. The Auto Scaling group scales based on CPU utilization metrics. The ecommerce application stores the transaction data in a MySQL 8.0 database that is hosted on a large EC2 instance. The database's performance degrades quickly as application load increases. The application handles more read requests than write transactions. The company wants a solution that will automatically scale the database to meet the demand of unpredictable read workloads while maintaining high availability. Which solution will meet these requirements?

A. Use Amazon Redshift with a single node for leader and compute functionality.
B. Use Amazon RDS with a Single-AZ deployment. Configure Amazon RDS to add reader instances in a different Availability Zone.
C. Use Amazon Aurora with a Multi-AZ deployment. Configure Aurora Auto Scaling with Aurora Replicas.
D. Use Amazon ElastiCache for Memcached with EC2 Spot Instances.

Answer

C. Use Amazon Aurora with a Multi-AZ deployment. Configure Aurora Auto Scaling with Aurora Replicas.

Explanation

Option C is the best solution because Amazon Aurora is a highly scalable and resilient managed relational database service that is compatible with MySQL and PostgreSQL. It provides several features that align well with the company's requirements:
- Multi-AZ Deployment: This ensures high availability and fault tolerance by replicating the database across multiple Availability Zones. If one AZ fails, the database can failover to another AZ with minimal downtime.
- Aurora Auto Scaling: This feature allows you to automatically add or remove Aurora Replicas based on the read workload. This ensures that the database can handle increased read traffic without manual intervention, thereby maintaining performance and scalability.
- Aurora Replicas: These replicas can be added to distribute read traffic, improving read performance. They can be placed in different Availability Zones to enhance availability and reduce latency.

Why Other Options Are Less Suitable

Option A (Amazon Redshift with a single node for leader and compute functionality):
- Redshift is a data warehousing service designed for complex queries and large-scale data analysis, not for transactional workloads. It is not suitable for an e-commerce application that requires low-latency read operations and frequent write transactions.
- A single-node deployment does not provide high availability or automatic scaling.
Option B (Amazon RDS with a Single-AZ deployment. Configure Amazon RDS to add reader instances in a different Availability Zone):
- Single-AZ Deployment: This does not provide high availability. If the primary instance fails, there will be downtime until a new instance is manually provisioned or the existing instance is restored.
- While adding reader instances can help with read scalability, it does not provide the same level of automated scaling and high availability as Aurora.
Option D (Amazon ElastiCache for Memcached with EC2 Spot Instances):
- ElastiCache is a caching service that can improve read performance by caching frequently accessed data in memory. However, it is not a replacement for a relational database and does not provide the same level of durability and consistency required for transactional data.
- EC2 Spot Instances can be terminated at any time if the spot price exceeds the bid price, which can lead to data loss or service interruptions. This makes them unsuitable for critical e-commerce applications.

Therefore, Option C is the most appropriate solution as it provides a managed, highly available, and scalable database service that can handle the read-heavy workload of the e-commerce application while maintaining high availability.

Question10

A company recently migrated to AWS and wants to implement a solution to protect the traffic that flows in and out of the production VPC. The company had an inspection server in its on-premises data center. The inspection server performed specific operations such as traffic flow inspection and traffic filtering. The company wants to have the same functionalities in the AWS Cloud. Which solution will meet these requirements?

A. Use Amazon GuardDuty for traffic inspection and traffic filtering in the production VPC.
B. Use Traffic Mirroring to mirror traffic from the production VPC for traffic inspection and filtering.
C. Use AWS Network Firewall to create the required rules for traffic inspection and traffic filtering for the production VPC.
D. Use AWS Firewall Manager to create the required rules for traffic inspection and traffic filtering for the production VPC.

Answer

C. Use AWS Network Firewall to create the required rules for traffic inspection and traffic filtering for the production VPC.

Explanation

Option C (AWS Network Firewall):
- AWS Network Firewall is a managed service that provides a stateful firewall and intrusion detection/prevention system (IDS/IPS) for your VPCs. It allows you to create and enforce security rules to inspect and filter traffic at the network layer. This service can perform deep packet inspection and apply rules to block or allow traffic based on predefined policies. It is well-suited for scenarios where you need to replicate the functionality of an on-premises inspection server, including traffic flow inspection and filtering.

Why Other Options Are Less Suitable

Option A (Amazon GuardDuty):
- Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior. While it can detect potential threats, it does not provide the ability to inspect and filter traffic in real-time. It is more focused on alerting and forensic analysis rather than active traffic filtering.
Option B (Traffic Mirroring):
- Traffic Mirroring allows you to copy network traffic from network interfaces and send it to a traffic mirror target (such as an EC2 instance or a Lambda function) for inspection. While this can be useful for traffic analysis and monitoring, it does not provide the built-in capabilities for traffic filtering and enforcement. You would need to implement additional logic to filter and act on the mirrored traffic, which adds complexity and operational overhead.
Option D (AWS Firewall Manager):
- AWS Firewall Manager is a service that helps you manage and automate the creation and maintenance of security rules across multiple AWS accounts and resources. While it can be used to manage security policies, it relies on underlying services like AWS Network Firewall or AWS WAF (Web Application Firewall) to enforce those policies. It does not provide the actual traffic inspection and filtering capabilities itself.

Therefore, Option C (AWS Network Firewall) is the most appropriate solution because it provides a comprehensive set of features for traffic inspection and filtering, similar to what the company had with its on-premises inspection server. It is a managed service that can be easily integrated into the VPC and provides the necessary stateful inspection and rule-based filtering to meet the company's requirements.

Question11

An application development team is designing a microservice that will convert large images to smaller, compressed images. When a user uploads an image through the web interface, the microservice should store the image in an Amazon S3 bucket, process and compress the image with an AWS Lambda function, and store the image in its compressed form in a different S3 bucket. A solutions architect needs to design a solution that uses durable, stateless components to process the images automatically. Which combination of actions will meet these requirements? (Choose two.)

A. Create an Amazon Simple Queue Service (Amazon SQS) queue. Configure the S3 bucket to send a notification to the SQS queue when an image is uploaded to the S3 bucket.
B. Configure the Lambda function to use the Amazon Simple Queue Service (Amazon SQS) queue as the invocation source. When the SQS message is successfully processed, delete the message in the queue.
C. Configure the Lambda function to monitor the S3 bucket for new uploads. When an uploaded image is detected, write the file name to a text file in memory and use the text file to keep track of the images that were processed.
D. Launch an Amazon EC2 instance to monitor an Amazon Simple Queue Service (Amazon SQS) queue. When items are added to the queue, log the file name in a text file on the EC2 instance and invoke the Lambda function.
E. Configure an Amazon EventBridge (Amazon CloudWatch Events) event to monitor the S3 bucket. When an image is uploaded, send an alert to an Amazon Simple Notification Service (Amazon SNS) topic with the application owner's email address for further processing.

Answer

A. Create an Amazon Simple Queue Service (Amazon SQS) queue. Configure the S3 bucket to send a notification to the SQS queue when an image is uploaded to the S3 bucket.B. Configure the Lambda function to use the Amazon Simple Queue Service (Amazon SQS) queue as the invocation source. When the SQS message is successfully processed, delete the message in the queue.

Explanation

Option A:
- Amazon S3 Notification to SQS: By configuring the S3 bucket to send a notification to an SQS queue when an image is uploaded, you ensure that the upload event triggers the processing workflow. This decouples the upload event from the processing, making the system more resilient and scalable.
Option B:
- Lambda Function Invoked by SQS: Configuring the Lambda function to use the SQS queue as the invocation source allows the Lambda function to process messages from the queue. This ensures that the Lambda function is only invoked when there is work to do, and it can handle multiple messages concurrently. Deleting the message from the queue after successful processing ensures that the message is not processed again, maintaining idempotency.

Why Other Options Are Less Suitable:

Option C:
- Lambda Function Monitoring S3 Bucket: While this approach can work, it is less efficient and more complex compared to using an SQS queue. Writing file names to a text file in memory is not a durable or scalable solution, and it adds unnecessary complexity to the Lambda function.
Option D:
- EC2 Instance Monitoring SQS Queue: Using an EC2 instance to monitor the SQS queue and invoke the Lambda function adds unnecessary complexity and increases costs. The EC2 instance also introduces a single point of failure, which goes against the requirement for durable, stateless components.
Option E:
- EventBridge (CloudWatch Events) to SNS: While this approach can be used for alerting and notification, it does not provide a mechanism for automatically processing the images. Sending an alert to an SNS topic with the application owner's email address for further processing is not an automated solution and does not meet the requirement for automatic image processing.

Therefore, Options A and B together provide a robust, scalable, and automated solution for processing and compressing images using durable, stateless components. They leverage AWS managed services to handle the event-driven workflow efficiently.

Question12

A company provides a Voice over Internet Protocol (VoIP) service that uses UDP connections. The service consists of Amazon EC2 instances running in an Auto Scaling group, with deployments across multiple AWS Regions. The company needs to route users to the Region with the lowest latency and also requires automated failover between Regions.

Solution

A. Deploy a Network Load Balancer (NLB) and an associated target group. Associate the target group with the Auto Scaling group. Use the NLB as an AWS Global Accelerator endpoint in each Region.

Explanation

Network Load Balancer (NLB): The NLB is designed to handle TCP and UDP traffic, making it suitable for VoIP services that use UDP. It provides high performance and low latency, which is crucial for real-time communication.
Auto Scaling Group: By associating the NLB with an Auto Scaling group, you ensure that the EC2 instances can scale automatically based on demand, maintaining the availability and performance of the VoIP service.
AWS Global Accelerator: This service uses the AWS global network infrastructure to route traffic to the nearest and most available endpoint, which in this case would be the NLB in the Region with the lowest latency. AWS Global Accelerator also provides automatic failover to another healthy endpoint if the primary one becomes unavailable, ensuring high availability and reliability.

Why Other Options Are Less Suitable

Option B (Application Load Balancer (ALB)): ALBs do not support UDP traffic, so they are not suitable for a VoIP service that uses UDP.
Option C (NLB with Route 53 Latency Record and CloudFront): While this option uses an NLB, which is appropriate for UDP, it introduces unnecessary complexity by using CloudFront. CloudFront is designed for web content delivery and does not natively support UDP. Additionally, using Route 53 latency records for DNS-based routing is less efficient and less reliable than using AWS Global Accelerator for real-time traffic.
Option D (ALB with Route 53 Weighted Record and CloudFront): Similar to Option B, ALBs do not support UDP, and using CloudFront for UDP traffic is not ideal. Route 53 weighted records can be used for load balancing, but they do not provide the same level of performance and failover capabilities as AWS Global Accelerator.

Question13

A company runs an online marketplace web application on AWS. The application serves hundreds of thousands of users during peak hours. The company needs a scalable, near-real-time solution to share the details of millions of financial transactions with several other internal applications. Transactions also need to be processed to remove sensitive data before being stored in a document database for low-latency retrieval.

Solution

C. Stream the transaction data into Amazon Kinesis Data Streams. Use AWS Lambda integration to remove sensitive data from every transaction and then store the transaction data in Amazon DynamoDB. Other applications can consume the transaction data off the Kinesis data stream.

Explanation

Scalability and Near-Real-Time Processing:
- Amazon Kinesis Data Streams is designed for real-time processing of streaming data at scale. It can handle high volumes of data and provide low-latency access to the data.
- AWS Lambda can be integrated with Kinesis Data Streams to process each transaction in near real-time. This allows you to remove sensitive data as soon as the transaction is streamed.
Data Storage and Low-Latency Retrieval:
- Amazon DynamoDB is a fully managed NoSQL database that provides fast and predictable performance with seamless scalability. It is ideal for storing and retrieving transaction data with low latency.
- After processing the transactions with Lambda, the cleaned data can be stored in DynamoDB, ensuring that it is available for low-latency retrieval by other applications.
Data Sharing:
- Other applications can consume the transaction data directly from the Kinesis Data Stream, allowing for real-time or near-real-time processing and sharing of the data.
- Additionally, you can set up multiple consumers or use Kinesis Data Streams' enhanced fan-out feature to ensure that multiple applications can read the data independently and in parallel.

Why Other Options Are Less Suitable

Option A (Store the transaction data into Amazon DynamoDB. Set up a rule in DynamoDB to remove sensitive data from every transaction upon write. Use DynamoDB Streams to share the transaction data with other applications.):
- DynamoDB does not have a built-in mechanism to automatically remove sensitive data upon write. You would need to use a Lambda function triggered by DynamoDB Streams to process the data, which is less efficient than processing the data in the stream itself.
- This approach also adds complexity and potential latency, as the data needs to be written to DynamoDB first, then processed, and then potentially re-written.
Option B (Stream the transaction data into Amazon Kinesis Data Firehose to store data in Amazon DynamoDB and Amazon S3. Use AWS Lambda integration with Kinesis Data Firehose to remove sensitive data. Other applications can consume the data stored in Amazon S3.):
- Kinesis Data Firehose is more suitable for batch processing and delivery to storage services like S3. It is not designed for real-time consumption by multiple applications.
- While you can use Lambda to process the data, the primary purpose of Firehose is to deliver data to storage, not to enable real-time consumption by multiple applications.
Option D (Store the batched transaction data in Amazon S3 as files. Use AWS Lambda to process every file and remove sensitive data before updating the files in Amazon S3. The Lambda function then stores the data in Amazon DynamoDB. Other applications can consume transaction files stored in Amazon S3.):
- This approach is more suitable for batch processing rather than real-time or near-real-time processing.
- Storing and processing data in S3 files introduces additional latency and complexity, and it is not as efficient for real-time data sharing and consumption.

Question14

A company hosts its multi-tier applications on AWS. For compliance, governance, auditing, and security, the company must track configuration changes on its AWS resources and record a history of API calls made to these resources.

Solution

B. Use AWS Config to track configuration changes and AWS CloudTrail to record API calls.

Explanation

AWS Config:
- Configuration Tracking: AWS Config provides a detailed view of the configuration of AWS resources in your account. It records the current configuration of your resources and can also track changes to those configurations over time.
- Compliance and Governance: AWS Config helps you evaluate the compliance of your resources by using predefined or custom rules. This is useful for auditing and ensuring that your resources adhere to your company's policies and best practices.
AWS CloudTrail:
- API Call Recording: AWS CloudTrail logs API calls made to your AWS account, including those made through the AWS Management Console, AWS SDKs, command-line tools, and other AWS services. This includes who made the request, what the request was, when it was made, and the source IP address.
- Security and Auditing: CloudTrail logs are essential for security analysis, resource change tracking, and compliance auditing. They provide a history of all API calls, which can be used to detect and respond to unauthorized or suspicious activity.

Why Other Options Are Less Suitable

Option A (Use AWS CloudTrail to track configuration changes and AWS Config to record API calls):
- Incorrect Usage: AWS CloudTrail is not designed to track configuration changes; it is used to record API calls. AWS Config, on the other hand, is specifically designed to track configuration changes and not to record API calls.
Option C (Use AWS Config to track configuration changes and Amazon CloudWatch to record API calls):
- Incorrect Usage: Amazon CloudWatch is primarily used for monitoring and logging metrics, logs, and events. While it can be used to monitor some API call-related metrics, it does not provide the detailed logging of API calls that AWS CloudTrail does.
Option D (Use AWS CloudTrail to track configuration changes and Amazon CloudWatch to record API calls):
- Incorrect Usage: As mentioned, AWS CloudTrail is for recording API calls, not for tracking configuration changes. Amazon CloudWatch, while useful for monitoring, is not the primary tool for recording detailed API call history.

Solution: Secure and Remote Administration of Amazon EC2 Instances

Question15

A company recently launched a variety of new workloads on Amazon EC2 instances in its AWS account. The company needs to create a strategy to access and administer the instances remotely and securely. The company needs to implement a repeatable process that works with native AWS services and follows the AWS Well-Architected Framework.

Solution

B. Attach the appropriate IAM role to each existing instance and new instance. Use AWS Systems Manager Session Manager to establish a remote SSH session.

Explanation

AWS Systems Manager (SSM) Session Manager:
- Security: Using Session Manager for remote access does not require opening inbound ports, maintaining bastion hosts, or managing SSH keys. This reduces the attack surface and enhances security.
- Auditability: All sessions through Session Manager can be logged and audited, which is essential for compliance and governance requirements.
- Operational Simplicity: No additional infrastructure (like a bastion host) or complex network configurations are needed. Administrators can easily connect to EC2 instances via the AWS Management Console, AWS CLI, or SDK.
- Integration: Session Manager integrates with AWS Identity and Access Management (IAM), allowing fine-grained control over who can access which instances.

Why Other Options Are Less Suitable

Option A (Use the EC2 serial console to directly access the terminal interface of each instance for administration):
- Limitations: The serial console is primarily used for troubleshooting and emergency access. It is not suitable for regular management and maintenance. Additionally, it does not provide full SSH functionality and is not supported for all instance types.
- Operational Overhead: Requires manual enabling of the serial console and may need additional security measures to protect console access.
Option C (Create an administrative SSH key pair. Load the public key into each EC2 instance. Deploy a bastion host in a public subnet to provide a tunnel for administration of each instance):
- Operational Overhead: Requires managing a bastion host, including its security and availability. Also, requires managing SSH key pairs, ensuring their secure storage and distribution.
- Complexity: Involves configuring network security rules (such as security groups) to allow traffic from the bastion host to the EC2 instances. This adds complexity to network configuration.
Option D (Establish an AWS Site-to-Site VPN connection. Instruct administrators to use their local on-premises machines to connect directly to the instances by using SSH keys across the VPN tunnel):
- Operational Overhead: Establishing and maintaining a Site-to-Site VPN connection requires additional network configuration and management. Also, requires managing SSH key pairs and ensuring their secure storage and distribution.
- Complexity: Involves configuring and managing the VPN connection, along with related routing and security settings. This adds complexity to network and security configurations.

Question16

A company has thousands of edge devices that collectively generate 1 TB of status alerts each day, with each alert being approximately 2 KB in size. The company needs a highly available solution to ingest and store these alerts for future analysis. The requirements include:

Minimizing costs.
Avoiding the management of additional infrastructure.
Keeping 14 days of data available for immediate analysis.
Archiving any data older than 14 days.

Solution

A. Create an Amazon Kinesis Data Firehose delivery stream to ingest the alerts. Configure the Kinesis Data Firehose stream to deliver the alerts to an Amazon S3 bucket. Set up an S3 Lifecycle configuration to transition data to Amazon S3 Glacier after 14 days.

Explanation

Amazon Kinesis Data Firehose:
- Serverless and Fully Managed: Kinesis Data Firehose is a serverless service that does not require managing any infrastructure, which aligns with the company's requirement to avoid additional infrastructure management.
- High Availability: Kinesis Data Firehose is designed to be highly available, capable of handling large volumes of data streams, and includes automatic failover and retry mechanisms.
- Data Ingestion and Delivery: Kinesis Data Firehose can reliably deliver data to Amazon S3, ensuring that no data is lost during the process.
Amazon S3 and S3 Lifecycle Policies:
- Storage and Archival: Storing data in Amazon S3 provides easy access and analysis capabilities. By setting up S3 Lifecycle policies, data can be automatically transitioned to Amazon S3 Glacier after 14 days, reducing storage costs.
- Data Retention: S3 offers durability and high availability, making it suitable for long-term data storage.

Why Other Options Are Less Suitable

Option B (Launch Amazon EC2 instances across two Availability Zones and place them behind an Elastic Load Balancer to ingest the alerts. Create a script on the EC2 instances that will store the alerts in an Amazon S3 bucket. Set up an S3 Lifecycle configuration to transition data to Amazon S3 Glacier after 14 days):
- Operational Complexity: Managing multiple EC2 instances, a load balancer, and custom scripts increases operational complexity and management overhead.
- Cost: Running EC2 instances incurs ongoing costs, and additional management and monitoring are required.
Option C (Create an Amazon Kinesis Data Firehose delivery stream to ingest the alerts. Configure the Kinesis Data Firehose stream to deliver the alerts to an Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster. Set up the Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster to take manual snapshots every day and delete data from the cluster that is older than 14 days):
- Cost and Management: Managing an OpenSearch Service cluster requires additional resources and costs. While it provides real-time analysis capabilities, it is not the most cost-effective solution for just storing and archiving data.
- Data Retention: Manual snapshot and data deletion processes add operational complexity and may not be as automated and reliable as S3 lifecycle policies.
Option D (Create an Amazon Simple Queue Service (Amazon SQS) standard queue to ingest the alerts, and set the message retention period to 14 days. Configure consumers to poll the SQS queue, check the age of the message, and analyze the message data as needed. If the message is 14 days old, the consumer should copy the message to an Amazon S3 bucket and delete the message from the SQS queue):
- Operational Complexity: Requires writing and maintaining consumer code to handle messages, and managing the message lifecycle, which increases operational complexity.
- Performance: The message polling mechanism in SQS can introduce latency, especially when handling large volumes of data. Additionally, manually copying messages to S3 and deleting old messages is less automated and efficient compared to using S3 lifecycle policies.

Question17

A company's application integrates with multiple software-as-a-service (SaaS) sources for data collection. The company currently runs Amazon EC2 instances to receive the data and upload it to an Amazon S3 bucket for analysis. The same EC2 instance also sends a notification to the user when an upload is complete. The company has noticed slow application performance and wants to improve the performance as much as possible.

Solution

B. Create an Amazon AppFlow flow to transfer data between each SaaS source and the S3 bucket. Configure an S3 event notification to send events to an Amazon Simple Notification Service (Amazon SNS) topic when the upload to the S3 bucket is complete.

Explanation

Amazon AppFlow:
- Data Transfer: Amazon AppFlow is a fully managed service that allows easy data transfer between SaaS applications and AWS services. It supports multiple SaaS providers (such as Salesforce, Slack, Google Analytics, etc.) and can automatically handle data extraction and loading.
- Serverless: Using AppFlow eliminates the need to manage any infrastructure, thereby reducing operational overhead.
- Scalability: AppFlow can handle large volumes of data and automatically scales as needed.
S3 Event Notifications:
- Notifications: Configure S3 event notifications to trigger events and send them to an Amazon SNS topic when data is uploaded to the S3 bucket. This ensures that users are notified immediately upon completion of the upload.
- Simple Configuration: S3 event notifications are easy to configure using the AWS Management Console or API.

Why Other Options Are Less Suitable

Option A (Create an Auto Scaling group so that EC2 instances can scale out. Configure an S3 event notification to send events to an Amazon Simple Notification Service (Amazon SNS) topic when the upload to the S3 bucket is complete):
- Operational Complexity: While Auto Scaling can improve performance, it requires managing Auto Scaling groups, load balancers, and EC2 instances, increasing operational complexity and management overhead.
- Cost: Running multiple EC2 instances incurs additional costs.
Option C (Create an Amazon EventBridge (Amazon CloudWatch Events) rule for each SaaS source to send output data. Configure the S3 bucket as the rule's target. Create a second EventBridge (CloudWatch Events) rule to send events when the upload to the S3 bucket is complete. Configure an Amazon Simple Notification Service (Amazon SNS) topic as the second rule's target):
- Complexity: Managing multiple EventBridge rules for each SaaS source adds complexity to the configuration and management.
- Limitation: EventBridge is primarily used for event-driven architectures and is not specifically designed for data transfer. Although it can configure S3 as a target, it is not as specialized as AppFlow for this purpose.
Option D (Create a Docker container to use instead of an EC2 instance. Host the containerized application on Amazon Elastic Container Service (Amazon ECS). Configure Amazon CloudWatch Container Insights to send events to an Amazon Simple Notification Service (Amazon SNS) topic when the upload to the S3 bucket is complete):
- Operational Complexity: Managing Docker containers, ECS clusters, and related configurations increases operational complexity and management overhead.
- Cost: Running an ECS cluster and managing containers can add to the overall cost.

Question18

A company runs a highly available image-processing application on Amazon EC2 instances in a single VPC. The EC2 instances run inside several subnets across multiple Availability Zones. The EC2 instances do not communicate with each other but download and upload images from and to Amazon S3 through a single NAT gateway. The company is concerned about data transfer charges and wants to find the most cost-effective way to avoid regional data transfer charges.

Solution

C. Deploy a gateway VPC endpoint for Amazon S3.

Explanation

Amazon S3 Gateway VPC Endpoint:
- Zero Data Transfer Fees: By using an Amazon S3 Gateway VPC Endpoint, EC2 instances can communicate directly with Amazon S3 without going through a NAT gateway. This means there are no additional regional data transfer fees.
- High Availability: VPC Endpoints are highly available because they are managed by AWS and can be used across multiple Availability Zones.
- No Additional Management: There is no need to manage additional infrastructure, such as NAT gateways or NAT instances, reducing operational complexity and management overhead.

Why Other Options Are Less Suitable

Option A (Launch the NAT gateway in each Availability Zone):
- Cost: Each NAT gateway incurs additional costs, and deploying multiple NAT gateways in different Availability Zones would increase expenses.
- Data Transfer Fees: Even with NAT gateways in each Availability Zone, data transfer through these gateways would still incur regional data transfer fees.
Option B (Replace the NAT gateway with a NAT instance):
- Management Complexity: NAT instances require self-management, including software updates, patching, and health monitoring.
- Availability and Reliability: NAT instances are less reliable and highly available compared to NAT gateways, as they depend on a single EC2 instance.
- Data Transfer Fees: Using NAT instances would still result in regional data transfer fees.
Option D (Provision an EC2 Dedicated Host to run the EC2 instances):
- Cost: EC2 Dedicated Hosts are generally more expensive than on-demand instances unless there are specific compliance or performance requirements.
- Flexibility: Dedicated Hosts offer less flexibility in dynamically adjusting resources based on demand.

Question19

A solutions architect is developing a VPC architecture that includes multiple subnets. The architecture will host applications that use Amazon EC2 instances and Amazon RDS DB instances. The architecture consists of six subnets in two Availability Zones. Each Availability Zone includes a public subnet, a private subnet, and a dedicated subnet for databases. Only EC2 instances that run in the private subnets can have access to the RDS databases.

Solution

Option C: Create a security group that allows inbound traffic from the security group that is assigned to instances in the private subnets. Attach the security group to the DB instances.

Explanation

To meet the requirement that only EC2 instances in the private subnets can access the RDS databases, Option C is the most suitable choice:

Security Groups: Security groups act as virtual firewalls for your instances to control inbound and outbound traffic. By creating a security group that specifically allows inbound traffic from the security group assigned to instances in the private subnets, you ensure that only those instances can access the RDS databases.
Implementation Steps:
1. Create a Security Group for RDS Instances:
- Define rules that allow inbound traffic on necessary ports (e.g., port 3306 for MySQL or Aurora) from the security group used by EC2 instances in the private subnets.
- Example:
```
Type        | Protocol | Port Range | Source
-------------------------------------------------
MySQL/Aurora| TCP      | 3306       | sg-private-subnet
```
1. Attach the Security Group to RDS Instances:
- Assign this newly created security group to the RDS DB instances during creation or modification.

Why Other Options Are Less Suitable

A. Create a new route table that excludes the route to the public subnets' CIDR blocks. Associate the route table with the database subnets:
- Route tables control how traffic is routed within a VPC but do not enforce access controls based on source or destination IP addresses. This option does not provide the necessary security isolation required.
B. Create a security group that denies inbound traffic from the security group that is assigned to instances in the public subnets. Attach the security group to the DB instances:
- Denying specific traffic can be less secure and more complex to manage. A better approach is to explicitly allow only the necessary traffic, as done in Option C.
D. Create a new peering connection between the public subnets and the private subnets. Create a different peering connection between the private subnets and the database subnets:
- Peering connections are used to connect VPCs or different regions, not subnets within the same VPC. Additionally, it complicates the network configuration unnecessarily for this scenario.

Question20

A company has a website hosted on AWS. The website is behind an Application Load Balancer (ALB) that is configured to handle HTTP and HTTPS separately. The company wants to forward all requests to the website so that the requests will use HTTPS.

Solution

Option C: Create a listener rule on the ALB to redirect HTTP traffic to HTTPS.

Explanation

To meet the requirement of forwarding all requests to the website using HTTPS, Option C is the most suitable choice:

Listener Rules: An Application Load Balancer (ALB) can be configured with listener rules that allow you to define how the ALB should respond to incoming traffic. By creating a listener rule specifically for HTTP traffic (port 80), you can set up a redirection action to automatically redirect all HTTP requests to HTTPS (port 443).

Why Other Options Are Less Suitable

A. Update the ALB's network ACL to accept only HTTPS traffic:
- Network ACLs operate at the subnet level and can block or allow traffic based on IP address and port number. However, they cannot perform URL-based redirection. This option would prevent HTTP access but not redirect it to HTTPS.
B. Create a rule that replaces the HTTP in the URL with HTTPS:
- While this might work in theory, it is not a built-in feature of the ALB and would require custom implementation, such as modifying application code or using AWS Lambda@Edge. It is less straightforward compared to configuring a listener rule.
D. Replace the ALB with a Network Load Balancer configured to use Server Name Indication (SNI):
- A Network Load Balancer (NLB) operates at Layer 4 (transport layer) and does not support HTTP/HTTPS redirection natively. SNI is used for hosting multiple SSL certificates on a single NLB, which is unrelated to the task of redirecting HTTP to HTTPS.