******************************************************************************
* Description: Knowledge of AWS solution architect associate certificate exam
* Date: 05:35 PM EST, 09/02/2017
******************************************************************************

	 
	 
<1> AWS services covered within the exam:
     |
     |__ 1) Messaging:
     |       |
     |       |__ o. SNS: simple notification service
     |
     |__ 2) Desktop and App streaming:
     |       |
     |       |__ o. Workspace:
     |
     |__ 3) Security & identity:
     |       |
     |       |__ o. IAM:
     |       |
     |       |__ o. Inspector: An agent on virtual machine, security report [not too much in this exam].
     |       |
     |       |__ o. Certificate Manager: SSL, domain name.
     |       |
     |       |__ o. Directory Service: Active directory to AWS [import].
     |       |
     |       |__ o. WAF - web application firewall. Protection to website. Network/application level protection, such as injuring SQL.
     |       |
     |       |__ o. Artifacts - method getting AWS document from console.
     |
     |
     |__ 4) Management tools:
     |       |
     |       |__ o. CloudWatch: Monitoring performance. EC2 disk/CPU utilization.
     |       |
     |       |__ o. CloudFormation: Turn your IT infrastructure into code to describe your IT infrastructure component, such as Physical firewall, network switch, physical machine.
     |       |                      One single command create the whole IT environment.
     |       |
     |       |__ o. CloudTrail: Auditing activties, and changes within the IT environment.
     |       |
     |       |__ o. Config: Automatically warning.
     |       |
     |       |__ o. OpsWorks: Automating deployment using chef.
     |       |
     |       |__ o. Service Catalog: Larger enterprise, sepecific server used by the company authorization. [Not exam topic].
     |       |
     |       |__ o. Trusted Advisor: Automating check your environment health - disk fault tolence.
     |       |
     |       |__ o. Managed Services:
     |      
     |
     |__ 5) Storage:
     |       |
     |       |__ o. S3: Object based.
     |       |
     |       |__ o. Glacier: Low cost to store unfrequently visited files, but the fetching speed would be slow.
     |       |
     |       |__ o. EFS: Block based.
     |       |
     |       |__ o. Storage Gateway: A virtual server that you setup from AWS console for connecting your data center to cloud.
     |                               It provides seamless and secure integration between an organization's on-premises IT environment and AWS's storage infrastructure.
     |
     |
     |__ 6) Database:
     |       |
     |       |__ o. RDS: relational database service.
     |       |
     |       |__ o. DynamoDB: non-relational database, no-sql, super high performance - more on developer exam.
     |       |
     |       |__ o. Redshift: data warehouse service - big data.
     |       |
     |       |__ o. Elasticache: caching data in cloud. 
     |
     |
     |
     |__ 7) Networking & content Delivery:
     |       |
     |       |__ o. VPC: Virtual Priviate Cloud. A virtual data center.
     |       |
     |       |__ o. Route53: AWS DNS service. User can register Domain Name here. Port 53 is for DNS service.
     |       |
     |       |__ o. Direct Connect: A dedicate network line from physical data center to cloud - security, reliable internet.
     |       |
     |       |__ o. CloudFront: A distribution allows you to distribute content using a worldwide network of edge locations that provide low latency and high data transfer speeds.
     |	 
     |
     |__ 8) Compute:
     |       |
     |       |__ o. EC2:
     |       |
     |       |__ o. EC2 Container:
     |       |
     |       |__ o. Elastic Beanstalk: For developers more, architect solution not too much. Generally, analyze developer's code, and design infrastructure.
     |       |
     |       |__ o. Lambda: serverless - upload your code will execute automatically. Not in exam.
     |       |
     |       |__ o. Lightsail: Wordpress blog pulishment purpose. Not in exam.
     |	 
     |
     |__ 9) AWS global infrastructure:
     |       |
     |       |__ o. Regions: a geographical area consists of 2 availability zones.
     |       |
     |       |__ o. Availability zones: simply a data center within one Regions. AZ in same region is close to keep latency low.
     |       |
     |       |__ o. Edge location: content delivery network end points for cloudfront. Basically, cache large size media files, such as videos. 
     |                             If you watch video in NY, but request from China. EL will cache for you. Edge Location count is more than region.
     |
     |__ 10) Migration:
     |       |
     |       |__ o. Snowball: export/import - bundle disk transfer to cloud S3, EBS virtual disk. IDA. Enterprise level.
     |       |
     |       |__ o. DMS: Database migration service allows migrate primary database to cloud. 
     |       |           For example, migrate Oracle database from your physical data center to AWS RDS/Arura to get rid of Oracle licensing fee. No down time replication.
     |       |
     |       |__ o. SMS: Server migration service - move virtual machine.
     |
     |
     |__ 11) Analytics Service:
             |
             |__ o. Athena: In your S3 bucket, you have lots of .csv/.json files. 
             |              This service will allow you SQL query the data within the file to turn the flat txt file into searchable database.
             |
             |__ o. EMR: big data processing (high level, and how to access is enough).
             |
             |__ o. Cloud search: search engine for your website. 
             |
             |__ o. Elasticsearch Service:
             |
             |__ o. Kinesis: Streaming real time data. Process TB data per hour.
             |
             |__ o. Data Pipeline: move S3 to DynamoDB. Not too much for associate.
             |
             |__ o. QuickSight: Business analysis tool. Create dashborad for your data.
			 
	

<2> Massive Concept:
     1) A region is a geographical area. Each region consists of 2 availability zonee, and a AZ is simple a data center. AZ in same region is close to keep latency low.
     2) Edge Location: content delivery network end points for cloudfront. Basically, cache large size media files, such as videos.
                       EL count is more than region. If the video that you watch saved in a server located in NY, but the request from China. 
                       As long as a user visit the video in China. Then, the EL in China will cache for second user. 
<3> Identity Access Management[IAM] - allow you to manage users and their level of access to the AWS console: o. IAM is global/universal. It does not apply any particular region. o. Customize AWS console URL on webpage. o. Root user account is the one you created with Email. o. New user has no permission. o. Access KEY ID & Seceret Access Key are only for API, not for console login. o. Access KEY ID & Seceret Access Key are one-time generated. New one must be recreated. o. Always setup Multi-Factor Authentication on your root account. [ Google authenticator ] o. Admin can customize your own password policy. Exa, length, up/lower case, numeric. o. Power user - can access to any AWS services except the management of groups and users within IAM. o. centralised control of your AWS account. o. shared access to your AWS account. o. Granular permission. o. identity federation (including active directory, facebook, linkedin etc). o. multifactor authentication. o. provide temporary access for users/devices and services where necessary. o. allows you to set up your own password rotation policy. o. integrates with many different AWS services. o. support PCI DSS Compliance. o. users. o. groups - a collection of users under one set of permissions. o. roles - you create roles and can assign them to AWS resources. o. policies - a document that defines one or more permission. Such as Oracle user profile. o. region - only global. o. console DNS - customize. o. activate (multiple factor authertication)MFA on your root account - root account is the email that you sign up/in. o. Enable MFA - Google authanticator. o. IAM user console URL - https://91932334972.signin.aws.amazon.com/console <4> Storage - Simple Storage Service[S3] - provides secure, durable, highly-scalable object storage: o. Object based storage only, such video, flat files, pdf, word, image. o. The data is spread across multiple devices and facilities, and cheap. o. Files can be from 0 bytes to 5 TB. o. There is unlimited storage - AWS monitor the use across regions globally. o. Files are stored in Buckets. Bucket is just a folder with unique name. [Name should be all in lower case] o. S3 is a universal namespace, that is, names must be unique globally. o. When you upload a file to S3, you will receive a HTTP 200 code if the upload was successful. o. Data Consistency Model for S3 [Critical] => Read after write consistency for PUTS of new objects. => Eventual consistency for overwrite PUTS and DELETES (can take some time to propagate). => There is no fee for copying data from EC2 to S3, if they are in the same region. => Basically, when you upload a object such as a PDF file, you should be able to read immeidately. However, when you updating the file, you can not have that much fast speed. It may take some time [million-second] for propagate. Because, it might be stored across multiple devices like hard disk. Meanwhile a file is being updated, it gets read. The data either be new version or old version of data. You will not read partial of the data, or corrupted data, like messive meaningless code. o. S3 is high availability and fault tolerance. No need to worry if the AZ or Region down. o. URL of S3 bucket .................................. https://s3-eu-west-1.amazonaws.com/emeralit o. URL of S3 bucket holding statistic website ........ https://S3_Bucket_Name.s3-website-ap-us-east-1.amazonaws.com o. Using Route53 to host a web site hosted in S3, the bucket name must be same as domain name. o. The minium file size that you can upload to S3 is 0 bytes. o. Versioing of S3 must be enabled for cross region replication. o. key - the name of the object o. value - the data is made of a sequence of bytes. o. version id - important for versioning o. metadata - data about the data you are storing, such last uploaded/modified time o. versioning o. metadata o. subresources - access control list(object permission), torrent o. Built for 99.99% availability for the S3 platform o. Amazon guarantee 99.99% availability o. Amazon ensures 99.99999999999% durabilit for S3 o. Tired storage avaiable o. Lifecycle management - deletion retention o. versioning o. encrytion o. Secure your data using access control lists and bucket policy o. Storage Ties/Classes o. S3 99.99% availability, 99.99999999999% durability, stored redundantly across multiple devices/facilities and is designed to sustain the loss of 2 facilities concurrently. o. S3 - IA(Infrequently accessed) for data that is accessed less frequently, but requires rapid access when needed. Lower fee than S3, but you are charged a retrieval fee. o. Reduced redundancy storage - design to provide 99.99% durability and availability of objects over a given year. o. Glacier - very cheap, but used for archival only. It takes 3-5 hours to restore from Glacier. o. Glacier is an extremely low-cost storage services for data archival. It stores data for as little as $0.01 per GB per month, and is optimized for data that is infrequently accessed and for which retrival times of 3 to 5 hours are suitable. o. S3 Charges - by storage/request time/Storage Management Pricing/Data transfer pricing(replication from one region to region)/Transfer acceleration/ o. Amazon S3 transfer acceleration enables fast, easy, and secure transfers of files over long distances between your end users and an S3 bucket. Transfer acceleration takes advantage of Amazon CloudFront's globally distributed edge location [LG is a data zone very close to end user]. As the data arrives at an edge location, data is routed to Amazon S3 over an optimized network path. o. Create S3 bucket: Go to --> Properties --> Storage Management --> Switch between "New Console" and "Old Console". o. properties: | |__ Versioning: can avoid user deleting object files accidently. | |__ Logging [Auditing user accessing for bucket]: Cloudtrail will audit all services access which is thought as overwhelm. | |__ Statistic wesite hosting [HTML only, Not for PHP] | |__ o. Scalable | |__ o. Cheap | |__ Tag | |__ Cross region replication | |__ Events: Exa. notifying when phote uploaded by user | |__ Lifecycle: S3/S3-IA/S3 Reduce Reduntency | |__ Permission: Every S3 bucket gets created, it will be priviate by default. | |__ Management: Metric with numbers of request, storage usage, report of inventory. o.Versioning of S3 bucket: ==> Versioning could only be disabled, not deleted. ==> Delete one version of object, you can not restore them via new concole. ==> When click on the link, it will point to the latest version --> restore from old concole via "Deleted Marker". ==> When deleted one object, it will be marked with "deleted", but still be able to restore via old concole. Sort of like removing a file to recycle bin first before deleting permently. o. Stores all versions of all object, including all writes and even if you deleted the object. o. Used for backup purpose as well. o. Once enabled versioning feature, it can not be disabled. Only suspended. o. Integrates with lifescyel rules. Such as move to Glacier after retention policy, or delete them. o. Versioing MFA deletion capability. If you want to delete one file, it will required multifactor authentication to improve security. o. S3 - Cross Region Replication: ==> Versioning must be enabled on both the source and destination buckets. ==> Regions must be unique. Can not replicate between buckets who are in the same region. ==> Files in an existing bucket are not replicated automatically. All subsequent updated files will be replicated automatically. ==> You cannot replicate to multiple buckets or use daisy chaining(at this time) ==> Deleting markers are replicated. ==> Deleting individual versions or delete markers will not be replicated. ==> Understand what cross region replication is at a high level would be fine. ==> Prefix is just the folder under your bucket. o. S3/Glacier - Life cycle management: ==> Can be used in conjunction with versioning. ==> Can be applied to current versions and previous versions. ==> Following actions can now be done: 1) Transition to the standard - infrequent access storage class (128KB and 30 days after the creation date). 2) Archive to the Glacier storage class (30 days after IA, if relevant) 3) Permanently delete. ==> Glacier archives max size is 40 TB; unlimited numbers; Uploading is synchronised; Support Upload/Download/Delete; As long as archive uploaded, it would be immutable [unchangable]. Glacier and AWS Storage Gateway are encrypting natively. o. Securing your buckets: ==> Be default, all newly created buckets are PRIVATE. ==> You can setup access control to your bucket using: <1> Bucket policies. <2> Access Control lists - individual permision. ==> S3 buckets can be configured to create access logs which log all requests made to the S3 bucket. This can be done to another bucket. ==> Encrytion: <1> In Transit - SSL/TLS [HTTPS] <2> At rest: >> Server side encrytion: 1) S3 managed keys-SSE-S3. 2) AWS Key Management Service, SSE-KMS. 3) Server Side Encryotion with Customer Provided Keys - SSE-C. >> Client Side Encrytion: You encrypted and uploaded to S3. <5> Storage - CloudFront - A content delivery network(CDN) is a system of distributed servers(network) that deliver webpages and other web content to a user based on the geographic locations of the user, the origin of the webpage and a content delivery server. o. Key Terminology: Edge Location - This is the location where content will be cached. o. CDN accelerate URL format will be https://bucket_name.s3-accelerate.amazonaws.com o. This is seperate to an AWS Region/AZ. o. Origin - This is the origin of all the files that the CDN will distribute. This can be either an S3 bucket, an EC2 instance, and Elastic load balancer or Route53. o. Distribution - This is the name given the CDN which consists of a collection of Edge locations. o. CloudFront can be used to deliver your entire website, including dynamic, static, streaming, and interactive content using a global network of edge locations. Requests for your content are automatically routed to the nearest edge location, so content is deliverd with the best possible performance. o. Cloudfront is optimized to work with other services, such as S3, EC2, ELB, and Route 53. Also works seamlessly with any non-AWS origin server, which stores the original, definitive versions of your files. o. web distributtion - Typically used for webiste o. RTMP - Used for media streaming - Adobe Flash o. Web distribution/RTMP distribution o. Edge Location - are not just READ only, you can write to them too. o. Objects are cached for the life of the TTL[Time to Live - How long it will be cached] o. You can clear cached objects, but you will be charged. [Default TTL is 24 days. If you want to delete them, such as update a new version of the video. You will be charged.] o. CloudFront not only for download, but also be available for upload. <6> Storage - Storage Gateway - a service that connects an on-permises software appliance with cloud-based storage to provide seamleass and secure integration between an organizations's on premises IT environment and AWS's storage infrastucture. The service enables you to securely store data to the AWS cloud for scalable and cost-effective storage. o. AWS storage gateway's software appliance is available for download as a virtual machine image that you install on a host in your data center. SG supports either VMare ESXi or Micoresoft Hyper-V. Once you are installed your gateway and associated it with your AWS account through the activation process, you can use the AWS management console to create the storage gateway option that is right for you. o. Four types of Storage Gateway: >> File Gateway (NFS) - Flat file - .pdf/video/.word ===> S3 - brand new. Most recent used data is cached on the gateway for low-latency access, and data transfer between your data center and AWS is fully managed and optimized by the gateway. >> Volumes Gateway (iSCSI) - Block based storage, OS/Virtual hard disk/SQL Server, not flat file a) Stored mode: primary data is stored locally and your entire dataset is available for low-latency access while asynchronously backed up to AWS. [Google Drive] b) Cached mode: primary data is written to S3, while retaining your frequently accessed data locally in a cache for low-latency access. >> Tape Gateway (VTL) o. File Gateway: >> Files are stored as objects in your S3 buckets, accessed through a network File system mount point. >> Ownership, permissions, and timestamps are durably stored in S3 in the user meta-data of the object associated with the file. >> Once objects are transferred to s3, they can be managed as native s3 objects, and the bucket policies such as versioning, lifecycle management, and cross-region replication apply directly to objects stored in your bucket. o. Volum Gateway [Virtual hard disk]: >> The volumn interface presents your applications with disk volumes using the iSCSI block protocal. >> Data written to these volumes can be asynchronously backed up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots. >> Snapshots are incremental backups that capture only changed blockes. All snapshot storage is alse compressed to minimize your storage charges. o. Tape Gateway: >> It offers a durable, cost-effective solution to archive your data in the AWS cloud. >> The VTL interface it provides lets you leverage your existing tape-based backup application infrastructure to store data on virtual tape cartidges that you create on your tape gateway. >> Each tape gateway is pre-configured with a media changer and tape drives, which are available to your existing client backup applications as iSCSI devices. >> You add tape cartridges as you need to archive your data. >> Supported by NetBackup, Backup Exec, Veam etc. o. Exam Tips: >> File Gateway - For flat files, stored directly on S3. >> vloume Gateway: a) Store volumes - entire dataset is stored on site and is asynchronously backed up to S3. b) Cached volumes - Entire dataset is stored on S3 and the most frequently accessed data is cached on site. >> Gateway Virtual: tape Libray(VTL) - Used for backup and uses popular backup applications like NetBackup, Backup Exec, Veam etc. <7> Storage - Snowball - Previously, AWS import/export disk accelerates moving large amounts of data into and out of the AWS cloud using portable storage devices for transport. AWS import/export disk transfers your data directly onto and off of storage devices using Amazon's high-speed internal network and bypassing the internet. But, the problem is that every client sends over different devices to manage. That is the reason why introduce snowball. Snowball is a AWS produce physical hardware like a suitcase. o. Snowball type: a) Standard Snowball. b) Snowball Edge. c) Snowmobile. o. Exam Tips - Snowball: >> Understand what snowball is >> Understand what import export is >> Snowball Can >> Import to S3 >> Export from S3 <8> EC2: o. Amazon Elastic Computed Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. Amazon EC2 reduces in the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity both up and down, as you computing requirements change. o. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios. o. EC2 Type by price: >> On Demand ... Allow you to pay a fixed rate by the hour with no commitment. Suit for startup company. >> Reserved .... Provide you with a capacity reservation, and offer a significant discount on the hourly charge for an instance. 1 Year or 3 Year Terms. >> Spot ........ Enable you to bid whatever price you want for instance capacity, providing for even greater savings if your applications have flexible start and end times. >> Dedicated ... Physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses. o. EC2 - On demand: >> Users that want the low cost and flexibility of Amazon EC2 without any up-front payment or long-term commitment. >> Applications with short term, spiky, or unpredictable workloads that cannot be interrupted. >> Applications being developed or tested on Amazon EC2 for the first time, such as startup company. o. EC2 - Reserved: >> Such as reserving 2 servers for Annual black Friday. >> Applications with steady state or predictable usage. >> Applications that require reserved capacity. >> Users able to make upfront payments to reduce their total computing costs even further. >> Reserved instane can be migrated across the availability zones, sepecific to one instance type, and lower Total Cost of Ownship of a system. >> Reserved instance is for long term, and predictable CPU/memory/IO performance. o. EC2 - Spot: >> Sudden requirement for large urgent data computing: >> Spot price is your expected price, and there is an AWS resource flowing price. If your expected price match the flowing price, then your EC2 instance will be created, otherwise it will be terminated. >> Use Case: a) Applications that have flexible start and end times. b) Applications that are only feasible at very low compute prices. c) Users with urgent computing needs for large amounts of additional capacity. >> If the Spot instance is terminated by Amazon EC2, you will not be charged for that partial hour of usage. However, if you terminate the instance yourself, you will be charged for any hour in which the instance ran. o. EC2 - Dedicated: >> For example, goverment purpose. o. EC2 Type - DR MC GIFT PX: >> D - Density >> R - RAM >> M - main choice for general purpose apps >> C - Compute >> G - Graphics >> I - lOPS >> F - FPGA >> T - cheap general purpose (think T2 Micro) >> P - Graphics (think Pics) >> X - Extreme Memory
<9> EBS Volume - block based storage: o. Amazon EBS allows you to create storage volumes and attach them to Amazon EC2 instances. Once attached, you can create a file system on top of these volumes, run a database, or use them in any other way you would use a block device. Amazon EBS volumes are placed in a specific Availability Zone, where they are automatically replicated to protect you from the failure of a single component. All EBS are cross multiple facilities, but only in single AZ. If the AZ falled, the vloume can not be fault tolence. o. EBS Type: >> General Purpose SSD (G P2) - General purpose, balances both price and performance: a) Ratio of 3 lOPS per GB with up to 10,000 lOPS and the ability to burst up to 3000 lOPS for extended perionds of time for volumes under lGib. >> Provisioned IOPS SSD (lOi) - Designed for I/O intensive applications such as large relational or NoSQL databases: a) Use if you need more than 10,000 lOPS. b) Can provision up to 20,000 lOPS per volume. c) 16 TiB as maximum size. Caution: it is TiB [bits, base-2], instead of TB [byte, base-10]. >> Throughput Optimized HDD (ST1): a) Big data. b) Data warehouse. c) Log processing - sequential read/write. d) Can NOT be a boot volume. >> Cold HDD (SC1): a) Lowest Cost Storage for infrequently accessed workloads. b) File Server c) Can NOT be a boot volume >> Magnetic (Standard): a) Lowest cost per gigabyte of all EBS volume types that is bootable. b) Magnetic volumes are ideal for workloads where data is accessed infrequently, and applications where the lowest storage cost is important. o. EB2/EBS Operation: a) Launch an EC2 Instance. b) Security Group Basics. c) Volumes and Snapshots. d) Create an AMI. e) Load Balancers & Health Checks. f) Cloud Watch. g) AWS Command Line h) IAM Roles with EC2 i) Bootstrap Scripts j) Launch Configuration Groups k) Autoscaling 101 l) EFS m) Lambda n) HPS (High Performance Compute) & Placement Groups o. EC2 Exam Tips: >> Know the difference between: a) On Demand b) Spot c) Reserved d) Dedicated >> Remember with spot instance: a) If you terminate the instance, you pay for the hour. b) If AWS terminates the spot instance, you get the hour it was terminated in for free. o. EBS Exam Tips: >> EBS Volum Types: a) SSD, General Purporse - GP2 - (Up to 10,000 lOPS) b) SSD, Provisioned lOPS - 101 - (More than 10,000 lOPS) c) HDD, Throughput Optimized - ST1 - frequently accessed workloads d) HDD, Cold - SC1 - less frequently accessed data. e) HDD, Magnetic - Standard - cheap, infrequently accessed storage >> You cannot mount 1 EBS volume to multiple EC2 instances, instead use EFS [This File System could be shared]. >> Encrypttion supportes on all types of EBS volumes. >> Snapshot of encrypted EBS are automatically encrypted. >> Encryotion does not support for all instance types. >> Existing EBS volume can not be encrypted. >> Shared EBS volume can not be encryted. >> Root EBS volumes cannot be encrypted, except using third party tool. >> The EBS volume size, type, and IOPS can be changed when it is attached on an EC2 instance. o. EC2 Family: >> DR MC GIFT PX might be missing in some of the regions. Not all regions have all types of EC2 instance. o. When setup EC2 instance, one subnet can not go across multiple availability zone. o. When setup EC2 instance, add tag on an instance can help you check the where the bill is from which server. o. When setup EC2 instance, in security group, you can indicate server access network protocal and associated port. "My IP" means, if your server's IP matches this, you can login directly. The default value is "Anywhere". o. T2/C4 only can be backed by EC2. C3/M3 are avaiable for SSD backed storage. o. $ yum install httpd -y [install Appache Server] $ yum update -u $ service httpd start $ /var/www/html is the default gateway folder, like /home/amos/public_html $ ssh ec2-user@192.168.52.91 -i /home/amos/ssh_key_pair.pem o. EC2 instance monitoring interval 5 minutes is free, but switch to 1 minute will cost bit more. o. Exam Tips: >> Termination Protection is turned off by default, you have to turn it on by yourself. >> On an EBS-backed instance, the default action for the root EBS volume is to be deleted when the instance is terminated. >> EBS Root Volumes of your DEFAULT AMI’s cannot be encrypted. But, You can also use a third party tool (such as bit locker etc) to encrypt the root volume, or this can be done when creating AMI's in the AWS console or using the API. >> Snapshot of data and root volume can be encrypted and attached to an AMI. >> EBS encryption support boot volume data being copied to an EBS snapshot with encryption option. >> Additional volumes can be encrypted. [The volumn that you created can be encrypted] o. Security Group - server virtual firewall >> When creating EC2 instance, in security group tab, you can choose: a) SSH, Port 22, Source: 0.0.0.0/0 ==> IP/Mask means that all the IP can access to this server. >> After EC2 instance up and installed Apache, create one HTML page. If you delete HTTP from security group, the webpage will be fobbiden to review immeidately. a) Any changes that you applied on security group will be in effective immeidately. >> All Inbound Traffic is Blocked By Default. >> All Outbound Traffic is Allowed. >> Changes to Security Groups take effect immediately. >> You can have any number of EC2 instances within a security group. >> You can have multiple security groups attached to EC2 Instances >> Security Groups are STATEFUL. a) If you create an inbound rule allowing traffic in, that traffic is automatically allowed back out again. >> You cannot block specific IP addresses using Security Groups,instead use Network Access Control Lists. >> You can specify allow rules, but not deny rules. o. Steps of upgrade volume type: >> AWS console ==> EC2 ==> Volumn ==> Action "Deatach" the target volumn [This action will garantee your file is consistency, even wait until disk write done] ==> make a snapshot of the volumn ==> Restore the snapshot on a upper level storage, such as "Provisioned IO" disk ==> Attach the volumn ==> Mount File System again. o. AWS EBS volume usaually goes with RAID or 10, rarely chose 5. o. To increase disk IO, you can create a RAID array by multiple AWS EBS individual volumes. o. Problem - Take a snapshot, the snapshot excludes data held in the cache by applications and the OS. This tends not to matter on a single volume, however using multiple volumes in a RAID array, this can be a problem due to interdependencies of the array. >> So, how to make a snapshot of a disk array: a) Stop the application from writing to disk. b) Flush all caches to the disk. c) How to archive above 2 pionts? ==> Freeze the file system. ==> Unmount the RAID Array. ==> Shutting down the associated EC2 instance. o. To create a snapshot for Amazon EBS volumes that serve as root devices, you should stop the instance before taking the snapshot. o. Snapshots of encrypted volumes are encrypted automatically.Volumes restored from encrypted snapshots are encrypted automatically. o. You can share snapshots, but only if they are unencrypted. These snapshots can be shared with other AWS accounts or made public. o. When you snapshot the boot volume, you must make a AMI image first, and then be able to launch EC2 instance afterwards. o. AMI Type [based on root device]: >> EBS backed volume ...................... It is a product of AWS created from an EBS volume. >> Instance store backed volume ........... These images could be created by any third party amateur [ AWS console => Launch EC2 => Go to "Community AMI" to choose AMI ] >> All AMIs are categorized as either backed by Amazon EBS or backed by instance store. >> For EBS Volumes: The root device for an instance launched from the AMI is an AWS EBS volume created from an Amazon EBS snapsot. >> For instance store volumes: The root device for an instance launched from the AMI ss an instance store volume created from a template stored in Amazon S3. Since needing to fetching image from S3, the launch speed will be slower than EBS backed instance. >> Sometime, the server needs to be reboot for maintenance. That is the reason why usually people choose EBS backed instances, not instance store backed. >> Any instance store volume get reboot, the data will be lost. >> The instance store instance usually stores temporary data, such as cache or buffer. Also, in considering of price and disk I/O, people sometime choose instance store AMI. >> AMI needs to be copied across regions for disaster recovery, and be allowed to purchase or sell in the marketplace. o. Main difference between EBS backed and Instance Store volume is: >> The instance launched from EBS can be stopped. >> However, the instance launched from "instance store" can only be reboot or terminated. As long as the instance stop, all the data will be lost. >> Previously, all the image are stored in instance storage instead of EBS volume. o. Exam Tips - EBS vs Instance Store: >> Instance Store Volumes are sometimes called Ephemeral Storage. >> Instance store volumes cannot be stopped. If the underlying host fails, you will lose your data. >> EBS backed instances can be stopped. You will not lose the data on this instance if it is stopped. >> EBS volume is only in one AZ, but distributed across multiple facilities. >> You can reboot both, you will not lose your data. >> By default, both ROOT volumes will be deleted on termination, however with EBS volumes, you can tell AWS to keep the root device volume. o. Elastic Load Balance: >> Instances monitored by ELB are reported as: InService, or OutofService. >> Health Checks check the instance health by talking to it. >> Have their own DNS name. You are never given an IP address. >> AWS load balancer only be resolved by domain alias name, not IP. >> ELB is using access logs feature to capture all the client connection details. >> ELB has health check feature. If the EC2 instance behind the EBL failed the health check, and then the traffic will be stopped routed to the instance. >> ELB is designed only for single availability zone, not across-zone or across-region. o. Cloud Wath: >> What are the default monitoring metrics: CPU, Disk, Network, and Status. Not RAM. >> What are the difference between CloudWatch and CloudTrail: Watch is for monitoring IT evnironment resource; Trail is for user action auditing, such as create a user. >> Standard Monitoring = 5 Minutes >> Detailed Monitoring = 1 Minute >> Dashboards - Creates awesome dashboards to see what ishappening with your AWS environment. >> Alarms - Allows you to set Alarms that notify you when particular thresholds are hit. >> Events - Cloud Watch Events helps you to respond to state changes in your AWS resources. >> Logs - Cloud Watch Logs helps you to aggregate, monitor, and store logs. Log agent is a AWS server embeded tool to send log data to cloudwatch. o. IAM with EC2: >> Use individual secret access number is insecure. You can create a IAM role, which is global without region limitation. >> Use case: you need to create a application which can b login via Facebook, so you can grant IAM role to the EC2 instance without restore credential physically. >> After creating IAM role, you can attach the role with EC2 instance. As long as EC2 instance granted with the role, it can access S3 without any credential. o. Querying EC2 instance Metadata via HTTP API request: >> Login EC2 instance. >> curl http://169.254.169.254/latest/meta-data >> curl http://169.254.169.254/latest/meta-data/public-ipv4 o. EC2 instance auto scaling: >> You need to go to AWS console ==> EC2 instance ==> Auto Scaling Group ==> You need to create "Launch Configuration" ==> Then create "Auto Scaling Group" by indicating EC2 instance type. >> Auto scaling group can contain EC2 instances across multiple AZ, but only distribute in one zone. >> Auto scaling termiation policy:
o. EC2 Placement Group: >> A placement Zone is a logical grouping of instances within a singly AZ. Using placement groups enables applications to participate in a low latency, 10 Gbps network. Placement groups are recommended for applications that benefit from low network latency, high network throughput, or both. >> Basically, place group is for grid computing cluster based IT environment, which requires low latency network, such as Oracle RAC. o. EC2 Placement Group Exam Tips: >> A placement group can’t span multiple Availability Zones. >> The name you specify for a placement group must be unique within your AWS account. >> Only certain types of instances can be launched in a placement group (Compute Optimized, GPU, Memory Optimized, Storage Optimized) >> AWS recommend homogenous instances within placement groups. Like, all the instance Family and type are same, which helps reducing network latency. >> You can’t merge placement groups. >> You can’t move an existing instance into a placement group. >> You can create an AMI from your existing instance, and then launch a new instance from the AMI into a placement group. o. EC2 Lambda: >> A event driven function. Like 5 users send over 5 HTTP requests individually to API Gateway, 5 Lambda functions will be triggered. >> What is Lambda - a combination? ==> Data Centres ==> Hardware ==> Assembly Code/Protocols ==> High Level Languages ==> Operating Systems ==> Application Layer/AWS APIs ==> AWS Lambda >> AWS Lambda is a compute service where you can upload your code and create a Lambda function. AWS Lambda takes care of provisioning and managing the servers that you use to run the code. You don’t have to worry about operating systems, patching, scaling, etc. You can use Lambda in the following ways: ==> As an event-driven compute service where AWS Lambda runs your code in response to events. ==> These events could be changes to data in an Amazon S3 bucket or an Amazon DynamoDB table. As a compute service to run your code in response to HTTP requests using Amazon API Gateway or API calls made using AWS SDKs. >> Programming language that Lambda support: ==> Node.js ==> Python ==> Java ==> C# >> AWS Lambda pricing: ==> Number of requests: First 1 million requests are free. $0.20 per 1 million requests thereafter. ==> Duration: Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 100 ms. The price depends on the amount of memory you allocate to your function. You are charged $0.00001 667 for every GB-second used. >> Caution: Lambda only supports 5 minuts function duration. >> Exam Tips - EC2 Lambda: ==> Lambda scales out (not up) automatically. ==> Lambda functions are independent, 1 event = 1 function. ==> Lambda is serverless. ==> Know what AWS services are serverless: API Gateway, DynamoDB ... ==> Lambda functions can trigger other lambda functions, 1 event can = x functions if functions trigger other functions. ==> Architectures can get extremely complicated, AWS X-ray allows you to debug what is happening. ==> Lambda can do things globally, you can use it to back up S3 buckets to other S3 buckets etc. ==> Know your triggers. ==> Maxium Lambda function execution time is 300 seconds. o. Elastic File System: >> It is as cluster shared drive. o. Serverless: >> Similar to Google platform serverless SMTP monitoring script. >> General HTTP request router: ==> User browser ==> HTTP request for .html file ==> In html file, there is a javascript function xhttp(GET, "AWS-API-Gateway-URL") ==> In AWS console, you need to go to "API-Gateway" to create a trigger based on a Python function script ==> When the javascript request arrives API-Gateway, it will trigger the python script to perform pre-setup function. <10> Route 53: o. You can use both IPv4 and IPv6 with AWS. o. Elabstic load balancer does not have any IPv4 or IPv6, just with DNS. o. Domain name, such as emeralit.com is a naked domain without any prefix, such as www, mail, sale, etc. o. Top Level Domains: These top level domain names are controlled by the Internet Assigned Numbers Authority (IANA) in a root zone database which is essentially a database of all available top level domains. o. Domain Registrars Because all of the names in a given domain name have to be unique. There needs to be a way to organize this all so that domain names aren’t duplicated. This is where domain registrars come in. o. A registrar is an authority that can assign domain names directly under one or more top-level domains. These domains are registered with InterN IC, a service of ICANN, which enforces uniqueness of domain names across the Internet. Each domain name becomes registered in a central database known as the WholS database. Popular domain registrars include GoDaddy.com, 123-reg.co.uk etc. o. SOA Records - The SOA record stores information about: >> The name of the server that supplied the data for the zone. >> The administrator of the zone. >> The current version of the data file. >> The number of seconds a secondary name server should wait before checking for updates. >> The number of seconds a secondary name server should wait before retrying a failed zone transfer. >> The maximum number of seconds that a secondary name server can use data before it must either be refreshed or expire. >> The default number of seconds for the time-to-live file on resource records. o. NS Records: NS stands for Name Server records and are used by Top Level Domain servers to direct traffic to the Content DNS server which contains the authoritative DNS records. o. A Records: An “A” record is the fundamental type of DNS record and the “A” in A record stands for “Address”. The A record is used by a computer to translate the name of the domain to the IP address. For example http://www.acloud.guru might point to http://123.10.10.80. o. TTL: The length that a DNS record is cached on either the Resolving Server or the users own local PC is equal to the value of the “Time To Live” (TTL) in seconds. The lower the time to live, the faster changes to DNS records take to propagate throughout the internet. o. CNAMES: A Canonical Name (CName) can be used to resolve one domain name to another. For example, you may have a mobile website with the domain name http:llm.acloud.guru that is used for when users browse to your domain name on their mobile devices. You may also want the name http://mobile.acloud.guru to resolve to this same address. o. Alias Records: Alias records are used to map resource record sets in your hosted zone to Elastic Load Balancers, CloudFront distributions, or S3 buckets that are configured as websites. Alias records terminology is only valid with AWS. AWS load balancer URL only can be resolved by domain name, NOT IP. Alias records work like a CNAME record in that you can map one DNS name (www.example.com) to another ‘target’ DNS name (elbi 234.elb.amazonaws.com). Key difference - A CNAME can’t be used for naked domain names (zone apex). You can not have a CNAME for http://acloud.auru, it must be either an A record or an Alias. o. Route 53 - Exam Tips: >> ELB’s do not have pre-defined IPv4 addresses, you resolve to them using a DNS name. >> Understand the difference between an Alias Record and a CNAME. >> Given the choice, always choose an Alias Record over a CNAME. >> Route 53 is a global service without any region sepecific. o. Route53 Routing Policies: >> Simple: >> Weighted: ==> This policy splits your traffic based on different weights assigned. you can set 10% of your traffic to go to US-EAST-1 and 90% to go to EU-WEST-1: ==> Use case: 80% of the business customor is from California, and 20% is from New York. ==> Use case: You can trying to fix one feature in production application, and applied enhancement on dev server already. Then, you can redirect some of the connection to dev server for testing purpose. >> Latency: ==> Latency based routing allows you to route your traffic based on the lowest network latency for your end user (pick the region gives the fastest response time). ==> To use latency-based routing you create a latency resource record set for the Amazon EC2 (or ELB) resource in each region that hosts your website. When Amazon Route 53 receives a query for your site, it selects the latency resource record set for the region that gives the user the lowest latency. Route 53 then responds with the value associated with that resource record set. >> Failover: ==> Failover routing policies are used when you want to create an active/passive set up. For example you may want your primary site to be in EU-WEST-2 and your secondary DR Site in AP-SOUTHEAST-2. Route53 will monitor the health of your primary site using a health check. A health check monitors the health of your end points. >> Geolocation: ==> Geolocation routing lets you choose where your traffic will be sent based on the geographic location of your users (ie the location from which DNS queries originate). For example, you might want all queries from Europe to be routed to a fleet of EC2 instances that are specifically configured for your European customers. These servers may have the local language of your European customers and all prices are displayed in Euros. ==> Use case: When a user comes from Sweden to visit your webpage, then you can redirect her to a webiste with Swedian language according to her geolocation. o. Exam Tips - Route53: >> ELB’s do not have pre-defined lPv4 addresses, you resolve to them using a DNS name. >> Understand the difference between an Alias Record and a CNAME. >> Given the choice, always choose an Alias Record over a CNAME. <11> AWS Database Service: o. AWS Database Types: >> RDS-OLTP: ==> SQL Server ==> MySQL ==> PostgreSQL ==> Oracle ==> Aurora ==> MariaDB >> DynamoDB-No SQL: >> RedShíft-OLAP: >> Elasticache - In Memory Caching: ==> Memcached ==> Redis >> DMS: o. Non Relational Databases: >> Database: ==> Collection ................ Table ==> Document .................. Row ==> Key Value Pairs ........... Fields o. JSON/NoSQL: { "_id": "3d8kd90dmw382dz", "firstname": "Amos", "lastname": "Geng", "address": [ { "street": "20182 Oracle Dr", "city": "Reston", } ] } o. Data Warehousing: >> Used for business intelligence. Tools like Cognos, Jaspersoft, SQL Server. Reporting Services, Oracle Hyperion, SAP NetWeaver. Used to pull in very large and complex data sets. Usually used by management to do queries on data (such as current performance vs targets etc). o. OLTP & OLAP: o. Elasticache: >> ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in memory caches, instead of relying entirely on slower disk-based databases. >> ElastiCache supports two open-source in-memory caching engines: ==> Memcached ==> Redis >> ElastiCache can be used to store transient uer session database temporary for any objects. DynamoDB can be used as same purpose, but save data persistenly. >> ElastiCache is equivalent to WebCache producted by Oracle WebLogic. o. DMS - Database Migration Service: >> Announced at relavent 2015, DMS stands for Database Migration Service. Allows you to migrate your production database to AWS. Once the migration has started, AWS manages all the complexities of the migration process like data type transformation, compression, and parallel transfer (for faster data transfer) while ensuring that data changes to the source database that occur during the migration process are automatically replicated to the target. AWS schema conversion tool automatically converts the source database schema and a majority of the custom code, including views, stored procedures, and functions, to a format compatible with the target database. o. AWS database backup type: >> Automated Backup. >> Snapshots Backup. o. Automated Backups: >> There are two different types of Backups for AWS. Automated Backups and Database Snapshots. Automated Backups allow you to recover your database to any point in time within a “retention period”. Retention period can be between one and 35 days. Automated Backups will take a full daily snapshot and will also store transaction logs throughout the day. When you do a recovery, AWS will first choose the most recent daily back up, and then apply transaction logs relevant to that day. This allows you to do a point in time recovery down to a second, within the retention period. >> Automated Backups are enabled by default. >> The backup data is stored in S3 and you get free storage space equal to the size of your database. So if you have an RDS instance of 10Gb you will get 10Gb worth of storage. Backups are taken within a defined window. During the backup window, storage I/O may be suspended while data is being backed up. Elevated latency may happen. o. DB Snapshots are done manually (ie they are user initiated). They are stored even after you delete the original RDS instance, unlike automated backups. o. When ever you restore either an Automatic Backup or a manual Snapshot, the restored version of the database will be a new RDS instance with a new end point. o. Encryption: >> Database encrytion is supported for MySQL, Oracle, SQL Server, PostgreSQL & MariaDB. Encryption is done using the AWS Key Management Service (KMS) service. Once your RDS instance is encrypted the data stored at rest in the underlying storage is encrypted, as are its automated backups, read replicas, and snapshots. >> At the present time, encrypting an existing DB Instance is not supported. To use Amazon RDS encryption for an existing database, create a new DB Instance with encryption enabled and migrate your data into it. o. AWS database Multi-AZ: >> Multi-AZ allows you to have an exact copy of your production database in another Availability Zone. AWS handles the replication for you, so when your production database is written to, this write will automatically synchronised to the stand by database. In the event of planned database maintenance, DB Instance failure, or an Availability Zone failure, Amazon RDS will automatically failover to the standby so that database operations can resume quickly without administrative intervention. >> Multi-AZ is for Disaster Recovery only. It is not primarily used for improving performance. For performance improvement you need Read Replicas. o. AWS Multi-AZ feature supports following databases: >> SQL Server >> Oracle >> MySQL >> PostgreSQL >> MariaDB o. Read Replica: >> Read replica’s allow you to have a read only copy of your production database. Achieving by using Asynchronous replication from the primary RDS instance to the read replica. You use read replica’s primarily for very read-heavy database workloads. The feature is similiar as Oracle Active Data Guard. >> Read replica supports database: MySQL, PostgreSQL, MariaDB. >> Caution: ==> Used for Scaling! Not for DR! ==> Must have automatic backups turned on in order to deploy a read replic. ==> You can have up to 5 read replicas copies of any databases. ==> You can have read replicas of read replicas (but watch out for latency). ==> Each read replica will have its own DNS end point. ==> You cannot have Read Replicas that have Multi-AZ. ==> You can create Read Replica’s of Multi-AZ source databases however Read Replicas can be promoted to be their own databases. This breaks the replication. ==> Read Replica in a second region for MySQL and MariaDB. Not for PostgreSQL. o. DynamoDB: >> Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed database and supports both document and key value data models. Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, loT, and many other applications. >> It is NoSQL database, comparing with Relational DB. The difference is adding any columns without verifying data consistency. You can add any columns on the fly without indicating NULL or not. And, all operation can be archinved by GUI without any SQL command line. >> DynamoDB offers “push button” scaling, meaning that you can scale your database on the fly, without any down time. RDS is not so easy and you usually have to use a bigger instance size or to add a read replica. >> Stored on SSD storage. Spread Across 3 geographically distinct data centres. >> DynamoDB is fully managed by AWS, rather than RDS. >> Eventual Consistent Reads (Default): ==> Consistency across all copies of data is usually reached within a second. ==> Repeating a read after a short time should return the updated data, when a recorded was inserted into database. (Best Read Performance). >> Strongly Consistent Reads. ==> A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read. >> DynamoDB Price - Provisioned Throughput Capacity: ==> Write Throughput $O.0065 per hour for every 10 units. ==> Read Throughput $O.0065 per hour for every 50 units. ==> Expensive for writes, and cheap for reads. If you application does not have many join function, DynamoDB would be the good choice. >> DynamoDB Pricing Example: ==> Let’s assume that your application needs to perform 1 million writes and 1 million reads per day, while storing 3 GB of data. First, you need to calculate how many writes and reads per second you need. 1 million evenly spread writes per day is equivalent to 1,000,000 (writes) /24 (hours) / 60 (minutes) / 60 (seconds) = 11.6 writes per second. A DynamoDB Write Capacity Unit can handle 1 write per second, so you need 12 Write Capacity Units. Similarly, to handle 1 million strongly consistent reads per day, you need 12 Read Capacity Units. With Read Capacity Units, you are billed in blocks of 50, with Write Capacity Units you are billed in blocks of 10. To calculate Write Capacity Units = (0.0065/10) x 12 x 24 = $0.1 872 To calculate Read Capacity Units = (0.00 65/50) x 12 x 24 = $0.0374 o. Redshift: >> Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1 ,000 per terabyte per year, less than a tenth of most other data warehousing solutions >> Redshift Configuration: ==> Single Node (160Gb). ==> Multi-Node. ==> Leader Node (manages client connections and receives queries). ==> Compute Node (store data and perform queries and computations). Up to 128 Compute Nodes. ==> Block size for columnar storage is 1024 KB. >> Columnar Data Storage: ==> Instead of storing data as a series of rows, Amazon Redshift organizes the data by column. Unlike row-based systems, which are ideal for transaction processing, column-based systems are ideal for data warehousing and analytics, where queries often involve aggregates performed over large data sets. Since only the columns involved in the queries are processed and columnar data is stored sequentially on the storage media, column-based systems require far fewer I/Os, greatly improving query performance. >> Advanced Compression: ==> Columnar data stores can be compressed much more than row-based data stores because similar data is stored sequentially on disk. Amazon Redshift employs multiple compression techniques and can often achieve significant compression relative to traditional relational data stores. In addition, Amazon Redshift doesn’t require indexes or materialized views and so uses less space than traditional relational database systems. When loading data into an empty table, Amazon Redshift automatically samples your data and selects the most appropriate compression scheme. >> Massive Parallel Processing(MRP): ==> Amazon Redshift automatically distributes data and query load across all nodes. Amazon Redshift makes it easy to add nodes to your data warehouse and enables you to maintain fast query performance as your data warehouse grows. >> Redshift pricing: ==> Compute Node Hours (total number of hours you run across all your compute nodes for the billing period. You are billed for 1 unit per node per hour, so a 3-node data warehouse cluster running persistently for an entire month would incur 2,160 instance hours. You will not be charged for leader node hours; only compute nodes will incur charges.) ==> Backup. ==> Data transfer (only within a VPC, not outside it). >> Redshift Security: ==> Encrypted in transit using SSL. ==> Encrypted at rest using AES-256 encryption. ==> By default RedShift takes care of key management. ==> Manage your own keys through HSM. ==> AWS Key Management Service. >> Redshift Availability: ==> Currently only available in 1 AZ. ==> Can restore snapshots to new AZ’s in the event of an outage. o. ElastiCache: >> ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory caches, instead of relying entirely on slower disk-based databases. >> It is similar with Oracle database SGA/PGA area. >> Amazon ElastiCache can be used to significantly improve latency and throughput for many read-heavy application workloads (such as social networking, gaming, media sharing and Q&A portals) or compute intensive workloads (such as a recommendation engine). Caching improves application performance by storing critical pieces of data in memory for low-latency access. >> Cached information may include the results of I/O-intensive database queries or the results of computationally-intensive calculations. >> ElastiCache Types: ==> Memcached: A widely adopted memory object caching system. ElastiCache is protocol compliant with Memcached, so popular tools that you use today with existing Memcached environments will work seamlessly with the service. ==> Redis....: A popular open-source in-memory key-value store that supports data structures such as sorted sets and lists. ElastiCache supports Master / Slave replication and Multi-AZ which can be used to achieve cross AZ redundancy. >> Exam Tips - Elasticache: Typically you will be given a scenario where a particular database is under a lot of stress/load. You may be asked which service you should use to alleviate this. ==> Elasticache is a good choice if your database is particularly read heavy and not prone to frequent changing. ==> Redshift is a good answer if the reason your database is feeling stress is because management keep running OLAP transactions on it etc. o. Aurora: >> Amazon Aurora, home grown production, is a MySQL-compatible, relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. >> Amazon Aurora provides up to five times better performance than MySQL at a price point one tenth that of a commercial database while delivering similar performance and availability. >> Scaling: ==> Start with 10Gb, Scales in 10Gb increments to 64Tb (Storage Autoscaling). ==> Compute resources can scale up to 32 vCPUs and 244 GB of memory. ==> 2 copies of your data is contained in each availability zone, with minimum of 3 availability zones. 6 copies of your data. ==> Aurora is designed to transparently handle the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. ==> Aurora storage is also self-healing. Data blocks and disks are continuously scanned for errors and repaired automatically. >> Replica: ==> 2 Types of Replicas are available: . Aurora Replicas (currently 15) .......... can failover automatically . MySQL Read Replicas (currently 5) ....... can NOT failover automatically o. RDS - Exam Tips: >> Your VPN must across at least 2 AZs in the same region, and one subnet to hold RDS instance. <12> VPC: o. Amazon Virtual Private Cloud (Amazon VPC) lets you provision a logically isolated section of the Amazon Web Services (AWS) Cloud where you can launch AWS resources in a virtual network that you define. You have complete control over your virtual networking environment, including selection of your own IP address range, creation of subnets, and configuration of route tables and network gateways. You can easily customize the network configuration for your Amazon Virtual Private Cloud. For example, you can create a public-facing subnet for your webservers that has access to the Internet, and place your backend systems such as databases or application servers in a private facing subnet with no Internet access. You can leverage multiple layers of security, including security groups and network access control lists, to help control access to Amazon EC2 instances in each subnet. You can create a Hardware Virtual Private Network (VPN) connection between corporate datacenter and VPC and leverage the AWS cloud as an extension of corporate datacenter. o. Public Subnet .... EC2 instance within public subnet must have public/elastic IP to be reachable by internet ...... [ Web server, or Jump Host ] Private Subnet ... EC2 instance within private subnet are internet unaccessible ................................... [ Database server, or Application server ] o. Difference between public subnet and private subnet: >> The main difference is the route for 0.0.0.0/0 in the associated route table. A private subnet sets that route to a NAT instance. Private subnet instances only need a private ip and internet traffic is routed through the NAT in the public subnet. You could also have no route to 0.0.0.0/0 to make it a truly private subnet with no internet access in or out. >> A public subnet routes 0.0.0.0/0 through an Internet Gateway (igw). Instances in a public subnet require public IPs to talk to the internet. The warning appears even for private subnets, but the instance is only accessible inside your vpc. o. One subnet only setup within one AZ, can not across multi-AZ. Security group, route table and Network ACL can span multi-AZ. Each VPC can only have one Internet gateway. o. What can do VPN? >> Launch instances into a subnet of your choosing Assign custom IP address ranges in each subnet. >> Configure route tables between subnets. >> Create internet gateway and attach it to our VPC. >> Much better security control over your AWS resources. >> Instance security groups. >> Subnet network access control lists (ACLS). o. Default VPC & Custom VPC: >> Default VPC is user friendly, allowing you to immediately deploy instances. >> All Subnets in default VPC have a route out to the Internet. >> Each EC2 instance has both a public and private IP address. >> If you delete the default VPC the only way to get it back is to contact AWS. o. VPC Peering: >> Allows you to connect one VPC with another via a direct network route using private IP addresses. >> Instances behave as if they were on the same private network. >> You can peer VPC’s with other AWS accounts as well as with other VPCs in the same account. >> Peering is in a star configuration, ie 1 central VPC peers with 4 others. >> Use case: needs to connect Product evnironment with Test/Dev environment. >> NO TRANSITIVE PEERING!!! o. Exam Tips - VPC: >> Think of a VPC as a logical datacenter in AWS. >> Consists of IGW’s (Or Virtual Private Gateways), Route Tables, Network Access Control Lists, Subnets, Security Groups. >> 1 Subnet = 1 Availability Zone. >> By default, you can create maximum 5 VPC per region. Further more, you need to submit request to increase the limit. >> Security Groups are Stateful, Network Access Control Lists are Stateless. >> NO TRANSITIVE PEERING. >> By default, all subnets within same VPC can communicate with each other, no matter they are public or private. o. NAT instance: >> NAT instance is just as trandition physical server acting as a subnet gateway. >> User can go to AWS instance store to find AWS NAT instance server image, and create a EC2 instance as gateway. >> You have to disable source and destination check on NAT instance always. o. NAT Gateway: >> NAT Gateway is a feature provided by AWS VPC, which functions same logically as a "fake" physical gateway. >> Best practice is that when you create a VPN, you always at least create one public subnet and one priviate subnet within it. And, attach NAT Gateway in the public subnet, because you do not want your priviate subnet exposed to public Internet. >> If you use NAT gateway feature, you do not need to maintain your NAT instance, such as applying Security patch periodically basis. o. Exam Tips - NAT instances: >> When creating a NAT instance, Disable Source/Destination Check on the Instance. >> NAT instance must be in a public subnet. >> There must be a route out of the private subnet to the NAT instance, in order for this to work. >> Must have an Elastic IP [Similar with statistic IP] to work. >> The amount of traffic that NAT instances supports, depends on the instance size. If you are bottlenecking, increase the instance size. >> You can create high availability using Autoscaling Groups, multiple subnets in different AZ’s and a script to automate failover. >> Behind a Security Group. o. Exam Tips - NAT Gateways: >> Very new, may not be in the exams yet. >> Preferred by the enterprise. No need for extra maintenance as NAT instance. >> Scale automatically up to 10Gbps. >> No needto patch. >> Not associated with security groups. >> Automatically assigned a public ip address. >> Remember to update your route tables. >> No need to disable Source/Destination Checks. >> Elastic IP will inccur charged when it is created but be attached with a stop instance, or not allocated to any instance at all to avoid IP waste. o. Access Contrl List [ACL]: >> ACL is just like a firewall of a subnet. >> One subnet can only have one ACL. o. Comparison of ACL and Security Group: o. Blocking certain IP via ACL setting up different level of rules. The less number, the higher rule priority: o.Exam Tips - ACL: >> Your VPC automatically comes a default network ACL and by default it allows all outbound and inbound traffic. >> You can create a custom network ACL. By default, each custom network ACL denies all inbound and outbound traffic until you add rules. >> Each subnet in your VPC must be associated with a network ACL. If you don’t explicitly associate a subnet with a network ACL, the subnet is automatically associated with the default network ACL. >> You can associate a network ACL with multiple subnets; however, a subnet can be associated with only one network ACL at a time. When you associate a network ACL with a subnet, the previous association is removed. >> A network ACL contains a numbered list of rules that is evaluated in order, starting with the lowest numbered rule. >> A network ACL has separate inbound and outbound rules, and each rule can either allow or deny traffic. >> Network ACLs are stateless; responses to allowed inbound traffic are subject to the rules for outbound traffic (and vice versa). >> Block IP Addresses using network ACL’s not Security Groups. o. Bastion: >> Bastion is just "jump host", can be accessed by SSH/Putty or RemoteDeskTop/RDP. >> The case is exactly as HSUS or HSUK acccess method. o. Exam Tips - NAT vs Bastions: >> A NAT is used to provide internet traffic to EC2 instances in private subnets. >> A Bastion is used to securely administer EC2 instances (using SSH or RDP) in private subnets. o. Exam Tips - Resilient Architecture: >> If you want resiliency, always have 2 public subnets and 2 private subnets. Make sure each subnet is in different availability zones. >> With ELB’s make sure they are in 2 public subnets in 2 different availability zones. >> With Bastion hosts, put them behind an autoscaling group with a minimum size of 2. Use Route53 (either round robin or using a health check) to automatically fail over. >> NAT instances are tricky to make resilient. You need 1 in each public subnet, each with their own public IP address, and you need to write a script to fail between the two. Instead where possible, use NAT gateways. o. Exam Tips - VPC Flow Logs: >> You can monitor network traffic within your custom VPC’s using VPC Flow Logs. <13> Application: o. SQS - Simple Queue Service: >> Amazon SQS is a web service that gives you access to a message queue that can be used to store messages while waiting for a computer to process them. Amazon SQS is a distributed queue system that enables web service applications to quickly and reliably queue messages that one component in the application generates to be consumed by another component. A queue is a temporary repository for messages that are awaiting processing. >> Message queue is for communication between different applications. >> Using Amazon SQS, you can decouple the components of an application so they run independently, with Amazon SQS easing message management between components. Any component of a distributed application can store messages in a fail-safe queue. Messages can contain up to 256 KB of text in any format, and is garanteed to be delivered at least once. SQS message retention is from 1 minute to 14 days. The default value is 4 days. The queue can contain up to 120,000 messages. And, a delete command is needed when the message get processed. Any component can later retrieve the messages programmatically using the Amazon SQS API. o. Message queue type: | |__ o. Standard queue [Default]: | . Amazon SQS offers standard as the default queue type. | . A standard queue lets you have a nearly-unlimited number of transactions per second. | . Standard queues guarantee that a message is delivered at least once. | However, occasionally(because of the highly-distributed architecture that allows high throughput), more than one copy of a message might be delivered out of order. | Standard queues provide best-effort ordering which ensures that messages are generally delivered in the same order as they are sent. | |__ o. FIFO queue [First-in-First-out]: . The FIFO queue complements the standard queue. . The most important features of this queue type are FIFO (first-in-first-out) delivery and exactly once processing. . The order in which messages are sent and received is strictly preserved and a message is delivered once and remains available until a consumer processes and deletes it; . duplicates are not introduced into the queue. . FIFO queues also support message groups that allow multiple ordered message groups within a single queue. . FIFO queues are limited to 300 transactions per second (TPS), but have all the capabilities of standard queues. o. SQS - Key Facts: >> SQS is puIl based, not pushed base. >> Messages are 256 KB in size. >> Messages can be kept in the queue from 1 minute to 14 days. The default is 4 days. >> Visibility Time Out is the amount of time that the message is invisible in the SQS queue after a reader picks up that message. >> Provided the job is processed before the visibility time out expires, the message will then be deleted from the queue. If the job is not processed within that time, the message will become visible again and another reader will process it. This could result in the same message being delivered twice. >> Visibility time out maximum is 12 hours. >> SQS guarantees that your messages will be processed at least once. >> Amazon SQS long polling is a way to retrieve messages from your AWS SQS queues. While the regular short polling returns immediately, even if the message queue being polled is empty. Long polling does not return a response until a message arrives in the message queue, or the long poll times out. >> By default, SQS is configured as short poll, which means the queue is polled so every often for new messages. >> Long poll allows a shorter poll time but taking more messages during long polling cycle. >> In order to reduce polling cycle, it better to have bigger gaps by enabing long polling. This can be done by setting ReceiveMessageWaitTimeSeconds>0. o. SWF - Simple Work Flow: >> Amazon Simple Workflow Service (Amazon SWF) is a web service that makes it easy to coordinate work across distributed application components. Amazon SWF enables applications for a range of use cases, including media processing, web application back-ends, business process workflows, and analytics pipelines, to be designed as a coordination of tasks. Tasks represent invocations of various processing steps in an application which can be performed by executable code, web service calls, human actions, and scripts. Key - if human being got involved. o. SWF - Actors: >> Starter - AWS portal website, where customer to place an order to buy a pair of shose. >> Decider - A program decides what is the next step in the whole order process. >> Worker - could be a AWS humman worker in the physical warehouse, or a script. o. SWF - Workers: >> Workers are programs that interact with Amazon SWF to get tasks, process received tasks, and return the results. o. SWF - Decider: >> The decider is a program that controls the coordination of tasks. For example, their ordering, cocurrency, and scheduling according to the application logic. o. SWF - Workers & Decider: >> The workers and the decider can run on cloud infrastructure, such as Amazon EC2, or on machines behind firewalls. Amazon SWF brokers the interactions between workers and the decider. It allows the decider to get consistent views into the progress of tasks and to initiate new tasks in an ongoing manner. At the same time, Amazon SWF stores tasks, assigns them to workers when they are ready, and monitors their progress. It ensures that a task is assigned only once and is never duplicated. Since Amazon SWF maintains the application’s state durably, workers and deciders don’t have to keep track of execution state. They can run independently, and scale quickly. o. SWF - Starters: >> An application that can initiate (start) a workflow. Could be your e-commerce website when placing an order or a mobile app searching for bus times. o. SWF - Domain: >> Your workflow and activity types and the workflow execution itself are all scoped to a domain. Domains isolate a set of types, executions, and task lists from others within the same account. You can register a domain by using the AWS Management Console or by using the Register Domain action in the Amazon SWF API. >> The parameters are specified in JavaScript Object Notation (JSON) format: https:llswf.us-east-1 .amazonaws.com RegisterDomain { “name”: “867530901”, “description” : “music”, “workflowExecutionRetentionPeriodlnDays” : “60” } o. SWF - Life cycle: >> Maxium work flow can be 1 year and the value is always measured in seconds. o. SWF vs SQS: >> Amazon SWF presents a task-oriented API, whereas Amazon SQS offers a message-oriented API. >> Amazon SWF ensures that a task is assigned only once and is never duplicated. Amazon SQS, you need to handle duplicated messages and may also need to ensure that a message is processed only once. >> Amazon SWF keeps track of all the tasks and events in an application. Amazon SQS, you need to implement your own application-level tracking, especially if your application uses multiple queues. o. SNS - Simple Notification Service: >> SNS is a web service that makes it easy to set up, operate, and send notifications from the cloud. It provides developers with a highly scalable, flexible, and cost-effective capability to publish messages from an application and immediately deliver them to subscribers or other applications. >> Amazon SNS follows the “publish-subscribe” (pub-sub) messaging paradigm, with notifications being delivered to clients using a “push” mechanism that eliminates the need to periodically check or “poll” for new information and updates. With simple APIs requiring minimal up-front development effort, no maintenance or management overhead and pay-as-you-go pricing, Amazon SNS gives developers an easy mechanism to incorporate a powerful notification system with their applications. >> Push notifications to Apple, Google, Fire OS, and Windows devices, as well as Android devices in China with Baidu Cloud Push. >> Besides pushing cloud notifications directly to mobile devices, Amazon SNS can also deliver notifications by SMS text message or email, to Amazon Simple Queue Service queues, or to any HTTP endpoint. To prevent messages from being lost, all messages published to Amazon SNS are stored redundantly across multiple availability zones. >> So called "Push", 即“推送”, 如QQ桌面软件推送新闻弹窗. >> SNS allows you to group multiple recipients using topics. A topic is an “access point” for allowing recipients to dynamically subscribe for identical copies of the same notification. One topic can support deliveries to multiple endpointtypes -- for example, you can group together iOS, Android and SMS recipients. When you publish once to a topic, SNS delivers appropriately formatted copies of your message to each subscriber. >> Instantaneous, push-based delivery (no polling). >> Simple APIs and easy integration with applications. >> Flexible message delivery over multiple transport protocols. >> Inexpensive, pay-as-you-go model with no up-front costs. >> Web-based AWS Management Console offers the simplicity of a point-and-click interface. o. SNS Subscribers: >> HTTP >> HTTPS >> Email >> Email-JSON >> SQS >> Application >> Lambda o. SNS vs SQS: >> Both Messaging Services in AWS. >> SNS - Push. >> SQS-Polls (Pulls). o. SNS Pricing: >> Users pay $0.50 per 1 million Amazon SNS Requests. >> $0.06 per 100,000 Notification deliveries over HTTP. >> $0.75 per 100 Notification deliveries over SMS. [Short Message Service, which is being used for most phone text message delivery] >> $2.00 per 100,000 Notification deliveries over Email. o. Elastic Transcoder: >> Media Transcoder in the cloud. >> Convert media files from their original source format in to different formats that will play on smartphones, tablets, PC’s etc. >> Provides transcoding presets for popular output formats, which means that you don’t need to guess about which settings work best on particular devices. >> Pay based on the minutes that you transcode and the resolution at which you transcode. o. API Gateway: >> Amazon API Gateway is a fully managed service that makes it easy for developers to publish, maintain, monitor, and secure APIs at any scale. With a few clicks in the AWS Management Console, you can create an API that acts as a “front door” for applications to access data, business logic, or functionality from your back-end services, such as applications running on Amazon Elastic Compute Cloud (Amazon EC2), code running on AWS Lambda, or any web application. >> API Gateway supports: GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS. o. API Caching: >> You can enable API caching in Amazon API Gateway to cache your endpoint’s response. With caching, you can reduce the number of calls made to your endpoint and also improve the latency of the requests to your API. When you enable caching for a stage, API Gateway caches responses from your endpoint for a specified time-to-live (TTL) period, in seconds. API Gateway then responds to the request by looking up the endpoint response from the cache instead of making a request to your endpoint. >> Low Cost and Efficient Scales Effortlessly. >> You can Throttle Requests to prevent attacks. >> Connect to Cloudwatch to log all requests. o. Same Origin Policy: >> In computing, the same-origin policy is an important concept in the web application security model. Under the policy, a web browser permits scripts contained in a first web page to access data in a second web page, but only if both web pages have the same origin. >> As my understand, first webpage GET/POST['value'], and transfer to second webpage. Both of the webpage should have same domain. o. Cross-origin Resource Sharing: >> CORS is one way the server at the other end (not the client code in the browser) can relax the same-origin policy. Cross-origin resource sharing (CORS) is a mechanism that allows restricted resources (e.g. fonts) on a web page to be requested from another domain outside the domain from which the first resource was served. Error - “Origin policy cannot be read at the remote resource?”. You need to enable CORS on API Gateway. o. Exam Tips - API Gateway: >> Remember what API Gateway is at a high level. >> API Gateway has caching capabilities to increase performance. >> API Gateway is low cost and scales automatically. >> You can throttle API Gateway to prevent attacks. >> You can log results to CloudWatch. >> If you are using Javascript/AJAX that uses multiple domains with API Gateway, ensure that you have enabled CORS on API Gateway. o. Streaming data: >> Streaming Data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes). >> Purchases from online stores (think amazon.com) >> Stock Prices. >> Game data (as the gamer plays). >> Social network data. >> Geospatial data (think uber.com). >> lOT sensor data. o. Kinesis: >> Amazon Kinesis is a platform on AWS to send your streaming data too. Kinesis makes it easy to load and analyze streaming data, and also providing the ability for you to build your own custom applications for you business needs. >> Core services of Kinesis: >> Kinesis Streams - "Shard is the unique marker": ==> Kinesis Streams consist of shards 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second and up to 1,000 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys). ==> The data capacity of your stream is a function of the number of shards that you specify for the stream. The total capacity of the stream is the sum of the capacities of its shards. >> Kinesis Firehose: >> Kinesis Firehose with ElasticSearch Cluster: >> Kinesis Analytics - You can run SQL query on the top of Kinesis Streams and Firehose: o. Exam Tips - Kinesis: >> Know the difference between Kinesis Streams and Kinesis Firehose. You will be given scenario questions and you must choose the most relevant service. >> Understand what Kinesis Analytics is. >> The record added into the stream are accessiable up to 24 hours by default, but you can extend data retention maximum to 7 days. <14> AWS White Paper: o. Cloud computing is the on-demand delivery of IT resources and applications via the Internet with pay-as-you-go pricing. Cloud computing provides a simple way to access servers, storage, databases, and a broad set of application services over the Internet. Cloud computing providers such as AWS own and maintain the network-connected hardware required for these application services, while you provision and use what you need using a web application. o. 6 Advantages Of Cloud: >> Trade Capital Expense for variable expense. >> Benefit from massive economies of scale. >> Stop guessing about capacity. >> Increase speed and agility. >> Stop spending money running and maintaining data centers. >> Go global in minutes. o. Security: >> State of the art electronic surveillance and multi-factor access control systems. >> Staffed 24 x 7 by security guards. >> Access is authorised on a “least privilege basis”. o. Compliance: >> SOC1 /SSAE 1 6/ISAE 3402 (formerly SAS 70 Type ll) >> SOC2 >> SOC3 >> FISMA, DIACAP, and FedRAMP >> PCI DSS Level 1 >> ISO 27001 >> ISO 9001 >> ITAR >> FIPS14O-2 >> Several industry-specific standards: . HIPAA . Cloud Security Alliance (CSA) . Motion Picture Association of America (MPAA) o. AWS Solution Architect - Associate Major Topic: o. AWS Solution Architect - Associate Topic Pecentage: o. Shared Security Model: >> AWS is responsible for securing the underlying infrastructure that supports the cloud, and you’re responsible for anything you put on the cloud or connect to the cloud. o. AWS Security Responsibilities: >> Amazon Web Services is responsible for protecting the global infrastructure that runs all of the services offered in the AWS cloud. This infrastructure is comprised of the hardware, software, networking, and facilities that run AWS services. AWS is responsible for the security configuration of its products that are considered managed services. Examples of these types of services include Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon Elastic MapReduce, Amazon WorkSpaces. o. Customer Security ResponsibiIitis: >> IAAS - such as Amazon F02, Amazon VPC, and Amazon S3 are completely under your control and require you to perform all of the necessary security configuration and management tasks. Managed Services, AWS is responsible for patching, antivirus etc., however you are responsible for account management and user access. It’s recommended that MFA be implemented, communicate to these services using SSL/TLS and that API user activity logging be setup with CloudTrail. o. Storage Demissioning: >> When a storage device has reached the end of its useful life, AWS procedures include a decommissioning process that is designed to prevent customer data from being exposed to unauthorized individuals. AWS uses the techniques detailed in DoD 5220.22-M (National Industrial Security Program Operating Manual t) or NIST 800-88 (Guidelines for Media Sanitization”) to destroy data as part of the decommissioning process. All decommissioned magnetic storage devices are degaussed and physically destroyed in accordance with industry-standard practices. o. Network Security: >> Transmission Protection - You can connect to an AWS access point via HTTP or HTTPS using Secure Sockets Layer (SSL), a cryptographic protocol that is designed to protect against eavesdropping, tampering, and message forgery. For customers who require additional layers of network security, AWS offers the Amazon Virtual Private Cloud (VPC), which provides a private subnet within the AWS cloud, and the ability to use an IPsec Virtual Private Network (VPN) device to provide an encrypted tunnel between the Amazon VPC and your data center. >> Amazon Corporate Segregation - Logically, the AWS Production network is segregated from the Amazon corporate network by means of a complex set of network security / segregation devices. o. Network Monitoring & Protection: >> DDoS >> Man in the middle attacks (MITM) >> lpSpoofing >> Port Scanning >> Packet Sniffing by other tenants o. Network Monitoring & Protection: >> IP Spoofing - Unauthorized port scans by Amazon EC2 customers are a violation of the AWS Acceptable Use Policy. You may request permission to conduct vulnerability scans as required to meet your specific compliance requirements. These scans must be limited to your own instances and must not violate the AWS Acceptable Use Policy. You must request a vulnerability scan in advance. o. AWS Credentials: o. AWS Trusted Advisor: >> Trusted Advisor inspects your AWS environment and makes recommendations when opportunities may exist to save money, improve system performance, or close security gaps. It provides alerts on several of the most common security misconfigurations that can occur, including leaving certain ports open that make you vulnerable to hacking and unauthorized access, neglecting to create lAM accounts for your internal users, allowing public access to Amazon S3 buckets, not turning on user activity logging (AWS CloudTrail), or not using MFA on your root AWS Account. o. Instance Isolation: >> Different instances running on the same physical machine are isolated from each other via the Xen hypervisor. In addition, the AWS firewall resides within the hypervisor layer, between the physical network interface and the instance’s virtual interface. All packets must pass through this layer, thus an instance’s neighbors have no more access to that instance than any other host on the Internet and can be treated as if they are on separate physical hosts. The physical RAM is separated using similar mechanisms. >> Customer instances have no access to raw disk devices, but instead are presented with virtualized disks. The AWS proprietary disk virtualization layer automatically resets every block of storage used by the customer, so that one customer’s data is never unintentionally exposed to another. >> In addition, memory allocated to guests is scrubbed (set to zero) by the hypervisor when it is unallocated to a guest. The memory is not returned to the pool of free memory available for new allocations until the memory scrubbing is complete. o. Guest Operating System: >> Virtual instances are completely controlled by you, the customer. You have full root access or administrative control over accounts, services, and applications. AWS does not have any access rights to your instances or the guest OS. Firewall - Amazon EC2 provides a complete firewall solution; this mandatory inbound firewall is configured in a default deny-all mode and Amazon EC2 customers must explicitly open the ports needed to allow inbound traffic. >> Encryption of sensitive data is generally a good security practice, and AWS provides the ability to encrypt EBS volumes and their snapshots with AES-256. The encryption occurs on the servers that host the EC2 instances, providing encryption of data as it moves between EC2 instances and EBS storage. In order to be able to do this efficiently and with low latency, the EBS encryption feature is only available on EC2 more powerful instance types (e.g., M3, C3, R3, G2). o. Elastic Load Balancing >> SSL Termination on the load balancer is supported. Allows you to identify the originating IP address of a client connecting to your servers, whether you’re using HTTPS or TCP load balancing. o. Direct Connect: >> Bypass Internet service providers in your network path. You can procure rack space within the facility housing the AWS Direct Connect location and deploy your equipment nearby. Once deployed, you can connect this equipment to AWS Direct Connect using a cross-connect. Using industry standard 802.1 q VLANs, the dedicated connection can be partitioned into multiple virtual interfaces. This allows you to use the same connection to access public resources such as objects stored in Amazon S3 using public IP address space, and private resources such as Amazon EC2 instances running within an Amazon VPC using private IP space, while maintaining network separation between the public and private environments. o. Shared Responsibility Model: >> Moving IT infrastructure to AWS services creates a model of shared responsibility between the customer and AWS. This shared model can help relieve customer’s operational burden as AWS operates, manages and controls the components from the host operating system and virtualization layer down to the physical security of the facilities in which the service operates. >> The customer assumes responsibility and management of the guest operating system (including updates and security patches), other associated application software as well as the configuration of the AWS provided security group firewall. o. Risk: >> AWS management has developed a strategic business plan which includes risk identification and the implementation of controls to mitigate or manage risks. AWS management reevaluates the strategic business plan at least biannually. This process requires management to identify risks within its areas of responsibility and to implement appropriate measures designed to address those risks. >> AWS Security regularly scans all Internet facing service endpoint IP addresses for vulnerabilities (these scans do not include customer instances). AWS Security notifies the appropriate parties to remediate any identified vulnerabilities. In addition, external vulnerability threat assessments are performed regularly by independent security firms. Findings and recommendations resulting from these assessments are categorized and delivered to AWS leadership. These scans are done in a manner for the health and viability of the underlying AWS infrastructure and are not meant to replace the customer’s own vulnerability scans required to meet their specific compliance requirements. >> Customers can request permission to conduct scans of their cloud infrastructure as long as they are limited to the customer’s instances and do not violate the AWS Acceptable Use Policy. o. Compliance: >> SOC1/SSAE16/ISAE3402(formerly SAS 70 Type II) >> SOC2 >> SOC3 >> FISMA, DIACAP, and FedRAMP >> PCI DSS Level 1 >> ISO 27001 >> ISO 9001 >> ITAR >> FIPS14O-2 >> Several industry-specific standards: . HIPAA . Cloud Security Alliance (CSA) . Motion Picture Association of America (MPAA) o. Each Services Contains: >> Summary >> Ideal Usage Patterns >> Performance >> Durability and Availability >> Cost Model >> Scalability & Elasticity >> Interfaces >> Anti-Patterns o. Business Benefits of Cloud: >> Almost zero upfront infrastructure investment >> Just-in-time Infrastructure >> More efficient resource utilization >> Usage-based costing >> Reduced time to market o. Technical Benefits of Cloud >> Automatíon — “Scriptable infrastructure” >> Auto-scaling >> Proactive Scaling >> More Efficient Development lifecycle >> Improved Testability >> Disaster Recovery and Business Continuity >> “Overflow” the traffic to the cloud o. Design For Failure: >> Rule of thumb: Be a pessimist when designing architectures in the cloud; assume things will fail. In other words, always design, implement and deploy for automated recovery from failure. In particular, assume that your hardware will fail. Assume that outages will occur. Assume that some disaster will strike your application. Assume that you will be slammed with more than the expected number of requests per second some day. Assume that with time your application software will fail too. By being a pessimist, you end up thinking about recovery strategies during design time, which helps in designing an overall system better. o. Decouple Your Components: >> The key is to build components that do not have tight dependencies on each other, so that if one component were to die (fail), sleep (not respond) or remain busy (slow to respond) for some reason, the other components in the system are built so as to continue to work as if no failure is happening. In essence, loose coupling isolates the various layers and components of your application so that each component interacts asynchronously with the others and treats them as a “black box”. >> For example, in the case of web application architecture, you can isolate the app server from the web server and from the database. The app server does not know about your web server and vice versa, this gives decoupling between these layers and there are no dependencies code-wise or functional perspectives. In the case of batch processing architecture, you can create asynchronous components that are independent of each other. o. Implement Elasticity: >> The cloud brings a new concept of elasticity in your applications. Elasticity can be implemented in three ways: 1. Proactive Cyclic Scaling: Periodic scaling that occurs at fixed interval (daily, weekly, monthly, quarterly). 2. Proactive Event-based Scaling: Scaling when expecting a big surge of traffic requests due to a scheduled business event (new product launch, marketing campaigns) 3. Auto-scaling based on demand. By using a monitoring service, your system can send triggers to take appropriate actions so that it scales up or down based on metrics (utilization of the servers or network i/o, for instance). o. AWS Import/Export: >> AWS Import/Export accelerates moving large amounts of data into and out of AWS using portable storage devices for transport. AWS transfers your data directly onto and off of storage devices using Amazon’s high speed internal network and bypassing the Internet. For significant datasets, AWS Import/Export is often faster than Internet transfer, and more cost effective than upgrading your connectivity. o. AWS Gateway: >> AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization’s on-premises IT environment and AWS’s storage infrastructure. The service enables you to securely store data to the AWS cloud for scalable and cost-effective storage. >> Gateway-cached volumes allow you to utilize Amazon S3 for your primary data, while retaining some portion of it locally in a cache for frequently accessed data. These volumes minimize the need to scale your on-premises storage infrastructure, while still providing your applications with low-latency access to their frequently accessed data. You can create storage volumes up to 32 TBs in size and mount them as ISCSI devices from your on-premises application servers. Data written to these volumes is stored in Amazon S3, with only a cache of recently written and recently read data stored locally on your on-premises storage hardware. >> Gateway-stored volumes store your primary data locally, while asynchronously backing up that data to AWS. These volumes provide your on-premises applications with low-latency access to their entire datasets, while providing durable, off-site backups. You can create storage volumes up to 1 TB in size and mount them as iSCSI devices from your on- premises application servers. Data written to your gateway-stored volumes is stored on your on premises storage hardware, and asynchronously backed up to Amazon S3 in the form of Amazon EBS snapshots. <15> Additional Topics: o. API Gateway: >> API Gateway is intergrated with CloudFront in the background automatically to ensure better distribution of repsonse of API calls. >> API Gateway stage prescribes a unique URL for uses to call the associated API snapshot. >> When your API's resource requests from a domain other than the API's own domin, you must enable cross origin resource sharing [CORS] for selected methos on the resource. >> API caching is not eligible for Free Tier account. >> AWS Service Proxy can be used API Gateway to make calles to Amazon S3, SNS, Kinesis. >> Amazon Cognito can be used to control access to your API gateway. >> Endpoint that are exposed with API gateway is HTTPS. >> API gateway should always integrates with IAM role for security purpose. o. Load balancer: >> If you suspend AddToLoadBalancer, Auto Scaling launches the instances but does not add them to the ELB or target group. >> Termination policy - The initial purpose of ELB is ensuring the traffic was distributed to AZs evenlly. Pick up the AZs with most EC2 instances in higher priority. >> When the auto scaling group uses both EC2 and ELB health checks, and if one of the reports an instance as unhealthy, the auto scaling group will replace the instance. >> ELB and ASG is the combination for AWS high availability. o. Security Token Service: >> default value of a session token received from STS with SAML Federati is 1 hour. Min 15 minuts; Max 1 hour. >> SAML - AD Directory, single sign-on between on-prmise and AWS cloud. >> Web Identity - Web login, such as Google, Facebook temporary credential. o. Container Service: >> Docker dignostics for error troubleshooting purpose. >> Private registry authentication. >> Task definition files to ensure the right docker image that you are using. >> Elastic container use case fit for: continous integration, continous deployment, microservice. >> Dynamic port mapping advantage: . Ability to assigned unused port. . ELB can shared among multiple servies using path-based routing. . Ability to create envionment variable with service ELB DNS name, supporting service discovery. >> The main purpose of port mappings is sending traffice on the host container. o. Elastic Load Balancer: >> Application ELB: path-based routing, microservice; Classic ELB: TCP-based routing. >> Internet-facing ELB: web server behind it; Internal ELB: database behind it. >> Cross-Zone ELB needs to be configured explictly. >> ELB - health check failed, the EC2 instance will be indicated as out-of-service. >> ELB - logging [ELB Request Tracing] catched detail information of each request, and it is free. o. Networking: >> VPC peering can between VPCs or the VPC across account, but they should in the same region. >> VPC CIDR IP block can not overlapping with each other. >> VPC peering is not transitive. >> You can not route traffic to a NAT gateway through a VPC peering connection, a VPN connection, or AWS Direct Connect. >> IAM users do not have permission to work with NAT gateways. You can create an IAM user policy that grants users permission to create, describe, and delete NAT gateways. NOT support resource-level API operation. o. EBS Encryption: >> CMK for EBS encryption - cannot change CMK that is associated with an existing snapshot or encrypted snapshot. You can associate a different CMK during a snapshot copy operation. >> Encrypted EBS volume: data at rest inside the volume; all data moving between the volume and the instance; All snapshots created from the volume. >> You can not enable encryption for existing EBS volume. >> Default AMI ==> root volume can not encrypted. Root/data volume snapshot ==> AMI fine