- Regions & Availability Zones
- Region > Availability Zone (AZ)
- IAM (Identity & Access Management)
- Gloabally accesible service (Not Region or AZ specific)
- Users
- Assigned to people
- Groups
- group of people by Function, Team, etc.
- Roles
- Assigned to machines or application (internal use)
- IAM policies written in JSON
- MFA (Multi Factor Authentication) can be setup
- IAM has predefined policies
- IAM Federation
- To enable AWS login using enterprise credentials
- Never share or write IAM credentials in code
- Never use ROOT account
- Complete seurity checks to secure the account
- Delete root access keys
- Activate MFA for root account (Google Authenticator)
- Create individual IAM users
- Create own user with Administrator Access policy
- Use groups to assign permissions
- Create 'Admin' group
- Assign Administrator Access policy to it.
- Add own user to this group.
- Detach individually assign Administrator Access policy to own user.
- Apply IAM password policy
- At least 1 upper-case
- At least 1 lower-case
- At least 1 number
- At least 1 non-alphanumeric
- Allow users to change their password
- enable password expiration with expiration duration
- prevent password reuse out of previous x paswords
- password expiration resquires administrator reset
- Access account using IAM Admin user, not the root user
- EC2
- Capabilities
- Renting VMs (EC2)
- Storing data on virtual drives (EBS)
- Distributing load across machines (ELB)
- Scaling services using auto-scaling group (ASG)
- Launch EC2 instance
- Choose AMI (Amazon Machine Image)
- Choose Instance type (t2.micro)
- Configure instance details
- No. of instances (For ASG)
- Choose VPC
- Choose Subnet
- Assign Public IP
- Select Placement Group
- Select Capacity Reservation
- Choose IAM Role
- Set Shutdown behaviour
- Enable termination protection
- Enable Cloudwatch monitoring
- Decide Tenancy
- Add Storage
- Set Tags
- Configure Security Groups
- Decide ports and IPs to allow access
- Review & Launch
- Select Key-Pairs
- Save the pem file securedly
- SSH into EC2 instance
- Use Putty
- Convert .pem to .ppk using PuttyGen
- Login to instance using public IP and .ppk file (SSH > Auth)
- Change permission of .pem file to 0400 on Linux
- Security Groups (Details)
- Acts like Firewall outside EC2 instance
- Controls the inbound and outbound traffic of EC2 instances
- If the issue is timeout-related, better check the security group rules
- If the issue is 'connection-refused', then its application level issue
- EC2 instance and Security Group has Many-To-Many relation
- All inbound traffic is blocked by default
- All outbound traffic is authorised by default
- Traffic from Security Groups as a source can also be allowed instead of specific IPs (Useful in case of load balancers)
- Private vs Public vs Elastic IP
- Public IP
- Accessible to internet
- Changes after instance restart
- Private IP
- Accessible inside private network
- Retained across instance restart
- Elastic IP (Billed)
- AWS IP to retain across EC2 instance restart
- Can Mask failure by remaaping Elastic IP to differnt instances
- Public IP
- EC2 User Data
- Run script only once on the instance first start
- Can do ANYTHING (More you do, startup time increases)
- User Data Script runs with ROOT user
- Add User Data Script while creating EC2 instance under Advanced Details on 'Configure Instance Details' page
- EC2 Launch Types
- On Demand Instances
- Short Workload
- Predictable Pricing
- Pay for what you use (per second billing)
- Highest cost
- No upfront payment
- No long term commitment
- Reserved Instances
- Long Workload (>= 1 year)
- 75% discount compared to On-Demand
- Pay upfront for long term commitment
- Reserve for 1 - 3 years
- Reserve a specific instance type (Eg.: x4-large)
- Convertible Reserved Instances
- Long Workload with flexible instances
- Can change instance type
- 54% discount compared to On-Demand
- Scheduled Reserved Instances
- Launch within reserved time window
- Use when you need (Day, Week, Month) (Eg.: Every Sat-Sun)
- Spot Instances
- Short Workload
- Cheap
- Can lose instances
- Up to 90% discount compared to On-Demand
- Get the instance by bidding
- Use the instance till bid amount is above the spot price
- Price depends on demand and offers
- Instance lost withing 2 mins notification after spot price crosses bid amount
- Typically used for Batch Jobs, Big Data Analysis which are resilient to failures
- Dedicated Instances
- No other customer will share hardware
- May share hardware with other instance under same account
- Dedicated Hosts
- Book entire physical server
- Control instance placement
- Visibility to underlying socket, processor cores, hardware, etc.
- Allocated for a 3 year period
- Much expensive
- Useful in cases of Complicated Licensing model or strict company compliance policies
- On Demand Instances
- Other important details
- EC2 Pricing (Only if the instance in Running)
- Per Hour Pricing depending on
- Region (Mumbai)
- Instance Type (t2.micro)
- Launch Type (On-Demand)
- OS (Linux)
- Per second billing (after 60 seconds)
- Other chargable factors
- Storage
- Data Transfer
- Fixed IP
- Load Balancing
- Per Hour Pricing depending on
- AMI (Amazon Machine Images)
- Predefined
- Ubuntu
- Fedora
- RedHat
- Windows
- Etc.
- Can be customised using EC2 User Data Scripts
- Custom AMIs can be used
- Pre-installed packages
- Faster Boot time (No User Data Scripts)
- Pre installed monitoring and network tools for enterprise
- Control maintenance and updates over time
- Configure LDAP out-of-the-box
- Install application before machine boot on all machines (During auto-scaling)
- Someone else's exported AMI
- Custom AMIs are region specific
- Predefined
- EC2 Instances Characteristics
- RAM (Type, Amount, Generation)
- CPU (Cores, Type, Make, Frequency, Generation)
- I/O (Disk Performance, EBS optimisations)
- Network (Bandwidth, Latency)
- GPU (Present or Not?)
- Permutations results intomore than 50 instance types
- Summary : https://ec2instances.info
- R/C/P/G/H/X/I/F/Z/CR - Best in their characteristic
- M - Balanced (Good overall, Great in nothing)
- T2/T3
- OK CPU overall
- When Spike in processing, CPU bursts (VERY GOOD processing)
- In burst mode, uses "burst credits"
- If all credits are used, CPU becomes BAD
- If machine stops bursting, credit accumulate over time
- T2 Unlimited
- Unlimited credits for high cost
- EC2 Checklist
- SSH to EC2
- .pem to .ppk (PuTTY)
- change permission of .pem (0400 on Linux)
- use security group properly
- public vs private vs elastic IP
- User Data Scripts to customize instance
- Custom AMI to enhance OS
- EC2 billing is per second (after 60 seconds)
- EC2 Pricing (Only if the instance in Running)
- Capabilities
- Load Balancing & Auto Scaling
- Load Balancer
- Forwards user traffic to multiple instances
- Spreads load across multiple downstream instances
- Handle failures of application instances
- Regular health checks of instances
- SSL termination (No need to have SSL certificates on instances)
- Stickiness with cookies (Ensures all request from a user session are forwarded to same instance)
- High availability across zones (if an AZ fails)
- Seperate public traffic from private traffic
- EC2 Load Balancer
- Uptime guarantee
- AWS handles Upgrades, maintenance, high availability
- provides only few configuration
- Costs more but very less efforts
- integrated with many AWS services
- Types of Load balancer (Internal / External)
- Classic (V1 - Old - 2009)
- Checks for health by hitting API (/health) and expects 200 OK response
- Application (V2 - New - 2016) (Recommended)
- Layer 7 (HTTP level)
- Balance multiple HTTP apps across machines
- Balance multiple apps on same machine
- Balance based on route in URL
- Balance based on Hostname in URL
- Great for microservics and cotainer-based apps (Docker, Amaxon ECS)
- Port mapping feature to redirect to dynamic port
- Can have multiple Target Groups
- Each target group can have multiple EC2 instances and a health checkup logic
- Load Balancer redirects traffic from HTTP to target group
- Stickiness can be enabled at the target group level
- stickiness cookie is generated at ALB, not application
- ALB supports HTTP/HTTPS
- Client IP invisible to server
- Client IP is present at 'X-Forwarded-For'
- Port at 'X-Forwarded-Port'
- Protocol at 'X-Forwarded-Proto'
- Network (V2 - New - 2017)
- Layer 4 (TCP Traffic)
- Handle millions of req/sec
- Siupoorts static/elastic IP
- Less Latency - 100ms (vs 400ms for ALB)
- Used for extreme performance (Not commonly used)
- Load Balancers have static hostname
- Do not use underlying IP
- Load balancers can scale but not instantaneously (Concat AWS for 'warm-up')
- NLP can see client IP
- If LB can't connect app, check security group
- If 503 error, capacity error or no registered target
- Classic (V1 - Old - 2009)
- Auto Scaling Group
- Scale out to match increased load (Add instances)
- Scale in to match decreased load (Remove instances)
- Define minimum & maximum number of instances for scaling
- Automatically register instances
- ASG Attributes
- Launch Configuration
- AMI + Instance Type
- EC2 User Data
- EBS Volume
- Security Group
- SSH Key pair
- Min Size
- Max Size
- Initial Capacity
- Network & Subnet info
- Load Balancer info
- Scaling policy
- Launch Configuration
- Auto Scaling Alarm
- ASG scales based on CloudWatch Alarms
- Alarm monitors a metric (Eg: Average CPU)
- Metrics are computed for overall ASG instances
- Scale based on Alarms
- Metrics
- Target Average CPU Usage
- Numbers of request on ELD per instance
- Avg Network In
- Avg Network Out
- Custom Metrics (Eg. : No. of connected users)
- Send custom metric from app to CloudWatch (PutMetric API)
- Create CloudWatch alarm to react to high/low values
- Use CloudWatch alarm as scaling policy
- IAM roles attached to ASG gets assigned to EC2 instances
- ASGs are FREE
- ASG will restart instance if they get terminated for whatever reason
- ASG terminates instances marked as unhealthy by LC and replaces them by creating new instances
- EBS Volume
- EC2 instance loses root volume after termination
- EBS Volume - network drive to persist data (not physical drive)
- Reattach to different instance
- Experience little latency
- Volumes are AZ specific (to move across AZ, take snapshot)
- Get billed for entire capacity
- Can increase the capacity later
- Can attach multiple EBS vlume to same instance
- Volume Types
- GP2 (SSD - General purpose)
- IO1 (SSD - Highest perf., low latency, high throughput)
- ST1 (HDD - Low cost, throughput intensive)
- SC1 (HDD - Lowest cost)
- EBS Volumes Characteristics
- Size
- Throughput
- IOPS (I/O operations per sec.)
- EBS volumes size can be increased (repartition after resize)
- Size
- IOPS (IO1 Only)
- EBS Snapshot
- Uses actual used space, not entire volume space
- Used for
- Backup
- Resizing
- Change volume type
- Encrypt volume
- Migration between AZs
- EBS Encryption
- Data at rest is encrypted
- Data in flight is also encrypted
- Snapshots are also encrypted
- All columes are created from snapshot
- Minimal impact on latency
- Uses KMS keys (AES-256) for encryption
- Copying unencrypted anapshot allows encryption
- EBS vs. Instance Store
- Some instace dont have Root EBS volume, but has Instance Store
- Instance Store is physically attached to instance
- Better I/O perf.
- Data loss on termination
- Can't be resized
- Backups to be done by users
- EBS backups shouldn't be done durine application in use (I/O intesive)
- Root EBS volumes get terminated on instance termination (this can be disabled)
- Load Balancer
- Route 53
- Managed DNS
- DNS Records
- A - URL to IPv4
- AAAA - URL to IPv6
- CNAME - URL to URL
- Alias - URL to AWS Resource
- Can use Public Domain (owned or purchased) or private domain resolvable in your VPC
- Load Balancing through DNS (client side load balancing)
- Limited Health checks
- Routing Policy
- Simple
- Failover
- GeoLocation
- GeoProximity
- Latency
- Weighted
- Prefer Alias over CNAME for AWS resource (better performance)
- RDS (Relationsal Database Service)
- Managed DB service
- Uses SQL language
- Supported DBs
- Postgres
- Oracle
- MySQL
- MariaDB
- Microsoft SQL Server
- Aurora (AWS Proprietary DB)
- Continuos backups and restore
- Monitoring dashboards
- Read replicas for better performance
- Upto 5 read replicas
- Within AZ
- Cross AZ
- Cross Region
- Replication is ASYNC (reads are eventually consistent)
- Application needs to change the connection string to leverage read replicas
- Application needs to add connection string for all replicas
- Multi AZ setup (Disaster Recovery)
- Replication is SYNC from Master to Standby instance of different AZ
- Only one DNS name (Automatic failover to standby)
- Increase availability
- Failover during loss of AZ, network, instance or storage
- No manual intervention in application
- Not suitable for scaling
- Maintenance windows for upgrades
- Scaling capability (vertical & horizontal)
- Can't SSH to RDS instances
- Backups
- Automatically enabled
- Daily full snapshot
- Capture transaction logs in real time
- Can restore to anypoint in time
- 7 days retension (increased to 35)
- Can manually trigger snapshot
- Manual snapshot can be retained for as long as we want
- RDS Encryption
- Encryption at rest with AWS KMS - AES-256
- SSL certificates to encrypt in-flight data
- To Enforce SSL
- Postgres : rds.force_ssl = 1 in AWS RDS Console
- MySQL : GRANT USAGE ON . TO 'mysqluser'@'%' REQUIRE SSL;
- To Connect using SSL
- Provide SSL Trust Cert (download from AWS)
- Provide SSL option while connection DB
- To Enforce SSL
- RDS security
- Usually deployed in private subnets, not in public
- Uses Security Groups
- IAM policies controls who manages RDS
- Username/Password can be used to login
- IAM users can be used too (MySQL/Aurora)
- RDS vs Aurora
- Aurora is proprietary technology by AWS (not open-source)
- Postgres & MySQL are supported by Aurora (Same Drivers)
- AWS cloud optimized
- 5x performance over MySQL
- 3x performance over Postgres
- Storage grows from 10GB to 64TB automatically
- Aurora can have 15 replicas (MySQL - 5)
- Aurora failover is instantaneous
- Aurora is 20% costilier that RDS - more efficient
- ElasticCache
- Managed redis or Memcached
- Cache are in-memory databases (high performance, low latency)
- Reduces load off of DB for read intensive apps
- Helps to maintain stateless application
- Write scaling capability using sharding
- Read scaling capability using read replicas
- Multi AZ with failover capability
- AWS takes care of maintenance, optimization, setup, config, monitoring, failover recovery, backup
- Invalidation strategy is decided by application
- Can also be used to manage session across multiple application instances
- Redis
- In-mempry key-value store
- Super low latency (sub ms)
- Cache survices reboots (persistence)
- Handles following
- User sessions
- Leaderbord
- Distributed states
- Relieve pressure on DB
- Pub / Sub capability for messages
- Multi AZ with auto failover
- Supports read replica
- Memcached
- In memory object-store
- Doesn't survive reboots
- ElasticCache Patterns
- Helpful for read-heavy applications
- Helpful for compute-intensive workloads
- Lazy Loading
- Load when necessary
- In case of Cache miss, app read from DB and writes to Cache
- Only requested data is cached
- Node failures are not fatal
- 3 round trips in case of cache miss
- Stale data can cause a problem. Can be overcomed using TTL (Tiem to live)
- Write Through
- Add or Update cache when DB is updated
- In case of Write, app wites to DB and then to cache
- No Cache miss
- No stale data
- Every write makes 2 calls
- Unused data also present is Cache, huge cache
- AWS VPC
- Within Region, VPCs are created
- Each VPC contains subnets
- Each subnet mapped to an AZ
- public subnets and private subnets are present
- Can have many subnets per AZ
- Public Subnets
- Load balancers
- Static Websites
- Files
- Public Authentication Layers
- Private Subnets
- Web application server
- Databases
- Public & Private subnets can communicate if in same VPC
- Can use VPN connection to connect to VPC
- VPC Flow logs allows to monitor traffic within, in & out of VPC
- VPC are per Account per Region
- Subnets are per VPC per AZ
- Some resources can't be deployed in VPC
- Can peer VPC (within or across accounts) to make it look like a same network
- AWS 3 Tier Architecture
- Amazon S3
- Advertised as "infinitely scaling storage"
- Many websites use S3 for integration
- S3 Buckets is a Global Service
- Allows to store objects (files) inside bucket (directory)
- Must have globally unique name
- Defined at Region-level
- Naming Convention
- No uppercase
- No underscore
- 3-63 characters long
- Not an IP
- Must start with lower-case or number
- Objects have key (full path with '/')
- No concept of directory (only on UI for navigation)
- Max size - 5TB
- For larger than 5GB object, must use "multi-part upload"
- Object can have Metadata (list of key-value pair)
- Object can have Tags (Upto 10 Unicode key-value pair)
- Object can use Version ID for versioning
- S3 Versioning
- Need to enable as bucket level
- If file is over-written, version upgrades automatically.
- S3 remembers all version
- Protects against unintended deletes
- Easy rollbacks to previous versions
- Takes more space
- Unversioned files before enabling versioning have version 'null'
- S3 Encryption
- 4 Methods of encryption
- SSE-S3 : using keys handled & managed by AWS
- Encryption on server-side
- AES-256 used
- Must set header "x-amz-server-side-encryption" : "AES256"
- SSE-KMS : AWS Key Management Service to manage keys
- KMS advantages - User Control + Audit trail
- Encryption on server-side
- Uses KMS CMK (Customer Master Key) to encrypt
- Must set header "x-amz-server-side-encryption" : "aws:kms"
- SSE-C : Manage your own keys
- S3 doesn't store keys
- HTTPS must be used
- key must be provided in headers for every request
- Client Side Encryption
- Client library : Amazon S3 Encryption Client (makes east to use)
- Client has to Encrypt & Decrypt the data
- Customer has to manage the keys and encryption/decryption logic
- SSE-S3 : using keys handled & managed by AWS
- Encryption in transit (SSL / TLS)
- HTTP endpoints : non encrypted
- HTTPS endpoints : encrypted
- Default Encryption can be enabled in Bucket properties
- 4 Methods of encryption
- S3 Security
- User Based
- IAM policies
- which API calls should be allowed for a specific user from IAM console
- IAM policies
- Resource Based
- Bucket Policies
- Bucket wide rules from S3 console
- Allows cross account
- JSON based policies
- Resources - Buckets & Objects
- Actions - Set of API to Allow or Deny
- Effect - Allow / Deny
- Principal - Account or User to apply the policy
- Used to grant public access to the bucket
- Force objects to be encrypted at upload
- Grant access to another account (cross account)
- Use Policy Generator to generate complex JSON policies
- Object Access Control List (ACL)
- Finer Control
- Bucket Access Control List (ACL)
- Less Common
- Bucket Policies
- Other Security concepts
- Networking
- Supports VPC Endpoints (without www internet)
- Logging & Audit
- S3 access logs can be stored in another bucket
- access logs should not be stored in the same bucket to avoid recursion
- API calls cane be logged in AWS CloudTrail
- User Security
- MFA can be enabled in versioned buckets to delete objects
- Signed URLs : URLs valid for limited time (ex : premium video service for logged in users)
- Networking
- User Based