AWS Certified Developer Associate Notes

  • Regions & Availability Zones
    • Region > Availability Zone (AZ)
  • IAM (Identity & Access Management)
    • Gloabally accesible service (Not Region or AZ specific)
    • Users
      • Assigned to people
    • Groups
      • group of people by Function, Team, etc.
    • Roles
      • Assigned to machines or application (internal use)
    • IAM policies written in JSON
    • MFA (Multi Factor Authentication) can be setup
    • IAM has predefined policies
    • IAM Federation
      • To enable AWS login using enterprise credentials
    • Never share or write IAM credentials in code
    • Never use ROOT account
    • Complete seurity checks to secure the account
      • Delete root access keys
      • Activate MFA for root account (Google Authenticator)
      • Create individual IAM users
        • Create own user with Administrator Access policy
      • Use groups to assign permissions
        • Create 'Admin' group
        • Assign Administrator Access policy to it.
        • Add own user to this group.
        • Detach individually assign Administrator Access policy to own user.
      • Apply IAM password policy
        • At least 1 upper-case
        • At least 1 lower-case
        • At least 1 number
        • At least 1 non-alphanumeric
        • Allow users to change their password
        • enable password expiration with expiration duration
        • prevent password reuse out of previous x paswords
        • password expiration resquires administrator reset
    • Access account using IAM Admin user, not the root user
  • EC2
    • Capabilities
      • Renting VMs (EC2)
      • Storing data on virtual drives (EBS)
      • Distributing load across machines (ELB)
      • Scaling services using auto-scaling group (ASG)
    • Launch EC2 instance
      • Choose AMI (Amazon Machine Image)
      • Choose Instance type (t2.micro)
      • Configure instance details
        • No. of instances (For ASG)
        • Choose VPC
        • Choose Subnet
        • Assign Public IP
        • Select Placement Group
        • Select Capacity Reservation
        • Choose IAM Role
        • Set Shutdown behaviour
        • Enable termination protection
        • Enable Cloudwatch monitoring
        • Decide Tenancy
      • Add Storage
      • Set Tags
      • Configure Security Groups
        • Decide ports and IPs to allow access
      • Review & Launch
      • Select Key-Pairs
        • Save the pem file securedly
    • SSH into EC2 instance
      • Use Putty
      • Convert .pem to .ppk using PuttyGen
      • Login to instance using public IP and .ppk file (SSH > Auth)
      • Change permission of .pem file to 0400 on Linux
    • Security Groups (Details)
      • Acts like Firewall outside EC2 instance
      • Controls the inbound and outbound traffic of EC2 instances
      • If the issue is timeout-related, better check the security group rules
      • If the issue is 'connection-refused', then its application level issue
      • EC2 instance and Security Group has Many-To-Many relation
      • All inbound traffic is blocked by default
      • All outbound traffic is authorised by default
      • Traffic from Security Groups as a source can also be allowed instead of specific IPs (Useful in case of load balancers)
    • Private vs Public vs Elastic IP
      • Public IP
        • Accessible to internet
        • Changes after instance restart
      • Private IP
        • Accessible inside private network
        • Retained across instance restart
      • Elastic IP (Billed)
        • AWS IP to retain across EC2 instance restart
        • Can Mask failure by remaaping Elastic IP to differnt instances
    • EC2 User Data
      • Run script only once on the instance first start
      • Can do ANYTHING (More you do, startup time increases)
      • User Data Script runs with ROOT user
      • Add User Data Script while creating EC2 instance under Advanced Details on 'Configure Instance Details' page
    • EC2 Launch Types
      • On Demand Instances
        • Short Workload
        • Predictable Pricing
        • Pay for what you use (per second billing)
        • Highest cost
        • No upfront payment
        • No long term commitment
      • Reserved Instances
        • Long Workload (>= 1 year)
        • 75% discount compared to On-Demand
        • Pay upfront for long term commitment
        • Reserve for 1 - 3 years
        • Reserve a specific instance type (Eg.: x4-large)
      • Convertible Reserved Instances
        • Long Workload with flexible instances
        • Can change instance type
        • 54% discount compared to On-Demand
      • Scheduled Reserved Instances
        • Launch within reserved time window
        • Use when you need (Day, Week, Month) (Eg.: Every Sat-Sun)
      • Spot Instances
        • Short Workload
        • Cheap
        • Can lose instances
        • Up to 90% discount compared to On-Demand
        • Get the instance by bidding
        • Use the instance till bid amount is above the spot price
        • Price depends on demand and offers
        • Instance lost withing 2 mins notification after spot price crosses bid amount
        • Typically used for Batch Jobs, Big Data Analysis which are resilient to failures
      • Dedicated Instances
        • No other customer will share hardware
        • May share hardware with other instance under same account
      • Dedicated Hosts
        • Book entire physical server
        • Control instance placement
        • Visibility to underlying socket, processor cores, hardware, etc.
        • Allocated for a 3 year period
        • Much expensive
        • Useful in cases of Complicated Licensing model or strict company compliance policies
    • Other important details
      • EC2 Pricing (Only if the instance in Running)
        • Per Hour Pricing depending on
          • Region (Mumbai)
          • Instance Type (t2.micro)
          • Launch Type (On-Demand)
          • OS (Linux)
        • Per second billing (after 60 seconds)
        • Other chargable factors
          • Storage
          • Data Transfer
          • Fixed IP
          • Load Balancing
      • AMI (Amazon Machine Images)
        • Predefined
          • Ubuntu
          • Fedora
          • RedHat
          • Windows
          • Etc.
        • Can be customised using EC2 User Data Scripts
        • Custom AMIs can be used
          • Pre-installed packages
          • Faster Boot time (No User Data Scripts)
          • Pre installed monitoring and network tools for enterprise
          • Control maintenance and updates over time
          • Configure LDAP out-of-the-box
          • Install application before machine boot on all machines (During auto-scaling)
          • Someone else's exported AMI
        • Custom AMIs are region specific
      • EC2 Instances Characteristics
        • RAM (Type, Amount, Generation)
        • CPU (Cores, Type, Make, Frequency, Generation)
        • I/O (Disk Performance, EBS optimisations)
        • Network (Bandwidth, Latency)
        • GPU (Present or Not?)
        • Permutations results intomore than 50 instance types
        • Summary : https://ec2instances.info
        • R/C/P/G/H/X/I/F/Z/CR - Best in their characteristic
        • M - Balanced (Good overall, Great in nothing)
        • T2/T3
          • OK CPU overall
          • When Spike in processing, CPU bursts (VERY GOOD processing)
          • In burst mode, uses "burst credits"
          • If all credits are used, CPU becomes BAD
          • If machine stops bursting, credit accumulate over time
        • T2 Unlimited
          • Unlimited credits for high cost
      • EC2 Checklist
        • SSH to EC2
        • .pem to .ppk (PuTTY)
        • change permission of .pem (0400 on Linux)
        • use security group properly
        • public vs private vs elastic IP
        • User Data Scripts to customize instance
        • Custom AMI to enhance OS
        • EC2 billing is per second (after 60 seconds)
  • Load Balancing & Auto Scaling
    • Load Balancer
      • Forwards user traffic to multiple instances
      • Spreads load across multiple downstream instances
      • Handle failures of application instances
      • Regular health checks of instances
      • SSL termination (No need to have SSL certificates on instances)
      • Stickiness with cookies (Ensures all request from a user session are forwarded to same instance)
      • High availability across zones (if an AZ fails)
      • Seperate public traffic from private traffic
      • EC2 Load Balancer
        • Uptime guarantee
        • AWS handles Upgrades, maintenance, high availability
        • provides only few configuration
        • Costs more but very less efforts
        • integrated with many AWS services
      • Types of Load balancer (Internal / External)
        • Classic (V1 - Old - 2009)
          • Checks for health by hitting API (/health) and expects 200 OK response
        • Application (V2 - New - 2016) (Recommended)
          • Layer 7 (HTTP level)
          • Balance multiple HTTP apps across machines
          • Balance multiple apps on same machine
          • Balance based on route in URL
          • Balance based on Hostname in URL
          • Great for microservics and cotainer-based apps (Docker, Amaxon ECS)
          • Port mapping feature to redirect to dynamic port
          • Can have multiple Target Groups
          • Each target group can have multiple EC2 instances and a health checkup logic
          • Load Balancer redirects traffic from HTTP to target group
          • Stickiness can be enabled at the target group level
            • stickiness cookie is generated at ALB, not application
          • ALB supports HTTP/HTTPS
          • Client IP invisible to server
          • Client IP is present at 'X-Forwarded-For'
          • Port at 'X-Forwarded-Port'
          • Protocol at 'X-Forwarded-Proto'
        • Network (V2 - New - 2017)
          • Layer 4 (TCP Traffic)
          • Handle millions of req/sec
          • Siupoorts static/elastic IP
          • Less Latency - 100ms (vs 400ms for ALB)
          • Used for extreme performance (Not commonly used)
        • Load Balancers have static hostname
          • Do not use underlying IP
        • Load balancers can scale but not instantaneously (Concat AWS for 'warm-up')
        • NLP can see client IP
        • If LB can't connect app, check security group
        • If 503 error, capacity error or no registered target
    • Auto Scaling Group
      • Scale out to match increased load (Add instances)
      • Scale in to match decreased load (Remove instances)
      • Define minimum & maximum number of instances for scaling
      • Automatically register instances
      • ASG Attributes
        • Launch Configuration
          • AMI + Instance Type
          • EC2 User Data
          • EBS Volume
          • Security Group
          • SSH Key pair
        • Min Size
        • Max Size
        • Initial Capacity
        • Network & Subnet info
        • Load Balancer info
        • Scaling policy
      • Auto Scaling Alarm
        • ASG scales based on CloudWatch Alarms
        • Alarm monitors a metric (Eg: Average CPU)
        • Metrics are computed for overall ASG instances
        • Scale based on Alarms
      • Metrics
        • Target Average CPU Usage
        • Numbers of request on ELD per instance
        • Avg Network In
        • Avg Network Out
      • Custom Metrics (Eg. : No. of connected users)
        • Send custom metric from app to CloudWatch (PutMetric API)
        • Create CloudWatch alarm to react to high/low values
        • Use CloudWatch alarm as scaling policy
      • IAM roles attached to ASG gets assigned to EC2 instances
      • ASGs are FREE
      • ASG will restart instance if they get terminated for whatever reason
      • ASG terminates instances marked as unhealthy by LC and replaces them by creating new instances
    • EBS Volume
      • EC2 instance loses root volume after termination
      • EBS Volume - network drive to persist data (not physical drive)
      • Reattach to different instance
      • Experience little latency
      • Volumes are AZ specific (to move across AZ, take snapshot)
      • Get billed for entire capacity
      • Can increase the capacity later
      • Can attach multiple EBS vlume to same instance
      • Volume Types
        • GP2 (SSD - General purpose)
        • IO1 (SSD - Highest perf., low latency, high throughput)
        • ST1 (HDD - Low cost, throughput intensive)
        • SC1 (HDD - Lowest cost)
      • EBS Volumes Characteristics
        • Size
        • Throughput
        • IOPS (I/O operations per sec.)
      • EBS volumes size can be increased (repartition after resize)
        • Size
        • IOPS (IO1 Only)
      • EBS Snapshot
        • Uses actual used space, not entire volume space
        • Used for
          • Backup
          • Resizing
          • Change volume type
          • Encrypt volume
          • Migration between AZs
      • EBS Encryption
        • Data at rest is encrypted
        • Data in flight is also encrypted
        • Snapshots are also encrypted
        • All columes are created from snapshot
        • Minimal impact on latency
        • Uses KMS keys (AES-256) for encryption
        • Copying unencrypted anapshot allows encryption
      • EBS vs. Instance Store
        • Some instace dont have Root EBS volume, but has Instance Store
        • Instance Store is physically attached to instance
        • Better I/O perf.
        • Data loss on termination
        • Can't be resized
        • Backups to be done by users
      • EBS backups shouldn't be done durine application in use (I/O intesive)
      • Root EBS volumes get terminated on instance termination (this can be disabled)
  • Route 53
    • Managed DNS
    • DNS Records
      • A - URL to IPv4
      • AAAA - URL to IPv6
      • CNAME - URL to URL
      • Alias - URL to AWS Resource
    • Can use Public Domain (owned or purchased) or private domain resolvable in your VPC
    • Load Balancing through DNS (client side load balancing)
    • Limited Health checks
    • Routing Policy
      • Simple
      • Failover
      • GeoLocation
      • GeoProximity
      • Latency
      • Weighted
    • Prefer Alias over CNAME for AWS resource (better performance)
  • RDS (Relationsal Database Service)
    • Managed DB service
    • Uses SQL language
    • Supported DBs
      • Postgres
      • Oracle
      • MySQL
      • MariaDB
      • Microsoft SQL Server
      • Aurora (AWS Proprietary DB)
    • Continuos backups and restore
    • Monitoring dashboards
    • Read replicas for better performance
      • Upto 5 read replicas
      • Within AZ
      • Cross AZ
      • Cross Region
      • Replication is ASYNC (reads are eventually consistent)
      • Application needs to change the connection string to leverage read replicas
      • Application needs to add connection string for all replicas
    • Multi AZ setup (Disaster Recovery)
      • Replication is SYNC from Master to Standby instance of different AZ
      • Only one DNS name (Automatic failover to standby)
      • Increase availability
      • Failover during loss of AZ, network, instance or storage
      • No manual intervention in application
      • Not suitable for scaling
    • Maintenance windows for upgrades
    • Scaling capability (vertical & horizontal)
    • Can't SSH to RDS instances
    • Backups
      • Automatically enabled
      • Daily full snapshot
      • Capture transaction logs in real time
      • Can restore to anypoint in time
      • 7 days retension (increased to 35)
      • Can manually trigger snapshot
      • Manual snapshot can be retained for as long as we want
    • RDS Encryption
      • Encryption at rest with AWS KMS - AES-256
      • SSL certificates to encrypt in-flight data
        • To Enforce SSL
          • Postgres : rds.force_ssl = 1 in AWS RDS Console
          • MySQL : GRANT USAGE ON . TO 'mysqluser'@'%' REQUIRE SSL;
        • To Connect using SSL
          • Provide SSL Trust Cert (download from AWS)
          • Provide SSL option while connection DB
    • RDS security
      • Usually deployed in private subnets, not in public
      • Uses Security Groups
      • IAM policies controls who manages RDS
      • Username/Password can be used to login
      • IAM users can be used too (MySQL/Aurora)
    • RDS vs Aurora
      • Aurora is proprietary technology by AWS (not open-source)
      • Postgres & MySQL are supported by Aurora (Same Drivers)
      • AWS cloud optimized
        • 5x performance over MySQL
        • 3x performance over Postgres
      • Storage grows from 10GB to 64TB automatically
      • Aurora can have 15 replicas (MySQL - 5)
      • Aurora failover is instantaneous
      • Aurora is 20% costilier that RDS - more efficient
  • ElasticCache
    • Managed redis or Memcached
    • Cache are in-memory databases (high performance, low latency)
    • Reduces load off of DB for read intensive apps
    • Helps to maintain stateless application
    • Write scaling capability using sharding
    • Read scaling capability using read replicas
    • Multi AZ with failover capability
    • AWS takes care of maintenance, optimization, setup, config, monitoring, failover recovery, backup
    • Invalidation strategy is decided by application
    • Can also be used to manage session across multiple application instances
    • Redis
      • In-mempry key-value store
      • Super low latency (sub ms)
      • Cache survices reboots (persistence)
      • Handles following
        • User sessions
        • Leaderbord
        • Distributed states
        • Relieve pressure on DB
        • Pub / Sub capability for messages
      • Multi AZ with auto failover
      • Supports read replica
    • Memcached
      • In memory object-store
      • Doesn't survive reboots
    • ElasticCache Patterns
      • Helpful for read-heavy applications
      • Helpful for compute-intensive workloads
      • Lazy Loading
        • Load when necessary
        • In case of Cache miss, app read from DB and writes to Cache
        • Only requested data is cached
        • Node failures are not fatal
        • 3 round trips in case of cache miss
        • Stale data can cause a problem. Can be overcomed using TTL (Tiem to live)
      • Write Through
        • Add or Update cache when DB is updated
        • In case of Write, app wites to DB and then to cache
        • No Cache miss
        • No stale data
        • Every write makes 2 calls
        • Unused data also present is Cache, huge cache
  • AWS VPC
    • Within Region, VPCs are created
    • Each VPC contains subnets
    • Each subnet mapped to an AZ
    • public subnets and private subnets are present
    • Can have many subnets per AZ
    • Public Subnets
      • Load balancers
      • Static Websites
      • Files
      • Public Authentication Layers
    • Private Subnets
      • Web application server
      • Databases
    • Public & Private subnets can communicate if in same VPC
    • Can use VPN connection to connect to VPC
    • VPC Flow logs allows to monitor traffic within, in & out of VPC
    • VPC are per Account per Region
    • Subnets are per VPC per AZ
    • Some resources can't be deployed in VPC
    • Can peer VPC (within or across accounts) to make it look like a same network
  • AWS 3 Tier Architecture AWS 3 Tier Architecture Diagram
  • Amazon S3
    • Advertised as "infinitely scaling storage"
    • Many websites use S3 for integration
    • S3 Buckets is a Global Service
    • Allows to store objects (files) inside bucket (directory)
    • Must have globally unique name
    • Defined at Region-level
    • Naming Convention
      • No uppercase
      • No underscore
      • 3-63 characters long
      • Not an IP
      • Must start with lower-case or number
    • Objects have key (full path with '/')
    • No concept of directory (only on UI for navigation)
    • Max size - 5TB
    • For larger than 5GB object, must use "multi-part upload"
    • Object can have Metadata (list of key-value pair)
    • Object can have Tags (Upto 10 Unicode key-value pair)
    • Object can use Version ID for versioning
    • S3 Versioning
      • Need to enable as bucket level
      • If file is over-written, version upgrades automatically.
      • S3 remembers all version
      • Protects against unintended deletes
      • Easy rollbacks to previous versions
      • Takes more space
      • Unversioned files before enabling versioning have version 'null'
    • S3 Encryption
      • 4 Methods of encryption
        • SSE-S3 : using keys handled & managed by AWS
          • Encryption on server-side
          • AES-256 used
          • Must set header "x-amz-server-side-encryption" : "AES256"
        • SSE-KMS : AWS Key Management Service to manage keys
          • KMS advantages - User Control + Audit trail
          • Encryption on server-side
          • Uses KMS CMK (Customer Master Key) to encrypt
          • Must set header "x-amz-server-side-encryption" : "aws:kms"
        • SSE-C : Manage your own keys
          • S3 doesn't store keys
          • HTTPS must be used
          • key must be provided in headers for every request
        • Client Side Encryption
          • Client library : Amazon S3 Encryption Client (makes east to use)
          • Client has to Encrypt & Decrypt the data
          • Customer has to manage the keys and encryption/decryption logic
      • Encryption in transit (SSL / TLS)
        • HTTP endpoints : non encrypted
        • HTTPS endpoints : encrypted
      • Default Encryption can be enabled in Bucket properties
    • S3 Security
      • User Based
        • IAM policies
          • which API calls should be allowed for a specific user from IAM console
      • Resource Based
        • Bucket Policies
          • Bucket wide rules from S3 console
          • Allows cross account
          • JSON based policies
            • Resources - Buckets & Objects
            • Actions - Set of API to Allow or Deny
            • Effect - Allow / Deny
            • Principal - Account or User to apply the policy
          • Used to grant public access to the bucket
          • Force objects to be encrypted at upload
          • Grant access to another account (cross account)
          • Use Policy Generator to generate complex JSON policies
        • Object Access Control List (ACL)
          • Finer Control
        • Bucket Access Control List (ACL)
          • Less Common
      • Other Security concepts
        • Networking
          • Supports VPC Endpoints (without www internet)
        • Logging & Audit
          • S3 access logs can be stored in another bucket
          • access logs should not be stored in the same bucket to avoid recursion
          • API calls cane be logged in AWS CloudTrail
        • User Security
          • MFA can be enabled in versioned buckets to delete objects
          • Signed URLs : URLs valid for limited time (ex : premium video service for logged in users)