TakingOpsToNextLevel Summit 2014

Report 5 Downloads 124 Views
Taking AWS Operations to the Next Level

Taking AWS Operations to the Next Level

Welcome! • • •

Take a seat! Meet your neighbors! We’ll be starting a bit after 9.

Today: 9am-5pm  Hour break for lunch 11:30-12:30 • •

Bio breaks during labs are cool too If we get done early? To the bar/tables!

Some presentation, some hands on labs Ask questions, but let’s not get too sidetracked Everyone is here to learn and level up  Mixed backgrounds, experience, titles, companies, industries, countries

Welcome to the AWS Summit 2014 Me: Chris Munns([email protected])  AWS Solutions Architect based in NYC  Previously Senior Sys-Ops Engineer at a number of places like

Etsy and Meetup

Also:  (today’s helpers)

Today’s Goals: Mine: •

Have everyone here learn something new and feel confident in that knowledge to take action on it when you return to your normal day to day. That action should make your infrastructure(and hence life and business) better.

Today’s Goals: Yours: •





Understand the power of AWS CloudFormation to create, provision, manage, and update your infrastructure on AWS Use host based application configuration management tools and methodology to manage the systems and applications living inside your infrastructure Combine the above with the Amazon Simple Workflow Service, service APIs and other tools, to automate routine tasks in your infrastructure

Sound good?

https://secure.flickr.com/photos/stevendepolo/5749192025/

Taking AWS Operations to the Next Level

Taking AWS Operations to the Next Level

Taking AWS Operations to the Next Level

Taking AWS Operations to the Next Level

Operations: “Systems Operations refers to a team, or possibly even a department within the IT group, which is responsible for the running of the centralized systems and networks. “ – http://www.yourwindow.to/information-security/gl_systemsoperations.htm

Operations: Creating, modifying, provisioning, updating systems, software and networks Perform day to day tasks to keep the infrastructure:  Available?  Growing?  Scaling?  Performing?  Secure?

Operations: • Creating, modifying, provisioning, updating systems, software and networks • Perform day to day tasks to keep the infrastructure: Working the way the business needs or will need for the business to meet or exceed its goals

Operations: • Creating, modifying, provisioning, updating systems, software and networks • Perform day to day tasks to keep the infrastructure:

HELPING THE BUSINESS MAKE MONEY

Next Level: How many RPG (role playing game) fans do we have in the room?

http://diablo3crack.com/images/Demon-Hunter-Level-Up-Diablo-3.png

Next Level: In RPGs, reaching the next level (leveling up) can mean you are now better and more capable at what you were doing before  Be Smarter  Be Stronger  Be Faster  Kill monsters/bad guys better. Win.

Next Level: In operations on AWS, reaching the next level (leveling up) can mean you are now better and more capable at what you were just doing before  Better use of APIs & automation ( Be Smarter )  Better availability/uptime ( Be Stronger )  Better agility as an organization ( Be Faster )  AWS gives you the tools. ( to kill those infrastructure bad guys )

http://findingagap.files.wordpress.com/2012/08/level-up.jpg

Today’s Agenda: Cover three tools that will help you to take things to the next level:  AWS CloudFormation  Host based configuration management with

Chef + AWS OpsWorks  Amazon Simple WorkFlow Service + APIs

Today’s Agenda: Going to be covering this from a Linux/open source perspective. ( sorry Windows d00ds!) The big picture is around automation and ease of deployment These are just some examples, but lots of flexibility and options

Today’s Agenda: We’re going to build out a highly available, scalable, and self healing WordPress site. It will live inside a VPC, be served via ELB, make use of OpsWorks for deployment and management of the web tier, and use ElastiCache and RDS for caching and database. We’ll also use SWF later to help with some of the selfhealing aspects of our infrastructure.

Our Infrastructure Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

ElastiCache

Amazon CloudWatch

Public VPC Subnet

SWF Worker Instance

NAT Instance

Amazon SNS

AWS Amazon CloudFormation SQS

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Amazon S3

Amazon SWF

AWS OpsWorks

EC2 API

Leap/Bastion Instance

AWS CloudFormation

AWS CloudFormation AWS CloudFormation gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion First released in 2010

AWS CloudFormation Templates to describe the AWS resources and any associated dependencies or runtime parameters required to run your application

AWS CloudFormation You don’t need to figure out the order in which AWS services need to be provisioned or the subtleties of how to make those dependencies work

AWS CloudFormation Once deployed, you can modify and update the AWS resources in a controlled and predictable way, allowing you to version your AWS infrastructure in the same way as you version your software

AWS CloudFormation

CloudFormation takes care of this for you

AWS CloudFormation AWS CloudFormation is available at no additional charge, and you pay only for the AWS resources needed to run your applications

AWS CloudFormation Templates to describe the AWS resources Modify and update your AWS resources in a controlled and predictable way Version control your AWS infrastructure

AWS CloudFormation Templates to describe the AWS resources Modify and update your AWS resources in a controlled and predictable way Version control your AWS infrastructure

Anatomy of a template

JSON

Perfect for version control

Plain text

JSON Validatable

Declarative language

{

"AWSTemplateFormatVersion" : "2010-09-09",

"Description" : "AWS CloudFormation Sample Template EC2InstanceSample: Create an Amazon EC2 instance running the Amazon Linux AMI. The AMI is chosen based on the region in which the stack is run. This example uses the default security group, so to SSH to the new instance using the KeyPair you enter, you will need to have port 22 open in your default security group. **WARNING** This template an Amazon EC2 instances. You will be billed for the AWS resources used if you create a stack from this template.", "Parameters" : { "KeyName" : { "Description" : "Name of an existing EC2 KeyPair to enable SSH access to the instance", "Type" : "String" } }, "Mappings" : { "RegionMap" : { "us-east-1" : { "AMI" : "ami-7f418316" }, "us-west-1" : { "AMI" : "ami-951945d0" }, "us-west-2" : { "AMI" : "ami-16fd7026" }, "eu-west-1" : { "AMI" : "ami-24506250" }, "sa-east-1" : { "AMI" : "ami-3e3be423" }, "ap-southeast-1" : { "AMI" : "ami-74dda626" }, "ap-northeast-1" : { "AMI" : "ami-dcfa4edd" } } }, "Resources" : { "Ec2Instance" : { "Type" : "AWS::EC2::Instance", "Properties" : { "KeyName" : { "Ref" : "KeyName" }, "ImageId" : { "Fn::FindInMap" : [ "RegionMap", { "Ref" : "AWS::Region" }, "AMI" ]}, "UserData" : { "Fn::Base64" : "80" } } } },

}

"Outputs" : { "InstanceId" : { "Description" : "InstanceId of the newly created EC2 instance", "Value" : { "Ref" : "Ec2Instance" } }, "AZ" : { "Description" : "Availability Zone of the newly created EC2 instance", "Value" : { "Fn::GetAtt" : [ "Ec2Instance", "AvailabilityZone" ] } }, "PublicDNS" : { "Description" : "Public DNSName of the newly created EC2 instance", "Value" : { "Fn::GetAtt" : [ "Ec2Instance", "PublicDnsName" ] } } }

{

"AWSTemplateFormatVersion" : "2010-09-09",

"Description" : "AWS CloudFormation Sample Template EC2InstanceSample: Create an Amazon EC2 instance running the Amazon Linux AMI. The AMI is chosen based on the region in which the stack is run. This example uses the default security group, so to SSH to the new instance using the KeyPair you enter, you will need to have port 22 open in your default security group. **WARNING** This template an Amazon EC2 instances. You will be billed for the AWS resources used if you create a stack from this template.", "Parameters" : { "KeyName" : { "Description" : "Name of an existing EC2 KeyPair to enable SSH access to the instance", "Type" : "String" } }, "Mappings" : { "RegionMap" : { "us-east-1" : { "AMI" : "ami-7f418316" }, "us-west-1" : { "AMI" : "ami-951945d0" }, "us-west-2" : { "AMI" : "ami-16fd7026" }, "eu-west-1" : { "AMI" : "ami-24506250" }, "sa-east-1" : { "AMI" : "ami-3e3be423" }, "ap-southeast-1" : { "AMI" : "ami-74dda626" }, "ap-northeast-1" : { "AMI" : "ami-dcfa4edd" } } },

MAPPINGS

"Resources" : { "Ec2Instance" : { "Type" : "AWS::EC2::Instance", "Properties" : { "KeyName" : { "Ref" : "KeyName" }, "ImageId" : { "Fn::FindInMap" : [ "RegionMap", { "Ref" : "AWS::Region" }, "AMI" ]}, "UserData" : { "Fn::Base64" : "80" } } } },

}

PARAMETERS

"Outputs" : { "InstanceId" : { "Description" : "InstanceId of the newly created EC2 instance", "Value" : { "Ref" : "Ec2Instance" } }, "AZ" : { "Description" : "Availability Zone of the newly created EC2 instance", "Value" : { "Fn::GetAtt" : [ "Ec2Instance", "AvailabilityZone" ] } }, "PublicDNS" : { "Description" : "Public DNSName of the newly created EC2 instance", "Value" : { "Fn::GetAtt" : [ "Ec2Instance", "PublicDnsName" ] } } }

RESOURCES

OUTPUTS

HEADERS

Parameters Provision-time specification Command-line options

Mappings Conditionals Case statements

Resources

Outputs

AWS CloudFormation Resources: Almost any AWS service  What’s missing(right now)? • • • • • •

Amazon Elastic MapReduce (EMR) Amazon Simple Workflow Service (SWF) Amazon Simple Email Service (SES) Amazon Glacier Amazon CloudSearch Might be small new features from other services not yet incorporated

Let us know what you need and how badly!

AWS CloudFormation Resources – Amazon Elastic Compute Cloud (EC2): { "Type" : "AWS::EC2::Instance", "Properties" : { "AvailabilityZone" : String, "DisableApiTermination" : Boolean, "EbsOptimized" : Boolean, "IamInstanceProfile" : String, "ImageId" : String, "InstanceType" : String,

AWS CloudFormation Resources – Amazon EC2: "KernelId" : String, "KeyName" : String, "Monitoring" : Boolean, "PlacementGroupName" : String, "PrivateIpAddress" : String, "RamdiskId" : String, "SecurityGroupIds" : [ String, ... ], "SecurityGroups" : [ String, ... ],

AWS CloudFormation Resources – Amazon EC2: "SourceDestCheck" : Boolean, "SubnetId" : String, "Tags" : [ EC2 Tag, ... ], "Tenancy" : String, "UserData" : String, "Volumes" : [ EC2 MountPoint, ... ] } }

AWS CloudFormation /dev/null 2>&1" end

Being a Chef Defining a role: ( frontends_role.json) { "name": ”Frontends", "chef_type": "role", "json_class": "Chef::Role", "default_attributes": { }, "description": ”Front End Web hosts", "run_list": [ "role[Base]", “recipe[apache::frontend]”, “recipe[syslog::frontend]”, “recipe[ganglia]” ], "override_attributes": { …

Being a Chef Defining a role: ( frontends_role.json) …. "override_attributes": { "ganglia": { "config": "/etc/ganglia/gmond.conf", "gmetad_name": [ ”ganglia.munnsdemo.prv" ], "cluster_name": [ ”FrontEnds" ] } } }

Being a Chef In recipe:

In Attributes:

variables( :gmetad_name => node[:ganglia][:gmetad_name], :cluster_name => node[:ganglia][:cluster_name] )

{ "ganglia": { "config": "/etc/ganglia/gmond.conf", "gmetad_name": [ "ganglia.munnsdemo.prv" ], "cluster_name": [ ”FrontEnds" ] } }

In template: … name = "” …. host = port = 8649 ttl = 1 }

Being a Chef In recipe:

In Attributes:

variables( :gmetad_name => node[:ganglia][:gmetad_name], :cluster_name => node[:ganglia][:cluster_name] )

{ "ganglia": { "config": "/etc/ganglia/gmond.conf", "gmetad_name": [ "ganglia.munnsdemo.prv" ], "cluster_name": [ ”FrontEnds" ] } }

In template: … name = "” …. host = port = 8649 ttl = 1 }

Working with Chef Chef main application components:  Chef-solo

OR  Chef-client & Chef-server

Working with Chef Chef-server + Chef-client options:  Hosted Chef –

“Get instant access to a highly available, dynamically scalable, fully managed and supported automation environment - powered by the experts at Opscode.”

 Private Chef –

“All the benefits of Hosted Chef, delivered with 24/7/365 support and implementation consulting, installed as enterprise software behind the corporate firewall.”

 Open Source –

You do it all. Not that difficult, community supported, free.

Working with Chef

OR…

AWS OpsWorks Integrated application management solution for opsminded developers and IT admins Model, control and automate applications of nearly any scale and complexity Management Console, SDKs, or CLI No additional cost

AWS OpsWorks SIMPLE Easy to use, quick to get started and productive

PRODUCTIVE

FLEXIBLE

POWERFUL

Reduces errors with conventions and scripted configuration

Simplifies deployments of any scale and complexity

Reduces cost and time with automation

SECURE Enables control with fine grained permissions

AWS OpsWorks

A stack represents the cloud infrastructure and applications that you want to manage together.

A layer defines how to setup and configure a set of instances and related resources.

Decide how to scale: manually, with 24/7 instances, or automatically, with load-based or timebased instances.

Then deploy your app to specific instances and customize the deployment with Chef recipes.

AWS OpsWorks Instance Lifecycle Agent on each EC2 instance understands a set of commands that are triggered by OpsWorks. The agent then runs a Chef solo run.

Setup

Configure

Deploy

Undeploy

Shutdown

AWS OpsWorks Instance Lifecycle configure all instances get configure event

online

running setup

configure all instances get configure event

terminating

instance gets Setup event

instance gets shutdown event

booting shutting down pending

requested new / stopped

OpsWorks Agent Communication 1. “Deploy App” 



2.

 



 EC2 Instance OpsWorks Service

 Your repo, e.g. GitHub

3. 4. 5. 6. 7.

Instance connects with OpsWorks service to send keep alive heartbeat and receive lifecycle events OpsWorks sends lifecycle event with pointer to configuration JSON (metadata, recipes) in S3 bucket Download configuration JSON Pull recipe and other build assets from your repo Execute recipe with metadata Upload Chef log Report Chef run status

How OpsWorks Bootstraps the EC2 instance Instance is started with IAM role  UserData passed with instance private key, OpsWorks public key  Instance downloads and installs OpsWorks agent

Agent connects to instance service, gets run info  Authenticate instance using instance’s IAM role  Pick-up configuration JSON from the OpsWorks instance queue  Decrypt & verify message, run Chef recipes  Upload Chef log, return Chef run status

Agent polls instance service for more messages

AWS OpsWorks + Chef OpsWorks uses Chef Solo to configure the software on the instance OpsWorks provides many Chef Server functions to users.  Associate cookbooks with instances  Dynamic metadata that describes each registered node in the

infrastructure

Supports "Push" Command and Control Client Runs With Chef 11, broader support for community cookbooks

AWS OpsWorks + Chef Predefined Layers: ELB HAProxy MySQL Memcached Ganglia

Ruby on Rails Node.js PHP Static Web Server Java

AWS OpsWorks + Chef Differences between OpsWorks & Chef-Server Recipes:

Environments = Stacks Data bags = Custom JSON / S3 Search = Stack Configuration / Deployment JSON

AWS OpsWorks Chef Recipes github.com/aws/opsworks-cookbooks

AWS OpsWorks + Chef include_recipe 'deploy‘ node[:deploy].each do |application, deploy| template "#{deploy[:current_path]}/wp-config.php" do source "wp-config.php.erb" owner "root" group "root" mode "0644" variables( :database => node['wordpress']['db']['database'], :user => node['wordpress']['db']['user'], :password => node['wordpress']['db']['password'], :dbhost => node['wordpress']['dbhost'], :lang => node['wordpress']['languages']['lang'], :cachenode => node['wordpress']['cachenode'] ) end end

AWS OpsWorks + Chef Resources to learn more: https://aws.amazon.com/opsworks/ https://aws.amazon.com/documentation/opsworks/ https://github.com/aws/opsworks-cookbooks http://www.opscode.com/ http://www.opscode.com/chef http://wiki.opscode.com/display/chef/Documentation

Lab 2 Deploying our App Using AWS OpsWorks

Lab 2 Three main parts:  Add RDS to our infrastructure with CloudFormation  Deploy our WordPress site with OpsWorks  Modify WordPress adding in ElastiCache Memcached, a caching

module to the code, and deploying in OpsWorks

One hour. Again, ask questions if stumped!

Lab 2 Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ElastiCache

NAT Instance

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

Amazon CloudWatch

Public VPC Subnet

AWS CloudFormation

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

Amazon S3 ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Leap/Bastion Instance

AWS OpsWorks

Taking AWS Operations to the Next Level Lab 2 Recap: 1. 2.

3.

Added in host based configuration management to help with software/configuration management on the host Using AWS CloudFormation with OpsWorks together helps you get from nothing to fully-running infrastructure with lots of ongoing benefits, easier maintainability Keep track of the who/what/when/why of your infrastructure/servers/applications

Taking AWS Operations to the Next Level Lab 2 Recap:  Continue to make use of software revision

control/testing techniques for our Chef recipes  Using OpsWorks makes deployment/configuration even easier  Can start off building this onto your current infrastructure today, no need for greenfield

Lab 2 Recap Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ElastiCache

NAT Instance

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

Amazon CloudWatch

Public VPC Subnet

AWS CloudFormation

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

Amazon S3 ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Leap/Bastion Instance

AWS OpsWorks

http://nintendobuddyboards.freeforums.org/sonic-3-and-knuckles-t402.html

So now we have provisioning and configuration down pretty well. But what could be missing?

Taking AWS Operations to the Next Level What about resources that aren’t easy to use AWS CloudFormation or Chef on? What about responding to failure/incidents? What might wake us up at 4am when we’re on call? What about SPOFs? What other routine things can be automated?

Taking AWS Operations to the Next Level

Some current deficiencies from what we’ve covered today:  AWS CloudFormation is passive and won’t react to

instance issues/failures •

How to re-assign resources like Amazon VPC routes, ENIs, EBS?

 OpsWorks won’t handle out of band resources that it

currently doesn’t support  Pools of hosts, clusters, and things that shard are

easy (one-offs, not so much)

What’s the missing piece? https://secure.flickr.com/photos/moxievision/470346677/

AWS Auto Scaling Amazon Simple Work Flow AWS APIs and a little elbow grease

Traffic to our site vs Provisioned Capacity Manually Provisioned capacity

Traffic to our site vs Provisioned Capacity Manually 76%

Provisioned capacity

24%

Traffic to our site vs Provisioned Capacity with Auto-Scaling

Provisioned capacity

Auto-Scaling Works with Amazon EC2 Triggers via Amazon CloudWatch alarms, scheduled actions, or manually Works with spot pricing! Scale up number of instances of a certain class Scale down number of instances of a certain class OR

Auto-Scaling You can use Auto-Scaling for singular instances that don’t scale up or down  min = 1, max = 1

Auto-Scaling gives you the ability to specify multiple Availability Zones, even you only need a single host  gives you multi-AZ failover

Auto-Scaling supports notifications on instance creation/termination  Useful for configuring other resources, bootstrapping, and

provisioning

Auto-Scaling is free!

Auto-Scaling Min = 1, Max = 1, Desired = 1 Set this around a single host and get a self-healing pool of 1

Auto-Scaling What other fun things can we do with Auto Scaling? Event triggers  Send messages to via Amazon SNS to a destination

“topic” such as Amazon SQS, Email, SMS, or HTTP(S) Post • When? • Instance initialization • Instance termination • Errors of either

How does knowing when an instance is created/terminated help us? From before:  AWS CloudFormation is passive. Won’t react to instance

issues/failures  OpsWorks won’t handle out of band resources that it currently doesn’t

support

So...

How does knowing when an instance is created/terminated help us? From before:  AWS CloudFormation is passive. Won’t react to instance

issues/failures  OpsWorks won’t handle out of band resources that it currently doesn’t

support

So...

Use these Auto Scaling notifications as a trigger to do other things

Things such as? Change Amazon VPC routes to point to a new NAT instance Restore a backup Amazon EBS data volume snapshot Remove hosts from monitoring/alerting Interact with any other AWS resources Configure something in another piece of software Update DNS Update third-party tools Remove anything manual in getting this host running

New Automation Process 1. Auto Scaling does something  Start, stop, scale up, scale down 2. A notification is sent  Where?

Ideally somewhere scalable, redundant, multihomed, reliable, API-able

Like Amazon SQS?

Like Amazon SQS?

YES!

Amazon SQS One of the first AWS services Extremely scalable – potentially millions of messages Extremely reliable – multi-AZ built in Simple – messages get sent in, messages get pulled out Secure – API credentials needed Inexpensive - $0.50 per 1,000,000 OR $0.0000005 per request 

Free tier: “You can get started with Amazon SQS for free. New and existing customers receive 1 million Amazon SQS queuing Requests for free each month.”

New Automation Process 1. Auto Scaling does something  Start, stop, scale up, scale down 2. A notification is sent  To an Amazon SQS queue 3. Pull message out of Amazon SQS  With what?

New Automation Process 1. Auto Scaling does something  Start, stop, scale up, scale down 2. A notification is sent  To an Amazon SQS queue 3. Pull message out of Amazon SQS  With what?

Some instance we can use as a management host

New Automation Process Some instance we can use as a management host:  That can be set up via AWS OpsWorks  Run Chef to install scripts/tools to interact with Amazon SQS and

other things  Can be a very lightweight host ( t1.micro perhaps )  Run cron on it to regularly poll Amazon SQS for changes to your infrastructure  Assign an Amazon EC2 IAM role giving it permissions to just what it needs

New Automation Process 1. Auto Scaling does something  Start, stop, scale up, scale down 2. A notification is sent  To an Amazon SQS queue 3. Pull message out of Amazon SQS  With a management host 4. Do something  Shell scripts??

The problem with doing this as a huge shell script Large monolithic scripts lead to bad behaviors  Hard to handle chained actions with various logic  Hard to handle failure situations

What happens if a shell script dies mid way? • What happens if the host running this job dies? What if various different systems need to do independent actions? Separation of tasks and duties within an infrastructure? Handling coordination of multiple flows and intertwining dependencies SOA all the things! •

   

We need a common way of reliably coordinating all the steps in our automation

Amazon Simple Workflow (SFW) Orchestration tool across your infrastructure Use it as a middle layer to pass messages and setup tasks to be completed Break down individual tasks into different workers You define logic between workers Anything that can be scripted, can be made into a worker task Built in retries, timeouts, logging Low cost, reliability, and scalability built in

YOUR CODE =

Deciders

&

Workers

Amazon SWF

Amazon Simple Workflow (SFW) Working with SWF – Terms to know:  Domain -> collection of workflows  Workflow -> collection of actions  Action -> task or workflow step

Actors:  Workflow starters -> start a workflow  Activity Workers -> implement actions  Deciders -> coordinate workflow actions

Amazon Simple Workflow (SFW) Other features:  No more than one delivery (unlike Amazon SQS)  Uses long polling, which reduces number of polls

without results  Visibility of task state via API  Timers, signals, markers, child workflows  Supports versioning  Keeps workflow history for a user-specified time

Amazon Simple Workflow (SFW) Deciders & Workers You make these. Independent, stateless, decoupled. Poll for work. Run in parallel to scale. Run multiple of each for different purposes.

Amazon Simple Workflow Service ( SWF )

Amazon Simple Workflow Service ( SWF ) EXAMPLE: Making a new NAT host host

Activity: Add an EIP Activity: Change Src/Dest Check Activity: Change VPC Routing Activity: Update monitoring

New Automation Process 1. Auto Scaling does something  Start, stop, scale up, scale down 2. A notification is sent  To an Amazon SQS queue 3. Pull message out of Amazon SQS  With a management host 4. Do something

New Automation Process 1. Auto Scaling does something  Start, stop, scale up, scale down 2. A notification is sent  To an Amazon SQS queue 3. Pull message out of Amazon SQS  With a management host 4. Kick off Amazon SWF workflow  Takes actions to complete desired scaling flow

New Automation Process 1. Auto Scaling does something  Start, stop, scale up, scale down 2. A notification is sent  To an Amazon SQS queue 3. Pull message out of Amazon SQS  With a management host 4. Kick off Amazon SWF workflow  Takes actions to complete desired scaling flow

Amazon Simple Workflow Service ( SWF ) Other uses:  Help infrastructure management  Backup orchestration  Dev/test environment refreshes  Processes within your application  Response to infrastructure incidents

What else can you think of??

Lab 3 Adding Auto Scaling, Amazon SQS, and Amazon SWF to our infrastructure for total host life cycle automation

Lab 3 Three main parts, set up Amazon SWF; set up Amazon SQS/SNS hooks and infrastructure helper host; re-launching our NAT instance in Auto Scaling; One hour One extra part: Terminate the NAT instance. See it fix itself.

Lab 3 Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

ElastiCache

Amazon CloudWatch

Public VPC Subnet

SWF Worker Instance

NAT Instance

Amazon SNS

AWS Amazon CloudFormation SQS

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Amazon S3

Amazon SWF

AWS OpsWorks

EC2 API

Leap/Bastion Instance

Taking AWS Operations to the Next Level Lab 3 Recap: Wrapped hosts in Auto Scaling to make self healing pools of 1 2. Took notifications from Auto Scaling through Amazon SQS and into Amazon SWF 3. Used Amazon SWF to make out-of-band changes to our infrastructure to complete provisioning/configuration, and recover from incidents 1.

Taking AWS Operations to the Next Level Lab 3 Recap: All of this automation for low cost (if not almost free) 5. Still mostly set up via AWS CloudFormation + OpsWorks 4.

Lab 3 Recap Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

ElastiCache

Amazon CloudWatch

Public VPC Subnet

SWF Worker Instance

NAT Instance

Amazon SNS

AWS Amazon CloudFormation SQS

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Amazon S3

Amazon SWF

AWS OpsWorks

EC2 API

Leap/Bastion Instance

Wrapping It All Up We’ve covered a whole whole lot today:  Treating your infrastructure as code with AWS

CloudFormation  Using Chef via OpsWorks to control the software lifecycle on our instances  Building a scalable automation framework with Amazon SWF, Amazon SQS, Auto Scaling, etc

Wrapping It All Up Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

ElastiCache

Amazon CloudWatch

Public VPC Subnet

SWF Worker Instance

NAT Instance

Amazon SNS

AWS Amazon CloudFormation SQS

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Amazon S3

Amazon SWF

AWS OpsWorks

EC2 API

Leap/Bastion Instance

Wrapping It All Up Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

ElastiCache

Amazon CloudWatch

Public VPC Subnet

SWF Worker Instance

NAT Instance

Amazon SNS

AWS Amazon CloudFormation SQS

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Amazon S3

Amazon SWF

AWS OpsWorks

EC2 API

Leap/Bastion Instance

Wrapping It All Up Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

ElastiCache

Amazon CloudWatch

Public VPC Subnet

SWF Worker Instance

NAT Instance

Amazon SNS

AWS Amazon CloudFormation SQS

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Amazon S3

Amazon SWF

AWS OpsWorks

EC2 API

Leap/Bastion Instance

Wrapping It All Up Private VPC Subnet

RDS DB Instance Primary ( Multi-AZ)

Web Instance

Availability Zone A

ELB

Private VPC Subnet

RDS DB Instance Standby (Multi-AZ)

ElastiCache

Amazon CloudWatch

Public VPC Subnet

SWF Worker Instance

NAT Instance

Amazon SNS

AWS Amazon CloudFormation SQS

Public VPC Subnet

Internet Gateway

ELB

Web Instance Availability Zone B

Private VPC Subnet

Public VPC Subnet

ELB

Web Instance Availability Zone C

Virtual Private Cloud AWS US-West-2

Amazon S3

Amazon SWF

AWS OpsWorks

EC2 API

Leap/Bastion Instance

Wrapping It All Up Where to go from here?  None of these things require greenfield, you can add

them in pieces and small bits to what you do today on AWS  Start using software revision control for your infrastructure and server configurations ASAP  Start thinking about what manual things you do regularly that can be broken up and automated

Wrapping It All Up Where to go from here?  Documentation!

https://aws.amazon.com/documentation/ (reads like a romance novel, I swear!)  Free tier to play! • https://aws.amazon.com/free/  Ask us!!! • Talk with your account reps, SAs, TAMs, etc. •

Taking AWS Operations To the Next Level

Recommend Documents