Thursday, November 29, 2012

AWS re:Invent Werner Vogel Keynote Live Blog

This is a live blog of Werner Vogel's Thursday morning keynote on the next generation of cloud architectures.  This will be quick and dirty so I can keep up with the information as it is presented.


  • Werner takes the stage to Nirvana playing...  getting the crowd fired up
  • usual recap of Day One announcements of the S3 price reduction and the new data warehouse service offering
  • Werner shows a slide from 2007 that he created and shows how the message is still relevant today (removed the heavy lifting of capital constraints - physics, people, scope)
  • AWS developed because Amazon's core business needed to scale and they were too constrained to continue to grow OR there was capacity that was underutilized. They were having trouble with the peaks and valleys of traffic to the business (Think Black Friday)
  • 11.10.10 - turned off the last physical web server supported Amazon business, 10.31.11 - turned off the last physical server supporting UK business removing the physical constraints to growth for their business
  • Key aspects of a 21st century architecture - secure, scalable, fault tolerant, high performance, cost effective
  • everything is a programmable resource - data centers, networks, compute, storage, databases, load balancers, all just services now - No more "hugging" anything
  • when a project focused on fixed resources, 31% never compete, 52% over budget due to inaccurate resource estimates, changing requirements, unmanaged risk, scope creep and complexity
  • If you are no longer "bogged down" by resources, Werner fells this will change and numbers will be more positive.
  • How do you change this mindset?
  • Take a step back and dont think of the resources anymore. i.e. An EC2 instance is not a server anymore
  • Decompose into smal loosely coupled, stateless building blocks
  • discussing imdb.com - their old architecture - running on Amazon Web Servers attached to IMDB Service. The architecture wasn't scalable. There was a tight coupling between the business. If Amazon went up, IMDB had to go up
  • After architecture was to loose couple the html code on S3 so if Amazon scaled up, IMDB wouldn't have too
  • Automate your application and processes
  • Werner says humans are terrible at automation, if you have to ssh or login to an instance, it isn't automated. He recommends Chef or Puppet
  • Let Business levers control the system
  • Be Agile, break down to small building blocks, let the business decide and be able to pivot in a short time to the new demands
  • Architect with cost in mind
  • Time for customer testimony
  • Ryan Park - Pinterest technical operations now on stage
  • Introduction to Pinterest
  • Design Principles used - Flexibility (Apache ZooKeeper used to redirect and balance as well as AWS Load Balancer)
  • Scalable (decomposed everything into services vs. monolithic design), databases are thousands of shards, no one server contains the whole database
  • Measurability (monitor application and infrastructure performance at all times)
  • peak traffic is USA times - autoscale to shut down 20% after hours reduces the cost when traffic is lower
  • used reserved instances for the standard traffic and then do on-demand and spot instances to handle the elastic load throughout the day. Watchdog processes look for spin up, spin down
  • $54 a hour to run initially, after changes, $20 a hour to run
  • Werner back discussing the spot instance market
  • Now talking Resilient design aspects (you shall protect your customers at all times)
  • Protecting your customer is the first priority
  • Encrypt all user data!
  • Amazon encrypts all traffic in transit as well as at rest
  • HTTPS is used everywhere
  • In production, deploy to at least two availability zones
  • You need to protect your business, if you go to production, span zones, period
  • Integrate security into your application from the ground up
  • "If firewalls were the answer, we would still have moats around all are cities" - classic quote
  • Build, test, integrate and deploy continuously
  • Don't wait for the "next version" to implement features, constantly iterate and deploy
  • Amazon average deployment time between "versions" is 11.6 seconds, constant iteration of the site (wow, that is amazing!)
  • Shows the old architecture, was very error prone, this method wasn't possible in the old model.
  • In the old way, roll back was almost impossible, today roll back is a single API call
  • Dr. Matt Wood - Chief Data Scientist for Amazon is on stage to give a demonstration
  • decoupled, stateless architectures are easy to maintain at scale but you don't need to be Amazon to take advantage of this.
  • Live Demo Time - Photo Management Application running on EC2, showing version 1.0 running
  • put full stack into version control so all dependencies are "stored" as a template including your bootstrap environment. Super easy to manage this way
  • everything is a load balanced, stateless architecture built across 10 instances
  • simulated traffic coming into site and data is in Dynamo DB
  • photo comes in, photo is processed and clean up, then published
  • discussion of the cost of processing 1000 images using version 1.0 of the product
  • spun up version 2.0 of the product using a different (faster) instance type, will this make a difference in the costs?
  • New Instance is launched to the load balancer, cost metric went down in real time so saving money just by using the template with a faster instance.
  • Since this brought cost down, replace the rest with faster instances and costs have now gone down, on the fly without the user being aware
  • "Don't become attached to your compute infrastructure"
  • Back to Werner - Don't think in single failures
  • "There is always a failure waiting around the corner"
  • Don't treat failure as an exception, treat it as a normal possible state
  • On stage now - Alyssa Henry, VP of Storage to discuss S3 design
  • S3 runs within multiple availability zones
  • How does an S3 request get processed (say a put request)
  • Load Balancer -> Availability Zone -> Web Server -> Index service and storage service stores on multiple instances in multiple facilities
  • What happens in a failure? redundant at all areas so even if an availability zone goes down you simply change DNS weight to route traffic away from failure
  • Adaptability - Weren't sure how the traffic would grow so they needed loose coupling of services so they could scale in whatever direction the customers took them
  • S3, circa 2006, saw amazing growth, built a new storage service that was side by side with the new storage and migrated over time from the old service to the new service.  In place migration and upgrades (better performance, higher availability, and lower costs to customers) without downtime.  There wasn't a "version 2.0" version of S3 offered, the users simply were migrated over to it
  • Werner back - Now talking Adaptive design
  • Assume Nothing
  • Build your architecture focused on your customer, not the available resources at your disposal
  • You get more business value by starting from scratch with no assumptions
  • This prevents you from being locked in (i.e capacity planning on a project, what if you are wrong?)
  • On Stage: CEO of Animoto - Brad Jefferson
  • Brad explains the site (user video creation site)
  • each video is custom rendered frame by frame on their site, a single server per video, this caused an EC2 instance explosion when announced Facebook integration
  • went from 100 instances to 5000+ instances in 12 months
  • In 2007 all was on AWS except the rendering, these were homegrown in house servers
  • In 2008, rendering was done on AWS using extra large instances
  • they were good at building videos, not on building servers
  • 2009, high-CPU extra large instances, higher performance (faster to render for users, higher resolution), added medium instances for lower resolution to lower costs
  • in 2011 Cluster GPU instances (quadruple extra large instances), now full HS video and streaming before the video is even finished rendering
  • Werner back on stage
  • Werner talking about the AWS Startup Challenge, if you own a business, submit for possible EC2 credits to help build startups, deadline is the next 7 days
  • Announcement: Two New EC2 Instance Types
    • Cluster High Memory 240 GB, 2x120GB
    • High Storage 48TB on 24 HD's with 117GB for "big data"
  • Talking about the importance of collection of data metrics to determine where your business is going
  • Don't look at averages, means... what if 20% of your customers are having a bad experience?
  • Look at 99.9% of your customers, do not focus on the average
  • "control the worst experience your customers are getting"
  • Announcement: AWS Data Pipeline, data driven workload service to move data from one service to another
    • Orchestration service for data workflows
    • Automated and scheduled data flows
    • pre-integrated with AWS data sources
    • Easily connect with 3rd party and on-premise sources
  • Demonstration - AWS Data Pipeline, Dr. Matt Wood back on stage
    • create a pipeline is a drag and drop interface
    • pre-made templates (example is Dynamo DB to S3)
    • configure the source, destination and requirements
    • automate this by setting a schedule
    • Another Example - Data Logs from S3
    • Create a Daily Report and analyze using map reduce (new Hadoop Cluster) - pay as you go log analysis
    • After Hadoop analysis, reports will be stored in another S3 bucket
    • In addition to the daily logs, create a weekly roll up analytics report
  • Conclusion and wrap up time....
In conclusion, very interesting stuff!

1 comment:

Anonymous said...

I was there; good job describing Werner's keynote.