Showing posts with label ApacheCon. Show all posts
Showing posts with label ApacheCon. Show all posts

Wednesday, February 27, 2013

ApacheCon LiveBlog: Software Defined Networking (SDN) in CloudStack


This is a live blog from ApacheCon that I'm attending this week.  This session is with Chiradeep Vittal.

Usual Live Blog Disclaimer: This is more of a brain dump typing as fast as I can, please excuse typos, format, and coherent thought process in general.



  • Introduction is about how does Amazon built a cloud (see his previous session for this part)
  • SDN Definition - Separation of Control Plane from the hardware performing the forwarding hardware - Also centralized control
  • Central control eases configuration, troubleshooting, maintain over time
  • Eliminates the tedious "log into every box" idea of network maintenance, log into controller
  • OpenFlow is that SDN? - NO, it is a protocol for the control plane to talk to the forwarding elements
  • Control is on the "top" and forwarding is on the "bottom"
  • flexibility example, different route based on direction. Box A and Box B, different flow from A to B and B to A if needed
  • IaaS and SDN go hand in hand - Agility, API configuration, Scalability,  Elasticity (all the ity's!)
  • SDN enables virtual networking - the illusion of isolated networks on a physical wire
  • SDN does have issues - Discovery of virtual addresses -> physical address mapping for instance
  • He is now going over a multi-tenant topology example:

  • CloudStack model - map virtual networks to physical network - define and provision networks and manage elasticity and scale
  • CloudStack Network Model is very robust (see pic, too much to type, things in box tend to be SDN functions)
  • How de we put this together?
  • CloudStack Service Catalog - Cloud users don't see the "guts" of the configuration, the cloud admin or operator designs the service catalog and presents this to the users
    • example - Gold Network - LB + FW + VPN using virtual appliances
    • Platinum - LB + FW + VPN but using hardware devices
  • Now going over topology example of the Gold offering & Platinum (uses Juniper firewall and Netscaler to Load Balance:
  • In both examples the users has no idea if they are on the Gold or Platinum network
  • Multi-Tier virtual networking - can define application tiers and isolate based on need as well, who is connected where
  • Orchestration - He went through the Multi-Tier example and demonstrated all the steps that would have to be down manually (too many to list) and this will all be done through orchestration
  • CloudStack Orchestration Architecture (see picture) - plugin Framework allows this to happen
  • SDN works with CloudStack through the plugin model, the SDN controller talks to the plugin, today there is integration with Nicira NVP, BigSwitch, Midokura, and CloudStack Native (requires XenServer)
  • CloudStack Native Controller uses GRE and and talks to Open vSwitch on the XenServer
  • All isolation happens through the concept of a tenant key over the GRE tunnels. Each tenant has a unique key
  • What makes the CloudStack controller different? 
    • It is purpose built for IaaS and is not a general purpose SDN solution
    • Proactive model - Deny all flows except ones programmed by the end-user API - others send to central controller and may have problems at scale
    • Use the CloudStack virtual router to provide L3-L7 services (mainly because most hardware doesn't understand GRE today)

ApacheCon LiveBlog: Powering CloudStack w/ Ceph RBD


This is a live blog from ApacheCon that I'm attending this week.  This session is with Patrick McGarry.

Usual Live Blog Disclaimer: This is more of a brain dump typing as fast as I can, please excuse typos, format, and coherent thought process in general.

(No title slide picture this time - missed it)

  • What is Ceph - storage that does object, block, and file all in one; block is thin provision, snapshots, cloning - object has REST API
  • RADOS (Google it) object store at the lowest level
  • Why Object at the lowest level - more useful than blocks, single namespace, scales better, simple API, workload is easily parallel
  • Because of this: define a pool (1 to 100's), independent namespaces and object collections
  • (Topic change) - Architecture
  • aggregate a bunch of different machines so that you can have a "large enough" front end to handle large number of requests in
  • In this "pile" you will have monitors. Monitors provide consensus for decisions, always an odd number, do not store data (traffic controllers) to the storage nodes (OSD nodes)
  • On an OSD node -> physical disk -> file system -> OSD layer
  • CRUSH - pseudo-random placement algorithm for data placement, CEPH "secret sauce", allows for stable mapping and uniform distribution with additional ruled configuration (can apply weights, topology rules)
  • How does it work, take an object, talk to monitors, CRUSH breaks it up, places it around according to the rules
  • What happens when something breaks? If an OSD node is lost, the ones with the copy of the data replicates the blocks somewhere else according to CRUSH rules and moves on
  • How to talk to it? LIBRADOS - library for RADOS, support for C, C++, Java, Python, Ruby, PHP
  • Also RADOSGW - Rest gateway compatible with S3 & Swift
  • CEPH FS - A POSIX-compliant distributed file system with a Linux kernel
  • RBD - reliable and fully-distributed block device sitting on top of the object store
  • RADOS Block Device (RBD) - storage of disk images in RADOS, allows decouple of VM from the host, images stripped across the pool, snapshots, copy-on-write clones
  • What does this look like? vm's are now split across the cluster, great for large capacity as well as high I/O instances of vm's
  • same model as Amazon EBS
  • it is a shared environment, so you can migrate running instances across cluster
  • Copy-On-Write Cloning (he gets lots of question on this) - think of a Golden Image Master vm and you want 100 copies - You spin the 100 instantly and it takes up additional storage as needed and the vm's grow.
  • Question: Is there a performance impact to this? A: No, but as usual it depends on the architecture (how many devices are hitting it)
  • CloudStack 4.0 and RBD? via KVM, no Xen or VMW support today
  • Live migrations are supported
  • No snapshots yet
  • NFS still required for system vm's
  • Can be added easily as RBD Primary storage in CloudStack
  • snapshot and backup support should be coming in version 4.2, cloning is coming, support for secondary storage in 4.2 (backup storage is coing in 4.2)



ApacheCon LiveBlog: DevCloud - A CloudStack SandBox



This is a live blog from ApacheCon that I'm attending this week.  This session is with Sebastien Goasguen.

Usual Live Blog Disclaimer: This is more of a brain dump typing as fast as I can, please excuse typos, format, and coherent thought process in general.

  • Today's talk will focus on DevCloud and CloudMonkey
  • Sebastien giving overview of IaaS market in general. He was actually an OpenNebula guy prior to CloudStack
  • With IaaS setting up a virtual sand box can be tricky since there are a larger number of moving parts: hypervisors, storage, networking
  • DevOps - quick introduction to DevOps to help everyone understand why this is such a big movement in the industry right now (bringing development coser to the operations)
  • This helps us set up an environment to enable a software defined datacenter that allows for automation at all levels
  • Now talking about ASF (Apache Software Foundation) and CloudStack. He has a LOT of analysis around the community. The growth once joining as an incubation project shows a HUGE spike (CloudStack is now the #1 Apache project when it comes to commits)
  • On to the internals of CloudStack, goal is to be as agnostic as possible (multi-hypervisor, both block and object storage) 
  • Network tends to be the most challenging for new folks (firewall, load balancing, basic networking vs. advanced networking, VPN, etc.) - See the bottom line on the picture above
  • Apache 4.0 was released in November, 4.01 was just released, 4.1 set for March. Goal is new release every six months
  • Architecture -> Zone(datacenter) -> Pods(rack) -> Cluster (hosts) -> primary & secondary storage -> Instance (virtual machines)
  • Centralized management server - can be multiple management servers behind a load balancer and replicated MySQL for large scale
  • system vm's are used to communicate from the management server to some features (firewall, secondary storage, etc.)
  • (Topic change) - What is DevCloud - CloudStack in a box, aimed at developers but can be a local EC2/S3 "cloud in a box"
  • self contained - cloudstack management server, ttyllinux (to stay small), system vm's, MySQL, interface all on one laptop - on a beefy laptop expect a good number of instances
  • What is CloudMonkey - cloudstack CLI - great for auto-completion of features, tabular output, help, scriptable, shell interaction, etc.
  • Intro - Launch CloudMonkey, you now have a shell to talk to your cloud, need to do a key exchange, then ready to access your devCloud instance
  • Demo Time - He is running VirtualBox on a Mac Book Air, he is using a NAT interface, forwarding a few ports needed (8080, 2222, 8443, 5901, 7080) - The vm uses nested vm's to launch inside the virtual machine on the laptop
  • 2nd Demo - He is running the 4.01 release on his laptop directly from the sourcecode instead of the devcloud vm as well.
  • Back to DevCloud - He shows the system vm's up and running and an instance that is halted.
  • Went into Web UI - Gave an overview of the Infrastructure, you will have a zone and pod that is defined (named devlcoud), from there a single host as well
  • Secondary storage - NFS storage is built in and emulated, primary storage is "local". No need to stand up an external NFS service
  • templates - the system vm's and the small linux template are already included.
  • Sebastien went through creation of a new instance using the included tiny instance and shows everything spinning up.
  • You can take snapshots (saves to secondary storage)
  • The first time a template is used it is pulled from secondary storage and copied down to primary storage
  • Global Settings - EC2 API feature turn on if you want to run EC2 commands against it
  • Now going over CloudMonkey features
  • First thing, set the API key (get this from the UI)
  • Now you can do common tasks (list virtual machines, start/stop virtual machines, etc.)
  • Another way to use DevCloud: different network type, 2 vNICS, one host only and one NAT
  • Build it from source (need Maven dependencies), deploy the database, basically build it yourself. Because you build it this way, there are no zones, pods, etc.  You build everything yourself.
  • One thing you can do with this is build your entire infrastructure from scripts. This allows you to test build process of CloudStack for replication.  This is a very powerful use case!

Really great presentation and great overview to those new to CloudStack and DevCloud!





Tuesday, February 26, 2013

ApacheCon LiveBlog: Object Storage with CloudStack & Hadoop


This is a live blog from ApacheCon that I'm attending this week.  This session is with Chiradeep Vittal.

Usual Live Blog Disclaimer: This is more of a brain dump typing as fast as I can, please excuse typos, format, and coherent thought process in general.


  • How does Amazon build a cloud:
    • Commodity Hardware -> OpenSource Xen Server -> AWS Orchestration Software -> AWS API -> Amazon eCommerce Platform
    • How would YOU build the same cloud on CloudStack - You can in much the same way: Hardware -> Hypervisor -> CloudStack -> API -> Customer Solution
  • CloudStack is built in the concept of a Zone (much like an AWS Zone)
    • Under the zone is a logical unit of Pods (think of it as a rack)
  • Secondary Storage is used for Templates, snapshots, etc. (items that are storage and not changed often, need to be shared across pods)
  • Cloud Style Workloads = low cost, standardized hardware, highly automated & efficient (it's the Pets vs. Cattle analogy)
  • At scale, everything breaks eventually
  • Regions and Zones - Region "West", hope a Region will not go down when another Region goes down. - Replication from one Region to another Region is the norm
  • Secondary Storage in CloudStack 4.0 today
    • NFS is the server default - mounted by any CloudStack Hypervisor, easy to set up
    • BUT - doesn't scale well, "chatty", maybe need WAN optimize. What if 1000 hypervisors talk to one NFS share?
    • At large scale NFS shows some issues
    • One solution is use object storage for secondary storage
  • Object Storage has redundancy, replication, auditing built in to the technology typically
  • In addition, this technology enables other applications, API server in front of the object store and you know have "Dropbox", etc.  typically static content and archival kinds of applications
  • Object is 99.9 availability and 99.(eleven 9's) durability according to Amazon S3 and Massive scale (1.3 trillion objects in AWS today serving 800k requests per second
  • Scalable objects can not be modified, only deleted (called an Immutable object)
  • Simple API with a flat namespace - think KISS princisple
  • CloudStack S3 API Server - understands Amazon S3 API with a Pluggable BackEnd, default backend is a POSIX filesystem (not very useful in production), Carringo was mentioned as a replacement, also HDFS replacement
  • Question - Does CloudStack handle all the ACL's / Answer: Yes
  • FollowUp - Does that mean SQL Server is a possible constraint / Answer: Yes
  • Integrations are available with Riak CS and OpenStack Swift
  • Upcoming in CloudStack 4.2 - Framework to expand this much more
  • Given all of this, what could we build? (Topic switch)
  • Want an Open Source, scales to 1 billion objects, reliability & durability on par with S3, S3 API
  • This is now a theoretical design (hasn't been tested)
  • (See picture for architecture)

  • Hadoop meets all of these requirements and is proven to work (200 million objects in 1 cluster, 100PB in 1 cluster), need to scale, just add a node, very easy
  • BUT - Name Node Scalability (at 100's of millions of blocks, could run into GC issues), Name Node is a SPOF (Single Point of Failure) - this is being worked currently, Cross Zone Replication (Hadoop has rack awareness, what if further apart?) - this isn't really tested today, where do you store metadata (ACL's for instance)
  • take a 1 billion objects example (bunch of assumptions here) - needs about 450GB per name node, 16TB / note = 1000 data nodes
  • Name Node management is federated (sorry this is vague, getting beyond my knowledge of Hadoop architecture at this point). Name Node and HA really hasn't been tested to date
  • NameSpace shards, how do you shard them? Do you need a DB just to store this?? What about rebalancing between node names?
  • Replication over lossy/slower links (solution really breaks down here today)
    • Async replication - how do you handle master/slave relationships?
    • Sync - not very feasible if you lose a zone (writes never acknowledged so will not continue)
  • Where do you store Metadata?
    • Store in HDFS along with the object, reads become expensive and meta data is mutable (needs to be edited), needs a layer on top of HDFS
    • Use another storage system (like HBase) - required for Name node federation anyway, but ANOTHER system to manage
    • Modify the Name Node to store the metadata
      • high performance (doesn't exist today)
      • not extensible and not easy to just "plug in"
  • What can you do with Object Store in HDFS today?
    • Viable for small size deployments - up too 100-200 million objects (Facebook does this) with datacenters close together
    • Larger deployments needs development and there is really no effort around this today

ApacheCon LiveBlog: CloudStack Top 10 Network Issues


This is a live blog from ApacheCon that I'm attending this week.  This session is by Kirk Kosinski.

Usual Live Blog Disclaimer: This is more of a brain dump typing as fast as I can, please excuse typos, format, and coherent thought process in general.



  • Kirk was an original cloud.com support engineer so he has seen a LOT over the years
  • # 1 Issue - VLANS! - biggest single reason for issues in CloudStack, check switch misconfiguration (Are all VLANs trunked by default?)
    • Does DHCP work for a certain number of the VMs? Lead indicator of this problem, vm's are running on the same host but the VLANs are messed up
    • So many reasons why VLANs could be a problem, this can be very hard to troubleshoot depending on the complexity of your environment (firewalls, layers of switches, etc)
  • #2 - Hypervisor problems - mostly network related again - NIC drivers, bonding (especially Xen), cabling, etc.
    • don't try to manually hack your management server database!
  • #3 Open vSwitch on XenServer - It is the default now. Make sure you run the latest patches!
  • #4 Security Groups - KVM, works out of the box most of the time, Xen, must enable Linux bridge back-end, must install Cloud Supplemental Pack (XS < 6.1), doesn't work on vSphere currently
  • #5 Host Connectivity - between hypervisors to system vm's and secondary storage
  • #6 CloudStack "Physical Networks" - not necessarily "physical", traffic labels - multiple NICS, etc.
  • #7 Console Proxy virtual machine - Connectivity from management server to end users web browser
    • check realhostip.com connection, check SSL cert status
  • #8 Templates - was it eth0 and you are now using eth1?, sysprep for Windows errors
  • #9 Password Reset Feature - reset script problems, check DHCP client & version
    • Daemon Problems - check 8080/tcp on virtual router (socat process, stop and restart)
  • #10 User and Meta-Data - Start/Stop vm, Start/Stop virtual router, Destroy/Recreate virtual router, check management-server.log

ApacheCon LiveBlog: CloudStack's Plugin Model

This is a live blog from ApacheCon that I'm attending this week.  This session is by Don Lafferty.

Usual Live Blog Disclaimer: This is more of a brain dump typing as fast as I can, please excuse typos, format, and coherent thought process in general.

 


  • Open Source Community Leadership Drives Enterprise-Grade Innovation is the opening bullet
    • Of course, CloudStack's plugin model permits this!
  • This presentation will be a case study for the addition of Hyper-V as a newcomer (meaning Don is a new comer) support into CloudStack - This shows where Don is coming from and challenges as he is getting started in this ecosystem
  • To be able to learn to plugin to CloudStack, you need to break it into pieces to make the learning curve more manageable: Hardware management (provisioning plugins), CloudStack Orchestration, Adapters (to bridge Orchestration & Provisioning) & Framework
  • Plugin serves two masters: Server component (java, adapter API's, RESTful API's, etc.) & Server Resource (Agent Proxy i.e KVM or Direct Connect i.e. Xen)
  • Follow the Apache Process for New Features:
    • Announce over mailing list
    • Publish Spec & Design
    • JIRA Ticket
    • Setup Dev Environment
    • Branch on github, use your own public branch
    • Submit changes to Review Board
  • Decide which wiki you want to use: Incubator Wiki (cleaner, simpler  or CloudStack Wiki (more in depth, harder to new comers) - DO NOT use the pre-Apache wiki!
  • Don recommends breaking the project into small steps to start just to learn the Apache process. Once you have the process down, then move onto more complex development
  • For the Hyper-V example he broke this into Phase 1 (talk over Message Bus to talk to an agent) and Phase 2 (WS-Management to the WMI layer)
  • Reuse and Repurpose rather than Rewrite!  (There is a ton of CloudStack code that exists, use it!)
  • Don discusses Phase One - He borrowed code from the KVM version of the agent and communicated over the message bus and combined with code from the Hyper-V plugin for OpenStack
  • Don went into some code examples and command examples - over my head :) - Take away, how do you want to structure the "conversation" between the management server and the agent.
  • Pay attention the development mailing lists and see if a development trend helps solve your issue.  (i.e. NFS to secondary storage in CloudStack, Hyper-V of course prefers an SMB connection, there was a project already going on to make this happen so no need to do that code)
  • Make Preparations for IP Clearance - These things take time and need to make sure the source code can be donated to the Apache Foundation to make sure everything is kosher from a legal standpoint.
  • Session wrapped up with Q&A around how much of the learning curve is Apache related vs. CloudStack related.  It is a "two headed monster" to get going.  You have to learn the process and you have to learn the product.  They go hand in hand.