This is a live blog from ApacheCon that I'm attending this week. This session is with Chiradeep Vittal.
Usual Live Blog Disclaimer: This is more of a brain dump typing as fast as I can, please excuse typos, format, and coherent thought process in general.
- How does Amazon build a cloud:
- Commodity Hardware -> OpenSource Xen Server -> AWS Orchestration Software -> AWS API -> Amazon eCommerce Platform
- How would YOU build the same cloud on CloudStack - You can in much the same way: Hardware -> Hypervisor -> CloudStack -> API -> Customer Solution
- CloudStack is built in the concept of a Zone (much like an AWS Zone)
- Under the zone is a logical unit of Pods (think of it as a rack)
- Secondary Storage is used for Templates, snapshots, etc. (items that are storage and not changed often, need to be shared across pods)
- Cloud Style Workloads = low cost, standardized hardware, highly automated & efficient (it's the Pets vs. Cattle analogy)
- At scale, everything breaks eventually
- Regions and Zones - Region "West", hope a Region will not go down when another Region goes down. - Replication from one Region to another Region is the norm
- Secondary Storage in CloudStack 4.0 today
- NFS is the server default - mounted by any CloudStack Hypervisor, easy to set up
- BUT - doesn't scale well, "chatty", maybe need WAN optimize. What if 1000 hypervisors talk to one NFS share?
- At large scale NFS shows some issues
- One solution is use object storage for secondary storage
- Object Storage has redundancy, replication, auditing built in to the technology typically
- In addition, this technology enables other applications, API server in front of the object store and you know have "Dropbox", etc. typically static content and archival kinds of applications
- Object is 99.9 availability and 99.(eleven 9's) durability according to Amazon S3 and Massive scale (1.3 trillion objects in AWS today serving 800k requests per second
- Scalable objects can not be modified, only deleted (called an Immutable object)
- Simple API with a flat namespace - think KISS princisple
- CloudStack S3 API Server - understands Amazon S3 API with a Pluggable BackEnd, default backend is a POSIX filesystem (not very useful in production), Carringo was mentioned as a replacement, also HDFS replacement
- Question - Does CloudStack handle all the ACL's / Answer: Yes
- FollowUp - Does that mean SQL Server is a possible constraint / Answer: Yes
- Integrations are available with Riak CS and OpenStack Swift
- Upcoming in CloudStack 4.2 - Framework to expand this much more
- Given all of this, what could we build? (Topic switch)
- Want an Open Source, scales to 1 billion objects, reliability & durability on par with S3, S3 API
- This is now a theoretical design (hasn't been tested)
- (See picture for architecture)
- Hadoop meets all of these requirements and is proven to work (200 million objects in 1 cluster, 100PB in 1 cluster), need to scale, just add a node, very easy
- BUT - Name Node Scalability (at 100's of millions of blocks, could run into GC issues), Name Node is a SPOF (Single Point of Failure) - this is being worked currently, Cross Zone Replication (Hadoop has rack awareness, what if further apart?) - this isn't really tested today, where do you store metadata (ACL's for instance)
- take a 1 billion objects example (bunch of assumptions here) - needs about 450GB per name node, 16TB / note = 1000 data nodes
- Name Node management is federated (sorry this is vague, getting beyond my knowledge of Hadoop architecture at this point). Name Node and HA really hasn't been tested to date
- NameSpace shards, how do you shard them? Do you need a DB just to store this?? What about rebalancing between node names?
- Replication over lossy/slower links (solution really breaks down here today)
- Async replication - how do you handle master/slave relationships?
- Sync - not very feasible if you lose a zone (writes never acknowledged so will not continue)
- Where do you store Metadata?
- Store in HDFS along with the object, reads become expensive and meta data is mutable (needs to be edited), needs a layer on top of HDFS
- Use another storage system (like HBase) - required for Name node federation anyway, but ANOTHER system to manage
- Modify the Name Node to store the metadata
- high performance (doesn't exist today)
- not extensible and not easy to just "plug in"
- What can you do with Object Store in HDFS today?
- Viable for small size deployments - up too 100-200 million objects (Facebook does this) with datacenters close together
- Larger deployments needs development and there is really no effort around this today
No comments:
Post a Comment