Feature Image

Thoughts on AWS

The more I think about it, the more I realize that we’re witnessing a transformation in IT. Despite the AWS tattoo and kool-aid mustache, I’m not married to AWS. Nothing’s perfect, and AWS is no exception. There are always questions about vendor lock-in and backing out or shifting services as the landscape changes.

Vendor lock-in is a possibility with every IT partnership, and there are ways to mitigate those risks. If Cisco, Redhat, IBM, or Oracle go out of business, it will have a huge impact. If any change their pricing structure, the same. In a practical sense, you usually have time to react. AWS might get purchased, or the CEO might change, but it’s unlikely that we’ll be informed on Monday that they’re shutting down, and we have 3 days to relocate services. With AWS, there’s no upfront cost:  you pay as you go. A provider can’t have success with that model without a rock-solid service. Furthermore, there’s no financial lock-in:  no multi-million dollar contracts or support agreements, which are probably the most common and painful cases.  AWS frees us from those kinds of lock-in.

AWS isn’t outsourcing — it’s a force multiplier. I guarantee that OIT engineers, with minimal additional training, can build more secure systems with better architectures in Amazon they’d be able to do on-premises for a fraction of the cost and time. Infrastructure is hard. It takes massive resources. AWS has the economy of scale and has been able to execute. The result is a game changing historic transformation of IT. Really, look at it:  it’s that profound. It enables experimentation and simplification. Take sizing systems. Just rough out your scaling, pick something larger than you need and scale it back after two months of usage data. Calculate the low nominal use and scale it back further, then auto-scale for the peak. Script it and reproduce everything, network, servers, security policies, all the things. That kind of architecture is an engineer’s dream, for pennies per hour, now, this instant, not years from now.

AWS or not, there is no question that this is the future of IT. Except for low latency or high performance, or very specialized applications that require a local presence, in my opinion, most companies will shed their datacenters for providers like AWS. My guess:  it’s going to come faster than we think. It’s not that we can’t accomplish these things, or that we don’t understand them. We simply don’t have the resources. Take long term storage and archiving. Glacier and S3 have eleven 9s of durability. Elasticity, I can upload 500 TB into Glacier tomorrow, or a petabyte, no problem. Spin up 10 machines, 100 machines, 1000 machines:  done. Chris did a data warehouse in a couple of days. There’s really no turning back.

I suspect we’ll be cautious, as we should be, moving services to AWS. But my forecast is that with each success, it will become apparent that we should do more. It’s going to become so compelling that you can’t look away.

I’m really glad we’ve picked AWS for IaaS. It is a huge opportunity.

Scripting AWS with Boto

It’s pretty easy to get yourself an Amazon Web Services account, log into the web console, and start firing up instances left and right.  It feels awesome, and it should, because you are doing a lot of cool things with just a few mouse clicks.  Soon, however, that buzz is going to wear off, because working this way is not sustainable at scale, and it is not easily repeatable.  If you want to get to #DevOps nirvana, creating and destroying instances automatically with each branch commit, you are going to have to abandon that GUI and get to the command line. 

So you google up “aws command line,” and behold: AWS Command Line Interface.  These tools are great, and probably worth a blog post in themselves.  They are an attempt to provide a unified CLI to replace prior offerings that were unique to each service (e.g., an ec2 tool, a CloudFormation tool), and they let you do quite a bit from the command line. But wait, because if you want to control everything in AWS while also leveraging the power of a mature scripting language, you want Boto for Python.

First, we’ll need python.  Mac and Linux users are good to go; Windows users will need to install it first.  We’re also going to need the python package manager, called pip.  For some reason it doesn’t come with python, but there’s a python tool that’ll fetch it for us.  This worked on my mac:

$ sudo easy_install pip

Okay, now you’d better learn some basic Python syntax. Go ahead; I’ll wait.

Good.  Now we’re ready.  Install the boto SDK with pip:

$ pip install boto

Then create a file called .boto in your home directory with the following contents:

[Credentials]
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY

Where to get those credentials?  I’ll let this guy tell you.

Now we’re ready for some examples.  Readers in the OIT can check out the full scripts here.  Let’s start by making a Virtual Private Cloud, aka a virtual data center.  VPCs are constructs that, despite their relative importance in a cloud infrastructure, are quite simple to create.  From create_vpc.py:

#!/usr/bin/python
import boto
from boto.vpc import VPCConnection

You’ve got the typical shebang directive, which invokes the python interpreter and lets us execute the script directly.  Then the important part: importing boto.  The first import is really all you need, but if you want to be able to reference parts of the boto library without fully qualifying them, you’ll want to do something like line 3 there.  This lets us reference “VPCConnection” instead of having to say “boto.vpc.VPCConnection.”

The parameters for creating a VPC are…

  • name: friendly name for the VPC (we’ll use VPC_ND_Datacenter.  This is actually optional, but we’re going to tag the VPC with a “Name” key after we make it)
  • cidr_block: a Classless Inter-Domain Routing block, of course (alright, Bob Winding helped me.  I’ll update with clarification later) Example: 10.0.0.0/16
  • tenancy: default / dedicated.  Dedicated requires you to pay a flat fee to Amazon, but it means you never have to worry that you’re sharing hardware with another AWS customer.
  • dry_run: set to True if you want to run the command without persisting the result

 

Basically the cidr_block is the only thing you really even need.  I told you it was a simple construct!  Note that variables exist for each in the create_vpc line below.

c = VPCConnection()
datacenters = c.get_all_vpcs(filters=[("cidrBlock", cidr)])
if not len(datacenters):
    datacenter = c.create_vpc( cidr_block=cidr, instance_tenancy=tenancy, dry_run=dryrun )
     print datacenter
else:
     datacenter = datacenters.pop(0)
     print "Requested VPC already exists!"

datacenter.add_tag("Name", "My awesome VPC")

First, get the VPCConnection object, which has the methods we’ll use to list/create a VPC.  Note that this is available due to the “from..import” above.  Next, use the method “get_all_vpcs” with a filter to check that no VPC with this cidr block already exists.

If that list is empty, we’ll call create_vpc.  Otherwise, we’ll print a message that it already exists.  We can also pop the first item off the list, and that’s the object representing the existing VPC.  Finally, we’ll add a tag to name our VPC.

This stuff is that easy.

Once you divide that VPC into subnets and create a security group or two, how about creating an actual ec2 instance?

ec2 = EC2Connection()

reservation = new_instance ec2.run_instances(
image_id='ami-83e4bcea',
subnet_id='MY_SUBNET_ID',
instance_type='t1.micro',
security_group_ids=['SOME_SECURITYGROUP_ID', 'ANOTHER_ONE')

reservation.instances[0].add_tag("Name", "world's greatest ec2 instance")

Similar setup here.  We’re using the EC2Connection object instead.  Note that the run_instances method doesn’t pass back the instance directly, but gives you a “reservation” object that can apparently have an array of instances.  AFAIK you can only create one at a time with this method.  Anyway, we tag it and boom!  Here’s our instance, initializing:

i could do this all day

and to think you wanted to click buttons

We’ve got more work to do before we can create this instance and provision it via puppet, deploy applications on it, run services, and make it available as a Notre Dame domain.  Still, this is a great start.  Maybe some OIT folks want to jump in and help!  Talk to me, and be sure to check out the full boto API reference for all available methods.

Rearview mirrors

In 1911, Ray Harroun entered a car in the inaugural Indianapolis 500 automobile race.MarmonWasp

Though primitive by today’s standards (the car averaged roughly 75 mph over the course of the 500 mile race), it was a massively controversial vehicle in its day.

Why?

Because it was the first car in history to be equipped with a rearview mirror.  This put Mr. Harroun at a considerable advantage, because his car was the only one that did not have a passenger mechanic on board.  Eliminating an onboard mechanic represented a significant weight savings, which Harroun used to his advantage, winning the race by just over half a mile.

Three years later, automobile manufacturers started putting review mirrors on cars available to the public.

Since then, racing has continued to improve the breed.  In 1961, Giotti Bizzarrini combined his engineering talent with Piero Drogo, a car body specialist.  Enbracing Kammback design, the result of their creation was a very unique looking Ferrari racing car.  The long tail section of the car minimized air resistance, allowing for higher speeds.

ferraribreadvan-01

In 2011, Toyota introduced the first widely-available hybrid electric car, aimed at the mass market.  Look closely at the profile, and you will see echoes of the Kammback principle.  Instead of winning races, the slippery aerodynamics help the Prius compete against the wind in order to deliver more miles per gallon.

2010-toyota-prius-photo-267765-s-1280x782

Today, there are many pieces of automotive technologies so mundane to us that we take them for granted.  Anti-lock brakes.  Disc brakes.  Rearview mirrors.  How many of us drive cars without rearview mirrors?

Step back and look at the infrastructure as a service landscape.  Private industry is our race team.  They are working constantly, pouring effort, energy, and capital into improving the breed.  We are the fortunate auto manufacturer on the receiving end of the trickle-down effect.  Technologies that we fantasized about twenty years ago are available at our fingertips.

Though cutting edge for higher education, adopting IaaS and DevOps has proven its mettle in the stiffest competition.  Let us step forward confidently and learn from our racing team, figuring out how to operationalize the use of these technologies in our environment.

And for those of you interested in learning more about technologies that we take for granted whose roots are in racing, please explore this interesting article.

OIT Collaboration with University Communications

We started a new project today to assist UC in the migration of the Conductor CMS from Rackspace to AWS. This is a great opportunity to help UC improve their deployment and to build skills in additional AWS services. The initial model is to deploy the service as a two tier (Web/Application and Database) service using a VPC with EC2 and RDS instances. This will also fill in some of our governance policies as we create roles in IAM for the project team members. We will also be using a shared system management model where UC is developing and managing the application and content and OIT the server stack and infrastructure. The service hosts over 300 websites with 2TB/month of network traffic.

AWS in Classroom

The ability to control provisioning of an entire development stack in AWS is not just a fantastic opportunity for the enterprise; it is also a great way to let students learn using infrastructure that might otherwise be prohibitively expensive and difficult to procure. Understanding this potential to facilitate learning and train a new generation of cloud-native developers, Amazon offers educational grants for use of AWS services. I started using program about a year ago for my Database Topics class in the Mendoza College of Business, and Chris Frederick is looking into it for his No-SQL/Big Data class.

It can give students complete hands on with servers/services without the issues of spinning up servers in house!

A snapshot of services are below.   The main services I have used in the RDBMS class are EC2 and RDS.   Each student was able to have their own Oracle DB instance and able to have DBA privileges to run their own DB and in that way set up users, privileges, roles which would be much more difficult to do on a shared environment.   They could then work on projects without fear of bringing down another student’s database.  One of the other projects I did in a class, with the help of Xiaojing Duan, was to set up a PHP server to show integration with Facebook.   As you can see from the list below,  there are a LOT more services available for classroom use!

11-21-2013 1-06-37 PM

The First Step

Building on the energy of the #DevOpsND workshop, the first target of opportunity identified by the attendees on Day Two is enterprise backup.  Of all the cloud candidates discussed, transitioning from on-premises backup to Amazon Glacier represents the greatest financial opportunity with the least effort in the shortest period of time.

On Day One of the workshop, Shane Creech came up with the idea of piloting backing up our remote campus locations using the newest of Amazon’s backup offerings.  The latest addition to Amazon’s Storage Gateway product is the Virtual Tape Library.  Available as an iSCSI interface, the Virtual Tape Library is capable using S3 or Glacier as a destination.

Part of Day Two’s work saw John Pozivilko, Shane Creech, and Jaime Preciado-Beas work together to successfully pilot the Storage Gateway.  The next step is to work with Mike Anderson and the storage group to test with our existing enterprise backups software.

Onward!

AWS Governance

Jason Williams and Jaime Preciado-Beas did a great job on the security assessment of our current deployment of nd.edu in Amazon. As discussed in day 2 of the workshop, this document will serve as a great starting point to flesh out our governance structure with respect to AWS. Let’s use it as a guide as we deploy new projects. Governance is critical to our long term success in leveraging AWS, so we need to advance and refine it with each project. Let the projects begin!

The Start of Something Big

The spark of DevOps at ND became a flame early this week as the OIT welcomed Dan Ryan to McKenna Hall for a two-day bootcamp on DevOps and Infrastructure-as-a-Service.  Over 50 OIT professionals from networking, product management, custom development, virtualization, senior leadership, information security, data center operations, identity/access management, and architecture gathered together to learn about DevOps and decide upon an IaaS provider.

Day One kicked off with an introduction by CIO Ron Kraemer, who challenged us to seize the “historic opportunity” represented by cloud computing.

Continuing a discussion started with his appearance at the Sept 2013 ND Mobile Summit, Dan made a compelling case not only for migrating our data center to the cloud, but for doing it using Amazon Web Services.

IMG_1012

Dan Ryan dropping some #IaaS knowledge

IMG_1013

AWS easily surpasses their closest competitor, Rackspace

The morning concluded by the assembly agreeing that Amazon Web Services is our preferred infrastructure as a service provider of choice based on organizational capability, price, and position in the market.

That afternoon and all of Day Two saw working groups spring up across OIT departments to tackle the practical architecture, design, and implementation details of a few specific projects, including AAA, Backup, and the Data Governance Neo4j application.  Bobs Winding and Richman led a discussion of how exactly to lay out VPCs, subnets, and security groups.

vpc architecture

A virtual data center is born.

Assisted by Milind Saraph, Chris Frederick and I dove into Boto, the Python SDK for AWS scripting, and started building automation scripts for realizing that early design.  Evan Grantham-Brown stood up a Windows EC2 instance and created a public, Amazon-hosted version of the Data Governance dictionary.

devops_pic

Just look at all that inter-departmental collaboration

Around 2pm, we were joined via WebEx by AWS Solutions Architect Leo Zhadanovsky, who talked us through some particular details of automating instance builds with CloudInit and Puppet, as detailed in this youtube presentation.

As the day came to a close, the conversation turned to governance, ownership, and process documentation. This Google Doc outlines next steps for continuing to roll out AWS and DevOps practices in many areas of OIT operations, and contains the roster of a Design Review Board to guide the architecture, implementation, and documentation of our new data center in the cloud.

whiteboard

The goal: continuous integration. Automated builds, testing, and deployment.

Aside from the decision to choose AWS as our IaaS provider of choice, it was really encouraging to see so many people cross departmental lines to try things out and make things happen.  Here’s to making every day look like these.  Many thanks go to Sharif Nijim () for conceiving and coordinating this event, to Mike Chapple () and OIT leadership for supporting the idea, and especially to Dan () for showing us the way forward.  Let’s do it!

all of them

ALL OF THEM!  (thanks @dowens)

Artifacts from the DevOpsND workshop are available here.