Why we are thankful


Heading into this Thanksgiving holiday, it is a good time to pause and reflect on the opportunity that lies before us.  We have the opportunity to make use of modern, proven technology in order to maximize the value we deliver to the University.

What am I thankful for?  That our leadership has the vision to take us there as an organization.



Focus.  It is such a remarkably productive state.  We have all experienced it as individuals, the Zen moments of super-productivity when you are completely, 100% focused on accomplishing a given task.  Societally, we recognize the perils of focus dilution as texting while driving becomes increasingly legislated.

After last week’s workshop, we have seen a groundswell of excitement and activity around both Amazon as an IaaS provider and DevOps as a way of operating.  The surest path to organizational success is to focus on delivering the greatest value to Notre Dame.  As identified in this blog post, the greatest target of opportunity has to do with understanding the potential of the Storage Gateway.  To realize that benefit, we need to focus.

Focus on understanding the capabilities.

Focus on understanding the frequency.

Focus on understanding restoration urgency.

Focus on understanding the cost.

Focus, focus, focus.

Appropriately enough, Ford’s entry into the World Touring Car Championship is a Focus.  Take some time out of your evening and watch this clip, then reflect and comment on how focus applies.

Further Thoughts on AWS Lock-in

In the last blog post, Bob Winding made some excellent points about the risk vs reward of building a data center in AWS.  For a more specific, non-ND perspective on this topic, take a look at this blog post, which actually categorizes AWS products by their degree of lock-in risk. It makes the very relevant point that most Amazon services are based on open standards, so lock-in is minimal. For example, Elasticache can easily be swapped for another hosted or even in-house hosted memcached server, and the only change required is to point application configuration URLs to the new location.  In the few cases where lock-in risk is greater, a consideration of the value of the service vs the likelihood of actually needing to migrate away may still make one inclined to proceed.

As Bob alluded, lock-in only becomes a problem when the service provider is either going away or imposing an egregious / unexpected financial burden.  Historical evidence indicates that neither scenario is likely to occur on AWS.  In fact, AWS Senior VP Andy Jassy stated in his Reinvent 2013 keynote that Amazon has reduced service prices 38 times since 2006.  Watch that video, and you’ll also see that they have a program that analyzes customer accounts to identify unnecessary spending and contacts those clients to recommend ways to reduce their footprint and save money. Amazon is looking to profit from economies of scale, not from gouging individual clients.

Furthermore, a concentrated move towards DevOps practices will naturally decouple us from our IaaS.  When it comes to building infrastructure and deploying custom apps, the more we automate with tools like puppet and capistrano, the less beholden we are to AWS.  Those scripts can be deployed as easily on a VM in ITC as they can in AWS.  This is why I’m taking pains in my boto scripts to separate configuration files from the scripts that deploy them.  The boto part may be Amazon-specific, but the configuration can travel with us wherever we go.

If we take a smart approach, we should be no more tied to Amazon for infrastructure than we are to AT&T for cell service.  And unlike AT&T, Amazon’s service is a lot more reliable, with no contract to sign.  There are many gains waiting to be realized with IaaS in terms of cost savings, operational efficiency, repeatability, and reliability.  Let’s not leave those benefits on the table for fear of vendor lock-in.

Thoughts on AWS

The more I think about it, the more I realize that we’re witnessing a transformation in IT. Despite the AWS tattoo and kool-aid mustache, I’m not married to AWS. Nothing’s perfect, and AWS is no exception. There are always questions about vendor lock-in and backing out or shifting services as the landscape changes.

Vendor lock-in is a possibility with every IT partnership, and there are ways to mitigate those risks. If Cisco, Redhat, IBM, or Oracle go out of business, it will have a huge impact. If any change their pricing structure, the same. In a practical sense, you usually have time to react. AWS might get purchased, or the CEO might change, but it’s unlikely that we’ll be informed on Monday that they’re shutting down, and we have 3 days to relocate services. With AWS, there’s no upfront cost:  you pay as you go. A provider can’t have success with that model without a rock-solid service. Furthermore, there’s no financial lock-in:  no multi-million dollar contracts or support agreements, which are probably the most common and painful cases.  AWS frees us from those kinds of lock-in.

AWS isn’t outsourcing — it’s a force multiplier. I guarantee that OIT engineers, with minimal additional training, can build more secure systems with better architectures in Amazon they’d be able to do on-premises for a fraction of the cost and time. Infrastructure is hard. It takes massive resources. AWS has the economy of scale and has been able to execute. The result is a game changing historic transformation of IT. Really, look at it:  it’s that profound. It enables experimentation and simplification. Take sizing systems. Just rough out your scaling, pick something larger than you need and scale it back after two months of usage data. Calculate the low nominal use and scale it back further, then auto-scale for the peak. Script it and reproduce everything, network, servers, security policies, all the things. That kind of architecture is an engineer’s dream, for pennies per hour, now, this instant, not years from now.

AWS or not, there is no question that this is the future of IT. Except for low latency or high performance, or very specialized applications that require a local presence, in my opinion, most companies will shed their datacenters for providers like AWS. My guess:  it’s going to come faster than we think. It’s not that we can’t accomplish these things, or that we don’t understand them. We simply don’t have the resources. Take long term storage and archiving. Glacier and S3 have eleven 9s of durability. Elasticity, I can upload 500 TB into Glacier tomorrow, or a petabyte, no problem. Spin up 10 machines, 100 machines, 1000 machines:  done. Chris did a data warehouse in a couple of days. There’s really no turning back.

I suspect we’ll be cautious, as we should be, moving services to AWS. But my forecast is that with each success, it will become apparent that we should do more. It’s going to become so compelling that you can’t look away.

I’m really glad we’ve picked AWS for IaaS. It is a huge opportunity.

Scripting AWS with Boto

It’s pretty easy to get yourself an Amazon Web Services account, log into the web console, and start firing up instances left and right.  It feels awesome, and it should, because you are doing a lot of cool things with just a few mouse clicks.  Soon, however, that buzz is going to wear off, because working this way is not sustainable at scale, and it is not easily repeatable.  If you want to get to #DevOps nirvana, creating and destroying instances automatically with each branch commit, you are going to have to abandon that GUI and get to the command line. 

So you google up “aws command line,” and behold: AWS Command Line Interface.  These tools are great, and probably worth a blog post in themselves.  They are an attempt to provide a unified CLI to replace prior offerings that were unique to each service (e.g., an ec2 tool, a CloudFormation tool), and they let you do quite a bit from the command line. But wait, because if you want to control everything in AWS while also leveraging the power of a mature scripting language, you want Boto for Python.

First, we’ll need python.  Mac and Linux users are good to go; Windows users will need to install it first.  We’re also going to need the python package manager, called pip.  For some reason it doesn’t come with python, but there’s a python tool that’ll fetch it for us.  This worked on my mac:

$ sudo easy_install pip

Okay, now you’d better learn some basic Python syntax. Go ahead; I’ll wait.

Good.  Now we’re ready.  Install the boto SDK with pip:

$ pip install boto

Then create a file called .boto in your home directory with the following contents:

aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY

Where to get those credentials?  I’ll let this guy tell you.

Now we’re ready for some examples.  Readers in the OIT can check out the full scripts here.  Let’s start by making a Virtual Private Cloud, aka a virtual data center.  VPCs are constructs that, despite their relative importance in a cloud infrastructure, are quite simple to create.  From create_vpc.py:

import boto
from boto.vpc import VPCConnection

You’ve got the typical shebang directive, which invokes the python interpreter and lets us execute the script directly.  Then the important part: importing boto.  The first import is really all you need, but if you want to be able to reference parts of the boto library without fully qualifying them, you’ll want to do something like line 3 there.  This lets us reference “VPCConnection” instead of having to say “boto.vpc.VPCConnection.”

The parameters for creating a VPC are…

  • name: friendly name for the VPC (we’ll use VPC_ND_Datacenter.  This is actually optional, but we’re going to tag the VPC with a “Name” key after we make it)
  • cidr_block: a Classless Inter-Domain Routing block, of course (alright, Bob Winding helped me.  I’ll update with clarification later) Example:
  • tenancy: default / dedicated.  Dedicated requires you to pay a flat fee to Amazon, but it means you never have to worry that you’re sharing hardware with another AWS customer.
  • dry_run: set to True if you want to run the command without persisting the result


Basically the cidr_block is the only thing you really even need.  I told you it was a simple construct!  Note that variables exist for each in the create_vpc line below.

c = VPCConnection()
datacenters = c.get_all_vpcs(filters=[("cidrBlock", cidr)])
if not len(datacenters):
    datacenter = c.create_vpc( cidr_block=cidr, instance_tenancy=tenancy, dry_run=dryrun )
     print datacenter
     datacenter = datacenters.pop(0)
     print "Requested VPC already exists!"

datacenter.add_tag("Name", "My awesome VPC")

First, get the VPCConnection object, which has the methods we’ll use to list/create a VPC.  Note that this is available due to the “from..import” above.  Next, use the method “get_all_vpcs” with a filter to check that no VPC with this cidr block already exists.

If that list is empty, we’ll call create_vpc.  Otherwise, we’ll print a message that it already exists.  We can also pop the first item off the list, and that’s the object representing the existing VPC.  Finally, we’ll add a tag to name our VPC.

This stuff is that easy.

Once you divide that VPC into subnets and create a security group or two, how about creating an actual ec2 instance?

ec2 = EC2Connection()

reservation = new_instance ec2.run_instances(
security_group_ids=['SOME_SECURITYGROUP_ID', 'ANOTHER_ONE')

reservation.instances[0].add_tag("Name", "world's greatest ec2 instance")

Similar setup here.  We’re using the EC2Connection object instead.  Note that the run_instances method doesn’t pass back the instance directly, but gives you a “reservation” object that can apparently have an array of instances.  AFAIK you can only create one at a time with this method.  Anyway, we tag it and boom!  Here’s our instance, initializing:

i could do this all day

and to think you wanted to click buttons

We’ve got more work to do before we can create this instance and provision it via puppet, deploy applications on it, run services, and make it available as a Notre Dame domain.  Still, this is a great start.  Maybe some OIT folks want to jump in and help!  Talk to me, and be sure to check out the full boto API reference for all available methods.

Rearview mirrors

In 1911, Ray Harroun entered a car in the inaugural Indianapolis 500 automobile race.MarmonWasp

Though primitive by today’s standards (the car averaged roughly 75 mph over the course of the 500 mile race), it was a massively controversial vehicle in its day.


Because it was the first car in history to be equipped with a rearview mirror.  This put Mr. Harroun at a considerable advantage, because his car was the only one that did not have a passenger mechanic on board.  Eliminating an onboard mechanic represented a significant weight savings, which Harroun used to his advantage, winning the race by just over half a mile.

Three years later, automobile manufacturers started putting review mirrors on cars available to the public.

Since then, racing has continued to improve the breed.  In 1961, Giotti Bizzarrini combined his engineering talent with Piero Drogo, a car body specialist.  Enbracing Kammback design, the result of their creation was a very unique looking Ferrari racing car.  The long tail section of the car minimized air resistance, allowing for higher speeds.


In 2011, Toyota introduced the first widely-available hybrid electric car, aimed at the mass market.  Look closely at the profile, and you will see echoes of the Kammback principle.  Instead of winning races, the slippery aerodynamics help the Prius compete against the wind in order to deliver more miles per gallon.


Today, there are many pieces of automotive technologies so mundane to us that we take them for granted.  Anti-lock brakes.  Disc brakes.  Rearview mirrors.  How many of us drive cars without rearview mirrors?

Step back and look at the infrastructure as a service landscape.  Private industry is our race team.  They are working constantly, pouring effort, energy, and capital into improving the breed.  We are the fortunate auto manufacturer on the receiving end of the trickle-down effect.  Technologies that we fantasized about twenty years ago are available at our fingertips.

Though cutting edge for higher education, adopting IaaS and DevOps has proven its mettle in the stiffest competition.  Let us step forward confidently and learn from our racing team, figuring out how to operationalize the use of these technologies in our environment.

And for those of you interested in learning more about technologies that we take for granted whose roots are in racing, please explore this interesting article.

OIT Collaboration with University Communications

We started a new project today to assist UC in the migration of the Conductor CMS from Rackspace to AWS. This is a great opportunity to help UC improve their deployment and to build skills in additional AWS services. The initial model is to deploy the service as a two tier (Web/Application and Database) service using a VPC with EC2 and RDS instances. This will also fill in some of our governance policies as we create roles in IAM for the project team members. We will also be using a shared system management model where UC is developing and managing the application and content and OIT the server stack and infrastructure. The service hosts over 300 websites with 2TB/month of network traffic.

AWS in Classroom

The ability to control provisioning of an entire development stack in AWS is not just a fantastic opportunity for the enterprise; it is also a great way to let students learn using infrastructure that might otherwise be prohibitively expensive and difficult to procure. Understanding this potential to facilitate learning and train a new generation of cloud-native developers, Amazon offers educational grants for use of AWS services. I started using program about a year ago for my Database Topics class in the Mendoza College of Business, and Chris Frederick is looking into it for his No-SQL/Big Data class.

It can give students complete hands on with servers/services without the issues of spinning up servers in house!

A snapshot of services are below.   The main services I have used in the RDBMS class are EC2 and RDS.   Each student was able to have their own Oracle DB instance and able to have DBA privileges to run their own DB and in that way set up users, privileges, roles which would be much more difficult to do on a shared environment.   They could then work on projects without fear of bringing down another student’s database.  One of the other projects I did in a class, with the help of Xiaojing Duan, was to set up a PHP server to show integration with Facebook.   As you can see from the list below,  there are a LOT more services available for classroom use!

11-21-2013 1-06-37 PM

The First Step

Building on the energy of the #DevOpsND workshop, the first target of opportunity identified by the attendees on Day Two is enterprise backup.  Of all the cloud candidates discussed, transitioning from on-premises backup to Amazon Glacier represents the greatest financial opportunity with the least effort in the shortest period of time.

On Day One of the workshop, Shane Creech came up with the idea of piloting backing up our remote campus locations using the newest of Amazon’s backup offerings.  The latest addition to Amazon’s Storage Gateway product is the Virtual Tape Library.  Available as an iSCSI interface, the Virtual Tape Library is capable using S3 or Glacier as a destination.

Part of Day Two’s work saw John Pozivilko, Shane Creech, and Jaime Preciado-Beas work together to successfully pilot the Storage Gateway.  The next step is to work with Mike Anderson and the storage group to test with our existing enterprise backups software.


AWS Governance

Jason Williams and Jaime Preciado-Beas did a great job on the security assessment of our current deployment of nd.edu in Amazon. As discussed in day 2 of the workshop, this document will serve as a great starting point to flesh out our governance structure with respect to AWS. Let’s use it as a guide as we deploy new projects. Governance is critical to our long term success in leveraging AWS, so we need to advance and refine it with each project. Let the projects begin!