Monday Morning, Start Again

Happy Monday!

Monday morning, start again

Thanks to http://www.usedave.com for the artwork!

It’s another beautiful Monday morning here in the Midwest.  It’s sunny, hot, and humid.  People are at their desks, working away.

I took a look at OnBase user activity, and it is normal for this time of year.  People are making updates to student files, annotating, making decisions, doing their jobs.  Just another normal Monday morning – nothing has changed.

Except for the fact that our OnBase platform is running in AWS!!!!  Massive kudos to the team for an excellent job, a well-designed plan, and a beautifully transparent migration.  Zero user impact apart from the negotiated outage on Saturday.

On to the next thing – just another normal Monday morning at #NDCloudFirst.

Automating Configuration Managment

So many choices!  Puppet, Chef, Salt, Ansible!  What’s an organization to do?
We initially went down the Puppet path, as one of our distributed IT organizations invested lots of time in getting Puppet going.  We ended up not going too far down the path as we started using Ansible.
The biggest reason is that Ansible is agentless.  All the commands go over ssh, and there is nothing to install on destination servers.  We’ve run into a couple of issues where the documentation doesn’t match the behavior when developing an Ansible playbook, but nothing insurmountable.
We realize many benefits from having a fully self-documenting infrastructure, and find that it, in concert with git (we use BitBucket b/c of free unlimited private repos for educational institutions), enables the adoption of devops principles.
At a high level, we have a playbook we call Ansible-Core which contains a variety of roles, maintained by our Platform team.  These roles correspond to specific configurations, including:
  • Ensuring that our traditional Platform/OS engineers have accounts/sudo
  • Account integration with central authentication
  • Common software installation
    • NGINX, configuration of our wildcard SSL certificate chain, etc
When developing a playbook for an individual service, the developer scripting software installation/configuration may encounter a dependency which is not specific to the service.  For example, installation of the AWS CLI (not there by default if you start with a minimal machine config).  Upon realizing that, it leads to a conversation with the Platform team to incorporate the addition of that role into Ansible-Core.  That can happen two ways:
  • By the dev, who issues a pull request to the Platform team.  That team reviews the change and merges as appropriate.
  • By a member of the Platform team
In the process of creating Ansible scripts, conversation between traditional operations folks and developers flows naturally, and we end up with truly reusable chunks of infrastructure code.  Everyone wins, and more importantly, everyone learns!

Gluecon 2015

Gluecon

I had the good fortune of attending Gluecon this past week.  It is a short, intense conference held in Broomfield, Colorado, attended by some of the best and brightest folks in technology.  There was a lot of talk about microservices, APIs, DevOps, and containers – all the latest tech, with an eye towards 2018.

While the majority of slides are available here, this is a quick synopsis of what I took away.

Tweet early, tweet often

I’m a sporadic tweeter, but go into full-on microblogging mode when at a conference.  It’s a great way to share information with the public, and a great way to make connections.  By adding a column in TweetDeck for #gluecon, the following image caught my eye:

Per @_nicolemargaret, “Doing some realtime analysis of hashtag co-occurrence among #gluecon tweets. #neo4j #rstats #d3 #nosql”

 

I love graphs, and the powerful way they communicate relationships.  Through that tweet, I had the opportunity to meet Nicole White.  Come to find out that she works for Neo Technology, the company behind neo4j.  While I have other friends at Neo, I had never heard of or met Nicole before (she is a relatively new hire).  I’m happy to have added her as a new node on my graph, as it were.  Very cool.

Tweeting is also a great way to reflect back on gems and tidbits – simply look at your own history to help organize your thoughts.  Like I’m doing now.

It’s always the people

There were quite a few sessions talking about the importance of culture and talent in making for productive, healthy organizations.  Salesforce did a good keynote, illustrating the gap between available technology jobs and qualified candidates.

A challenge for us all

This theme continued through the very last keynote session:

Is this clear?

Almost every session I attended had a subtle undertone of “we need talent.”  Most messages were not subtle.

Talking about microservices, Adrian Cockroft made a great cultural point.  When operating microservices, organizations need to fully embrace the DevOps model with a clear escalation chain.

Culture Win: the VP of Engineering volunteers to be on call while expecting never to get called.

Tools?  What tools?

Building on the importance of people, let’s talk about tools.  Specifically, let’s talk orchestration tools – Ansible, Puppet, and Chef.  I happen to think agentless Ansible is the way to go, but ultimately, it’s what you and your organization are capable of doing with the tools, not the tools you pick.  Brian Coca illustrated many possible ways in which Ansible can be used…because he deeply understands how to use Ansible!

You can do this with Ansible…should you?

One of my favorite one-liners from the conference sums up the tools discussion: “Rather than teach everyone to code, let’s teach them to think. The coding can come later; it’s easier.”

Right on – pick a tool that can be successfully adopted by you/your organization, and stick with it.  Stay focused, and don’t get distracted.

APIs still rule…and enterprise software still lags behind

APIs have been a thing for years now.  I remember writing the customer profile master data store for a major airline in the late 90s.  As a master data source, many internal systems wanted to access/update said data.  Instead of giving each system direct access to our database, we surrounded it with a cloud of services.  At the time, these were written in C, using Tuxedo.

What has changed in the last 20 years?  The utter ubiquity of APIs in the form of web services.  The concept is the same – encapsulate business logic, publish a defined interface, and let the data flow quickly and easily.  And yes, it is much, much easier to get going with a RESTful API than a Tuxedo service.

Table stakes for software vendors

What else has changed?  Organizations simply expect data to be available via APIs.  If you are an enterprise software vendor and your data/business logic is locked up in a proprietary monolithic application, start opening up or face customer defection.

What a Wonderful World

In his presentation, Runscope CEO and co-founder John Sheehan put this slide up.

Can you imagine life without these tools?

Think for a moment about how remarkably powerful any one of the concepts listed in his slide is.  We live in a world where all of them are simply available for use.  With the remarkably rich tools which are out there, there is simply no excuse for a poorly performing web site or API.  If an organization is running into scale issues, the technology is not likely to be at fault – it’s how that technology is implemented.

Talk with Adrian

If you get the chance, spend some time talking with Adrian Cockroft.  I was fortunate enough to spend 20 or 30 minutes talking with him over lunch.  First off, he is a genuinely friendly and kind person.  Second, he likes interesting cars (Tesla Roadster, Lotus Elise, among others).  Finally, he’s flat brilliant with loads of experience.

I was able to glean useful tidbits about containers, tertiary data storage, and autocrossing.

Lunch with the incomparable Adrian Cockroft

One thing Adrian mentioned that stuck with me concerned the current state of containers.  They are mutating so fast that even companies who are working with them full time have difficulty keeping up.  That said, the speed at which this space moves makes for a high degree of agility.  However, every organization has finite limits on what new/emerging technologies can be pursued.  Containerize if you wish, but you should have a clearly defined objective in mind with palpable benefits.

Serious about automation? Take away SSH access.

One of my favorite tidbits was to remove SSH access from servers.  If no individual has SSH access, it forces them to automate everything.  At that point, servers truly become disposable.

Get beyond the tech

I was pretty fried by the afternoon of day two of the conference.  I took the opportunity to skip a couple of sessions and spend some time with Kin Lane.  His dedication to understanding and explaining APIs earned him a Presidential Innovation Fellowship in 2013.

Yes, we talked tech…and proceeded to go beyond.  Kin likes motorcycles.  He was a former VW mechanic.  He’s gone through a material purge and enjoys the mobility his work affords him.  Yes, he strives to be all things API, but that’s only one facet to his very interesting personality.

Opting to hang with Kin Lane instead of attending a session

Repeat attendance?

So, would I go to Gluecon again?  Most definitely.  It was a worthy spend of time, providing insight into the leading edge of technology in the context of microservices, devops, containers, and APIs.  Not too long, and not too big.

I came away from the conference with a better understanding of trends in technology.  With that knowledge, I am better prepared to work with, ask questions of, and assess potential vendors.

We are poets!

So I just finished day two of RailsConf, and I had a very interesting experience. My final session of the day was held by an actor-turned-programmer by the name of Adam Cuppy. He started things off with a quote by DHH from last years conference, where David challenged everyone to see themselves not as software engineers, but as software writers. However, he showed how the term writer has no purpose behind it; simply a description of things a writer has done. Taking a look at synonyms leads to the word poet. Examining poet shows us that a poet has super-powers! Now who doesn’t want super-powers? And one thing everyone knows is that a person with super powers has the responsibility to use them for the betterment of others.

Through this super-power of expression, poets are able to relay meaning and feeling with their words. They are able to use the syntax if a language to convey feeling, and paint a picture in the reader’s mind. And so too are we challenged to paint a picture with the language we are writing in. Just imagine for a minute, how easy it would be for a new programmer to a project to on-board if the application code they were looking at read like a play? What if we could engage this new person and made it not only easy, but enjoyable for them to learn the application?

Think about the last time you began on an existing code base. What did you spend the most time trying to learn: the how of the program, or the why of the program? With forethought and planning, we can develop conventions that make it easy to discern the meaning of our code. If we can make it easy for someone reading it to know how this method fits into the application, they will be able to see the while picture much more quickly. And think how beautiful that picture could look.

So I echo his enthusiasm: go forth and be poets! Give meaning to your code, and do something great.

Capistrano upload fails with no error

There are a few reasons the upload! method of capistrano might fail on you, but you usually see some kind of an error, such as a read-only file system error on the logged-in user.  I just had a very mysterious failure with no error text whatsoever, even in verbose mode.

Turns out the issue was STDOUT output from my remote .bashrc file.  I was setting some things in my remote host’s bashrc, and had it echoing debug messages (ie “started ssh agent”, and, most cleverly, “hi from bashrc!”).  Almost every other capistrano command had no problem with this, but upload! would just sit there and hang, not receiving the server response it expects.

I removed the echo statements, and everything went back to normal.  So… maybe don’t do that.  Hopefully this helps somebody, someday.

Why Differing Perspectives are Good

Yesterday afternoon, we were pushing to get a Ruby on Rails app deployed into AWS.  One firewall rule away, the developer put in a change request that he thought would get there and stepped out for a quick break.  Since we’re all sitting in the same room, some might think it overkill.

Cloud Central

We have an information security professional embedded in our Cloud First team who has final authority over security-related decisions.  Reviewing the request, he didn’t feel comfortable with the nature of the change.

When the developer returned, he and our InfoSec person discussed the nature of the change request.  It turns out what was being requested was valid, but the there was a slight error in how it was written up.

Problem solved, firewall rule processed, and boom – our app was in business.

Having a team with broad perspectives is not only desirable, it is absolutely necessary for success.  And that is exactly the type of consideration that went into drawing members for our #NDCloudFirst mission:

 

Augmented Team

Onward!

#NDCloudFirst

Today is the most exciting day of my modern professional life.  It’s the day we are announcing to the world our goal of migrating 80% of our IT service portfolio to the cloud over the next three years.

Yes, that’s right, 80% in 3 years.  What an opportunity!  What a challenge!  What a goal!  What a mission for a focused group of IT professionals!

The following infographic accurately illustrates our preference in terms of prioritizing how we will achieve this goal.

Screen Shot 2014-11-07 at 8.54.34 AM

 

Opportunistically, we will select SaaS products first, then PaaS products, followed by IaaS, with solutions requiring on-premises infrastructure reserved for situations where there a compelling need for geographic proximity.

The layer at which we, as an IT organization, can add value without disrupting university business processes is IaaS.  After extensive analysis, we have selected Amazon Web Services as our IaaS partner of choice, and are looking forward to a strong partnership as we embark upon this journey.

Already documented on this blog are success stories Notre Dame has enjoyed migrating www.nd.edu, the infrastructure for the Notre Dame mobile app, Conductor (and its ~400 departmental web sites), a copy of our authentication service, and server backups into AWS.  We have positioned ourselves to capitalize on what we have learned from these experiences and proceed with migrating the rest of the applications which are currently hosted on campus.

So incredibly, incredibly fired up about the challenge that is before us.

If you want to learn more, please head over to Cloud Central: http://oit.nd.edu/cloud-first/

Onward!

Just because you can, doesn’t mean you

ImageUploadedByAG Free1350804245.491429

One of the applications that is a shoe-in candidate for migration has a smalltime usage profile.  We are talking 4 hits/day.  No big deal, it’s a business process consideration.

It needs to interface with enterprise data, resident in our Banner database.  No worries there, the data this app needs access to is decoupled via web services.  Now lets swing our attention to the apps transactional data storage requirements.

First question – does it need any of the Oracle-specific feature set?  No.  So, let’s not use Oracle – no need to further bind ourselves in that area.  Postgres is a reasonable alternative.

OK, so, RDS?  Yes please – no need to start administering an Postgres stack when all we want to do is use it.

Multiple availability zones?  Great question.  Fantastic question!  Glad you asked.

Consider the usage profile of this app.  4 records per day. 4.  Can the recovery point/time objectives be met with snapshotting?  Absolutely.  Is that more cost-effective than running a multi-AZ configuration?  Yes.

Does it make sense for this application?

Yes.

Thank you Amazon for providing a fantastic set of tools, and thank you to the #NDCloudFirst team for thinking through using those tools appropriately.

The Speed of Light

How fast is it really?  In the course I teach, students have the opportunity to interact with a database, taking their logical models, turning them into physical designs, and finally implementing them.

Up until this semester, I have made use of a database that is local to campus.  The ongoing management and maintenance of that environment is something which is of no particular interest to me – I just want to use the database.  Database-as-a-Service, as it were.  As in, Amazon Relational Database Service.

Lucky for us all, Amazon has a generous grant program for education.  After a very straight-forward application process, I was all set to experiment.

To baseline, I executed a straightforward query against a small, local table.  Unsurprisingly, the response time was lightning-fast.

local

 

Using RDS, I went ahead and created an Oracle database, just like the one I have typically used on campus.  After setting up a VPC, subnet groups, and creating a database subnet group, I chose to create this instance in Amazon’s N. Virginia Eastern Region.  Firing off the test, we find that, yes, it takes time for light to travel between Notre Dame’s campus and northern Virginia:

east

 

Looks like it added about 30 milliseconds.  I can live with that.

Out of curiosity, how fast would it be to the west coast?  Say, Amazon’s Oregon Western Region?  Fortunately, it is a trivial exercise to find out.  I simply snapshotted the database and copied the snapshot from the eastern region to the west.  A simple restore and security group assignment later, and I could re-execute my test:

west

 

Looks like the time added was roughly double – 60 milliseconds.

Is that accurate?  According to Google Maps, it looks like yes indeed, Oregon is roughly twice as far away from Notre Dame as Virginia.  The speed of light doesn’t lie.

So, what did I learn?  First, imagine for a moment what I just did.  Instantiate an Oracle database, on the east coast, and the west coast.  From nothing!  No servers to order, to routers to buy, no disks to burn in, no gnomes to wire equipment together, no Oracle Universal Installer to walk through.  I still get a thrill every time I use Amazon services and think about what is actually happening.  I can already see myself when I’m 70, regaling stories about what it was like to actually see a data center.

OK, deep breath.

Second, is 30 milliseconds acceptable?  For my needs, absolutely.  My students can learn what they need to, and the 30 millisecond hit per interaction is not going to inhibit that process.  It’s certainly a reasonable price to pay, especially considering there is nothing to maintain.

What is the enterprise implication?  Is 30 milliseconds going to be insufficient?  An obstacle that inhibits business processes?  We shall see.  For local databases and remote web/application servers, perhaps.  Perhaps not.

This is why we test, remembering that despite what a remarkably amazing toolset AWS represents, we are still bound by the speed limit of light.

AWS Midwest Region, anyone?

Enthusiasm – it’s contagious

I just interviewed a guy today for a position as a Ruby on Rails developer. This would be his first software development position, having previously worked as a customer service rep for an insurance company. His background includes a degree in film, so not your typical coder back story. He was a confident guy, well-spoken and quite personable. While speaking with him I could feel how excited he was at the prospect of continuing to work with the platform. But it wasn’t just what he would be doing, but also where he would be doing it. He complimented the campus (something you hear a lot about Notre Dame), was even happy about the color in the trees, but he stated that he couldn’t wait to help work on things that would influence change. To him working in a higher education setting, having the ability to impact the students and make a positive change for the world as a whole was quite invigorating.

I found myself coming out of that interview with a new energy for my position. I think too often we start to take for granted where we are in life, and forget to really appreciate what we are doing and the influence we can actually have in the world. I have a really cool career, get the opportunity to work with some really great people, and no matter how long the walk is in to work everyday the view is always incredible. So as we look to the future in our careers, trying always to get somewhere better, remember to take a moment and remember what made you so excited when you first started down the path you are on.