Monday, June 9, 2014

IC Projects: The Switch Upgrade of 5/29

IC isn't a department for slackers. It turns out, maintaining all the networks and computing systems of this institution is a hefty work load, but each day we rise like a coffee-fueled phoenix from the ashes of yesterday to meet the challenges of tomorrow.

This blog is the first in the summer intermittent blog series: IC Projects. Grab your spectacles and a spade, and let's dig into the events 5/29.

Some of you might recall a few services being down late into the night of 5/29.  If you were wondering what that was or why that happened, it was IC completing a project, and I am here to tell the tale, with the help of IC's lead network wizard (and turkey hunting extraordinaire), Ken Lambert.

Between his tasks, I grabbed Ken so that we could discuss exactly what went down on the 29th so that you, the SU user, could be in the know.

Q: Could you describe to SU users, what happened on the 29th?
A: Basically, the core of our network needed to be upgraded. This would be like upgrading the processor of a computer, or brain of a human (if that could be done). That's a good way of putting it. We essentially upgraded the processor of our network. We went bigger, better, and faster on this one. Because it was our primary switch that we upgraded, we had some disruption of services.  Imagine trying to upgrade a computer's processor without turning off the machine, or upgrading a human's brain without putting them to sleep, you know, if that was possible. Keep in mind, there is a lot of technical stuff going on here, so this simple explanation does not completely describe everything perfectly. Additionally, we upgraded the link to HPB from 1gig to a 10gig connection, and we upgraded the switch at the pharmacy school to one that is 20x faster.  In short, there is a lot more speed in some critical areas.

Q:Wow, so your team did a ton that night.  What was the impact on the end users, if any as a result of this upgrade?
A: At this point, end users might not recognize any significant impact because we did this upgrade before there was an issue. In the evenings when we backup things, we were noticing that this would cause our network to reach up to 50% capacity.  Now, after the upgrade, what used to be 50% capacity on a 1gig connection, at peak load, is now much less significant because the connection is now a 10gig connection. Again, this was all done to stay ahead of the curve.

Q:I heard through the grape vine that you all were up pretty late making sure this project was completing properly, is that right?
A: Yea, we left about ten minutes to three, and we were back to work at 8a the next day. By about 8:30p on 5/29, the internet, Blackboard and Datatel servers were back up and running, but in this particular case there was some more to do. Because this was such a large upgrade, there were some adjustments to make, but ultimately, we got it all finished by about 2:50am on 5/30.

In Summary:  

On 5/29 there was an upgrade that needed to be done, not because there was a major issue or failure of any kind, but because foreseeable issues existed.  Preemptive action was taken to remedy potential issues so that SU users continue to experience quality networking services, even during times of peak load (meaning lots of stuff is happening all at once). Now that there is a faster connection between main campus and HPB we will have additional networking capabilities which will, ultimately, allow the IC department additional capabilities for serving SU down the road.

No comments:

Post a Comment