We used to not have very good internet at our office. But our internet should have been good. Because our company pays Comcast for good internet. Yet, the service we were receiving on a daily basis would oft not allow us to even view emails.
This is the story of how we used data to prove to ourselves, and the world, that the provided internet was indeed… not very good. Not to ruin the ending, but it seems Comcast did something about it… though a new set of problems have creeped up, which are totally their fault.
Now let me set the scene. I work for a mobile development firm, and we often use the internet to get things done. Fading internet can get you worked up fast. See below.
The Chatter Continues, and Eyebrows Elevate
There was a good month in there where we further identified the problem. Namely that there was bad internet. Many times the bad quality of the internet was mentioned. But talk is cheap, and internet is not, so something had to give.
The Business Analyst Gets Involved
Now I’m not the kind of business analyst who just sits around while her developers flounder. I may not be the most technical at our company, but I never shy away from the opportunity to collect data.
In short, I decided that I was going to fix the internet.
And with no little coincidence, the project that I have been working on for the past couple of months, Optiko, turned out to be my first ‘in’ to the whole mess.
Optiko is a software solution that helps companies troubleshoot issues for their enterprise mobile devices. It works by continuously collecting data from each mobile device in the system. Then we massage the data a bit and display enterprise-wide results for all to see.
And guess what? One of the most common issues that companies face is bad network connectivity. So Optiko collects a lot of data about wifi. And guess who has Optiko installed on the wifi-connected phone that sits on her desk at work? Yeah, this girl. Time to get the data cranking.
With my first round of data digging, the thing I noticed was that my phone was most often connecting to the access point in the office that was furthest away, with the worst signal.
I was most often connected to the production room access point, even though my home base was in the Fish Bowl.
On the graph below, each circle represents a signal strength data point. The color of the circle represents which access point I was connected to. The more negative the signal strength, the worse the signal is. Below -80 dBm is considered pretty bad.
Talking with the lads around the office about the connection patterns of my phone, they suggested that we might have too many access points or that there might be channel interference. This circumstance could cause packet (data) loss and dropped connections.
Now, one must consider that I was looking at data from my mobile device and not my computer – the connection patterns would be different. However, with this new lead, I decided to keep on diving deeper.
I knew that I could likely collect a lot of this information from command line. So when I went in search of commands to run, I looked for things that could show me:
- What the channel configuration of our access points was
- What access point I was connected to, given the signal strength and channel of all access points
- What my packet loss was at work compared to my packet loss at home.
Command Line with Airport
Because of course you TOO would like to find out what your internet is doing, I’ve provided some instructions on what to do (if you have a Mac):
First, run the command: sudo ln -s /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport /usr/sbin/airport
After you run that, the following commands get you the good info:
/usr/sbin/airport –s : Gives you a list of all the access points your computer can see, their signal strength (RSSI), what channel they are on and their security settings. In the picture below, our APs have the common SSID of ‘BunderMifflin.’
/usr/sbin/airport –I : Gives you information about the network you are actually connected to.
Great info, right? RIGHT! It even answered my questions about our channel configuration – we actually did have a lot of redundancy in our channels (not shown in picture above), so we definitely had some opportunity to do some optimization there.
But remember that I wanted to know how my connection behaved over time. I needed to continuously run these commands. Again, I questioned the lads, and Alex suggested I run my commands in a cron job (Cron is a Linux utility that will run a command or a script at a scheduled date/time).
Here is the cron job gui editor that I used:
Not being particularly familiar with command line and having never written a script or made a cron job before, I proceeded to spend a couple hours trying to sort it all out. After reaching a point of frustration, Alan took a look at my work and pointed out that my file paths were not consistent. With that nice little fix, I began to pump out data. (All scripts can be found at the end of this article)
Now remember. The premise of this blog is that Comcast was the big culprit with the internet issues. Based on our pattern of service, such as all having connection problems at the same time of day at peak internet usage hours, I was pretty sure that Comcast was the bad guy. But our problems could have stemmed from several sources and, I wanted to make sure there wasn’t anything we could fix on our end.
From the data that we gathered about channel configuration, we decided to optimize what channels our APs were on. If you look at the graph below, each color represents a different AP. There were changes made on May 15th and June 3th. You can see that the number of different channels in the sampling increased through these changes.
Now looking at this next graph, you can see that my computer did in fact change its behavior of connection, albeit slightly, following the changes made on May 13th and June 3rd. Again, each color represents a different AP that I am connected to throughout the day.
This last graph shows the RSSI signal range of my computer, post all of the channel changes. Each color represents a different AP BSSID. My conclusion from it is that I have decent signal strength. At this point, I don’t believe that our APs are the culprits. Coming up next, turning up the heat on Comcast.
Turning Up the Heat
Enough with inward reflection. I knew that in order to pin the bad wifi tail to the Comcast donkey, I would have to provide some data that revealed the issue. I wrote another script to run in a cron job to conduct a ping test to google.com. The theory here was that if we experienced packet loss going out to the Googs, Comcast was probably to blame.
The results supported my every suspicion and fear about the quality of our internet. Just look at the difference between the packet loss going on at BlueFletch (Comcast Business) vs. my home network (Comcast Xfinity Home) vs. my parent’s network (Verizon Fios). Mmm, not great, yeah?
Now keep in mind that the graph above is showing you Average packet loss. So, good times mixing in with bad times. Let’s see what it looks like on the minute-to-minute. My home network packet loss generally looks like this:
But try to use the internet at BlueFletch on an afternoon around, say, 2:42pm. Then the horrid packet loss will look a little something like this:
To Make Comcast Fix Internet
I had fun learning about scripts and cron jobs and command line and wifi data, so I can’t be the hero of this story. The real hero is Lauren Lynn. See, I didn’t actually fix the internet. She did. In many long, arduous conversations with Comcast. It was a touch-and-go process for sure, as the bad internet did not always rear its mighty head at appropriate times.
A Soft Conclusion
The only way I can conclude this post is to say that Comcast got their act together for about a week. The sort of terrible thing that happened next is quite unfortunate. See, in all of my digging, I also found that we probably did not pay for the level of internet service that we were expecting. Granted, we weren’t even receiving what we paid for, but perhaps 50Mbps download and 10Mbps upload is not good enough internets for an office of our type.
So we upgraded to a better plan. 100Mbps download and 20Mbps upload. It was good for one day, then our packet loss soared to unprecedented levels.
The sort of confusing thing about this latest development is that it seems like the increased service that we have sort of kind of a little bit offsets the increased packet loss. But still. COMCAAAAAST!
That concludes our blog post! Below please find screen shots of the scripts that I used in the cron jobs. Thanks!
This script shows you all the networks that you computer can see. I’ve limited it to only display the ones that contain the name of our network.
This guy shows you information about the specific network that you are connected to.
This one gives you a second-by-second account of your wifi signal strength. Because why not?
And at last, the golden jewel. This script shows you your packet loss over the past minute.