Tag: statistics

Understanding Scale

There’s all sorts of bad advice about how people just aren’t trying hard enough to not be poor — if only you saved more money like there is a surfeit of money around to save. Work more like you can add a couple of extra hours to each day or just jam another day into the week. And this guy … who evidently thinks the whole problem is that people don’t understand … scale?

The funniest part to me? This dude wants to start with “you don’t understand scale, I’m gonna educate you …” and then proceeds to not understand scale. Small scale purchases will yield the highest price per pound — someone who is buying tomatoes by the tonne certainly isn’t paying a buck a tomato or even fifty cents a tomato. What’s the price for a tonne of tomatoes? The tomato price per tonne data I’ve found are a little outdated, but lets say $100 a tonne for easy mental math. Even if these tomatoes weigh a pound each (unlikely), then every 2k tomatoes gets you $100. He has about 4 million tomatoes … so 2,000 tonnes of tomatoes @ $100 a tonne grosses $200,000. In addition to not understanding scale, he is not understanding gross v/s net income. And, well, tomatoes.

Even if we ignore the required land (which wouldn’t be trivial — planting 150k tomato plants with adequate spacing is going to be 10+ acres), equipment, and labor required to produce and harvest all of those tomatoes. Say they ripen over a 90 day period (which is super generous in my part of the world, but again pretending it’s reasonable for the sake of argument), you need to move some 44,000 tomatoes A DAY for 90 days. Where are these things going as they get picked? How to I transport them to these hypothetical customers? And who are these customers? Even if every customer buys ten tomatoes a week, I need over 30,000 unique customers (every single one of whom repeats their ten tomato a week purchase for three months straight). Are there actually 30,000 people willing to buy a $10/week tomato subscription for the entire harvest season?

This guy’s hypothetical tomatoes aren’t an example of scale, they’re an example of generational wealth. If you inherited a few thousand acres of land (probably complete with an irrigation system and greenhouses), equipment, warehouses, and a fleet of trucks to move ’em … then maybe you could employ a lot of people for planting, harvesting, and selling at farm markets where you might hope to get something even approaching a buck a tomato. Even then, you aren’t netting hundreds of millions of dollars — you’ve got electrical, transportation, and labor expenses to pay. That’s not building a tomato empire from fifty bucks and a handful of tomato plants — that’s millions of dollars in inherited assets to net maybe a million bucks a year.

Making Statistics Work for You

The local newspaper had a poll (in a heavily Republican area) asking if readers support gun control — now they didn’t define “gun control”, so it’s possible some individuals said “no” because they envisioned something unreasonably restrictive or some said “yes” because they think ‘gun control’ includes arming teachers in classrooms or something. Based on the way they elected to bucket the data, there’s no clear “winner”.

But looking at it as just ‘yes’ or ‘no’ — almost 80% of the readers said “yes”

They could break it out by party affiliation and show that only 10% of self-identified Democrats said they don’t support gun control where 28% of self-identified independents and 24% of self-identified Republicans don’t support gun control.

But any of these charts clearly show that a significant majority supports some type of gun control.

Center-Right

I keep seeing that this is a “center right” country, but the election results we’re seeing make me question this analysis. I see ‘center right’ as an average without a standard deviation. If it’s 70 degrees every day, the average temp is 70. If it’s 100 degrees half of the year and 40 the other half, the average temp both places is moderate; but that average hides the two different realities. It’s the standard deviation that shows you how representative an average *is*.
 
If there were a low standard deviation on center-right, then the Democratic party’s would make sense — you’re pretty close to their moderate position, so earning your vote is possible. If there’s a high standard deviation, there’s no appealing to “the other side” — you’re task is to energize people on “your side”, get them enthusiastic about voting, get them engaged in getting their friends out voting.

Influenza Data

Scott hypothesized that 2020 should have a fairly low rate of illness apart from SARS-CoV-2. The preventative measures taken to limit the spread of this virus should also have reduced the number of people with colds, flu, etc. There’s no way to tell for mild illnesses, but I knew the CDC tracked flu and pneumonia cases … you can link the CDC’s CSV data sources into Excel, create a Pivot table to get rows of week numbers or months & columns of year-by-year case counts, then create a chart that compares case counts year-to-year. Unfortunately, they have a new file name each week. You’ve got to find the latest URL from https://www.cdc.gov/flu/weekly/index.htm

I was surprised to see 2020 significantly higher than the previous two years through the end of April and bumping back up again between weeks 26 and 27 (late June / early July)

Broken out by state and filtered to a few states to make the chart readable, I see the same trend. 2020 is generally higher than 2019 or 2018.

The significant increase in pneumonia deaths this year? That’s probably not people who actually had pneumonia completely unrelated to SARS-CoV-2. The influenza/pneumonia data set includes an “All Deaths” column — which depicts the excess deaths for 2020 (I assume the past month or so of data is not yet finalized, as thee numbers fall off sharply in the final weeks of the data set).

Mid-stream

Hospitals have been instructed to provide SARS-CoV-2 data to HHS instead of CDC. CDC falls under HHS so it’s a little like having the “parent company” handle something some subsidiary used to do. Which means the move isn’t as alarming as some people are making it out to be. The ‘parent company’ will authority to more readily mobilize resources, and moving responsibility for a project to the parent company can signify the importance of the project.
Which isn’t to say I think it’s a good move … from an IT perspective, CDC has the infrastructure in place to handle the reporting & publicizing of data. About the best case would be a reorganization — same people supporting the same thing, but adding in the uncertainty of a new organizational structure (new processes, new priorities, a new person’s take on what you should be doing). If HHS is taking over that system, there’s opportunity for failure because the new people don’t know what the old people know. If HHS is bring up a new system, there’s a LOT of opportunity for failure because, well, it’s a new system. Mid-disaster isn’t when I’d want to change my reporting process. Maybe run two in parallel because the new one is going to provide some great new insights. But I would never say “hey, everyone, stop using A and move over to B on Thursday”.
Additionally, it doesn’t inspire confidence that the HHS website has been throwing a lot of connection errors since the announcement. I expect it’s a load problem as people begin to learn what HHS is … but ‘the guy who cannot keep his website online will be taking over statistics for us’ is not exactly the direction I’d move critical reporting.

Statistical Coverup

I keep encountering people who cite the fact that “only” half a percent of kids who get SARS-CoV-2 are dangerously ill. A small percentage of a very large number is still *a large number*.
 
The Department of Education estimated 50,800,000 public school students started the 2019-2020 school year. School admission rates have been trending up, but 2019 is the latest available data. Data from the CDC puts ICU admittance for children infected with SARS-CoV-2 at 0.58% (between 0.58% and 2%, but I’ll use the lower number since I haven’t encountered an ‘only two percent’ argument).
 
If only 1% of the kids who enter public school get infected, that’s over 2,500 kids in the ICU. If 5% get infected, that’s over 14,000 in the ICU. I doubt anyone would make the argument “Schools should re-open because only 14k kids are going to end up in the ICU”.

SARS COV-2 Visualizations

I see charts of the cumulative number of infections (‘the curve’) and the number of tests administered … but comparing the daily number of tests to the cumulative number of infections is not particularly meaningful beyond seeing that the increase in infections is still rather exponential.

A better visualization compares the cumulative tests to the cumulative infections (or, for less staggering numbers, the daily tests administered and the daily number of new infections identified). No, it doesn’t appear that ‘the curve’ is flattening. I’m curious to see, however, the impact of multiple states going into lock-down has in a week or two.

Looking at a number of infections, especially compared across the globe, provides a bit of a distorted view. Comparing countries by the percent of the population that’s been identified as infected instead of the raw number of identified infections avoids the appearance that small countries are less impacted (and that highly populated countries are disproportionately impacted).

Did you know … Windstream’s Teams usage statistics through 2Q2019

Windstream will replace Skype for Business with Teams by the end of 3Q2019. That’s only three months away, so I wanted to provide another update on our progress toward this goal. There are just under 14,000 IM-enabled accounts (this includes both employees and contractors). About 3,800 accounts (about 27% of Windstream’s IM users) have been upgraded to Teams Only. Two third of the company should have already received communication letting them know when their accounts will be upgraded. The remainder of the company will see messages throughout July and August. If you are ready to upgrade before your scheduled date (visit Stream for more info on what to expect when your account is upgraded), use <redacted> to upgrade your account.

Teams accounts for around 80% of Windstream’s IM activity. This is a significant change from the same time last year when under 10% of our IM traffic was in Teams.

From 20 May onward, there have been more people logging into Teams than Skype each day. About 80% of our IM enabled accounts have logged into Teams at least once; and over 8,000 people, about 60% of our IM enabled accounts, are logging into Teams each day. If you aren’t one of these people who are already using Teams, check it out.

In the past year, the number of chat messages sent in Teams has increased 20-fold – from under 10,000 messages a day in July 2018 to around 200,000 messages in June 2019. We’ve seen a reduction in the number of instant messages sent through Skype – from 100,000 daily messages last year to under 50,000 daily messages this month. Some messages that would have been sent in Skype are now being sent in Teams, but the number of IMs sent across the company has doubled in the past year too – from 110,000 messages each weekday this time last year to 220,000 messages each weekday today.

How are people accessing Teams? Teams is predominantly accessed with the Windows desktop app. About 80% of the people using Teams each day use the Windows desktop app (OSX users haven’t been left out, but there are only 20 people using the Mac desktop app). About 5% of the people using Teams each day use the web client. Most of the desktop features are available in the web app, and you can use the web app from a computer that isn’t managed by Windstream. While screen sharing isn’t yet available in private chats, screen sharing is available in meetings when using the Chrome browser. We’ve seen mobile app usage increase from about 10% at the beginning of the year to 15% today, and the mobile app is used by about 30% of Teams users over weekends. The Teams mobile app makes it easy to check in at work while you’re working away from the office, and setting ‘quiet time’ keeps work from intruding on you time.

At the beginning of the year, mobile app usage was split about 50/50 between Android and iOS. By the end of 1Q2019, the percent of Android users dropped to 40%; and, at the end of 2Q2019, iOS accounts for 75% of the mobile app usage.

The Teams: Teams is more than just a replacement for Skype. In addition to private chats, Teams offers collaboration spaces too. The Teams spaces include conversations, shared files, tabs – even a SharePoint Online site. More than 7,000 messages a day are posted in Teams channel conversations.

There are over 3,000 Teams groups – over 1,000 of which were created this quarter. You can search public Teams groups at <redacted>. Microsoft is currently testing a setting which will allow private Teams to be searchable too. Follow our Stream space – we’ll post information on how to mark your private Team as searchable when the feature becomes available.

Fifteen percent of these groups are public – meaning anyone can join the group. Public teams are a great way to solicit end-user feedback, organize local events, provide mentorship, or even discuss “fun stuff”. You can currently search public Teams groups at <redacted>. Microsoft is currently testing a setting which will allow specially-configured private Teams to be searchable too. Follow our Stream space – we’ll post information on how to mark your private Team as searchable when the feature becomes available.

82% of our Teams have between 2 and 25 members. For anyone wondering what the point of a Team with one person is … Teams doesn’t let you send messages to yourself, but you can send messages to a Team that is just you. You can add tabs to provide quick access to your frequently used sites, use connectors to make external data readily available (as an example, I use my Teams space as an RSS aggregator), and play around with Teams functionality without annoying your colleagues.

59% of our Teams have had conversation activity in the past week. Most of the Teams with no conversation activity in the past year have been archived. Archived Teams keep information accessible – visible and searchable – without risking individuals starting conversations in a Teams space that is no longer watched by members.

If you want to see more information about Windstream’s Teams usage, current Teams usage information is available at <redacted>.

Unimaginably Large Numbers

Unimaginably large numbers are, unfortunately, hard to conceptualize. FEMA has delivered 6,200,000 gallons of water to Puerto Rico in the month since Hurricane Maria hit the island. That sounds like a lot of water and probably makes for a good press release. Problem is there are 3.5 million residents. Who should drink half a gallon or so a day (3/4 of a gallon is the WHO recommendation for an adult, but there are kids there too, and I like lazy math). There have been 30 days since the hurricane stuck. Three and a half million people drinking half a gallon of water a day for thirty days is 52,500,000 gallons of water. Not quite 12% of the water needed and my estimate is significantly low.

Doesn’t sound quite so impressive if you say FEMA has delivered 10% of the water needed in Puerto Rico. It also makes breaking into superfund sites to access water more understandable. 100% chance of death if you don’t get water, even an 98% chance of death from poisoning is a better option.