Get

Firefox



Digging for Truth




Microsoft: Take the VistaChallenge!

Read the VistaChallenge (pdf) posted on http://www.thenixedreport.com/. Spread the challenge. :)

Microsoft has learned that it can no longer control who gets to play in the sandbox - Sum Yung Gai



Find your Lunix






Distribution of 1 month page hit "popularity" data sampled from
distrowatch.com on April 6, 2007. The red curve is a data fit to a Pareto like distribution function.

y = 78750*1.25*(x+3)(-1.25)

The x+3 is an offset in the index into the distro list and may be thought of as due to several "distros" not sampled. Like ... how about Windows, Apple Mac, and Unix. Well, maybe. ;)
My post on distrowatch.com

The Distribution of Distro Popularity

http://distrowatch.com/weekly.php?issue=20070402&mode=28#comments
167The Distribution of Distro Popularity (by Fractalguy on 2007-04-07 03:00:35 GMT from United States)

It looks like a good question might be how many distros should be featured in a "top" list? Top 10? Top Dozen? Top Hundred?

How many are in the data base? I don't know, but some very nice ones are not getting any notice at all - and their creators don't mind, at this point. Maybe they aren't ready to go "public". I'm guessing it might push 500 in all.

So, how about a little anaylsis then. I'm thinking the Distribution of Distro Popularity (gotta love that title) is one of those long tail things "they" write about. I used the long tail maths in a work related report about 9 years ago. :)

For background see http://en.wikipedia.org/wiki/The_Long_Tail

Anyways, there can be a very large number totaled in that long tail depending on the cutoff for the sum. And some companies are learning that lesson the hard way (think brick and mortar book stores vs. Amazon who understands). The long tail represents the little known but once in a while bought books or songs. And these numbers add up to be more than the top 40. So, you ignore the long tail at your own risk.

Here on Distrowatch, the total (one month) hits over the first 100 distros today is, by my Gnumeric spreadsheet table, 32821. So how about if we say the "Top" list should cover half of the "voting" visitors. Then that would be, it turns out, the top 11, pretty close to a "Top 10". Interesting.

I modeled the data from Distrowatch with a Pareto function, see http://www.statsoft.com/textbook/stdisfit.html#pareto.

When I extrapolate the function out to 500 distros to cover more of the rare but important long tail the total accumulation is about 40890, half of which are in a Top 20 list.

Hmmm. So, what to do with the extra 10 distros. Well, I think they might satisfy both the "More taste, Less filling" crowds and "More eye candy, less bloat" fans, if not most of them. Will the long tail ever merge with the top distros? I doubt it and don't see why they should. Many might run a Big Ten distro on one machine/partition while tinkering with their favorite on another.

You can see the chart on my web page.

http://www.geocities.com/e_8013/index.htm#Distrowatch



Linux distributions I've installed or explored (live CD): distro-list.txt


Hosted by www.Geocities.ws

1