DONATE NOW TO HELP UPGRADE LIBCOM.ORG

Bristol indymedia - Competing With The Evening Post - South West welcome

32 posts / 0 new
Last post
munkeeunit
Offline
Joined: 10-03-04
Nov 9 2006 11:35
Bristol indymedia - Competing With The Evening Post - South West welcome

Bristol Indymedia now consistently gets nearly as many hits as the Evening Post's ‘This Is Bristol’ website! If you do a 'Bristol' on UK Google pages, Bristol Indymedia comes out at around 7th place, and jostles for position with the 'This Is Bristol' website on busier days.

Join in with the movement to democratise Bristol’s corporate dominated media landscape. Publish your news story now! It's only a click away! Click the 'Submit An Article' button and your story will appear on the right-hand newswire within minutes of posting.

BRISTOL INDYMEDIA - Read It, Write It, Your Site, Your News...
http://www.bristol.indymedia.org/

PUBLISH YOUR NEWS
http://www.bristol.indymedia.org/publish.php

Even though our newswire carries the banner 'Read It, Write It, Your Site, Your News' it seems that not everyone is yet aware that this means exactly what it says. The 'Submit An Article' button, in the top-right corner, takes you to our article submission form. All you need to do is fill in all the form, click the 'Preview Before Publishing' button at the end of the form, to check it looks ok, and then click 'Edit Again' or the 'Publish!' button. If you have any problems filling out the form, or with uploading articles, please see our brief guide below.

HOW TO CONTRIBUTE ARTICLES - A Brief Guide
http://www.bristol.indymedia.org/newswire.php?story_id=25376

PUBLISH YOUR EVENT
http://bristol.indymedia.org/whatson/calendar.php

If you have an event to promote you can use our calendar too. Again, just like with the newswire, all you need to do is click 'Add Your Event' and follow the instructions, and your event will be immediately displayed in the calendar. If your event has an interesting report attached to it, then you can turn that into an article too, on the main newswire, to go with your calendar entry.

GUIDELINES
Only if your article is outside of, or breaks, guidelines will it be moved from the right-hand newswire.
http://bristol.indymedia.org/editorial.php

BRISTOL INDYMEDIA - Read It, Write It, Your Site, Your News...
http://www.bristol.indymedia.org/

Admin - please don't post thread titles in capitals

Steven.'s picture
Steven.
Offline
Joined: 27-06-06
Nov 9 2006 12:05
munkeeunit wrote:
Bristol Indymedia now consistently gets nearly as many hits as the Evening Post's ‘This Is Bristol’ website! If you do a 'Bristol' on UK Google pages, Bristol Indymedia comes out at around 7th place, and jostles for position with the 'This Is Bristol' website on busier days.

That's not what google searches mean - they count the number of links to a site, not the number of views. To compare the number of views the most effective way is on www.alexa.com

AndrewF's picture
AndrewF
Offline
Joined: 28-02-05
Nov 9 2006 12:13
John. wrote:
munkeeunit wrote:
Bristol Indymedia now consistently gets nearly as many hits as the Evening Post's ‘This Is Bristol’ website! If you do a 'Bristol' on UK Google pages, Bristol Indymedia comes out at around 7th place, and jostles for position with the 'This Is Bristol' website on busier days.

That's not what google searches mean - they count the number of links to a site, not the number of views. To compare the number of views the most effective way is on www.alexa.com

Alexa reduces to the parent domain so you just get the (very impressive) visits for indymedia.org (Traffic Rank for indymedia.org: 2,852)

Also Alexa is not very accurate for ranks above 100,000. Or in situations where one or more of the site editors has the alexia toolbar (Jack does) as this breaks the assumed 'random' nature of their sampling. I suspect this is why libcom has such mad fluctuations (see below) in the alexa graphs - a large percentage comes from the surfing of one or two people.

Steven.'s picture
Steven.
Offline
Joined: 27-06-06
Nov 9 2006 12:33

It was like that before Jack got it as well, I remember you yourself pointed it out then.

But that graph as you'll see is the rank - i.e. how it compares to other sites. You'd expect that to be up and down. The graphs for reach (no. visitors) and page views are much steadier:

Mike Harman
Offline
Joined: 7-02-06
Nov 9 2006 12:56
I don't think that amount of variation in rank is unusual at all:
Steven.'s picture
Steven.
Offline
Joined: 27-06-06
Nov 9 2006 13:09
Mike Harman wrote:
I don't think that amount of variation in rank is unusual at all:

Exactly - the ranks are relative, so they will shoot up and down. They average them out though for a general look. The graph joe posted of average rank to average number of views was accurate for us when he posted it. I wonder if it still is...

gurrier
Offline
Joined: 30-01-04
Nov 9 2006 13:41

Nope. It's just a basic of the methodology.

Alexa only counts visitors who have the alexa toolbar installed. This is a tiny and decreasing percentage of internet users (decreasing due to firefox proliferation and the fact that most organisation's with IT policies are smart enough to ban its use). Once you get down to the high numbers, the samples become tiny, so a single visit from somebody using the toolbar will have an enormous effect on rank once it is extrapolated up.

My own experiments with it show that if two people who visit a site extensively install the toolbar a site's rank was changed from about 150,000 to about 20,000 despite no change in traffic.

Also, comparing indymedia ireland traffic to its alexa rankings, there is just no correlation at all. The one big recent spike in rank corresponded with a promotion by politics.ie to get their users to use alexa - their spillover usage on indymedia probably caused the ranking bump.

Thus, I think it's pretty safe to say that alexa ranks are now mostly useless.

It's also worth remembering that the toolbar passes every single detail of your browsing history to a capitalist company who have the right to then sell this information on to whoever they like.

AndrewF's picture
AndrewF
Offline
Joined: 28-02-05
Nov 9 2006 13:48
Jack wrote:
JoeBlack2 wrote:
Also Alexa is not very accurate for ranks above 100,000.

Isn't that just according to the lunatic who runs wikipedia watch, tho?

No - I'm pretty sure last time we had this discussion I dug out the actual disclaimer alexa have that says this. But anyway it is based on what they say about their date and not what their critics say.

And yes of course the rank graph will have the most flutuation - I choose it because this means its the clearest example. But the libcoms fluctuations are much greater than that for the others posted which suggests that a very small number of toolbars are having a very big impact. (In fact you'll see that for the top ranked sites the level of fluctuation is very smoothed out which is what you'd expect statistically).

I'm not saying alexa has no value, I just think people tend to draw too big conclusions from it. Its a statistical guess and so is subject to all the normal problems such guesses are - and assuming non random sampling is actually random is one of the biggest problems you can have - see the debate about Iraq death tolls and the lancet reports as an example.

Mike Harman
Offline
Joined: 7-02-06
Nov 9 2006 14:13
gurrier wrote:

Alexa only counts visitors who have the alexa toolbar installed. This is a tiny and decreasing percentage of internet users (decreasing due to firefox proliferation...

Only partially true. There are firefox plugins which send information to Alexa. That'd be a tiny percentage as well though of course.

Quote:
Also, comparing indymedia ireland traffic to its alexa rankings, there is just no correlation at all. The one big recent spike in rank corresponded with a promotion by politics.ie to get their users to use alexa - their spillover usage on indymedia probably caused the ranking bump.

Well, there are two events within the past year on libcom we can use to gauge it. The cpe blog (end of February into March), and the hack (between May-July since it happened a couple of times, but really we've only been running full steam again since mid-September).

I think if anything admin traffic went up around the hack periods so ought to have compensated, but those two periods are very clearly marked in the Alexa graphs and for comparison our forums stats, although the forums stats are for the past two years, not one year, so you have to squint a bit.

The spike in October also correlates with our stats, although obviously site design affects page views and I think that's gone up both positively in terms of stuff being interlinked more and negatively with some of the forum niggles (the lack of jump to last post button for example).

Quote:
It's also worth remembering that the toolbar passes every single detail of your browsing history to a capitalist company who have the right to then sell this information on to whoever they like.

It's quite possible to use the alexa site for fun and not have the toolbar installed. Have you stopped using google?

Quote:
Thus, I think it's pretty safe to say that alexa ranks are now mostly useless.

They're mostly useless if taken in isolation, but probably not that bad a sample compared to most polls or TV ratings, and interesting in themselves. Since most sites don't publish their web traffic openly, it's about the only way to see how various sites compare - although technorati does a ranking thing in a different way:

http://technorati.com/search/libcom.org

http://technorati.com/search/indymedia.org

I think John was more than justified in pointing it out as an alternative to comparing google search result rankings, which might improve traffic if you do well for certain popular terms, but definitely aren't a reflection of traffic at all.

Mike Harman
Offline
Joined: 7-02-06
Nov 9 2006 14:27
JoeBlack2 wrote:
But the libcoms fluctuations are much greater than that for the others posted which suggests that a very small number of toolbars are having a very big impact.

Hmm. We have some news articles get 80 views in three days, others get a couple of thousand within the same period. Something like 50% of our visits (not page views, independent visits - around 50k visits/month from google fwiw) are referrals from google, and those fluctuate by 10-20% each month up and down. So there's a very real fluctuation in traffic, and that's dependent on how google search rankings change, whether people are searching about certain subjects or not, whether certain sites link to stuff here or not. With indymedia.org, or mtv.co.uk or bbc.co.uk you're more likely to have a regular audience who dips in every so often via direct browsing(as I do with BBC news), and much higher numbers of casual visitors. So I think it's a combination of a higher statistical sample combined with the nature of traffic to sites with brand identities like that.

Quote:
I'm not saying alexa has no value, I just think people tend to draw too big conclusions from it. Its a statistical guess and so is subject to all the normal problems such guesses are - and assuming non random sampling is actually random is one of the biggest problems you can have - see the debate about Iraq death tolls and the lancet reports as an example.

If you're up for it, it'd be interesting to compare page views/visits/google referrals/top search terms between a few different sites to see how they compare. If we could get 5-10 sites to agree to it, it might be a very worthwhile thing to try for about 3-6 months??

AndrewF's picture
AndrewF
Offline
Joined: 28-02-05
Nov 9 2006 14:33
Mike Harman wrote:
If you're up for it, it'd be interesting to compare page views/visits/google referrals/top search terms between a few different sites to see how they compare. If we could get 5-10 sites to agree to it, it might be a very worthwhile thing to try for about 3-6 months??

Most if not all of the sites I'm involved with have a policy of not routinely collecting such statistics as a (limited) protection against court injunctions ordering them to hand over info on a particular poster. (This has happened a number of times in the USA, including to flag see http://flag.blackened.net/forums/viewtopic.php?t=72081 )

This is annoying as it was great when I had access to this info but on the other hand I can also see the argument for not being put in that position.

BTW I think Gurrier presents the problems with the Alexa stats better than I did. What percentage of your hits are to news rather than forum articles - I'd have presumed forums took a lot more hits (but maybe not unique visits?)

AndrewF's picture
AndrewF
Offline
Joined: 28-02-05
Nov 9 2006 14:40
Jack wrote:
gurrier wrote:
It's also worth remembering that the toolbar passes every single detail of your browsing history to a capitalist company who have the right to then sell this information on to whoever they like.

What, so they can give money to libcom every time I buy something off Amazon? The bastards. wink

I think the point is that 'whoever they like' probably includes the equivalent of the Economic League (are they still going?), Homeland Security, journalists etc It's easy to be complacent about this until something like this happens to you - the person is a poster here and the article which was obviously aimed at getting her sacked included quotes from various internet posts.

Mike Harman
Offline
Joined: 7-02-06
Nov 9 2006 15:17
JoeBlack2 wrote:
Most if not all of the sites I'm involved with have a policy of not routinely collecting such statistics as a (limited) protection against court injunctions ordering them to hand over info on a particular poster. (This has happened a number of times in the USA, including to flag see http://flag.blackened.net/forums/viewtopic.php?t=72081 )

sad

Quote:
BTW I think Gurrier presents the problems with the Alexa stats better than I did. What percentage of your hits are to news rather than forum articles - I'd have presumed forums took a lot more hits (but maybe not unique visits?)

The majority of our page views are on the forums, but quite a small minority of the visits. Page views are dodgy though - since it takes 2-10 page views for me to click on a thread, read a comment, go to the reply page, read the next reply etc. whereas with a news article it's one click. That counts between sites as well I think, let alone sections of one site. Before we were using one system, I could probably have given figures, but now non-forums pages (/tracker) is used for the forums and vice versa so it's impossible to tell distribution accurately. I've done some trickery on google analytics though, and it ought to be possible for us to compare the main non-forums traffic - i.e. library, news, history with each other in a couple of months since those are a bit more clear cut.

The visits we get from people typing libcom.org into their browser are consistently less than 35-40%, possibly lower, and I'd imagine 80-90% or more of those are regular users, putting non-forum visits at around 60-80%. Absolute unique visitors (i.e. counting distinct ip addresses regardless of how many times someone visits the site) - this is some ridiculous number like 93-95% or higher of visits, but I don't really know how that's worked out.

Again Alexa is kind of useful for gauging this, with page views crashing when we were down/forums hacked, but reach more steady - since traffic from links and search engines would still count on alexa even if there wasn't anything to see.

gurrier
Offline
Joined: 30-01-04
Nov 9 2006 15:51
Quote:
Only partially true. There are firefox plugins which send information to Alexa. That'd be a tiny percentage as well though of course.

Actually, it's entirely true. The firefox plugins are much less common than the IE toolbar and therefore as firefox proliferates, alexa penetration decreases.

Quote:
It's quite possible to use the alexa site for fun and not have the toolbar installed. Have you stopped using google?

I was referring to the down sides of installing the toolbar, not browsing to the site. Using the google site does not give google access to anything other than your use of google. Using alexa toolbar gives them a complete record of your browsing history, which is far far more valuable and which is sold to all sorts of marketing people.

Quote:
I think if anything admin traffic went up around the hack periods so ought to have compensated, but those two periods are very clearly marked in the Alexa graphs and for comparison our forums stats, although the forums stats are for the past two years, not one year, so you have to squint a bit.

That doesn't really say too much though. If a single one of your regular non-admin users has the toolbar installed, then you'd expect any outage to show a marked downturn. If 3 or four people with the toolbar installed browse to an article, that'll give you a big bump. The problem is that it is impossible to distinguish between a situation where 3 regular users have decided to install the toolbar and where 3,000,000 users have browsed to the site and 3 of them have the toolbar installed.

For all those reasons and more (e.g. they are bad at mapping installed toolbars to individual users, they don't map multiply named domains together, etc, etc) I really don't think that it's useful as anything more than a tool which allows you to see if a site is:
a) really big (0-5,000)
b) reasonably big (5,000 - 200,000)
c) small (200,000 +)

I mean, here are the indymedia.ie actual traffic numbers:

Date # Hits Bytes Transferred
Oct 12, 2006 478183 8.38 GB
Oct 13, 2006 483278 9.10 GB
Oct 14, 2006 367837 8.13 GB
Oct 15, 2006 384620 8.98 GB
Oct 16, 2006 515486 9.43 GB
Oct 17, 2006 500309 9.24 GB
Oct 18, 2006 491701 9.60 GB
Oct 19, 2006 456729 8.77 GB
Oct 20, 2006 447459 8.29 GB
Oct 21, 2006 360629 8.77 GB
Oct 22, 2006 404585 9.25 GB
Oct 23, 2006 519915 10.43 GB
Oct 24, 2006 506750 9.72 GB
Oct 25, 2006 480349 8.78 GB
Oct 26, 2006 473322 7.12 GB
Oct 27, 2006 431756 8.38 GB
Oct 28, 2006 367899 7.65 GB
Oct 29, 2006 407757 8.63 GB
Oct 30, 2006 435003 8.11 GB
Oct 31, 2006 470603 8.66 GB
Nov 1, 2006 458541 8.87 GB
Nov 2, 2006 446691 8.47 GB
Nov 3, 2006 431116 9.00 GB
Nov 4, 2006 354044 8.09 GB
Nov 5, 2006 385437 8.38 GB
Nov 6, 2006 503229 9.47 GB
Nov 7, 2006 490691 9.19 GB
Nov 8, 2006 481152 8.78 GB

And here is the corresponding alexa chart:

As you can see, there is no correlation between the two. The traffic was pretty much flat throughout the month, the graph shows an enormous bump in the middle and the rank goes from about 40,000 to about 150,000 during a period where there is no change in traffic. If this can happen to a site which gets about a half million hits a day (i.e. the sample size is significant), then sites which get less traffic are going to experience even wilder swings.

Steven.'s picture
Steven.
Offline
Joined: 27-06-06
Nov 9 2006 17:09
gurrier wrote:
As you can see, there is no correlation between the two.

Yes but that's the rank chart. They're not meant to correlate. The rank is the rank of your site compared to the rest of the internet. More interesting would be a graph of your view rate compared with the alexa one. Actually we should try to do that with ours.

Quote:
If this can happen to a site which gets about a half million hits a day (i.e. the sample size is significant),

I'm sure you know that "hits" are pretty meaningless, as that's the number of files downloaded from your site. As a file can be any tiny little thing, like this say: libcom arrow for bullet points you can easily pick up hundreds of "hits" with only one page view.

gurrier
Offline
Joined: 30-01-04
Nov 9 2006 17:24
Quote:
Yes but that's the rank chart. They're not meant to correlate. The rank is the rank of your site compared to the rest of the internet. More interesting would be a graph of your view rate compared with the alexa one. Actually we should try to do that with ours.

Yes, but the levels of traffic on the internet each day are roughly the same (a bit of weekly and seasonal variation on top of an underlying constant rate of increase). We also have to assume that the distribution of the traffic is more or less similar on most days - although the particular sites will vary in their traffic, the distribution of traffic across all sites should retain a pretty constant pattern. Therefore, a site that gets a constant amount of traffic should retain more or less a constant rank - it might vary a bit as everything can vary, but unless the general traffic distribution changes wildly, or the traffic levels suddenly increase or decrease across the board (neither of which happens in practice), it should remain at or around the same level. As you can see from the image below, alexa's page views have pretty much the same shape as the rank.

Quote:
I'm sure you know that "hits" are pretty meaningless, as that's the number of files downloaded from your site. As a file can be any tiny little thing, like this say: libcom arrow for bullet points you can easily pick up hundreds of "hits" with only one page view.

Yup, I know, I wuz keeping it simple. In fact due to caching and other optimisations, you never really get more than about 1:20 in terms of page impressions to hits (at least I've never seen that). On indymedia.ie the ratio is about 1:2.5 (which means about 200,000 page impressions per day). As I said above, that's a significant sample size and any process that comes up with such an oscillating score with such a sample size is going to need a really huge sample before it approaches anything like accuracy. Maybe 100 times more traffic would smooth it out - but 20 million page impressions per day is into the top few hundred sites in the world.

Tacks's picture
Tacks
Offline
Joined: 8-11-05
Nov 9 2006 17:35
gurrier wrote:
It's also worth remembering that the toolbar passes every single detail of your browsing history to a capitalist company who have the right to then sell this information on to whoever they like.

what toolbar? Do i have one? Does it make any difference if u delete ur private data in the Tools menue before hand?

Tacks's picture
Tacks
Offline
Joined: 8-11-05
Nov 9 2006 18:20

But i always paid you in cash grin

Mike Harman
Offline
Joined: 7-02-06
Nov 11 2006 12:23
gurrier wrote:
The problem is that it is impossible to distinguish between a situation where 3 regular users have decided to install the toolbar and where 3,000,000 users have browsed to the site and 3 of them have the toolbar installed.

Except it tracks reach (different toolbars) and page views. I reckon one or two toolbars with regular users would have a smallish effect on reach.

Quote:

I mean, here are the indymedia.ie actual traffic numbers:

Oct 23, 2006 519915 10.43 GB

And here is the corresponding alexa chart:

The day with the biggest traffic (10gb) corresponds with the spike.

Hits and bandwidth really don't tell you anything at all though. Even page views, total visits and unique visitors only give you an idea. If a video was posted on the 23rd that could account for the bandwidth spike, if a post with loads of photos was posted that accounts for both extra hits per page (one page might generate 30 or 40 hits easily, if not more) and extra bandwidth.

Mike Harman
Offline
Joined: 7-02-06
Nov 12 2006 00:09

OK I know I'm a sad git, but this thread on Bristol Indymedia competing with the Evening Post is now competing with the Bristol Indymedia article on Bristol Indymedia competing with the Evening Post.

http://www.google.co.uk/search?q=bristol+indymedia+evening+post&start=0&...

AndrewF's picture
AndrewF
Offline
Joined: 28-02-05
Nov 12 2006 11:44
Mike Harman wrote:
If a video was posted on the 23rd that could account for the bandwidth spike, if a post with loads of photos was posted that accounts for both extra hits per page (one page might generate 30 or 40 hits easily, if not more) and extra bandwidth.

Indymedia.ie don't host video so that explanation is out.

They have a good archive system so you can actually get an exact listing of posts for that date. There aren't very many and they are not very interesting unless you happen to be a Sinn Fein member (they had held a health care rally the previous day One post does have 15 pictures, that would not be an unusual amount).

The obvious explanation for the huge Alexa spike in comparison with a pretty indifferent peak in actual activity is that a couple of Sinn Fein members with the Alexa toolbar were browing those articles to see if they were in any of the photos! Or someone from the PR section of one of the rival parties.

Basically the comparative stats gurrier posts show that even for a site with massive traffic and a wide readership the correlation between the Alexa stats and the real ones is so weak as to be almost insignficant. I was cynical about Alexa but I'd have expected a much closer correlation for indymedia.ie because it does have traffic volume.

I'd even suggest that in SEO/PR terms it may well be common to install the Alexa toolbar so that you can than dazzle you clients with the apparent popularity of their site via ones they will reckon to be popular.

Mike Harman
Offline
Joined: 7-02-06
Nov 12 2006 15:01
Quote:
Yes, but the levels of traffic on the internet each day are roughly the same (a bit of weekly and seasonal variation on top of an underlying constant rate of increase). We also have to assume that the distribution of the traffic is more or less similar on most days - although the particular sites will vary in their traffic, the distribution of traffic across all sites should retain a pretty constant pattern. Therefore, a site that gets a constant amount of traffic should retain more or less a constant rank - it might vary a bit as everything can vary, but unless the general traffic distribution changes wildly, or the traffic levels suddenly increase or decrease across the board (neither of which happens in practice), it should remain at or around the same level.

That doesn't quite cut it though. I'd imagine vast numbers of sites, from the smallest to the biggest, have very, very similar levels of traffic. So if the top 20,000-100,000 sites on Alexa have traffic between 200,000 and 2 million page views per month (for argument's sake), a bit of daily variation up or down would easily show massive jumps in rank from a smallish change in traffic.

Quote:
Indymedia.ie don't host video so that explanation is out.

They have a good archive system so you can actually get an exact listing of posts for that date. There aren't very many and they are not very interesting unless you happen to be a Sinn Fein member (they had held a health care rally the previous day One post does have 15 pictures, that would not be an unusual amount).

Nonono, I was pointing out "hits" as being an incredibly limited way of determining traffic, particularly when trying to compare with other sites. Visits, page views, and now length of visit are the most accurate afaik, hits have been pretty much given up on apart from people who like big numbers. Using them as an "accurate" measure of traffic compared to Alexa isn't therefore a particularly useful plan.

Quote:
Most if not all of the sites I'm involved with have a policy of not routinely collecting such statistics as a (limited) protection against court injunctions ordering them to hand over info on a particular poster.

So you don't keep Apache logs at all then? Or you simply don't collect statistics? If you've got apache logs then that's a useless policy since all the raw data interpreted into statistics is there.

AndrewF's picture
AndrewF
Offline
Joined: 28-02-05
Nov 12 2006 15:31
Mike Harman wrote:
So you don't keep Apache logs at all then? Or you simply don't collect statistics? If you've got apache logs then that's a useless policy since all the raw data interpreted into statistics is there.

AFAIK no logs at all are collected on a routine basis.

Mike Harman
Offline
Joined: 7-02-06
Nov 13 2006 13:21

fwiw, here's a new post from the Alexa blog detailing how sites routinely jump from 200,000 to 100,000 without much traffic changes in the rankings, pretty much what I said on here before I read it:

http://awis.blogspot.com/2006/11/traffic-on-long-long-tail.html

gurrier
Offline
Joined: 30-01-04
Nov 13 2006 13:38
Mike Harman wrote:
fwiw, here's a new post from the Alexa blog detailing how sites routinely jump from 200,000 to 100,000 without much traffic changes in the rankings, pretty much what I said on here before I read it:

http://awis.blogspot.com/2006/11/traffic-on-long-long-tail.html

You're still missing my point though.

The fact that a website can jump from position 150,000 to position 20,000 without any change in traffic shows that the sampling is inadequate to provide any real indication of traffic levels.

At it's height Alexa was claiming an installed base of 5 million, due to increased awareness of spyware, this has now dropped off a lot (they no longer report the figure). Since a fairly small percentage of installed toolbars will be used every day, that means that the daily sample size is probably no more than a few hundred thousand at best. That means that each different user will add something like 5 points to the daily reach and since a reach of 40 per million is going to get you into the top 10,000, from the outer reaches, the whole thing is way too open to sampling biases to mean anything outside the top few thousand.

Mike Harman
Offline
Joined: 7-02-06
Nov 13 2006 13:46

I've not said it's highly accurate though - all I've said on this thread is it's the only way to compare traffic with other sites on the internet, and it can be used alongside other stats programmes. Plus the massive jumps in our rankings do generally have some correlation with traffic.

Now you may not give a shit about traffic comparisons, and I'm fully prepared to admit it's 1. a bit geeky 2. a bit silly 3. not all that useful, but you care enough to post on this thread so it's obviously something you've looked into yourself. You've not actually posted up any alternative ways to look into stuff like that though, and my one concrete suggestion (other than technorati, which is useful in it's own right, the rankings really don't mean much with that since it's weighted so heavily towards tech blogs) looks like it won't go anywhere if so many sites are deleting/not collecting their apache logs.

gurrier
Offline
Joined: 30-01-04
Nov 13 2006 14:38
Mike Harman wrote:
I've not said it's highly accurate though - all I've said on this thread is it's the only way to compare traffic with other sites on the internet, and it can be used alongside other stats programmes. Plus the massive jumps in our rankings do generally have some correlation with traffic.

Now you may not give a shit about traffic comparisons, and I'm fully prepared to admit it's 1. a bit geeky 2. a bit silly 3. not all that useful, but you care enough to post on this thread so it's obviously something you've looked into yourself. You've not actually posted up any alternative ways to look into stuff like that though, and my one concrete suggestion (other than technorati, which is useful in it's own right, the rankings really don't mean much with that since it's weighted so heavily towards tech blogs) looks like it won't go anywhere if so many sites are deleting/not collecting their apache logs.

The best way to compare sites is to compare their stats for page impressions, visits and unique IPs. If you don't have that information available, google page rank data is probably about the best metric to use. Since a significant percentage of most sites' traffic comes from google referals, the page rank correlates reasonably well with traffic (you can finesse it by doing some weighting depending on the type of site).

My point is just that the sample size of alexa has fallen to such a point where it's probably only statistically significant in the top 5,000 or so.

Mike Harman
Offline
Joined: 7-02-06
Nov 13 2006 14:44
Quote:
The best way to compare sites is to compare their stats for page impressions, visits and unique IPs.

Like I said, I'd be very interested in doing this (along with keyword referrals) with a few relatively like-minded sites, sounds like it's not possible though...

gurrier
Offline
Joined: 30-01-04
Nov 13 2006 14:59
Mike Harman wrote:
Quote:
The best way to compare sites is to compare their stats for page impressions, visits and unique IPs.

Like I said, I'd be very interested in doing this (along with keyword referrals) with a few relatively like-minded sites, sounds like it's not possible though...

Oh it is. Although indymedia.ie and the various other oscailt sites I help to run (anarkismo, wsm) don't normally collect apache logs, we know how many page impressions / visits / unique IPs we get with fairly good accuracy.

The stats for hits and bytes above are collected by the ISP (they're what they bill us on). Every so often I turn on apache logs for a few days and collect a sample of traffic to the various sites. From these I can work out the ratios of hits/bytes to sites, hits/bytes to visits and hits/bytes to page impressions. The hits to page impressions ratio for all oscailt sites is pretty much similar - it runs from about 2 to about 3 with very little variation on a particular site (on indymedia it's between 2.4 and 2.7). From this data I can work out a formal probability distribution for page impressions, visits and so on. A cheap and easy way to do it though is just the following:
pages = hits / 2.5
unique IPs = hits / 30
Which gives you more or less the centre of the probability distribution for each metric.

Unique IP's is the only one that shows much variation in ratio, as big stories will attract a lot of one-off visitors, but that just means that the probability curve drops off less steeply for that metric.