*** Welcome to piglix ***

Wikipedia:Modelling Wikipedia's growth

P cartesian graph.svg
Main
General statistics
Breakdowns
 

This page analyzes the article count data in and attempts to fit a simple numerical model of past and future growth to the observed article count size and growth data.

EnwikipediaArt.PNG

Here, several models are presented to attempt to explain the observed general trends in article growth.

EnglishWikipediaArticleCountGraphs.png

The growth in articles had been approximately 100% per year from 2003 through most of 2006, but has tailed off since roughly September 2006. The trend is no longer one of exponential growth, but has been closer to linear since that time.

Notes

A few notes on features of the graph:

The graph of article count on the right is plotted on a logarithmic scale, so exponential growth should manifest itself as linear behavior of the data. Between October 2002 and July 2006, the data do fit very well along the dotted line shown, while from July 2006 onwards there is a noticeable fall off from linear behaviour. Before October 2002, the behaviour is more complex.

The graph on the right below is a close-up of the data points that follow a linear trend: the best-fit line in red was computed using linear regression. From the slope of this best-fit line, the proper time of the exponential growth can be found, giving:

By the end of 2006, when there were 1.5 million articles, the projection was already overestimating the growth by 10-15%, and the prediction of over 3 million articles by the end of 2007 is significantly more than the actual figure of about 2.1 million articles.

Questions:

Note that from the beginning of December 2005, only registered users can create new pages.


In 2009, the continued strong growth indicated there was no obvious nearby midpoint in the growth for new articles. Although growth was slowing, it was slowing more gradually, and could be expected to continue beyond another 15 years, creating up to 10 million articles. The predicted date for the 3-million-article mark would be much earlier, in mid-August 2009. The growth was supported by the need for various spin-off articles, such as unseen-hand and lost-world articles, millions of missing red-link articles, plus many thousands of new disambiguation pages needed to connect the other millions of pages. The new projected midpoint might occur in year 2011, although any massive auto-upload of numerous articles could change the schedule, such as a mass, automated effort to auto-generate red-link stubs with sources suggested from search-engine results. The continued strong growth fits the model reaching about 10 million articles, before deletions and merges would offset the increase of new articles being added.


...
Wikipedia

...