The Dissemination of Scholarly Information: Journals, Open-Access and Distributed Filtering

Current methods of disseminating scholarly information focus on the use of journals who retain exclusive rights in the material they publish. Recently there has been increasing dissatisfaction with this model, with suggestions for alternative approaches such as “Open Access”.

Together with a colleague (Omar Al-Ubaydli) I’ve been working to explore the reasons for the development of the traditional journal model, why it is no longer efficient and how it could be improved upon. We’re particularly interested in going beyond the basic question of distribution (access) to that of filtering, i.e. the process of matching information with the scholars who want it.

With the volume of information production ever growing – and attention ever more scarce – filtering is becoming crucial. Digital technology offers us some radically new possibilities. In particular, distribution and filtering can be separated, in turn, allowing filtering to be decentralized and distributed – a model which promises dramatic increases in transparency, innovation and efficiency.

Below is an overview of our analysis with the full version of the current paper here: http://rufuspollock.org/economics/papers/scholars_and_journals.pdf

Overview

It is crucial to the progress of any domain of scholarship that those engaged therein are able to communicate their discoveries and activities to others. As such a variety of systems and institutions have been developed in order to support ‘scholarly communication’ in one form or another ranging from personal letters to physical meetings.

In recent times, the growth of scholarship, combined with its increasing geographical dispersion, have resulted in the centrality of the written word and its dissemination via ‘journals’. In this paper we consider the purposes of any system of scholarly communication and consider the current academic journal system in light of them. This examination highlights several deficiencies and also suggest various possible improvements.

When thinking about the possible mechanisms of scholarly communication it is useful to specify in more detail the criteria against which they should be measured. That is, to put it more succinctly, what do we want a good mechanism for scholarly communication to do? In particular, when we say communicate we must ask ourselves what, to whom, in what form, etc etc.

For it is clear that when we talk of communication we usually mean more than the simple transmission of a piece of information. In fact, today, with so much scholarship available, the challenge may often not lie in the transmission from the author to the reader but in the matching of authors and readers – the decision of ‘what to read’.

This growing focus on choice is a natural one in a world where time and attention are limited and the amount of scholarship available is ever increasing. As such it suggests that there are at least two distinct functions performed by a system of scholarly communication:

Distribution – getting information from authors to readers (and back again)
Selection (filtering) – deciding what to distribute and to whom

In appreciating this distinction it is illuminating to consider how practice has changed over time. Originally communication between scholars, at least in written form, primarily took the form of letters between the individuals involved. As such, the two activities of distribution and filtering would be almost completely identical.

Then, as the number of authors and readers grew this became infeasible and dedicated journals would be created which would then disseminate to their particular readers a selection of what was submitted to them. Thus, what was once a direct peer-to-peer relationship became mediated by a new institutional form: the academic journal – though of course journals were often run by the very readers and authors who used them.

Finally, today, thanks to digitization and the Internet peer-to-peer is once again a possibility though with important differences: unlike in the past, where a letter writer chooses the recipient, the modern peer-to-peer approach more resembles journals in that the author and reader act independently – the author uploads or publishes his/her work to a repository entirely separately from the reader finding, downloading and reading it. This last discussion suggests breaking down our original two categories a little further:

‘Making available’ – publishing material
Discovery – finding out what is available
Choice – choosing from what is available
Reading – getting access to the material (in the form required)

Here, the first and fourth item would come under the ‘distribution’ heading while the second and third would come under ‘selection’. In addition we should mention two other functions performed by such a system, both of which relate to selection: a) improvement of work via peer-review (distinct from filtering process itself); b) ‘quality signalling’ whereby the selection of work helps signal the quality of its creators which in turn is important for the purpose of resource allocation (jobs, grants etc) within the scholarly community.

With these added to the list we now have a good number of separate goals which a scholarly communication mechanism may seek to satisfy. The next stage is to consider how the current system, largely based on academic journals, fares in respect of them.

Goals, Instruments and the Current Journal System

It is well known that in order to fully address a given number of (independent) goals one needs an equal number of instruments. For example, if one is seeking to address both congestion and pollution in relation to road-traffic, a single instrument such as petrol taxes, will be insufficient.

Here too there are multiple independent goals, most notably distribution and selection (matching). These are clearly distinct goals and require distinct instruments for their achievement but journals are but a single instrument which combine distribution and filtering in one mechanism.

Originally, the restrictions of reproduction and distribution technologies, meant they were the best instrument available. Today, with the advent of the computer and the Internet, this is no longer true: distribution (the uploading and downloading) can be done by almost anyone and quite separately from recommendations and rating of that material.

As such, the traditional journal system is becoming a serious constraint, particularly in its closed access form. There are two distinct aspects of this constraint.

First, on the distribution side, journals delay and restrict access as a result of higher prices arising either from simple monopoly control or the costs of the (inefficient) selection mechanism the traditional model necessitates.

Second, on the selection side, the forced combination of selection and distribution and the associated monopoly control of content greatly limit the efficiency (and utility) of the selection and filtering processes used to match authors and readers together.

Unfortunately, the two-sided nature of the journal market (based on expectations), combined with the current evaluation structure of academia, continue to lock society into this inefficient restriction.

Open-access journals provides are an important part of improving the current situation. However, as we discuss below, they are only a first step: in order to reap the full benefits of new technology we must move away from the traditional ‘journal’ model to a system that allow for full separation between the distribution and selection operations.

The Technological Origins of Modern Inefficiency

At this point it is worth considering in a little more detail why restricted-access journals originally came about. The answer lies in the nature of the technology available in earlier periods to manage distribution (printing and transmission).

When many journals were originally started the cost of transmitting information was very high and journals acted as a club good by which the costs of reproduction and distribution could be (efficiently) shared (the efficiency arising here from economies of scale).

At the same time, given the limited ‘bandwidth’ it was natural for journals to take on some filtering role in order to economize on the scarce distribution capacity. In this situation, dissemination is limited and with only one instrument available (journals), it is natural to tie dissemination and filtering together (with filtering in many ways secondary).

Once filtering is being done it is natural for journals to ‘tie’ material to the journal explicitly via copyright – though at an early stage given the scale economies of journals this explicit tying was not actually necessary and was probably done for simple legal convenience.

With the advent of digital communications, in particular the Internet, bandwidth is no longer scarce. What is now scarce is attention. In this setup the importance of a journal is not its role in efficiently sharing reproduction and distribution costs but its role as a filtering mechanism.

However, there is now a problem: when distribution is central it is natural to ‘add-in’ filtering, it is not natural, or necessary, to tie distribution to filtering when filtering is central. In fact it seems clear that distribution and filtering can be done entirely separately (there are potentially lots of ways for you to download my paper quite separate from getting it from a journal – and lots of ways to do matching and filtering other than by journal editors and reviewers).

The Open Access movement can be seen as largely about achieving this separation: with open access there is no longer a connection between access/distribution (which would be free) and the filtering mechanism (the choice of which articles go in a particular journal).

That said the ‘Open Access’ movement still has a large focus on journals – albeit open-access ones. This, in our view, is a mistake.

Technology has also affected possibilities for filtering. In particular it is no longer clear why the centralized mechanism of official peer-review and journals is superior to alternative decentralized options. The last decade, has witnessed widespread, and often successful, experimentation with distributed voting and evaluation mechanisms (for example Slashdot’s story-ratings and Google’s link-based site rankings).

Thus, to be more radical, it makes sense not only to remove centralized control of distribution but also centralized control of filtering.

A more distributed (market-like?) filtering mechanism would permit the same freedom (and same status) for reviewing and recommendation as it does in the production of scholarly information. At the same time it would deliver greater transparency and, by permitting ‘free-entry’ in filtering, would permit greater specialization, greater diversity, increased participation and the increasing efficiency flowing from greater competition.

As such, the gains from going ‘open’ are not simply wider access, but a reduction in the time and energy scholars spend finding and processing research information. Significantly, this second item, which is less frequently mentioned in discussions of ‘Open Access', may well be the most significant.

Distributed Filtering: A Proposal

Here we give a concrete proposal as to how a distributed filtering mechanism for scholarly information would function:

All papers are uploaded to ‘open’ repositories from which free access is allowed. Each paper receives an identifier.
All members of the community (or anyone in fact) are allocated identifiers by the Reviewing/Filtering network
There exist Reviewing/Filtering servers where people can log on and make a ‘review’ of a paper. A review could consist of a single vote or a proper detailed critique.
Review weighting and ranking. Given review weighting (and perhaps author ranking) one can produce a ‘quality’ value for each paper. The weighting system would be one of the most obvious places for innovation and competition to deliver efficiency improvements. We discuss this in more detail below but do mention one important aspect:
Ranking and Reviewing/Filtering by groups (NotAJournals). One major innovation permitted is that one need not have one single ranking/weighting algorithm. Instead one could allow groups to form which supplied their own weights and ranking (most obviously they could have a weighting in which only reviews by the group mattered – this corresponds to a traditional journal).

Reviews

At its simplest a review is a ‘vote’ (this could either be discrete e.g. one of 1,2,3,4,5 or it could continuous e.g. any number in [1,5].)

However it can also include full commentary in the traditional manor. The more detailed this commentary the more valuable the review. Reviews could also be non-anonymous. Non-anonymous reviews could be given higher weight than a non-anonymous review.

Ranking and Review Weighting

As discussed already it is suggested that there could be many different ranking and weighting functions. However, it is still worth considering some of the general attractive properties. For example, weighting should probably be higher for non-anonymous reviews – though this weighting would be interacted with the valency of the review with negative non-anonymous reviews getting a higher weighting than positive non-anonymous reviews. The weighting would be higher for detailed reviews than non-detailed reviews.

Of course this may be difficult to implement in an automated way. To remedy this it might make sense to take the whole reviewing mechanism a step back and ‘review’ reviews. That is, just as scholars can vote on articles, they can vote on reviews of articles. These reviews of reviews would then used to generate (or alter) the weighting system for reviewers and hence of reviews. Users would not be confined to using the default weighting system but could design their own. This issue is considered in the next section.

NotAJournals

A given user could decide that they only care about non-anonymous reviews provided by a particular group of people – this would correspond to a weighting of zero for the rest of the reviewing population. Extending this it is natural for groups to develop to review in particular areas. Members of such a group would review particular sets of articles (with efforts to ensure that each article gets reviewed by at least several members of the group).

At the same time the group could make available a weight-set corresponding exactly to that group as well as a selected articles set – members of the group likely also review outside of the area of this group and so a weight-set is not sufficient information to permit others to just get the group’s set of recommended articles. The group would thus be making available a particularly set of reviewed articles. In this sense the group would function, at least on the filtering side of things, much like a traditional journal. However they would not have exclusivity over the articles. For this reason we have christened them NotAJournals.

Formal Mechanism

Formally we could describe the system as consisting of scholars s(i) (i=1,2,…), articles a(j), reviews by scholar of articles r(i,j) (note that a review and just be a single number of could be considered as a tuple of rating plus other attributes such as comments). A weight-set is then a tuple of weights w(i) together with a review evaluation function f (which is a function of the computable attributes of the review, the article and the scholar). An overall weight function is then:

W(article j) = Sum over scholar i: w(i) f(r(ij), a(j))

Some examples:

Simplest linear case. The only attributes of the review are a rating which with simple linear weighting would give: W(article) = sum over scholars i: w(i) r(ij)
Give value for detailed comments, so r(ij) is a tuple consisting of ‘numeric’ rating plus indicator of whether review was detailed or not.
Limiting to only caring about reviews from reviewers with expertise in that area etc. This is similar to first example but where weightings of reviews are set to 0 for all reviewers without expertise.