Home | About NISO | Blog

Archive for September, 2008

Changes to Google Scholar’s Search algorithm

Thursday, September 4th, 2008

I was in Philadelphia for a meeting yesterday prior to the Society of Scholarly Publishing’s Top Management Roundtable.  I was talking to a colleague about some recent changes that were made to Google Scholar’s search algorithm that were released last week.  Apparently, these changes bring articles that are freely available higher on the search results list than articles which are behind a subscription wall.  This is an interesting change that could create some stir in the community.

I have several thoughts about this change.  First, I wonder who knew or recognized the change when it happened? As a service that many (most?) researchers and students use, the underlying basis for which results are presented is completely unkown.  This has been a common criticism of Google for a long time.  Google’s PageRank algorithm has been the “secret sauce” and among the most highly guarded secrets in a highly secretive company.  Although, according to Scholar’s “About” page, the algorithm is different for Scholar than the rest of Google.  Interestingly, though no one (outside of Google) knows why an article ranks more highly on the list than another everyone seems to rely on it, despite some research that has shown other library search services are more effective.  Many in the community have been critical of this this practically since Scholar was released.

More interesting than the ongoing debate about Google’s openness is the ramifications that this particular change has regarding copyright.  If an article is found in a subscription-walled system and is copied and posted to an open site, according to this change, the pirated copy would appear higher on the search results than the legitimate copy.  The person I was discussing this with saw some examples of content from their site which was posted on open sites.  Obviously, people post content for numerous reasons and some have legitimate rights to do so.  For example, most publishers allow author self-archiving or posting to an individual’s home page.  In this case, it probably is preferable from the author’s perspective to have the freely available copy ranked higher, because it is directed to the author’s site, probably where more information about the author and their work resides.  This is one area where NISO’s Journal Article Versions recommended practice would be usefully applied.

What is likely occurring more often is that authorized users find an article, copy the file and post it outside of the subscription wall.  This might be done knowingly or not, but it is odd that the new changes would drive traffic to files that could well be posted in violation of copyright laws.

Of course, if the algorithm changes again, we might never know.

The annual world of college freshmen

Monday, September 1st, 2008

Every fall with the start of the new school year, there is a yearly list of things that the incoming college freshmen have never known or experienced.  This year’s college freshmen were generally born in 1990.  Among the interesting facts from this year’s list published by Beloit College are: 

    • 18) WWW has never stood for World Wide Wrestling.
    • 28) IBM has never made typewriters.
    • 36) Kids may have been given a Nintendo Game Boy to play with in the crib
    • 51) Windows 3.0 operating system made PCs user-friendly

 My own son is many years off from college, but its interesting to reflect on things that he will never have considered.  

    • He’ll never know dial-up internet.  
    • There will always have been people living in space.
    • He’s never seen a skyline of NYC with a World Trade Center
    • He’s always been driven on roads where there were lots of vehicles on the road that could run on electricity (at least some of the time) instead of gasoline.
    • He could never buy Poloroid film
    • The video game industry has always been bigger than the movie industry
    • He’ll never know a time when either an African-American or a women wasn’t elected to either a president or vice-president position
    • To google” has always been a verb

  I wonder which of these questions in the will have been resolved by then: 

    • Textbooks will no longer be available in print
    • PDA’s, e-book readers, cell phones and laptops will have converged into a single device
    • Moore’s Law will have reached the limits of physics

 It’s always a fun thing to consider, particularly on a holiday afternoon while the boy’s napping.