Guardian moves on from Java and Oracle
Guardian moves on from Java and Oracle
Written by Kay Ewbank   
Thursday, 07 April 2011

The developer team at online news site is making another interesting decision to use emerging technology for its site - and its choice is Scala.

The Guardian website has the highest readership of any online news site apart from the New York Times. Over recent months developers working on the site have been revealing plans to move from the current Java and Oracle based system to one based on Scala and MongoDB. The changeover is starting with the Content API which is used for collecting the content from the online newspaper.

The API was developed in Java, but the team has decided to switch to the JVM-based Scala instead. The decision was made because of the need to reduce the time taken to deliver new features. Scala, although based on Java, is very much a modern language organised around a functional programming approach plus objects. It is certainly a more advanced language than Java but obviously it is also a less tried and trusted tool.



Graham Tackley, the Web Platform Development Team Lead for, explained at Eurocon 2010 how they represent their 50 table relational database model in Apache Solr for the media storage and used Scala for the real-time content searching, indexing or updating. He said moving to Scala reduced the time for building the search index from 20 hours to one.

The slides of this talk on Solr are available here with details of the system architecture. Tacklry has also written some interesting posts about Scala and why he likes it on his blog.

Meanwhile, another developer on the Guardian, Mat Wall, spoke at Qcon London about the decisions behind moving from Oracle to MongoDB. MongoDB stores documents in JSON (JavaScript Object Notation) format rather than using a relational structure, which means that documents with new attributes can be added to the database at runtime. The Guardian team were being restricted by the fact that if the developers wanted to update the code that runs the site, they often had to update the database schema, and so put content updates on hold while the schema was updated.

Both MongoDB and Oracle are in use at the moment, and this is being handed by a custom API layer that acts as a wrapper for the database access to the two very different data structures. Mat Wall also said the team are looking to move their data to being hosted in the cloud. The slides from the session can be downloaded from the Qcon website.

You have to admire the team for making  decisions to use tools that are "non-safe". After all it is an old saying that no one ever got fired for buying into the mainstream. You also can't help but think that they are deliberately setting out to be trailblazers. Whether they or the Guardian will regret it in the longer term is a different matter. Adopting new technologies on such a scale is an interesting experiment and one we can all learn from.


HHVM 3.26 With New Front End

The latest version of the HHVM interpreter for PHP/Hack has been released and now uses the HackC compiler front-end. HackC includes a full-fidelity parser (FFP) and bytecode emitter for the Hack and P [ ... ]

Oculus Go and More AR/VR at F8

At Facebook's F8 Conference, an estimated five thousand attendees  were delighted to learn that they would be taking away an Oculus Go headset, which started shipping on May 1st  with a pric [ ... ]

More News

Last Updated ( Thursday, 07 April 2011 )

RSS feed of news items only
I Programmer News
Copyright © 2018 All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.