Free book on web mining
Free book on web mining
Thursday, 30 December 2010

 

Banner

 

Mining of Massive Datasets, a textbook written for an advanced graduate course taught at Stanford University, has been made available for free download by its authors, Anand Rajarma and Jeffrey D. Ullman.

The book focuses on data mining of data so large that it doesn't fit into main memory and uses examples of data derived from the Web. Its approach is to apply algorithms to data, rather than using machine-learning.

According to its Preface the principal topics covered are: 

  1. Distributed file systems and map-reduce as a tool for creating parallel algorithms that succeed on very large amounts of data.
  2. Similarity search, including the key techniques of minhashing and locality-sensitive hashing.
  3. Data-stream processing and specialized algorithms for dealing with data that arrives so fast it must be processed immediately or lost.
  4. The technology of search engines, including Google's PageRank, link-spam detection, and the hubs-and-authorities approach.
  5. Frequent-itemset mining, including association rules, market-baskets, the A-Priori Algorithm and its improvements.
  6. Algorithms for clustering very large, high-dimensional datasets.
  7. Two key problems for Web applications: managing advertising and recommendation
    systems.

Although this is an academic text it is written in an accessible style making it a suitable for other readers with existing knowledge of SQL, data structures and algorithms and software systems.

If you are interested in big data then this is a must and given it is free the price is right too.

You can read it online (HTML) or download it as a PDF.

Download it from:

http://infolab.stanford.edu/~ullman/mmds.html

 

Banner


Imagine Cup Winners 2015
01/08/2015

Thirty three teams of students spent last week in Seattle for the 2015 Imagine Cup World Finals and now we can reveal that the team who were awarded the coveted trophy is Team eFitFashion of Brazil.



The Truck Factor Revealed
16/07/2015

Don't take this too seriously, but if 90 programmers working on the Linux kernel were to be hit by a truck then the project, and hence Linux, would be toast. If a truck score of 90 seems good what abo [ ... ]


More News

Last Updated ( Thursday, 30 December 2010 )
 
 
Banner

   
RSS feed of news items only
I Programmer News
Copyright © 2015 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.