This Course is Expired

CS246 - Mining Massive Data Sets

Pivotal issues pertaining to mining massive data sets will range from how to deal with huge document databases and infinite streams of data to mining large social networks and web graphs

Course at a Glance

Mode of learning : Online - Self Paced

Domain / Subject : Engineering & Technology

Function : Information Technology(IT)

Trainer name : Jure Leskovec

Starts on : 6th Jan 2015

Difficulty : Advanced

CS246 - Mining Massive Data Sets
Available Online

The importance of data to business decisions, strategy and behavior has proven unparalleled in recent years. Predictive analytics, data mining and machine learning are tools giving us new methods for analyzing massive data sets. Companies place true value on individuals who understand and manipulate large data sets to provide informative outcomes. 

Pivotal issues pertaining to mining massive data sets will range from how to deal with huge document databases and infinite streams of data to mining large social networks and web graphs.

Practical hands-on experience will entail the design of algorithms for analyzing very large amounts of data and to learn existing data mining and machine learning algorithms. As a useful analytic tool, case studies will provide first-hand insight into how big data problems and their solutions allow companies like Google to succeed in the market.


Students enrolling under the non degree option are required to take the course for 4.0 units.

At least one: Computer Organizations & Systems (Stanford Course CS107) OR Introduction to Databases (Stanford Course CS145) or equivalent


At least one: Intro to Probability for Computer Scientists (Stanford Course CS109) OR Theory of Probability (Stanford Course STATS116) or equivalent

Tuition Option(s)

  • For Credit    $3,960.00    
  • For Credit (member)    $3,360.00    
  • Units 3 units
  • Computer Science Department Requirement 
  • Students taking graduate courses in Computer Science must enroll for the maximum number of units and maintain a B or better in each course in order to continue taking courses under the Non Degree Option.

Non Degree Option 
Note: Enrolling in this course for credit under the Non Degree Option requires an approved application. If you do not already have an approved application on record, the application will be presented to you as part of the checkout process. If your application is denied, tuition and fees for the course will be refunded.

Textbooks/Course Materials 
Students enrolled in a graduate course for credit are required to complete homework assignments, projects, and take exams as required of all students during the 10-week quarter. Information regarding textbooks and materials is usually covered in the first lecture and may also be found on the course Web site.

Topics Include

  • Shingling, minhashing, random hyperplanes, locality-sensitive hashing
  • Dimensionality reduction: UV, SVD, and CUR decompositions
  • Algorithms for very large scale mining: clustering, nearest-neighbor search, gradient descent, support-vector machines, classification, and regression
  • Submodular function optimization
  • Units-3.0 - 4.0


Write Your Own Review

Write your review here (required)

Is the price of course overrated?
would you recommend this course to others?
Is duration of the course sufficient enough?
Did you like the faculties?
What would you prefer in future classroom or online learning?

Key features

Related Courses:

Disclaimer: The contents of the course & Institute are obtained from the institute’s website by automated scraping or manual updates. For the latest information, please refer the institute website directly. For any discrepancies in the content, contact us at

Sample Video