My 2013 PyCon talk: Awesome Big Data Algorithms

Schedule link

Description

Random algorithms and probabilistic data structures arealgorithmically efficient and can provide shockingly good practicalresults. I will give a practical introduction, with live demos and badjokes, to this fascinating algorithmic niche. I will conclude withsome discussions of how our group has applied this to large sequencingdata sets (although this will not be the focus of the talk).

Abstract:

I propose to start with Python implementations of most of the DS & A mentioned in this excellent blog post:

http://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/

and also discuss skip lists and any other random algorithms that catchmy fancy. I’ll put everything together in an IPython notebook and addvisualizations as appropriate.

I’ll finish with some discussion of how we’ve put these approaches towork in my lab’s research, which focuses on compressive approaches tolarge data sets (and is regularly featured in my Python-ic blog,http://ivory.idyll.org/blog/).

相关文章:

你感兴趣的文章:

标签云: