9 episodes

A podcast about big data, distributed systems, and scalability

Scale Cast – A podcast about big data, distributed systems, and scalability Unknown

    • Technology

A podcast about big data, distributed systems, and scalability

    • video
    An Introduction to ZooKeeper Video

    An Introduction to ZooKeeper Video

    In 2006 we were building distributed applications that needed a master, aka coordinator, aka controller to manage the sub processes of the applications. It was a scenario that we had encountered before and something that we saw repeated over and over again inside and outside of Yahoo!. For example, we have an application that consists […]

    • video
    More Optimal Bloom Filters

    More Optimal Bloom Filters

    The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. Elements can be added to the set, but not removed (though this can be addressed with […]

    • video
    An Overview of High Performance Computing and Challenges for the Future

    An Overview of High Performance Computing and Challenges for the Future

    In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. A new generation of software libraries and algorithms are needed for the effective and reliable use […]

    • video
    Disk-Based Parallel Computation, Rubik’s Cube, and Checkpointin

    Disk-Based Parallel Computation, Rubik’s Cube, and Checkpointin

    This talk takes us on a journey through three varied, but interconnected topics. First, our research lab has engaged in a series of disk-based computations extending over five years. Disks have traditionally been used for filesystems, for virtual memory, and for databases. Disk-based computation opens up an important fourth use: an abstraction for multiple disks […]

    • video
    Lecture 5: Cluster Computing and MapReduce

    Lecture 5: Cluster Computing and MapReduce

    link to video

    • video
    Lecture 3: Cluster Computing and MapReduce

    Lecture 3: Cluster Computing and MapReduce

    link to video