Friday, June 5, 2015

Weekend reading list

Let's see: I need to finish Seveneves, get going on H is for Hawk, play The Witcher 3: Wild Hunt, watch the Champions League final, watch the NBA Finals...

Oh yeah, and I need to read all of this:

  • Flying faster with Twitter Heron
    A real-time streaming system demands certain systemic qualities to analyze data at a large scale. Among other things, it needs to: process of billions of events per minute; have sub-second latency and predictable behavior at scale; in failure scenarios, have high data accuracy, resiliency under temporary traffic spikes and pipeline congestions; be easy to debug; and simple to deploy in a shared infrastructure.
  • Everything You Ever Wanted to Know About Message Latency
    To get a feel for message latency, we’ll use the analogy of a train going through a tunnel. This setup is shown in Figure 1. The train represents the message; longer trains stand for larger messages. The tunnel represents the link connecting the sender and receiver. The message latency is the time it takes for the train to enter the tunnel and leave the other side.
  • Turing Lecture: The Computer Science of Concurrency: The Early Years
    This is a personal view of the first dozen years of the history of the field of concurrency—a view from today, based on 40 years of hindsight. It reflects my biased perspective, so despite covering only the very beginning of what was then an esoteric field, it is far from complete. The geneses of my own contributions are described in comments in my publications web page.
  • Recommending items to more than a billion people
    Collaborative filtering (CF) is one of the important areas where this applies. CF is a recommender systems technique that helps people discover items that are most relevant to them. At Facebook, this might include pages, groups, events, games, and more. CF is based on the idea that the best recommendations come from people who have similar tastes. In other words, it uses historical item ratings of like-minded people to predict how someone would rate an item.
  • How Global Network Latency Affects Your Mobile App
    Looking at G20 countries, South Korea comes in as the fastest at 259 ms, while Saudi Arabia clocks in the slowest at 695 ms. For full global stats, check out the map.
  • Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity
    Specifically, we focus our study on the common use of feral, or application-level, mechanisms for maintaining database integrity, which, across a range of ORM systems, often take the form of declarative correctness criteria, or invariants. We quantitatively analyze the use of these mechanisms in a range of open source applications written using the Ruby on Rails ORM and find that feral invariants are the most popular means of ensuring integrity (and, by usage, are over 37 times more popular than transactions).
  • Optimizing Optimistic Concurrency Control for Tree-Structured, Log-Structured Databases
    If applications do not partition perfectly, then transactions accessing multiple partitions end up being distributed, which has well-known scalability challenges. To address them, we describe a high-performance transaction mechanism that uses optimistic concurrency control on a multi-versioned tree-structured database stored in a shared log. The system scales out by adding servers, without partitioning the database.
  • Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems
    These new NVM devices are almost as fast as DRAM, but all writes to it are potentially persistent even after power loss. Existing DBMSs are unable to take full advantage of this technology because their internal architectures are predicated on the assumption that memory is volatile. With NVM, many of the components of legacy DBMSs are unnecessary and will degrade the performance of data intensive applications.
  • Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems
    We present a novel MVCC implementation for main-memory database systems that has very little overhead compared to serial execution with single-version concurrency control, even when maintaining serializability guarantees. Updating data in-place and storing versions as before-image deltas in undo buffers not only allows us to retain the high scan performance of single-version systems but also forms the basis of our cheap and fine-grained serializability validation mechanism.
  • From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System
    In this paper, we describe a system that can compute efficiently complex join queries, including queries with cyclic joins, on a massively parallel architecture. We build on two independent lines of work for multi-join query evaluation: a communication-optimal algorithm for distributed evaluation, and a worst-case optimal algorithm for sequential evaluation.
  • SIGMOD '15- Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
    The annual ACM SIGMOD/PODS conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences.

Good thing it's summer and the days are long, long, long...

No comments:

Post a Comment