*** Welcome to piglix ***

Reynold Xin

Reynold Xin
Fields Computer Science
Alma mater UC Berkeley (doctoral study)
University of Toronto (BA.Sc.)
Doctoral advisor Michael J. Franklin
Known for Apache Spark

Reynold Xin is a computer scientist and engineer specializing in big data, distributed systems, and cloud computing. He is a co-founder and Chief Architect of Databricks. He is a frequent speaker on the topic of Big Data and open source software at conferences. He is best known for his work on Apache Spark, which as of June 2016 is the top open-source Big Data project. He designed and lead development of the GraphX, Project Tungsten, and Structured Streaming components and he co-designed DataFrames—all of which are part of the core Apache Spark distribution—plus served as the release manager for Spark's 2.0 release. As of September 2016 he is also the most active contributor to Spark with over 1000 commits.

Xin started his work on the Spark open source project while he was a PhD candidate at the UC Berkeley AMPLab.

The first research project, Shark, created a system that was able to efficiently execute SQL and advanced analytics workloads at scale. Shark won Best Demo Award at SIGMOD 2012. Shark was one of the first open source interactive SQL on Hadoop systems, with claims that it was between 10 and 100 times faster than Apache Hive. Shark was used by technology companies such as Yahoo, although it was replaced by a newer system called Spark SQL in 2014.

The second research project, GraphX, created a graph processing system on top of Spark, a general data-parallel system. GraphX at the same challenged the notion that specialized systems are necessary for graph computation. GraphX was released as an open source project and merged into Spark in 2014, as the graph processing library on Spark.


...
Wikipedia

...