Kansas City User Groups: June 21st: Data Science KC - Untangling Healthcare with Spark and Dataflow

June 21st: Data Science KC - Untangling Healthcare with Spark and Dataflow

On 8:37 AM

Tuesday, June 21, 2016

6:30 PM
C2FO

4210 Shawnee Mission Pkwy, Fairway, KS (map)
Spark is becoming a data processing giant, but it leaves much as an exercise for the user. Developers need to write specialized logic to move between batch and streaming modes, manually deal with late or out-of-order data, and explicitly wire complex flows together.

In this talk, Ryan Brush will talk about tackling these problems over a multi-petabyte dataset at Cerner. We start with how hand-written solutions to these problems evolved to prescriptive practices, opening up development of such systems to a wider audience. From there we look at how the emergence of Google’s Dataflow on Spark is helping us take the next step: the tradeoffs between correctness, latency, and cost are becoming a simple, easily changeable decision rather than a deep analysis for each new need. Finally, we look at challenges unique to doing processing in large organizations, such as making independent units of processing composable into large pipelines — and making them usable in both batch and stream modes.

Ryan Brush is a software engineer at Cerner, where he works on Hadoop-based systems to bring together and make sense of the world's health data. He dabbles in writing, having contributed chapters to

Hadoop: The Definitive Guide

and

97 Things Every Programmer Should Know

. He is also the author of Clara, an open-source rule engine in Clojure. Ryan's recent focus is on ways to declaratively express domain expertise and apply it at scale.

Kansas City User Groups

IT Education opportunities in the Kansas City Metro area.

Pages

DevOps Groups

Coding Groups

Data User Groups

Design Groups

Platform Groups

Proj Mgmt Groups

Security Groups

Startup Groups

Other Groups

June 21st: Data Science KC - Untangling Healthcare with Spark and Dataflow

Tuesday, June 21, 2016

C2FO

0 Response to "June 21st: Data Science KC - Untangling Healthcare with Spark and Dataflow"

Post a Comment

Forums

Group Tools

Training

Blog Archive

My other sites

Followers

Kansas City User Groups

IT Education opportunities in the Kansas City Metro area.

Pages

DevOps Groups

Coding Groups

Data User Groups

Design Groups

Platform Groups

Proj Mgmt Groups

Security Groups

Startup Groups

Other Groups

June 21st: Data Science KC - Untangling Healthcare with Spark and Dataflow

Tuesday, June 21, 2016

C2FO

0 Response to "June 21st: Data Science KC - Untangling Healthcare with Spark and Dataflow"

Post a Comment

Forums

Group Tools

Subscribe To

Training

Blog Archive

My other sites

Followers