Building Distributed Pipelines for Data Science using Kafka, Spark, and Cassandra

Andy Petrella (@noootsab) is a mathematician turned into a distributed computing entrepreneur, in addition to being a Scala and Spark trainer. Andy participated in many projects built using Spark, Cassandra, and other distributed technologies, in various fields including geospatial, IoT, automotive, and smart cities projects. Andy is the creator of the Spark Notebook, the only reactive and fully Scala notebook for Apache Spark. In 2015, Andy founded Data Fellas, working on an integrated and reactive distributed data science toolkit orchestrated from within the Spark Notebook.


Link to Full Article: Building Distributed Pipelines for Data Science using Kafka, Spark, and Cassandra

Pin It on Pinterest

Share This