Scale Up & Scale Out

Big Data, advanced analytics and scientific computing bring exciting opportunities for businesses to leverage more and richer types of data to create meaningful business value and impact. However, it also creates serious computational challenges. To effectively manage this increasingly complex landscape, Data Science teams need technologies that will easily scale and take advantage of the available processing power from their desktop to high performance clusters with the latest advances in chipsets.

Python is the fastest growing Open Data Science programming language with an incredibly rich open source ecosystem. It is the go-to language for Data Science teams because of the easy learning curve and simplicity in programming and readability, which results in faster development time compared with compiled languages. However, because it’s an interpreted language, it is not associated with high performance. But with today’s increasingly large data sets and complex analytics, scaling analytics has become an issue for many Data Science teams.

The Anaconda platform easily delivers high performance analysis to Data Science teams by leveraging investments in all types of infrastructure to both Scale Up and Scale Out workloads. The Anaconda Python distribution is a high performance distribution of Python that has been compiled with the Intel® Math Kernel Library (MKL) that delivers faster throughput on numerically intensive calculations. Additionally, there are a multitude of Scale Up options available in Anaconda that allow Data Science workloads to maximize the investment in single machine infrastructure.

In this paper, you’ll discover why Anaconda, the leading Open Data Science platform, is the right solution to deliver that range of flexibility and performance. We will:
  • Define the computing constraints
  • Identify techniques for resolving the computing constraints
  • Describe ways to deliver high performance on today’s modern architectures
  • Showcase the flexibility of the Anaconda platform to scale with any modern infrastructure


Thank you, the whitepaper will arrive in your inbox momentarily.

About the Author

Dr. Stan Seibert

Dr. Stan Seibert is the High Performance Python Team Lead at Continuum Analytics. His work focuses on high performance GPU computing and designing data analysis, simulation and processing pipelines.
Before Continuum, Stan did research in experimental particle physics at the University of Pennsylvania and Los Alamos National Labs.
Stan holds a Ph.D. in Physics from the University of Texas at Austin and undergraduate degrees in Physics and Computer Science from the University of Arizona.
In this whitepaper, Dr. Seibert is joined by Dr. Kristopher Overholt, Michele Chambers and Christine Doig of Continuum Analytics.