Paddy Mullen

Introducing Wakari

We are proud to introduce Wakari, our hosted Python data analysis environment.

We believe that programmers, scientists, and analysts should spend their time writing code, not working to setup a system. Data should be shareable, and analysis should be repeatable.  We built Wakari to achieve these goals.

Sample Use Cases

We think Wakari will be useful for many people in all types of industries.  Here are just three of the many use cases that Wakari will help for.

Learning python

If you want to learn Python, Wakari is the perfect environment. Wakari makes it easy to start writing code immediately, without needing to install any software on your own computer.  You will be able to show instructors your code and get feedback as to where you’re getting hung up.

Academia

If you’re an academic frustrated by setting up computing environments and annoyed that your colleagues can’t easily run your code, Wakari is made for you.  Wakari handles all of the problems related to setting up a Python scientific computing environment.  Because Wakari builds on Anaconda, useful libraries like SciKit, mpi4py and NumPy are right at your fingertips without compilation gymnastics.

Since you run code on our servers through a web browser, it is easy for your colleagues to re-run your code to repeat your analysis, or try out variations on their own.  At Continuum, we understand that reproducibility is an important part of the scientific process that your results be consistent for reviewers and colleagues.

Finance

For users who work in finance, Wakari lets you avoid the drudgery of emailing Excel files to share analysis, data, and visuals. Since data feeds are integrated into the Python environment, it is effortless to import financial data into your coding environment.  When it is time to share results, you can email colleagues a URL that links to running code.  Interactive charts are easy to create and share from Python.  Since Wakari is built on top of Anaconda, great libraries like NumPy, Scipy, Matplotlib, and Pandas are already installed. Wakari includes support for Anaconda’s multiple environments, so you can easily change between versions of Python (including Python 3.3!) and versions of fundamental libraries.

How Are We Doing It?

Wakari integrates many different components. We are currently going into private beta with a “minimum viable product” and we will iterate on every area.  Each feature contributes in some way to our goal of easy, repeatable, and shareable data analysis.

Web-based Code Editing

The code editor is key to the environment, since this means that you don’t have to download any additional software to edit and create Python code.  We use the Ace Editor as our code editor.  The editor also works with a file browser that provides GUI file management.

Sample Data stores

Wakari comes with a number of canned examples of our data library.  These stores are built on top of IOPro for fast, indexed access, and each data store includes an example script.  Using a data store is as simple as:

from wakaridata.wikilogs import WikiLogs
logs = WikiLogs()
keys = logs.keys
print logs[keys[0]][:]]

Python

Shareable JS Charting

Charting and visualization is a key part of any data analysis package.  To make it easy to share interactive plots with colleagues, we have integrated our CDX plotting library into Wakari.  This allows fully interactive Javascript charts to be created from Python. CDX also enables users to export single plots or groups of plots into their own URLs, which can be shared with anyone.

Multiple Python Environments

Many users have to run their code against different versions of Python and NumPy. Normally this leads to a morass of compiling, linking, and installing.  With Wakari you can seamlessly run your code against multiple Python versions and multiple library versions.  Changing the Python environment is as simple as clicking on a dropdown.

Execution on compute nodes

We wanted Python environments to be set up without users having to go through complicated install steps.  This is why all of your code runs on our compute nodes.  We have a fleet of Amazon EC2 nodes that are all configured with the full Anaconda environment.

When we start adding cluster and disco integration, your code will be seamlessly distributed to those nodes.

Many ways to interact with the running python environment

Wakari lets you run code in many ways. We have IPython shell integration, IPython notebook integration, and a unix shell environment all accessible from your web browser.  Including these powerful tools lets you write code the way you’re used to.

In the coming months we will be adding many features to Wakari.

Here is a tentative list:

  • Python debugger integration
  • Profiler integration
  • Better shell integration
  • Better plotting
  • Code sharing via github
  • Private cloud installation
  • Increased compute ability via GPU compute nodes
  • Batch jobs
  • Scheduled jobs

We are opening Wakari beta registration and will be releasing a fixed number of accounts each day. Sign up today at http://wakari.io/!

Tags: Wakari
comments powered by Disqus