Julia: Taking stock of how it has been doing

I came across a 2012 question that had a very good discussion about Julia as an alternative to R / Python for various types of Statistical Work.

Here lies the original Question from 2012 about Julia’s promise

Unfortunately Julia was very new back then & the toolkits needed for statistical work were somewhat primitive. Bugs were being ironed out. Distributions were difficult to install. Et cetera.

Someone had a very apt comment on that question:

This said, it’ll be 5 years before this question could possibly be
answered in hindsight. As of right now, Julia lacks the following
critical aspects of a statistical programming system that could
compete with R for day-to-day users:

That was in 2012. Now that it’s 2015 and three years have passed, I was wondering how people think Julia has done?

Is there a richer body of experience with the language itself & the overall Julia ecosystem? I would love to know.

Specifically:

  1. Would you advise any new users of statistical tools to learn Julia over R?
  2. What sort of Statistics use-cases would you advise someone to use Julia in ?
  3. If R is slow at a certain task does it make sense to switch to Julia or Python?

Note: First posted June 14 2015.

Answer

I have switched to Julia, and here are my pragmatic reasons:

  • It does glue code really well. I have a lot of legacy code in MATLAB, and MATLAB.jl took 5 minutes to install, works perfectly, and has a succinct syntax that makes it natural to use MATLAB functions. Julia also has the same for R, Python, C, Fortran, and many other languages.
  • Julia does parallelism really well. I’m not just talking about multiple processor (shared memory) parallelism, but also multi-node parallelism. I have access to a HPC nodes that aren’t used too often because each is pretty slow, so I decided to give Julia a try. I added @parallel to a loop, started it by telling it the machine file, and bam it used all 5 nodes. Try doing that in R/Python. In MPI that would take awhile to get it to work (and that’s with knowing what you’re doing), not a few minutes the first time you try it!
  • Julia’s vectorization is fast (in many cases faster than any other higher level language), and its devectorized code is almost C fast. So if you write scientific algorithms, usually you first write it in MATLAB and then re-write it in C. Julia lets you write it once, then give it compiler codes and 5 minutes later it’s fast. Even if you don’t, this means you just write the code whatever way feels natural and it will run well. In R/Python, you sometimes have to think pretty hard to get a good vectorized version (that can be tough to understand later).
  • The metaprogramming is great. Think of the number of times you’ve been like “I wish I could ______ in the language”. Write a macro for it. Usually someone already has.
  • Everything is on Github. The source code. The packages. Super easy to read the code, report issues to the developers, talk to them to find out how to do something, or even improve packages yourself.
  • They have some really good libraries. For statistics, you’d probably be interested in their optimization packages (JuliaOpt is a group which manages them). The numeric packages are already top notch and only improving.

That said, I still really love Rstudio, but the new Juno on Atom is really nice. When it’s no longer in heavy development and is stable, I can see it as better than Rstudio because of the ease of plugins (example: it has a good plugin for adapting to hidpi screens). So I think Julia is a good language to learn now. It has worked out well for me so far. YMMV.

Attribution
Source : Link , Question Author : Community , Answer Author :
Chris Rackauckas

Leave a Comment