I have been tinkering with lots of different programming languages (see here and here) over the last few years. Scheme is the only language so far that I have enjoyed enough to write a decent amount of code. Elixir first caught my eye back in April 2020, but I've only recently tried to write more than 'hello world' with it. So far, I think it is great and I'm excited to learn more. I haven't previously been a fan of code notebooks, but I think Livebook is amazing.
I have done some recent work on my
dataframe library for Scheme (R6RS) and thought I would run through the examples in the Data Transformation chapter of R for Data Science (R4DS). In this post, I won't reproduce any of the R code and will provide limited commentary on the Scheme code (which is also available via this gist).
As a learning exercise, I decided to translate examples from the book, From Python to NumPy, into R and Chez Scheme. This post describes the random walk example from Chapter 2. All of the code is in this repository so I will only highlight a few pieces of code below. For context, I am a long-time R programmer who only periodically pokes at Python and dabbles in Scheme for fun. Because performance is the primary motivation of vectorizing code with NumPy in Python, I will be loosely comparing timings between Python, R, and Chez Scheme. Take these timings with a large grain of salt. I don't know how comparable the different timings are.
As a learning exercise, I wrote a
dataframe library for Scheme (R6RS). Because I was learning Scheme while I wrote
dataframe, I did not prioritize performance. However, as I've tried to use the
dataframe library (exploratory data analysis, spam simulation, gapminder), I've encountered performance pitfalls that make
dataframe largely unusable for datasets with more than a few thousand rows. I have a rough idea of where the bottlenecks are, but I thought it would be a useful to take a step back and visualize the
dataframe procedures as a network graph.
The stochastic logistic population model described in this blog post has become my default exercise when I'm exploring a new programming language (Racket, F#, Rust). I ended that Rust post by noting that I was interested in Rust as a modern alternative for scientific computing and thought it would be a good learning exercise to re-write small legacy Fortran programs in Rust. In the process of looking for Fortran programs to translate to Rust, though, I found myself becoming more interested in the idea of learning Fortran than Rust, particularly after learning about the efforts to improve the tooling around Fortran (e.g., here and here). So, here we are...exploring Fortran via the stochastic population model exercise.
I keep my eye out for blog posts illustrating data analysis tasks in R that I can use to test the functionality of my
dataframe libraries for Scheme (R6RS). A post comparing
pandas (Python) and
dplyr (R) in a basic analysis of the gapminder dataset provides a nice little test case. In this post, I will also include base R code used to accomplish the same tasks as a contrast to both the Scheme code and the
dplyr code from the other post.
I recently came across this blog post on writing simple test programs in different programming languages as a way to get a feel for a language. At the bottom of the post, there were links to articles on implementations of a number guessing game in 13 different languages, including Fortran. I was curious about the Fortran example because I've been interested in learning a little Fortran after hearing about efforts to improve the tooling around Fortran (e.g., here and here). Well, the Fortran example was written in Fortran 77 so I decided that re-writing it in modern Fortran would be a nice little exercise.
I learned a lot about Scheme (R6RS) by writing a few libraries and I expect that there is more to learn by trying to use those libraries (e.g., EDA in Scheme). A blog post about a stochastic simulation of spam comments in R caught my eye as an interesting example to test my
As I spent a little time learning F# over the last few months, I found that it wasn't holding my attention. My interest in F# was based on the idea that I could write more robust code (via static typing) than in R and that I could more easily turn that code into web or desktop applications. I still think that F# could be a valuable tool to add to my toolbox, but I encountered just enough friction that I wasn't having fun with it. My primary point of frustration is that so much material for learning F# assumes that you already know C# and .NET. Plus, the roll out of .NET 5 and F# 5 this fall, while exciting, creates a period of increased confusion for beginners.
In three previous posts, I wrote about different programming languages that I have considered learning. I mentioned about 15 different languages in those posts. F# was not on the list. Because my background is in R, I thought I was better off sticking to learning dynamically typed languages at this point. Moreover, I hold a longstanding bias against Microsoft and Windows and that bias was easy to transfer to F#.