Sunday, November 18, 2007

My first F# program

In previous hopes of getting an interview with the nascent F# team (which I obviously did not get; sad faces all around), I started adding F# blogs to my feed, including a former C# PM who I had the privilege of working with during my internship and ICFP 2006, Luke Hoban (not Luke H on my links). Jomo Fisher, a new developer on the F# team, posted a small challenge to "pivot" a list of lists in a small number of lines in any language. The goal is to be as concise and clean as possible, while beating his solution and a yet-to-be-posted solution, which are 9 and 4 lines of F# respectively. I came up with my first F# program after awhile (it looks like I just copied someone else's on the comments list, but comments are moderated so they weren't all there), which was kind of fun:

let rec pivot(l: 'a list list) =
    match List.hd l with
        | head::tail -> [for sub in l -> List.hd sub]::pivot [for sub in l -> List.tl sub]
        | [] -> [];;

Possibly due to my amateurism in functional programming or unfamiliarity with F#, these four lines took me quite awhile to figure out (more than an hour). Of course, reading the code is easy enough; I wonder if functional code exhibits hard-to-write, easy-to-read behavior in more than just this case. Of course, my solution is not quite the best, since my pattern matching operates on the head of the list. It's probably better to match the list with []::_ and _, which was someone else's solution.

Imperative code for this in the comments have been predominantly larger than 4 lines, with a C# with LINQ solution being the shortest. What other tricks can people come up with? I think the zip use in Python is interesting. It accomplishes the problem in one line by unpacking the list of lists and sending it to zip, a built in function that does transposition. Of course, zip is basically the problem objective so it seems kind of like the cheap way out; the use of Python's * operator to unpack the list makes it quite easy to do; F# has List.zip, but I have no idea if it can unpack the list like Python can.

In other news, I finally got F# to work with Mono on my Linux computers; turns out all that I needed to do is install some libraries.

On C# and properties

I flew out to Seattle again this weekend for afternoon tea with Microsoft's CLR and C++ teams. The C++ developer I met was an interesting fellow, having worked on MSVC++ compiler for over 14 years and other C/C++ compilers for even longer. I've never seen someone so passionate about compilers, especially not for C++, but he seemed to take the challenge of C++ and transform it to pure enjoyment.

At one point during our discussion over Thai food for lunch, we started talking about characteristics that constituted "high-quality code", as well as topics in programming languages. Of course, there was no way I could lie my way out and say that C++ is my favorite language—it's probably among the least favorites—but my mention of C# led to some discussion about why I liked C#. I mentioned that C# was basically "Java done right", since Java is pretty clearly a direct predecessor to C#. Of course, no programming language is perfect, and he probably would have thought I was an opinionated jerk if i had said C# was.

Instead I decided to bring up something that I'd been thinking about for the past few days, with our software engineering team to blame. Essentially, I thought that properties should be able to encapsulate private variables properly for maintainers/fellow developers. Obviously they hide private variables and define the interface for designers of other classes, but for the designers of a given class, encapsulation is "broken." The maintainers of the code have two ways to access data (directly through the private variable and through the exposed property), which leads to some unintended problems:

  1. Finding references to a variable is separate from finding references to a property. The case where this is necessary is probably rare, at best. This is demonstrated by the popular use of properties that are simply one line get/sets, which is no different than setting the private variable directly. Removing the private path of access would eliminate this problem, which is an irritation to anyone who might use "Find All References" in Visual Studio, as well as any tool that may perform static analysis on C# code.
  2. If a property does anything besides a one line get/set, private access to the variable means some code that wasn't executed probably should have been. This was colorfully illustrated in our code in which a property had a multi-page set method (which arguably shouldn't have happened, since properties are supposed to be light). Any single incorrect access to the private variable would meant lots of code feeling lonely and would have resulted in a rebellion that would probably bring the program down.

This is a problem that can probably be solved by simply prefixing the private variable with something (like _ or m_), but I can't say I'm a fan of that solution, and it still appears on IntelliSense. I tend to make the capitalization of the first letter the only difference (private members being camelCase and public being PascalCase) and steadfastly believe that prefixes are ugly. Of course, this is probably the only solution without changing C#, so I may have to adopt it in the future, but what fun is not changing C#?

The solution that I suggested that I hadn't completely thought through was for C# properties to be able to embed variables in them. C# properties can only have get/sets inside of them, but consider the following code:

public DateTime Date
{
    DateTime date;
    get { return date; }
    set { date = value; }
}

I think this provides an interesting, additional level of abstraction. If the date variable in the Date property is local to the property, it prevents public access of any kind to it; all access must go through the property. In a sense, you're almost creating a new class to wrap around the data (you can't actually do this in C# since you can't overload the = operator). This means all reference searches would have to be through the property, which fixes both problems I listed above. Although it looks kind of funny, I think it would be a backwards compatible change, since the private variable could theoretically still exist inside or outside the property, depending on the need to access the private variable directly.