Saturday, April 28, 2007

How to get the wrong people in your computer science program

A humoring thread came up on the Something Awful forums that I almost forgot about. The thread is about those schools you see that advertise on television ("Come to ITT Tech!!! Make some gamez!!!!!"); everyone who has seen those ads on TV probably deduces that they aren't very good at what they do (why else would they be advertising?). The University of Advancing Technology made a highly amusing blunder on their website:

Eating is a serious pastime at UAT. Lunchtime is never a quiet, reflective time for college students in general. But at UAT, it's a noisy, rollicking journey down the highway of fun. It's not uncommon to witness students engaged in vigorous Guitar Hero contests, or watching the latest anime on the big-screen. Or you can eavesdrop on impassioned conversations about the merits of C++ versus Linux.

Enough said, I think.

Friday, April 27, 2007

Monoculture in open source software

Slashdot recently posted a story linking to an article about the "why Microsoft wins the development war so often." The excerpt from the Slashdot article:

"Microsoft offers the certainty of no choices. Choice isn't always good, and the open source community sometimes offers far too many ways to skin the same cat, choices that are born more out of pride, ego, or stubbornness than a genuine need for two different paths. I won't point fingers, everyone knows examples... The reality is that there are good, practical reasons that drive people into the arms of the Redmond tool set, and we need to accept that as a fact and learn from it, rather than shake our fists and curse the darkness."

I'm wondering how true this is for other people. Joel Spolsky posted an article last year about choices being headaches (ironically talking about Microsoft!), but referring to GUI interface design, rather than choosing software in general. Even though Joel is referring to the quirky shutdown menu for Windows Vista, I'd like to think that it also applies to the context of the Slashdot article. I've noticed some projects fork off of others for no reason, or the ones listed above (pride, etc.); but regardless of the reason, it feels like the open source world has way too many paths to pick from. And to top it off, sometimes none of those paths are even worthwhile!

Don't get me wrong, the concept of open source software is good (even the software that's platform dependent on certain proprietary operating systems). I do think, though, that the whole "diversity" among existing software can be rather annoying. Quality software doesn't need to be forked (how many Firefox forks are out there, and who bothers to use them?). Maybe this is one reason why I'm still an avid Windows user; I have less decisions to make, and hurray for that.

Wednesday, April 18, 2007

AOL is in your chats, droppin' your packets

Since I started using Windows Vista awhile ago, the AIM 5.2 + DeadAIM combination stopped working. DeadAIM is no longer maintained, and registering my key doesn't actually work. I started using shaim in its place (just recently celebrating 1,000 revisions!), since I have a natural distaste for alien user interfaces, such as GTK on Windows (Gaim), Qt on Windows (Psi--not as related, I know), and whatever Trillian, AIM 6.0, and AIM Lite use. The reason I stuck with AIM 5.2 for so long was because it wasn't bloated, and it was simple; when AOL started adding in features from DeadAIM, I thought they made it look retarded, so I didn't upgrade.

In addition to using shaim, I also got commit access to the Subversion repository very recently, which makes this my first open source project. Unfortunately for all of you non-Windows users, shaim's UI is currently built on WPF, which means it's native to Windows. Now this is a good thing (OS X users, see: TextMate); however, even though it's currently Windows-native, it's built on C#, and Mono compatibility (for non-UI elements) is very feasible; the UI is modular, and can be rewritten using other toolkits/interfaces.

But now I'm getting off topic from what I actually want to mention; the UI rant can be saved for later. Recently, for our last OS project, I set up a Subversion repository to work with my roommate and left him instructions including the repository location, which was a file:// URI. The next day, he asked if I would be setting up a Subversion repository. What? I kindly informed him that he must have missed the message I sent him the night before, but he was fairly certain that he had read all of his messages. I sent him another one while he was in class, but apparently that didn't go through either.

I thought it might be a URL parsing error, since shaim previously had some problems with converting URIs to actual hyperlinks, but it turned out that such was not the case. I followed the logic all the way to watching Wireshark report the packet being sent out. Wireshark never reported the message coming back in! Apparently AOL servers filter messages that contain file:// hyperlinks. Try it yourself! Sending file:// in plain text works fine, but shaim automatically parsed all URIs to be hyperlinks. If you try sending "file://test" hyperlinked to yourself, you'll only see one copy of the message--it never echoes! This is because AOL's AIM servers simply drop any packets that have hyperlinked file:// URIs.

Is this really the best course of action? There isn't any sort of notification that the packet is dropped, which is really user unfriendly. AIM 6.0 doesn't automatically convert file:// URIs, but if you manually do it (for whatever reason), it still just drops. It feels like there should be better ways to handle these potentially malicious hyperlinks. For example, why not offload the processing to the client? The client could either reject (and send a notice! How novel!) the incoming message, or it could remove the <a href=... from the message. I'm guessing the AIM servers don't have the ability to handle that kind of processing load, but the clients could do it easily. This makes it plainly obvious to everyone (senders, receivers, developers) what's going on, rather than just dropping packets left and right and being so secretive about it.

Tuesday, April 10, 2007

The perfect programming language

This really can't end well, because there really is none that I've discovered thus far. I'm sure lots of people in any language's band camp would beg to differ, but whatever. This is clearly going to be a post based purely on opinion, because the definition of what a perfect programming language is, is incredibly subjective. Personally, I tend to put a lot of emphasis on aesthetics of a language. Don't get me wrong, features are obviously vital, but often times it boils down to how languages look compared to their cousins (there are times when I'll argue in the feature field, usually about Java and C#).

I'm not trying to put any specific language in the spotlight, but is it just me, or do imperative languages look nicer than functional languages? Having grown up on imperative languages, maybe I'm biased; I know there are people out there that prefer some icky, icky looking languages like Scheme as well, so maybe it all depends on how we're cultured. Awhile ago, after talking with a friend from IU, I decided I would finally stop being [as] lazy and pick out a functional language to dive into.

Now, being the person that I am, I like to analyze things to the point of death (read: complain) sometimes, so we spent the night looking at different languages on Wikipedia, after talking about how he didn't "get" imperative languages and I didn't "get" functional languages (even though he actually started on C++). We looked at languages such as Haskell, Lisp, ML, and OCaml before I decided that I thought functional languages were just ugly.

I have nothing really specific to complain about either; I think they all just look really alien or messy. What I'm really wondering is, do functional languages purposely try to look as foreign as possible compared to imperative languages? After picking OCaml as the language and going through some tutorials, I noticed a lot of concepts that were present in OCaml and various other imperative languages. It seems like OCaml just [mostly] removes mutability, allows you to pass in functions to functions (which allows currying, probably the coolest feature I've seen), and full type inference. Since OCaml is impure, unlike Haskell, it does allow mutability in records (structs), references, and arrays.

It seems like a lot of the examples that are given in the OCaml tutorials could be expressed similarly in C#; however, there are still [obviously] advantages of using OCaml for functional programming.

  • OCaml is optimized as a functional language. For example, the CLR has a sub-optimal performance on performing its .tail MSIL (now apparently called CIL) instruction. A developer for Nemerle commented on writing manual tail call optimizations, which eliminated the use of .tail by using a simple jump instruction, but pointed out that mutually recursive methods are still not optimized.
  • Currying on C#, while it can work, doesn't work very well. You have to create your own classes for generating lambdas via anonymous delegates. The main article shows a type unsafe way to do it, as pointed out in comments; in order to provide type safety, you would have to add a type parameter for each argument to the Curry method, which could get cumbersome (e.g. Curry<K1, T1>, Curry<K1, T1, T2>, etc. where K1 is the return type and T1 are argument types).

On the other hand, I think C# in the general case brings a lot more to the table, concerning syntax (aside from full type inference, though it seems that C# 3.0 is moving towards type inference). I definitely like the C-family syntax a lot more (inherited bias); it feels terser than the ML-family syntax. I also think it's kind of silly how OCaml uses a semi-colon to separate statements, with double semi-colons to end blocks. Using braces feels more sensible to me, or at least something to enclose a block.

Along the same lines, I've never been a big fan of optional parentheses when it comes to calling functions, either. I first encountered this when learning Ruby and thought, "Cool!" However, after awhile, I changed my mind. It feels like optional parentheses are just for lazy people, and doesn't really increase readability; rather, it encourages laziness where applicable (I guess the space bar is a lot easier to hit than the parentheses' keys). I've never really gotten why Lisp and Scheme put their parentheses on before the function name, either. Of course, I know absolutely nothing about the Lisp family, so I'm sure there's a reason, and I'll get around to learning it. But really, what's so bad about "fun(arg1, arg2, arg3)"? In high school and onward, we're taught that functions are represented exactly like that, so I would think that using that representation would make more sense.

So why can't a language like C# take on the cooler features that are present in OCaml? Here's a [small] list:

  • Type inference - I know C# 3.0 is working towards type inference, as mentioned above; I don't think it will ever reach the point of full type inference, like OCaml. I had the opportunity to sit in on some language design meetings (and didn't understand much of it), but from what I understand, OCaml accomplishes full type inference by having literally no implicit casting at all, including promotion of numeric data types.
  • Currying - Maybe I think this is just really, really cool; I haven't programmed anything major in OCaml, so I don't know just how valuable currying is (the examples provide some interesting uses, but they're just that--examples). By looking at Sriram's article, it seems like it would be possible to have the currying functions that he wrote generated on the fly, rather than forcing the user to write it all.
  • Support for more optimal tail recursion - From what I understand, recursion and tail recursion are used very often in functional programming languages. Thus, providing a better optimization for tail recursion would get some imperative programmers thinking more functionally.

Interestingly, the comment about Nemerle brought me to their website. In their own words:

Nemerle is a high-level statically-typed programming language for the .NET platform. It offers functional, object-oriented and imperative features. It has a simple C#-like syntax and a powerful meta-programming system.

It sounds a lot like what I'm looking for, but I wonder if I'll actually like it. I'll put it on my list of languages to learn/look into; I'm still going to finish going through OCaml tutorials, at least. Unfortunately, there won't be much of that until school is out.

Thursday, April 05, 2007

Java as a first programming language

A thread on the Something Awful forums came up recently about a high school, second-year computer science/programming course. The author was requesting help to convince his teacher to switch said course from using VB6 to C#. There were a number of suggestions and alternatives given (Java, VB.NET, Python, Scheme), but that's not as relevant to the CS environment at Purdue (thread is here if interested). I got into a couple of discussions about C# and Java, and it brings me back again to imagine what Purdue would be like if we didn't teach Java as a first programming language.

The issue I brought up was an argument against teaching students how to write procedural code in a purely object oriented language. In particular, someone suggested that teaching procedural code in C# was easy, since you could simply add the static keyword to class methods to write non-object oriented code. I argued that teaching the static keyword before telling students what it meant was a particularly bad idea. I've never been a fan of teaching stuff out of order (who is?), but it seems impossible to teach things in order in Java. This is plainly visible by Java's hello world program:

class Main {
    public static void main(String[] args) {
        System.out.println("Hello world!");
    }
}

This is incredibly ugly, and C# is guilty of the exact same thing. For beginners, I don't think this is acceptable. Unfortunately, all compiled languages that I know of are guilty of this to some degree; on the other hand, scripting languages can do hello world programs in one line. When you introduce this Java program to a student, here is what new programmers tend to think:

  1. What is public?
  2. What is static?
  3. What is "main"?

The list can continue (what is System, out, println, etc.), but the point is, there are a lot of questions to be answered that can't be answered without covering material that they aren't ready for. In particular, I believe the static keyword is a real killer, even for some people who've programmed before (obviously not in depth). Instead of reiterating everything I said in the thread, I'll just copy and paste like the resourceful (read: lazy) person I am:

I think the static keyword versus other language syntax is slightly different. Why should you have to teach the static keyword first in an object-oriented language? In terms of thinking about objects, static "breaks" the OOP paradigm. One of the C# compiler devs even argues that the static keyword shouldn't exist. Obviously you need some way to differentiate between instance and static methods, but I never really liked to use of the word "static" to do so. Maybe it's just me, but when I hear the word in a general context, I take it to either be in terms of static you see in TVs, or static as in unchanging, the latter of which is due to static/DHCP IPs. Are people really supposed to be able to tell what "static" means, just offhand?

OOP languages like C# and Java are actually backwards in terms of expressing methods and their arguments. If you were to write OOP code in C, the "instance" methods would be the ones with the extra typing (the first parameter being the object type), and the static methods would simply do the opposite. In Python, declaring methods is the exact same way. All instance methods in Python begin with a "self" whereas static methods simply don't have a self. You can also call instance methods with static syntax, which gives some insight into just how OOP methods are actually defined.

As far as learning is concerned, I think Python is better for teaching the difference between static/instance methods (other topics are debatable). I really don't like telling my students, "oh, don't worry about this huge, gigantic header you have to write for your main method--or what a main method actually is. You'll learn that later!" {}, (), [], and <> are introduced in a more appropriate order (though funny enough, I do have students that still don't get the paren), so I don't think it's [as] relevant.

A post from another user probably summarizes very well:

introducing students to computer science through languages which introduce syntax issues long before they introduce computer science concepts is a recipe for failure, and I completely agree. Aside from the utter boneheadedness of an "objects early" or "objects first" approach, you're giving students a gigantic chunk of boilerplate code and not explaining what any of it does. Seriously, just think about it: a student is not going to be thinking in object-oriented terms right off the bat, so how do you explain what this "public class" and "public static void main" business is? The Javaschool answer is "we don't, we just tell them that it's not important and they'll understand it later." This is not an acceptable approach.

From what I've observed, Java is taught at Purdue (and other schools) for a few reasons:

  • Java is the number one language used, at the moment
  • Java is heavily involved in other courses (for Purdue, often data structures and compilers), and not teaching Java would have a severe impact on the curriculum
  • It's a good gateway to other statically typed languages, unlike Python and company, because of syntax similarities

There was, however, a comment by the same user above, regarding Java in computer science programs:

I've been doing a lot of research into computer science education recently (I'm more or less rewriting a computer science curriculum for my college), and there have been numerous studies which show poor computer science retention rates in Java-based computer science programs (equivalent to ACM CS1 core) that have improved substantially when the course syllabus was switched to a simpler language like Python or Scheme which allows students to focus on computer science concepts rather than fighting with language syntax. Pummel them with Java later.

While the idea of teaching a language that isn't statically typed bugs me, maybe it's just my upbringing in C. As far as static methods in classes go, I would certainly agree that Python does a better job of getting the idea across, as said in my quote above. However, I also think Python offers a lot of freedoms that aren't available in C# and Java (as do most scripting languages), which might increase the difficulty of learning statically typed OOP; nevermind the syntax differences between Python and "C-family" languages.

On the other hand, Python's enforced proper block indentation also brings something refreshing to the table. I've seen some extremely bad indention in Java (literally different levels on each line in a block!), and I think Python has taken a good step by enforcing uniform indentation. Unfortunately, we don't see this in most languages, so this is a nice plus to using Python as a learning language.

Unfortunately, I'm still not sure teaching Python as a first language is a good idea. I sure don't like Java (for anything), but then again, I don't know of any languages that I would rather teach students that have advantages that are substantial enough to merit a change in a class syllabus. I don't think functional languages will ever fly here, regardless of how pretty or ugly they look; the idea of CS180 is to prepare students for CS240, which transitions from Java to C. I don't think the Pythonic way of doing things will prepare students for C; it may teach you how to program, but as much as it makes no sense, I'm guessing it would just pass along the problem to the CS240 instructors. And we all know what kind of monster that would awaken.

Monday, April 02, 2007

Apparently computer science at Purdue sucks (Part 2)

Here's a response to several posts (linked) about Purdue CS from "I'm a 10!":

Non-CS CS students

I'm sure the statistics agree that the percentage of students that enter without any computer science background and stay is much lower than the percentage of students that enter with prior experience. I would also argue, however, that part of it is because those students enter CS without knowing what they really want. I've used this argument a lot, and it might not be true [anymore], but I think a lot of students come into our department because they think they're good with computers; this is patently incorrect reasoning. I think for people interested in CS, they should try to get into programming before they leave high school to see if it's really for them. Of course, sometimes it's formally impossible to do so.

While I'm one of the fortunate that had AP Computer Science offered at their high school, not everyone does. One of our CS180 lab instructors came to Purdue with no formal experience, because his school didn't offer it. This is a commonly cited reason for not having experience, but I don't think I buy this as an excuse. In most situations, I find it hard to believe that a high school student doesn't have the time for some self-teaching; I also believe that if a student is unable to teach him or herself, it might be a sign; after all, software engineering is hard. This is not to say that it's impossible, or that people with no experience entering CS at Purdue will fail, as a counter-example is provided above, but I don't think it's making it easier on anyone. To me, computer science is one of the few areas where you can get some real, practical knowledge before entering college. I'm fairly ignorant about other fields, so it could be one of many.

Of course, to shift all of the blame onto the students is probably biased elitism. I think the curriculum could use an overhaul, but such a task is time-consuming, bureaucratic, and not likely to happen. There are a lot of issues at hand, and I daresay that it's impossible to solve them all. Primarily, not all professors that teach classes want to be there. From what I understand, all professors are supposed to teach at least one class a semester, but not all professors are interested in teaching, much less teaching undergrads. The same goes for TAs; I've heard of a number of graduate TAs that don't seem to put much effort into their teaching, or their knowledge of the subject is restricted to an inapplicable domain, or some other random problem. Is it impossible to find grad students that care and actually know something? I don't think so. I've come across several good TAs over the years; perhaps they're just a minority. In all reality, this applies to undergrad TAs as well. I've seen some TAs who either don't care, don't actually know enough to be very good TAs, or just think that students that ask questions are dumb. Since I think I'm taking over as lab admin next semester, I sure hope I get a say in who will be on my lab staff.

Course syllabi also vary significantly from professor to professor. While I understand that professors have different teaching styles, I don't think it's a very good idea to have such a significant difference. For example, I took CS251, our data structures class, in the fall, which is off-semester. The class was a piece of cake and I hardly had to study for it. The kids who took it in the spring had incredible amounts of work, including building a web server using certain data structures, etc. While I think it's cool to work on practical projects like that, I think it's unfair to the students who have to do significantly more work for less return. The objective of the course should be learning about data structures, their run times, and how to implement them. All of the extra stuff is nice for the very advanced students, but adds incredible stress to everyone else.

Administration

I can't really comment much on this, since I'm not in the USB and have zero interaction with higher administration, but I'm guessing it's related to my statements above, since the higher administration are a subset of the faculty. I have no idea what caused a strained relationship between the administration and the USB, but I really fail to see why bygones can't be bygones. It seems that both the students and the faculty know what's best for the students, which just causes a deadlock in improving our CS curriculum.

The new building

To some degree, I agree that the new building has had a negative impact on the little sociality that our CS department had. I think the lack of a new undergrad lounge is kind moot point, since it's [kind of] migrated to the small area outside of the advisors' offices as well as the computer labs, where a lot of people seem to do work. I'm pretty sure the administration knew that the commons wouldn't be occupied by many CS majors, but I guess they didn't have much else to do with that space besides make it look pretty. Besides, the commons brings girls into our building, what can be so bad about that?!

My real issue with the new building is the bureaucracy that's been added to the advisors' offices. There's a secretary outside of that block now, and we have to sign in and out and set up appointments to meet our advisors; I would label this as being very close to the pinnacle of ridiculous. I don't think the advisors have ever had any trouble doing all of the work they needed with an open-door policy (which closes when stress levels rise), so who thought it would be a good idea to formalize social interaction with our advisors? I used to visit my advisor very frequently, and now it's reduced to one visit every few months; usually it's just to schedule my next semester. This sort of alienation between the students and the advisors is the last thing the department needs.

Apathy and entitlement

Apathy, as partly mentioned above, is a big kicker for the students and faculty. The faculty might not care, but the students often don't have more motivation than to complain on a newsgroup or to their own peers. The CS feedback panel hosted by the USB on a yearly basis shows little attendance from students that have issues with the way things are done, and sometimes the ones that do show up have very little constructive feedback to give. Even if there was a larger attendance and more feedback, does the faculty even care? Does anyone read their course feedback? I've read mine from teaching a CS158 (C for engineers) lab as well as for MA366 (ODE), but I've never really seen any active caring in CS180 about what sort of feedback TAs get. Is it like this for all classes/professors? No, but probably a majority.

My own conclusions

Despite all of this prose that looks like a bucket full of complaining, I'm actually not as dissatisified with CS as I seem. I've passed all of my classes and have started learning new stuff lately, which is good. Education and how to better teach in a CS180 lab are specific areas that I'm interested in, as a TA. I feel like it's my duty to try and reduce the number of people that drop out of CS because of CS180, but what's the use if they just drop out when they get to CS240?

I've jumped to conclusions before, but I wonder if other universities have these problems. It seems like a university with a CS department that genuinely cares about its undergrads would be a utopian department. Unfortunately, I don't think these universities exist.