The Project "SICP-2020".

1 TL;DR: Project artefact summary for the impatient.

2 Abstract

At some point in my life I decided that I need to know programming well in order to navigate the modern world.

This page describes the project that I undertook in 2019-2020, with the goal of modernising computer science education. Its purpose is to collect in one place links to all artefacts that were produced during the project execution, with commentary.

The object of study was one of the most famous programming problem textbooks, the "Structure and Interpretation of Computer Programs", Second Edition, by Abelson, Sussman and Sussman. At the moment of writing of this document, 24 years have passed since the release of SICP, and many things may change in computing in such a time frame. Hence, apart from looking at the book itself, it was interesting how much effort would be additionally required in order to replicate the experience of the era when the book was still fresh.

This documents tells the story of this endeavour. Initially, there was no plan for making any kind of comprehensive post-mortem of the project. However, the project turned out so long, so labour-intensive, and eventually, so fruitful, that making a summary has arisen to be necessary, even if for myself and my readers only, and only to list artefacts.

In short, I have solved all of the SICP's problem set in a consistent and reproducible fashion, using modern software, filling the gaps in supporting software and libraries, and documenting as much of my work as possible, in particular how much time every problem required, and how much external help was requested.

I tried going through the book honestly, without cutting any corners whatsoever. This would give me a chance to learn the subject as it is presented, along with all the technologies that the book may not touch directly but which would prove to be necessary.

As a result, I have produced, quite unexpectedly, a far greater amount of artefacts than I had expected.

Some philosophers would argue that people in general live by producing stories about themselves, and telling those stories to other people. This is my final "story" of this project, containing links to all the sub-stories of the project.

Interested readers are invited to read the Experience Report, as published in the Proceedings of the International Conference on Functional Programming 2020, archived as the University of Michigan Technical Report CSE-TR-001-21.

In short, it required me 729 hours to go through the SICP.

3 The Story

3.1 Time Management Video Tutorial

In order to find time, as a working person, I needed a more systematic approach to time management. Since I had already chosen org-mode as the solution medium, using org's time-management facilities was an obvious choice.

Following the general principle that learning something is the easiest when you are teaching it, I organised a seminar/lecture. This lecture was well-received by the audience, because it was given during the time of the corona-virus pandemic of 2020, so quite a lot of people found themselves locked in their home, and in the need to structure their time more efficiently.

This approach is not unlike the one outlined by John Lees-Miller (https://github.com/jdleesmiller) in his article "How a Technical Co-founder Spends his Time".

There is no presentation file, because it was a hands-on tutorial given right into the time management software.

3.2 SICP Solution

In short, I have solved all of the problem set, measuring how much time every problem required. Furthermore, I had to write and publish several libraries, in order for the solution to be runnable on modern Schemes, and be as portable between them, as possible.

3.2.1 Solution

Eventually, there were two artefacts produced from the solution of SICP:

The org-version is strongly preferred to the pdf version, because the pdf version requires certain uncanny tricks to get built, has no added value and is 5000 pages long.

To use the org-mode one, you need chibi-scheme of a sufficiently recent version, ImageMagick, as well as GNU Fortran for the last two exercises. Emacs is also strongly recommended.

3.2.2 Schemetran

During the solution process, one of the exercises requires writing a Scheme interpreter in "a low level language of your choice". In my case this choice happened to be Fortran, partly due to a relatively greater popularity of Fortran in 1996, partly due to relatively straightforward memory management. It is a toy implementation, not recommended for any serious applications. It's likely leaking memory and compares symbols in \(O(n)\).

3.2.3 SRFI 203: A Simple Drawing Language in the Style of SICP

Scheme Requests for Implementation is the Scheme's equivalent of XEPs or PEPs or JCPs. In order to make possible working with graphics in Scheme, I had to implement several interfaces assumed to be "given" in SICP. The graphics sub-library found its place as SRFI-203:

3.2.4 SRFI 216: SICP Prerequisites (Portable)

I also had to write a support library that packages the functions that are already available on modern schemes. (Unlike SRFI-203, which implemented an unportable subset.)

There review process has been started on 2020-11-15. There is also a Lay Summary in English, and in Russian.

3.3 ICFP Paper

Since this took so long and made me think too much, I decided to analyse the solution process and to document it for future reference. The result of this analysis ended up being substantive enough for a whole "scientific" paper, and was later presented at ICFP 2020, Scheme Track.

The papers were "published online", which means that ACM is going to maintain the website with a lay summary for a while, and hope that the papers will be mirrored by the major publishers… I guess.

In any case, below you can find the:

3.4 ICFP Presentation

Every "scientific" paper nowadays needs a supplementary paper to explain what it is actually about. This report is not an exception.

4 The Post-Mortem

There are things that divide your live into "before" and "after".

I had originally planned to write a review on SICP within the loosely defined series of book reviews that had started with two books on writing Scientific Software, one by Rouson and Xu, and the other one by Oliveira and Stewart.

I was naive. SICP proved to be a significantly more influential material… I had started writing a review several times, but failed at each of those, and eventually ended up just creating this file, where I just listed the artefacts created in process.

I guess, some things just are too big to have a review written about them. Maybe it is a peculiar particular case of the uncertainty principle. You can write a review as long as the length of the process you are writing about is short compared to the length of your life. Then you can truly "see the object through the lens of your life". However, if the process takes too much time and effort that it stops being small compared to your life, writing a review becomes similar to a process of solving a coupled system of equations. The "review" would be an image of the object seen through itself.

So at the end of the day I just decided to attach the two previous review attempts to this document, and let them be. After all, perhaps every nice project must leave something undone… just as a hook in the memory.

4.1 Attempt 1

4.1.1 Background

This story started a long time ago, in 2013, when I just entered a Post-Grad programme at one world-renown university. I was writing a study program for myself, aimed at filling the gaps that the previous education had left, as well as outlining the potential future research and engineering directions.

Functional programming had been on my list of interesting things for a long time, and I even had a book in mind, one highly regarded at one computer enthusiasts community that I had been a part of. The book was called "Practical Common Lisp", and it got my attention when a few of the frequent visitors of an online community I used to skim through at that time were making fun of the idea that anything Lisp-related may be to any extent practical.

There was, however, and additional level of impracticality. (Evil people rumour that there are many more, but I refuse to acknowledge their existence.) The hint was give in the name of the book discussed above. If that Lisp was "common", there must as well be "uncommon" one, right?

Indeed, such an uncommon Lisp exists. (In fact there are several.) The uncommon Lisp is called "Scheme", and has kind of an unhappy reputation. Practical programmers often say that it is too academic, whereas academics make a wry face, and say that it's a great language, and they would have written a lot of code in it, unless they had been so busy doing Science.

Science… indeed, science was the thing that brought me to Scheme. Apart from the ever-chasing me feeling, "what is it that they are actually talking about", I had been pointed towards Scheme by marketing.

It may seem strange, but the background is the following: I used to write a sizeable amount of neural-network and other statistics related code in Python and TensorFlow and Theano. Most of it was just exercises, but they gave me that pythonic experience that I had needed for a long time, but didn't have an opportunity to get.

And the most evident thing when using TensorFlow was that it in no sense actually fits into the model of being used through Python. The computational graphs and the delayed evaluation were almost by themselves begging for being an organic feature of a language instead of being an artificially plugged-in entity.

And then I remembered a feature from ages ago. I remembered a children's book (who can guess the name?) which was giving an entertaining introduction into computing; the book was classifying programming languages by their attribution to different tasks, and Lisp was ennobled with the "for Artificial Intelligence" title. Hmm, I thought. Why don't I find out what those people actually meant by Artificial Intelligence back then, in the Golden Era? Additionally, we had a course on functional programming in the Uni, and we were supposed to have something like an embedding Lisp into a Categorical Abstract Machine as one of the course assignments. Why don't I remember a thing from that course, I wonder?

So Scheme comes into play with several kinds of an impetus. The Artificial Intelligence of the day, childhood memories, friends' recommendations, university memories.

How did I choose the book? To my greatest shame, I don't remember. I remember someone saying that people who studied through Structure and Interpretation of Computer Programs were the best programmers he (or she?) had ever met. Who was this friend? My memory fails me. A man or a woman? Online or offline? It's a shame to loose memory at such a young age.

4.1.2 The first attempt

I started reading the book several years later, in 2016. In fact, I have read four chapters out of five, ignoring the last chapter, dealing with what I saw as assembly language tricks that I supposedly had already learnt back when it was my first university year. Frankly speaking, that reading was perhaps worth doing once because it taught me that by only reading a book the chances of learning anything are close to zero. I had remember spending a couple of weeks staring at my phone screen (I did read the book from a phone), and I had remembered a few function names, such as display, but almost nothing else.

4.1.3 Present days

Reading SICP was one of those things that kind of separate your life into before and after. It is strangely hard to say why. I mean, I could iterate over a number of reasons: Because it shows you almost the whole multi-layer structure of the computer world, from the silicon to the command line? (And even a little bit about graphics.) Because it makes you approach every piece of software with a grain of salt? That is, it makes you, at one hand, see how crappy software is even when the developers have spent a lot of effort on making it look like it isn't, and on the other hand, makes you feel the itching that is telling you "I would have fixed this bug faster than they would even understand what I am talking about"? Maybe because it, by hiding all the gory details that are unnecessary for the narrative under the carpet, hangs all this giant bag of heuristics right onto your neck, and makes yourself responsible for it?

Or maybe because it makes you feel how little you actually know, and possibly even never be able to know much more?

All of the above is true, but there seems to be still something that is hard to enunciate.

Anyway, for myself it was very important because it was one of the few things that I did almost without any imitation.

4.1.4 Imitation

Imitation plays such a large role in the modern life. It's strange to write such a platitudinous phrase in a personal blog describing a personal experience, especially the one talking this much about seeking the truth.

Computers make it astonishingly easy to tell lies. They also make impossible possible, and let you see the unobvious truth where it is otherwise hard to see, however this requires effort, whereas lying happens almost naturally in the computer world.

The biggest lie of computing is, perhaps, that computing is easy. It is not. No matter what Larry Wall or Guido van Rossum tell you.

SICP makes you feel this difficulty to a full extent. SICP is by no means the only book on programming in Scheme or programming artificial intelligence. The other ones are also not really easy, but they are not deliberately hard.

Is it a general rule that good textbooks are always deliberately hard? Landau-Lifshits comes to mind. I always hated it, because it just readily ignores a lot of under-the-carpet problems with the story it tells.

But maybe there is actually something in it? The textbooks almost grow in difficulty from "hard because badly written" to "hard because huge" to "easy because well written" and finally to "hard because the authors deliberately nudge you into thinking about something"?

4.1.5 How do you read textbooks?

When I was younger, I believed in "learning by attending classes", then in "learning by listening", then in "learning by reading…", I even used to be at "learning by doing".

Now I am at the stage of "learning by writing".

I guess, the next stage is "learning by teaching", and it all ends with "you're still be a fool at your deathbed".

4.2 Attempt 2

It is relatively easy to make a programming language that is only capable of running on a single machine, or only on machines of a single product line. In fact, making machine languages is a routine exercise in universities offering majors in computing. Even I did not evade this exercise, even though I was not a low-level programming major.

This, however, although pleasurable to machine engineers, limits the practicality of the language itself, as we usually want it to be useful on as many machines as possible.

So far, only the C language has really managed to maintain a noticeable connection to the machine hardware, while still remaining a popular option among programmers. Most of the other languages embrace portability.

In fact, they embrace it so much, that the actual notion of machine parts remains nothing more than a nuisance for ordinary programmers, so huge is the abstraction gap. These programmers reason with everyday things, such as texts, pictures, thoughts.

Let me say, that in my view, SICP is a book that tries to build the bridge between the "reasoning about everyday things" and "reasoning about electric signals". And exactly because the gap is so wide, the authors were faced with a difficult choice.