On Further Study of Operating Systems

July 22, 2021 — Josh

Last semester I was a teaching assistant for our operating systems course at Hunter College, g0oD TiM3s. I wrote this message to the class and figured it would be cool to share it here too.

I also have a bunch of related links etc saved for myself here if this is something you're interested in.


Hi all,

This is Josh, one of your TAs. Thank you for a great semester. I hope you found the course interesting and learned something new. This is a post about approaches you could take to learn more about operating systems on your own time (and links to a bunch of resources). Note that this has nothing to do with the final exam, so please ignore this message until you have time to read it.

I'll start with a classification of operating systems courses by John Lions:

There seem to be three main approaches to teaching Operating Systems.

First there is the “general principles” approach, wherein fundamental principles are expounded, and illustrated by references to various existing systems. This is the approach advocated by the COSINE Committee, but in our view, many students are not mature or experienced enough to profit from it.

The second approach is the “building block” approach, wherein the students are enabled to synthesize a small scale or “toy” operating system for themselves. While undoubtedly this can be a valuable exercise, if properly organised, it cannot but fail to encompass the complexity and sophistication of real operating systems, and is usually biased towards one aspect of operating system design, such as process synchronisation.

The third approach is the “case study” approach. This is the one originally recommended for the Systems Programming course in “Curriculum ’68”, the report of the ACM Curriculum Committee on Computer Science, published in the March, 1968 issue of the “Communications of the ACM”. Ten years ago, this approach, which advocates devoting “most of the course to the study of a single system” was unrealistic because the cost of providing adequate student access to a suitable system was simply too high.

This quote is from 1976, but for the most part this still holds true. Our course was of the first type, a (relatively in depth) survey of general principles. As professor Weiss said in the first slide of Chapter 1, "This is an overview of computer operating systems. It is not an in-depth study of them". Clearly, there is still a lot left to learn. Here are some ideas:

Continuing Our Approach

As professor Weiss mentioned at the end of class today, our course has covered the chapters from the textbook regarding processes. The other half of the chapters are for the most part centered around hardware-related topics (Main Memory, Virtual Memory, Mass-Storage Structure, I/O Systems) and other fundamental operating system structures (File-System Interface, File-System Implementation, File-System Internals, Security, Protection, Virtual Machines, Networks and Distributed Systems). Reading the remainder of the textbook seems like a good place to start in learning more about those topics. There are also publisher slides for those chapters linked on our course's home page.

The "Case Study" Approach

The classic "case study" operating systems text is John Lions' A Commentary on the Sixth Edition UNIX Operating System -- it's an analytical commentary on the Version 6 UNIX source code (chosen because of its simplicity and conciseness). You can find a PDF of the book here.

The V6 source code is quite old, written in a pre-ANSI dialect of C, and meant to run on a PDP-11. A modern rendition of this code exists and is used in many operating systems courses today -- it's called xv6, and you can find more information about it here. The Wikipedia article on xv6 links to many pages of courses using xv6 to teach, so maybe following along with one of those courses would be a good way to proceed.

Another choice is Andrew Tanenbaum's text based on MINIX, a teaching operating system he wrote. The text is called Operating Systems: Design and Implementation. In class we discussed a flame war between Tanenbaum and Linus Torvalds regarding microkernels vs monolithic kernels -- the reason they were having this conversation in the first place is because Linux was a free 'rewrite' (well, more than that) of MINIX. As the direct predecessor to Linux, MINIX has historical significance. According to Tanenbaum, this book is somewhere between a practical and theoretical study of operating systems.

The last choice I will mention is the most modern and maybe the most pragmatic. The Design and Implementation of the FreeBSD Operating System, as the title says, is a text on FreeBSD, which itself is a modern free and open source Unix-like operating system. Along with the FreeBSD source code, this might be a very good learning resource.

Systems Programming

Many of the code examples we've seen in class this semester revolved around using the API presented by the operating system (i.e. system calls) to perform some function. This is how system programs are written (recall the definition of a system program, for example stuff in /bin like ls, bash, etc). Now that you are familiar with some of the facilities provided by the operating system, you can use them to write such utilities yourself. Maybe this is more of interest to you than big ideas about operating systems. This is also a good way to get more familiar with C, as most system programs written for Unix-like systems are written in C.

A good way to go about this may be to read the source code of common command-line utilities, and to try to re-implement them by yourself. Parties responsible for different operating systems may write utilities in drastically different ways -- specifically GNU utilities were "often intentionally written in an odd style to remove all questions of Unix copyright infringement at the time that they were written", so keep this in mind when reading. Here are links to utilities from FreeBSD, GNU, Plan 9, and The Heirloom Project. You can likely find other implementations online. Here's a link to a YouTube channel with videos of implementing a few utilities (including cat, chmod, echo, and mkdir).

The Unix Programming Environment by Kernighan and Pike is a great book that may also come in handy. I've personally read half of this one so I can recommend it. Another one is The Art of UNIX Programming by Eric Raymond. Supposedly Advanced Programming in the Unix Environment by Richard Stevens is a canonical UNIX systems programming book as well.

Two more useful documents are Command Line Interface Guidelines ("An open-source guide to help you write better command-line programs, taking traditional UNIX principles and updating them for the modern day") and perhaps In the Beginning was the Command Line.

Don't forget that the professor has written a bunch of demos for us in the /data/biocs/b/student.accounts/cs340_sw/demos directory -- these are a great resource. He's also taught a UNIX systems programming course before: here is its webpage.

Alternative Operating Systems and Languages

You may notice that there has been a strong emphasis on UNIX specifically in our course and in most of the resources I have linked. UNIX struck a balance between simplicity and expressiveness, and for historical reasons has persisted until now despite being created in the early 70s. It may or may not be the end all be all of operating systems, though.

Moving Past C?

Because UNIX and C co-evolved at Bell Labs, they share a heritage, and UNIX + its system utilities are mostly written in C. There are pros and cons to this.

  1. C and UNIX were built with each other in mind, so they are a good match.
  2. C is a thin layer over the PDP-11 computer's architecture instruction set, and that is the machine UNIX was built for. This means that on a PDP-11 you will for the most part know exactly how your C program will operate on your computer, which is great for those who have a thorough understanding of how their computer works (the language grants you all the control you need). But, I'd hedge a bet and say your computer is not a PDP-11. On many modern computers, C is no longer a low level language, as it does not reflect the underlying architecture and how the computer works.
  3. C is fundamentally unsafe and insecure (e.g. buffer overflows can be exploited), and because of this many of the foundational technologies our society is built on are insecure, too. This is one of the main inspirations for new languages like Rust, which aim to be safe from the start (and many systems and utilities are being re-written in such languages).
  4. C's extensive usage of pointers makes some optimization and parallelization very difficult or impossible.
  5. C is an imperative language built (for the most part) for the Von Neumann Architecture model of computing. If one wants to experiment with different ways of thinking about languages and architectures, C is not the place to look.

Alternatives

In short, the above is meant to communicate the idea that there are other alternatives to C and UNIX, and we should look for them and study them to get a better idea of what is possible.

Here's a start:

There are also experimental Unices (Unix-es?) and kernels, such as the GNU Hurd, Mach Kernel, and Haiku.

You may also be interested in Multics, which was a complex predecessor to UNIX. There were a lot of great ideas in Multics that made it into UNIX, and perhaps some that did not. Tom Van Vleck's page on Multics, https://multicians.org/, is the canonical place to look. For references to other documents check out the Multicians Bibliography. Also see https://ban.ai/multics/doc/.

General Resources

That's all, thanks again for a great semester and thank you for reading. If you need any help locating resources I didn't directly link to then send me an email. Similarly if you have any questions or just want to discuss further. I look forward to hearing from you!

Josh