Skip to content

CS423

CS423 (Operating System Design) is a 3/4-credit-hour course that satisfies the Technical Elective requirements for ECE majors and satisfies an Advanced Computing Elective for CEs. It is offered only in fall semesters.

Content Covered

  • Interrupts
  • Concurrency, Synchronization, and threads
  • Scheduling
  • Debugging the Kernel
  • Memory Management
  • Virtualization, Virtual Memory
  • Disks and Files
  • Filesystems

If you are looking for another course like ECE391, this might not be it. Unlike ECE391, CS423 does not work with x86 or any other form of assembly; while there are code snippets in x86 in lectures and exam questions have asked about lower level topics 391 students built in MP3, the MPs themselves don't deal with hardware so much as drivers. The class takes on a CS-based, somewhat higher-level approach to kernel programming. The course uses the standard form of kernel C libraries, as used in the Linux Kernel. Most of the lower-level and hardware interaction is already covered by pre-existing C functions and libraries. The student's focus is on finding the best ways of implementing these functions from a design and engineering perspective. Lecture begins with an introduction to operating systems and kernel programming, including concepts such as processes, threads, and interrupts. The class soon dives into synchronization and scheduling (which is covered in more depth than in ECE391), followed by virtual memory (i.e. paging and memory) and filesystems. The weeks after Spring or Fall Break cover virtualization (i.e. virtual machines and virtualized environments) and distributed computing concepts, ending the semester with an overview of parallel computing and security in operating systems.

One good way to think about this class coming from 391 is a more broader and in-depth design oriented look at topics in 391 that were taught with just functionality in mind. For example, 391 taught how basic ext2 filesystems worked. 423 will teach how a disk works, why filesystems are the way they are, how these ideas evolved (from FFS to others), different ways of implementing them (from minor changes like extents instead of simple datablocks all the way to whole new systems like log-structured file systems), reliability (journaling, logs, how to protect information in the case of a crash as well as ordering and atomics), and then go into SSDs and even distributed filesystems. As for another example, ECE391 teaches the old linux scheduler, with an active and non-active list; 423 teaches several algorithms for scheduling (realtime ones like EDF and Rate Monotonic, then SJF, FIFO, good old Round Robin), implementations of these (e.g. Multilevel Feedback Queues) and then the actual current Linux Scheduler (the Completely Fair Scheduler) down to the red black trees. Other such topics include synchronization, memory (paging), threads, system calls and interrupts, and other such topics. There are also some topics not seen in 391; the main one is virtualization, as in virtual machines, hypervisors, and containers (e.g. how Docker works), which are a good topic in their own rights but also important in distributed systems with some relations to architecture. There's also reliability (as mentioned above), distributed systems, and security as well.

It is important to mention that the MPs teach a lot more about the linux kernel itself, simply due to having to look up details and functions used in the kernel itself. An invaluable skill is learning how to write kernel modules, as the MPs are all kernel modules (excepting the 4th one, which may be a distributed application or an LSM which is treated completely different). These are how device drivers such as given TUX software in 391 work, and knowing how to write these are an integral skill for kernel hacking along with anyone wanting to work with Operating Systems (or just linux for that matter). This class does use C extensively but most problems are solved either through implementation or preexisting kernel interfaces; as thus, the focus goes to actual linux documentation and linux code a lot more. Each MP also contains goals to teach something specific about the kernel. Finally, synchronization is an actual issue now that the kernel is no longer uniprocessor. A rudimentary list of what each MP teaches (among other things) is provided below (Course Structure). Additionally, throughout the semester, students are introduced to trending topics in the field, such as cloud computing and the Google Filesystem.

Prerequisites

The official prerequisites for this course are ECE391 (Computer Systems Engineering) or CS341 (System Programming). Students who have taken CS341 have a slight advantage during the first few weeks of the semester, as they have more experience working with synchronization constructs, but this can be made up by ECE students because of the sheer difficulty 391 prepares students for. Helpful topics to remember from ECE391 are synchronization and deadlocks, scheduling, virtual memory, and filesystems.

When to Take It

Students interested in operating systems should take this class by the second semester of their junior year, or at the very latest, by the first semester of their senior year in order to take more courses in the field before graduation. While the officially listed prerequisites are somewhat useful, the class can certainly be taken without them if you are willing to put in a little more effort at the beginning of the semester. However, before taking this course, you should be well-versed in C programming and in synchronization concepts, i.e. deadlocks, race conditions, etc. It would be important to mention that ECE391 gives a MASSIVE advantage (at least in the 2020 class' experience); as a matter of fact, quite a bit of the class seems like a review, and the MPs are a bit easier due to experience in kernel programming. Quite a bit is catered to 341 students.

Course Structure

The workload for CS 423 varies. There are one or two homework assignments during the semester, each requiring about 5-6 hours to complete. MP's are assigned on a monthly basis, totaling 4 MP's - each requiring about 15 to 20 hours of time, working in groups of two or three. The MPs given vary by professor. MPs so far were:

  • MP0: Setup - Compile and boot into the linux kernel (but this time from scratch - pull the right version from git, compile yourself, load for GRUB yourself, and boot into it yourself on a VM; none of that fake 391 stuff).
    • Pull, compile, and boot into an actual linux kernel (v 4.4.0, MUCH more current than 391)
    • Command line utilities (e.g. screen)
  • MP1: Intro to kernel programming - implementing linked lists in the kernel, initialize proc entries, set reading and writing buffers for files, etc.
    • Linked lists in the kernel (and the linked list interface)
    • Synchronization in the kernel
    • Kernel modules and loading/unloading them
    • The proc filesystem (and basic driver functions like read, write, etc.)
    • Kernel Timers and Timer Interrupts
    • Workqueues
  • MP2: implementing a real-time scheduler in the kernel. (NOTE: This is not overwriting the actual linux scheduler, this is more writing a module that processes subscribe to then behave with certain predefined patterns; the scheduler uses kernel functions to "emulate" scheduling, rescheduling, and preempting).
    • Linux scheduler functions (set priority, schedule(), etc.)
    • Slab Caches and Slab Allocation (ACTUALLY using them)
    • Parts of the Task Struct (like the Process Control Block but in Linux)
    • Rate Monotonic scheduling algorithm, along with real time
  • MP3: Implementing a CPU utilization profiler; this involved some work with paging and rewriting mmap for a character device driver, as info had to be written by the kernel and read by user space.
    • Profiling
    • More Task Struct
    • Virtual Memory (in actual linux)
    • Memory Mapping (you have to write mmap() using vmalloc() and reserve pages)
    • Character devices (you have to essentially write one and use mknod to let your user program use it)
    • Thrashing and paging traces MP 4: ONE OF
  • linux security module - implement your own major security module for the kernel filesystem through hooks, then implement least priority policy with the passwd binary program.
    • Linux Security Modules (Major vs Minor modules)
    • More detail on compiling kernels (your code gets compiled into the main kernel rather than being loaded as a module, which YOU need to set up)
    • Security in the kernel (hook functions, creds, security blobs, least privilege policy)
    • Real filesystems (inodes, dentries, traversing those from current task to getting security credentials out of them)
    • The strace utility
  • distributed systems - running multiple instance of a Java or C++ program and load-balancing work over the network to complete the task as quickly as possible.

There is one midterm and one final exam.

Instructors

CS423 has been taught by various professors from the CS department. Most recently, it has been taught by Professors Ram Alagappan and Tianyin Xu.

Life After

If you liked this course, consider a career in operating systems. Apart from your standard OS jobs with Apple and Microsoft, with the shift to cloud computing, there are plenty of emerging opportunities in distributed computing, virtualization, etc.

Interested students should consider taking additional courses in the field, such as CS425 (Distributed Systems) and CS 523 (Advanced Operating Systems) though students should note that CS523 is a graduate level course and cannot be taken for Tech Elective credit without permission from the ECE department.