Under Development

There is an open bug in SmartOS that needs to be fixed for this all to work.
These are development notes until this header is removed.

Motivation

I (still) don't run VMware but I do have a SmartOS machine (it's a little nicer than the one from a decade ago).
I now work on Triton for my day job and I want to run CoaL for some testing.

Networking

The first trick is going to be to get some appropriate network tags set up and configured in the way that the CoaL image expects. I'm going to set up both an admin network and an "external" network. The latter will perform the same NAT that gets configured by the scripts for use with VMware.

Admin network.

This is a private network that doesn't need to reach the internet. Since I'll be confining my experiments to a single SmartOS hypervisor I'll just use an etherstub:

nictagadm add -l sdc_admin0

External network.

This one is tricker. CoaL expects this to be a network that can reach the network via NAT. We'll create another etherstub for it, then we'll create a zone to do NAT using Etherstubs:

nictagadm add -l sdc_external0

Provision a zone to be the NAT router using the following json (you can use whatever image_uuid you want, it doesn't actually matter):
coal-nat.json

{
  "alias": "coal-nat",
  "hostname": "coal-nat",
  "brand": "joyent-minimal",
  "max_physical_memory": 128,
  "image_uuid": "2f1dc911-6401-4fa4-8e9d-67ea2e39c271",
  "nics": [
    {
      "nic_tag": "external",
      "ip": "dhcp",
      "allow_ip_spoofing": "1",
      "primary": "1"
    },
    {
      "nic_tag": "sdc_external0",
      "ip": "10.88.88.2",
      "netmask": "255.255.255.0",
      "allow_ip_spoofing": "1",
      "gateway": "10.88.88.2"
    }
  ],
  "customer_metadata" : {
    "manifests" : "network/forwarding.xml\nnetwork/routing/route.xml\nnetwork/routing/ripng.xml\nnetwork/routing/legacy-routing.xml\nnetwork/ipfilter.xml\nsystem/identity.xml\n",
    "smf-import" : "mdata-get manifests | while read name; do svccfg import /lib/svc/manifest/$name; done;",
    "user-script" : "mdata-get smf-import | bash -x; echo -e 'map net0 10.88.88.0/24 -> 0/32\nrdr net0 0/0 port 22 -> 10.88.88.200 port 22 tcp' > /etc/ipf/ipnat.conf; routeadm -u -e ipv4-forwarding; svcadm enable identity:domain; svcadm enable ipfilter"
  }
}

You can also set a static IP address on the first NIC if you prefer.

Create the zone:

vmadm create -f coal-nat.json

Building the headnode VM

Normally SmartOS provides a lot of protection on the vnics. We'll be turning them all off so that the guest can do whatever it wants. This is one of the reasons I like setting up the etherstubs. Even if this VM runs amok the only other zone it can reach is that very minimal NAT zone.

We need to specify the hardcoded MAC addresses that the answers.json file is expecting to see as well:
coal-headnode.json:

{
  "alias": "coal-headnode",
  "brand": "bhyve",
  "bootrom": "uefi",
  "ram": 16384,
  "vcpus": 4,
  "autoboot": false,
  "nics": [
    {
      "mac": "00:50:56:34:60:4c",
      "nic_tag": "sdc_admin0",
      "model": "virtio",
      "ip": "dhcp",
      "allow_dhcp_spoofing": true,
      "allow_ip_spoofing": true,
      "allow_mac_spoofing": true,
      "allow_restricted_traffic": true,
      "allow_unfiltered_promisc": true,
      "dhcp_server": true
    },
    {
      "mac": "00:50:56:3d:a7:95",
      "nic_tag": "sdc_external0",
      "model": "virtio",
      "ip": "dhcp",
      "allow_dhcp_spoofing": true,
      "allow_ip_spoofing": true,
      "allow_mac_spoofing": true,
      "allow_restricted_traffic": true,
      "allow_unfiltered_promisc": true,
      "dhcp_server": true
    }
  ],
  "disks": [
    {
      "boot": true,
      "size": 8192,
      "model": "virtio"
    },
    {
      "size": 65440,
      "model": "virtio"
    }
  ]
}

Create the VM:

vmadm create -f coal-headnode.json

Copying over the CoaL USB stick image

Triton releases live at https://us-central.manta.mnx.io/Joyent_Dev/public/SmartDataCenter/triton.html
But there's a link kept up to date pointing to the latest.

curl -fLO https://us-central.manta.mnx.io/Joyent_Dev/public/SmartDataCenter/coal-latest.tgz
tar xzvf coal-latest.tgz
UUID=$(vmadm list -H -o uuid alias=coal-headnode)
qemu-img convert -f raw -O host_device coal-release-*.vmwarevm/*.img /dev/zvol/dsk/zones/${UUID?}/disk0
zfs set refreservation=0 zones/${UUID}/disk0@sdc-pristine
zfs snapshot zones/${UUID?}/disk0@sdc-pristine

Pre-configuring CoaL

We need to obtain the CoaL answers.json file and reconfigure Loader so that it will behave correctly in the VM.

lofiadm -l -a /dev/zvol/dsk/zones/${UUID?}/disk0
mount -F pcfs /devices/pseudo/lofi@2:c /mnt
curl -kL https://raw.githubusercontent.com/tritondatacenter/sdc-headnode/master/answers.json.tmpl.external | sed 's/vga/ttya/g' > /mnt/private/answers.json
# TODO: LOADER FIXES
umount /mnt
lofiadm -d /dev/lofi/2
zfs snapshot zones/${UUID?}/disk0@configured

Optional: Get a performance boost at the cost of potential VM data corruption if the host loses power:

zfs set sync=disabled zones/${UUID?}

Now you're ready to boot your VM.

vmadm start ${UUID?} ; vmadm console ${UUID?}

B2VT 2025 Josef "Jeff" Sipek

A week ago, I participated in a 242 km bike ride from Wikipedia article: Bedford to the Harpoon Brewery in Windsor. This was an organized event with about 700 people registered to ride it. I’ve done a number of group rides in the past, but never a major event like this, so I’m going to brain-dump about it. (As a brain-dump, it is not as organized as it could be. Shrug.)

This was not a race, so there is no official timekeeping or ranking.

TL;DR: I rode 242 km in 11 hours and 8 minutes and I lived to tell the tale.

The Course

The full course was a one-way 242 km (150 mile) route with four official rest stops with things to eat and drink. The less insane riders signed up for truncated rides that followed the same route and also ended in Windsor, but skipped the beginning. There was a 182 km option that started at the first rest stop and a 108 km option that started at the second rest stop. Since I did the full ride, I’m going to ignore the shorter options.

The above link to RideWithGPS has the whole course and you can zoom around to your heart’s content, but the gist of it is:

Rest Stops, Food, Drinks

The four official rest stops were at 58 km, 132 km, 169 km, and 220 km. The route passed through a number of towns so it was possible to stop at a convenience store and buy whatever one may have needed (at least in theory).

Each rest stop was well-stocked, so I didn’t need to buy anything from any shops along the way.

There was water, Gatorade, and already-prepared Maurten’s drink mix, as well as a variety of sports nutrition “foods”. There were many Maurten gels and bars, GU gels, stroopwafels, bananas, and pickle slices with pickle juice.

Maurten was one of the sponsors, so there was a ton of their products. I tried their various items during training rides, and so I knew what I liked (their Solid 160 bars) and what I found weird (the drink mix and gels, which I describe as runny and chunky slime, respectively).

My plan was to sustain myself off the Maurten bars and some GU gels I brought along because I didn’t know they were also going to be available. I ended up eating the bars (as planned). I tried a few B2VT-provided GU gel flavors I haven’t tried before (they were fine) and a coconut-flavored stroopwafel (a heresy, IMO). I also devoured a number of bananas and enjoyed the pickles with juice. Drink-wise, I had a bottle of Gatorade and a bottle of water with electrolytes. At each stop, I topped off the Gatorade bottle with more Gatorade, and refilled the other bottle with water and added an electrolyte tablet.

The one item I wish they had at the first 3 stops: hot coffee.

With the exception of the second rest stop, I never had to wait more than 30 seconds to get whatever I needed. At the second stop, I think I just got unlucky, and I arrived at a busy time. I spent about 5 minutes in the line, but I didn’t really care. I still had plenty of time and there was John (one of the other riders that I met a few months ago during a training ride) to chat with while waiting.

In addition to the official rest stops, I stopped twice on the way to stretch and eat some of the stuff I had on me. The first extra stop was by the Winchester, NH post office or at about 111 km. The second extra stop was at the last intersection before the climb around Ascutney which conveniently was at 200 km.

Since I’m on the topic of food, the finish had real food—grilled chicken, burgers, hot dogs, etc. I didn’t have much time before my bus back to Bedford left, so I didn’t get to try the chicken. The burgers and hot dogs were a nice change of flavor from the day of consuming variously-packaged sugars and not much else.

Mechanics

Conte’s Bike Shop (also a sponsor) had a few mechanics provide support to anyone who had issues with their bikes. They’d stay at a rest stop, do their magic, and eventually drive to the next stop helping anyone along the way. They easily put in 12 hours of work that day.

Thankfully, I didn’t have any mechanical issues and didn’t need their services.

Weather

Given the time and distance involved, it is no surprise that the weather at the start and finish was quite different. The good news was that the weather steadily improved throughout the ride. The bad news was that it started rather poor—moderate rain. As a result, everyone got thoroughly soaked in the first 20 km. Rain showers and wet roads (at times it wasn’t clear if there is rain or if it’s just road spray) were pretty standard fare until the second rest stop. Between the second and third stops, the roads got progressively drier. By the 4th stop, the weather was positively nice.

None of this was a surprise. Even though the weather forecasts were uncertain about the details, my general expectation was right. As a side note, I find MeteoBlue’s multi-model and ensemble forecasts quite useful when the distilled-to-a-handful-of-numbers forecasts are uncertain. For example, I don’t care if it is going to be 13°C or 15°C when on the bike. I’ll expect it to be chilly. This is, however, a very large range for the single-number temperature forecast and so it’ll be labeled as uncertain. Similarly, I don’t care if I encounter 10 mm or 15 mm of rain in an hour. I’ll be wet either way.

I kept checking the forecasts as soon as they covered the day of the event. After a few days, I got tired of trying to load up multiple pages and correlating them. I wrote a hacky script that uses MeteoBlue’s API to fetch the hourly forecast for the day, and generate a big table with as much (relevant) information as possible.

You can see the generated table with the (now historical) forecast yourself. I generated this one at 03:32—so, about 2 hours before I started.

Each location-hour pair shows what MeteoBlue calls RainSpot, an icon with cloud cover and rain, the wind direction and speed (along with the headwind component), the temperature, and the humidity.

I was planning to better visualize the temperature and humidity and to calculate the headwind along more points along the path, but I got distracted with other preparations.

Temperature-wise, it was a similar story. Bad (chilly) in the beginning and nice (warm but not too warm) at the end.

Clothing

The weather made it extra difficult to plan what to wear. I think I ended up slightly under-dressed in the beginning, but just about right at the end (or possibly a smidge over-dressed). I wore: bib shorts, shoe covers, a short-sleeved polyester shirt, and the official B2VT short-sleeved jersey.

The shoe covers worked well, until they slid down just enough to reveal the top of the socks. At that point it was game over—the socks wicked all the water in the world right into my shoes. So, of the 242 km I had wet feet for about 220 km. Sigh. I should have packed spare socks into the extra bag that the organizers delivered to rest stop 2 (and then to the finish). They wouldn’t have dried out my shoes, but it would have provided a little more comfort at least temporarily.

For parts of the ride, I employed 2 extra items: a plastic trash bag and aluminum foil.

Between the first rest stop and the 200 km break, I wore a plastic trash bag between the jersey and the shirt. While this wasn’t perfect, it definitely helped me not freeze on the long-ish descents and stay reasonably warm at other times. I probably should have put it on before starting, but I had (unreasonably) hoped that it wouldn’t actively rain.

At the second rest stop, I lined my (well-ventilated) helmet with aluminum foil to keep my head warm. When I took it off, my head was a little bit sweaty. In other words, it worked quite well. As a side note, just before I took the foil out at the third rest stop, multiple people at the stop asked me what it was for and whether it worked.

Pacing & Time Geekery

Needless to say, it was a very long day.

My goal was to get to the finish line before it closed at 18:30. So, I came up with a pessimistic timeline that got me to the finish with 23 minutes to spare. I assumed that my average speed would decrease over time as I got progressively more tired—starting off at 26 km/h and crossing the finish line at 18 km/h. I also assumed that I’d go up the 3 major climbs at a snail’s pace of 10 km/h and that I’d spend progressively more time at the stops.

Well, I was guessing at the speeds based on previous experience. The actual plan was to stay in my power zone 2 (144–195W) no matter what the terrain was like. I was willing to go a little bit harder on occasion to stay in someone’s draft, but any sort of solo effort would be in zone 2.

I signed up for the 15 miles/hour pace group (about 24 km/h), which meant that I would start between 5:00 and 5:30 in the morning. I hoped to start at 5:00 but calculated based on 5:30 start time.

Here’s my plan (note that the fourth stop moved from 218 to 220 km few days before the event, and I didn’t bother re-adjusting the plan):


                     Time of Day     Time
               Dist  In    Out    In    Out
Start             0  N/A   05:30  N/A   00:00
Ashby climb      51  07:27 08:09  01:57 02:39
#1               58  08:09 08:24  02:39 02:54
Hinsdale climb  121  10:55 11:37  05:25 06:07
#2              132  11:37 11:57  06:07 06:27
#3              168  13:35 13:55  08:05 08:25
Ascutney climb  198  15:21 16:15  09:51 10:45
#4              218  16:25 16:50  10:55 11:20
Finish          241  18:07 N/A    12:37 N/A

To have a reference handy, I taped the rest stop distances and expected “out” times to my top-tube:

(After I started writing it, I realized that the start line was totally useless and I should have skipped it. That extra space could have been used for the expected finish time.)

So, how did I do in reality?

Well, I didn’t want to rush in the morning so I ended up starting at 5:30 instead of the planned for 5:00. Oh well.

Until the 4th stop, it felt like I was about 30 minutes ahead of (worst case) schedule, but when I got to the 4th stop I realized that I had a ton of extra time. Regardless, I didn’t delay and headed out toward the finish. I was really surprised that I managed to finish it in just over 11 hours.

Here’s a table comparing the planned (worst case) with the actual times along with deltas between the two.


                       Planned      Actual        Delta
               Dist  In    Out    In    Out    In    Out
Start             0  N/A   00:00  N/A   00:00  N/A   +0:00
Ashby climb      51  01:57 02:39  01:53 02:17  -0:04 -0:22
#1               58  02:39 02:54  02:17 02:33  -0:22 -0:21
Hinsdale climb  121  05:25 06:07  04:59 05:41  -0:26 -0:26
#2              132  06:07 06:27  05:41 06:10  -0:26 -0:17
#3              168  08:05 08:25  07:34 07:55  -0:31 -0:30
Ascutney climb  198  09:51 10:45  09:13 09:37  -0:38 -1:08
#4              218  10:55 11:20  10:08 10:20  -0:47 -1:00
Finish          241  12:37 N/A    11:08 N/A    -1:29 N/A

It is interesting to see that I spent 1h18m at the rest stops (16, 29, 21, and 12 minutes), while I planned for 1h20m (15, 20, 20, and 25 minutes). If I factor in the two pauses I did on my own (3 minutes at 111 km and 9 minutes at 200 km), I spent 1h30m stopped. I knew I was ahead of schedule, and so I didn’t rush at the stops as rushing tends to lead to errors that take more time to rectify than not-rushing would have taken.

I’m also happy to see that my 10 km/h semi-arbitrary estimate for the climbs worked well enough on the first climb and was spot on for the second. The third climb wasn’t as bad, but I stuck with the same estimated speed because I assumed I’d be much more fatigued than I was.

To have a better idea about my average speed after the ride, I plotted my raw speed as well as cumulative average speed that’s reset every time I stop. (In other words, it is the average speed I’d see on the Garmin at any given point in time if I pressed the lap button every time I stopped.) The x-axis is time in minutes, and the y-axis is generally km/h (the exception being the green line which is just the orange line converted to miles per hour).

The average line is 21.7 km/h which is the distance over total elapsed time (11:08). If I ignore all the stopped time and look at only the moving time (9:43), the average speed ends up being 24.9 km/h. Nice!

Power-wise, I did reasonably well. I spent almost 2/3 of the time in zones 1 and 2. I spent a bit more time in zone 3 than I expected, but a large fraction of that is right around 200W. 200 is a number that’s a whole lot easier to remember while riding and so I treat it as the top of my zone 2.

Fatigue & Other Riders

I knew what to expect (more or less) over the first 2/3 of the ride as my longest ride before was 163 km. In many ways, it felt as I expected and in some ways it was a very different ride.

At the third rest stop (168 km), I felt a bit less drained than I expected. I’m guessing that’s because I actively tried to go very easy—to make sure I had something left in me for the last 70 km.

Sitting on the saddle felt as I expected: slowly getting less and less enjoyable but still ok. It is rather annoying that at times one has to choose between drafting and getting out of the saddle for comfort.

What was very different was the “mental progress bar”. Somehow, 160 km feels worse if you are planning to do 163 km than if you are planning to do 242 km. It’s like the mind calibrates the sensations based on the expected distance. Leaving the third rest stop felt like venturing into the unknown. Passing 200 km felt exciting—first time I’ve ever seen a three digit distance starting with anything other than a 1 and only 42 km left to the finish! Leaving the fourth rest stop felt surprisingly good because there were only 22 km left and tons of time to do it in.

In general, I was completely shameless about drafting. If you passed me anywhere except a bigger uphill, I’d hop onto your wheel and stay for as long as possible.

Between about 185–200 km, I was following one such group of riders. This is when I really noticed how tired and sore some people got by this point. One of them got out of the saddle every 30–60 seconds. I don’t blame him, but following him was extra hard since every time he’d get up, he’d ever-so-slightly slow down. That group as a whole was a little incohesive at that point. I tried to help bring a little bit of order to the chaos by taking a pull, but it didn’t help enough for my taste. So, as we got to the intersection right before the climb around Mount Ascutney, I let them go and took a break to celebrate reaching 200 km with some well-earned crackers.

After the long and steady climb from that intersection, the terrain is mostly flat. This is when I noticed another rider’s fatigue. As I passed him solo, he jumped onto my wheel. After a minute or two, he asked me if I knew how much further it is. I found this a bit peculiar—knowing how far one has gone or how much is left is something I spent hours thinking about. I gave him how far I’ve gone (216 km), how long the course is (240 km), did quick & dirty math to give him an idea what’s left, and I threw in that the rest stop is in about 3 km. Then about a minute later, I realized that he dropped while I continued at 200W.

After the mostly flat part, there was a steep but relatively short uphill to the fourth rest stop. This is when I stopped caring about being quite so religious about sticking to 200W max. Instead of spinning up it, I got out of the saddle and went at a more natural-for-me climbing pace (which isn’t sustainable long term). To my surprise, my legs felt fine! Well, it was not quite a surprise since I know that my aerobic ability is (relatively speaking) worse than my anaerobic ability, but it was nice to see that I could still do a bigger effort even after about 5000 kJ of work.

One additional observation I have about long non-solo events like this is that unless you show up with a group of people that will ride together, it is only a matter of time before everyone spreads out based on their preferred pace and you end up solo. People (perhaps correctly) place greater value on sticking to their own pace instead of pushing closer to their limit to keep up with faster people and therefore finishing sooner. I noticed this during the last B2VT training ride and saw it happen again during the real ride. This is much different from the Sunday group rides I’ve attended where people use as much effort as needed to stay with the group.

Conclusion

Overall I’m happy I tried to do this and that I finished. My previous longest-ride was 163 km, so this was 48% longer and therefore it was nice to see that I could do this if I wanted to. Which brings up the obvious question—will I do this again? At least at the moment, my answer is no. Getting ready for a long ride like that takes long rides, and long rides (even something like 5–6 hours) are harder to fit into my schedule, which includes work and plenty of other hobbies. So, at least for the foreseeable future, I’ll stick to 2–2.5 hour rides max with an occasional 100 km.

Garmin Edge 500 & 840 Josef "Jeff" Sipek

First, a little bit of history…

Many years ago, I tried various phone apps for recording my bike rides. Eventually, I settled on Strava. This worked great for the recording itself, but because my phone was stowed away in my saddle bag, I didn’t get to see my current speed, etc. So, in July 2012, I splurged and got a Garmin Edge 500 cycling computer. I used the 500 until a couple of months ago when I borrowed a 520 with a dying battery from someone who just upgraded and wasn’t using it. (I kept using the 500 as a backup for most of my rides—tucked away in a pocket.)

Last week I concluded that it was time to upgrade. I was going to get the 540 but it just so happened that Garmin had a sale and I could get the 840 for the price of 540. (I suppose I could have just gotten the 540 and saved $100, but I went with DC Rainmaker’s suggestion to get the 840 instead of the 540.)

Backups

For many years now, I’ve been backing up my 500 by mounting it and rsync’ing the contents into a Mercurial repository. The nice thing about this approach is that I could remove files from the Garmin/Activities directory on the device to keep the power-on times more reasonable but still have a copy with everything.

I did this on OpenIndiana, then on Unleashed, and now on FreeBSD. For anyone interested, this is the sequence of steps:

$ cd edge-500-backup
# mount -t msdosfs /dev/da0 /mnt
$ rsync -Pax /mnt/ ./
$ hg add Garmin
$ hg commit -m "Sync device"
# umount /mnt

This approach worked with the 500 and the 520, and it should work with everything except the latest devices—540, 840, and 1050. On those, Garmin switched from USB mass storage to MTP for file transfers.

After playing around a little bit, I came up with the following. It uses a jmtpfs FUSE file system to mount the MTP device, after which I rsync the contents to a Mercurial repo. So, generally the same workflow as before!

$ cd edge-840-backup
# jmtpfs -o allow_other /mnt
$ rsync -Pax \
        --exclude='*.img' \
        --exclude='*.db' \
        --exclude='*.db-journal' \
        /mnt/Internal\ Storage/ edge-840-backup/
$ hg add Garmin
$ hg commit -m "Sync device"
# umount /mnt

I hit a timeout issue when rsync tried to read the big files (*.img with map data, and *.db{,-journal} with various databases, so I just told rsync to ignore them. I haven’t looked at how MTP works or how jmtpfs is implemented, but it has the feel of something trying to read too much data (the whole file?), that taking too long, and the FUSE safety timeouts kicking in. Maybe I’ll look into it some day.

Aside from the timeout when reading large files, this seems to work well on my FreeBSD 14.2 desktop.

Is the Information Security industry succeeding? The Trouble with Tribbles...

Yesterday I had a trip up to London and had a wander round Infosecurity Europe. It was an interesting day, lots of things to see, many interesting conversations.

The show itself is huge. We've clearly come out of the doldrums of the last few years where shows had become tiny. And this was a dedicated infosec event, not just one part of a larger IT event.

Going by the size of the event, the number of exhibitors, the number of attendees, the size and extravagance of the displays, I think it's fair to say that Information Security as a business sector is doing very well. There's clearly a huge amount of vendor cash to splash around, and a confidence that customers have plenty of cash to buy the products on offer.

But is making money the correct definition of success here?

Most of the industry has a focus on detection and remediation. The pitch is that your systems are horrendously insecure and you need to give vendor X lots of money so they can detect a failure and help get your business back on its feet.

There was very little, in fact almost nothing, aimed at actually building more secure systems. (Even training and awareness is really nothing more than glossing over the cracks.) Maybe the closest is things aimed at the supply chain, but even that's basically detection of someone else's vulnerabilities.

So, in terms of actually building better systems, the Infosecurity industry is failing. It's not even addressing the problem.

(I would say that one definition of success for an information security company would be for it to do such a good job it's no longer needed. Clearly that's not going to be in many business plans.)

Furthermore, a string of high-profile hacks and breaches clearly indicates that the industry is failing to keep businesses secure.

Random thoughts on a Next Generation Tribblix The Trouble with Tribbles...

I have a little private project called xTribblix.

What's the x stand for? eXtreme? eXtraordinary? eXperimental? neXt generation?

Honestly, I don't know. It doesn't matter, it's just a little bucket I can drop things in to. But essentially, a set of experiments around changing Tribblix that allows me to do interesting things. The aim would be that, if successful, they get folded back into regular Tribblix; if unsuccessful then it's a learning experience.

It's just the logical continuation of the drive I've always had to make Tribblix faster, leaner, cleaner, fitter, easier, more secure. While retaining compatibility and functionality.

There are a few bits of illumos that really ought to be removed. Printing is a prime example - CUPS is a better, more modern implementation, maintained, familiar to everyone, what most Solaris people wanted anyway, and to be honest printing *isn't* an illumos core competency, so it's an ideal target to be outsourced. That's a clear example with a superior replacement already available; most subsystems might have someone crawl out of the woodwork who's inconvenienced by their removal.

So far, I've simply looked at things and decided to implement many of the simple ones for the next release(s) without the need for a separate experimental release. This isn't new, it's been going on for many releases already, and so far I've managed not to break anything that matters.

Some of the things done already (some will be in the next release):

grub deprecated
update DEFAULT_MAXPID to allow pid > 30000 (eg 99999 like smartos)
delete ftpusers, as there's no illumos ftpd
long usernames now silent rather than warning
removed uucp, and removed the nuucp user
zones based on core-tribblix need to worry less about what to remove
overlays based on core-tribblix with the actual images having a driver layer on top, so cloud/virtual images can slim down
replace /usr/xpg4/bin/more with a link to less
replace pax with the heirloom version
create /var/adm/loginlog by default
increase PASSLENGTH in /etc/default/passwd to 8
remove /etc/log and /var/adm/log, latter only used by volcopy
transformed away and eliminated most uses of isaexec
remove /usr/games
remove all legacy printing
remove libadt_jni
remove ilb
remove the old as on x86, everything should use gas
remove oawk and man page (and ref in awk.1)
remove newform, listusers, asa
no longer install doctools by default
drop the closed iconv bits, as they're useless
remove libfru* on x86
replace sendmail with the upstream
deprecate mailwrapper

A lot of this is simple package manipulation as I convert the IPS repo produced by an illumos build into SVR4 packages, mostly avoiding the need to patch the source or the build.

There's a lot more that could be done, some examples of what I'm thinking of include:

xpgN by default (replace regular binaries in /usr/bin)
sort out cpp (last remaining closed bin)
everything 64-bit
remove /etc links more aggressively
no ucb at all [except mebbe install...]
see if there are any expensive and unused kstats we could remove
firewall on by default
passwd blocklists by default
extendedFILE(7) enabled by default (although not necessary if everything is 64-bit!)
refactor packages so they are along sensible boundaries (with reducing the number of distinct packages being the goal)

Now all I need is some time to implement all this...

OmniOS Community Edition r151054 OmniOS Community Edition

OmniOSce v11 r151054 is out!

On the 5th of May 2025, the OmniOSce Association has released a new stable version of OmniOS - The Open Source Enterprise Server OS. The release comes with many tool updates, brand-new features and additional hardware support. For details see the release notes.

Note that r151050 is now end-of-life. You should upgrade to r151052 or r151054 to stay on a supported track. r151054 is a long-term-supported (LTS) release with support until May 2028.

For anyone who tracks LTS releases, the previous LTS - r151046 - now enters its last year. You should plan to upgrade to r151054 during the next twelve months for continued support.

OmniOS is fully Open Source and free. Nevertheless, it takes a lot of time and money to keep maintaining a full-blown operating system distribution. Our statistics show that there are almost 2’000 active installations of OmniOS while fewer than 20 people send regular contributions. If your organisation uses OmniOS based servers, please consider becoming a regular patron or taking out a support contract.

Any problems or questions, please get in touch.

Oxide’s Compensation Model: How is it Going? Oxide Computer Company Blog

How it started

Four years ago, we were struggling to hire. Our team was small (~23 employees), and we knew that we needed many more people to execute on our audacious vision. While we had had success hiring in our personal networks, those networks now felt tapped; we needed to get further afield. As is our wont, we got together as a team and brainstormed: how could we get a bigger and broader applicant pool? One of our engineers, Sean, shared some personal experience: that Oxide’s principles and values were very personally important to him — but that when he explained them to people unfamiliar with the company, they were (understandably?) dismissed as corporate claptrap. Sean had found, however, that there was one surefire way to cut through the skepticism: to explain our approach to compensation. Maybe, Sean wondered, we should talk about it publicly?

"I could certainly write a blog entry explaining it," I offered. At this suggestion, the team practically lunged with enthusiasm: the reaction was so uniformly positive that I have to assume that everyone was sick of explaining this most idiosyncratic aspect of Oxide to friends and family. So what was the big deal about our compensation? Well, as a I wrote in the resulting piece, Compensation as a Reflection of Values, our compensation is not merely transparent, but uniform. The piece — unsurprisingly, given the evergreen hot topic that is compensation — got a ton of attention. While some of that attention was negative (despite the piece trying to frontrun every HN hater!), much of it was positive — and everyone seemed to be at least intrigued.

And in terms of its initial purpose, the piece succeeded beyond our wildest imagination: it brought a surge of new folks interested in the company. Best of all, the people new to Oxide were interested for all of the right reasons: not the compensation per se, but for the values that the compensation represents. The deeper they dug, the more they found to like — and many who learned about Oxide for the first time through that blog entry we now count as long-time, cherished colleagues.

That blog entry was a long time ago now, and today we have ~75 employees (and a shipping product!); how is our compensation model working out for us?

How it’s going

Before we get into our deeper findings, two updates that are so important that we have updated the blog entry itself. First, the dollar figure itself continues to increase over time (as of this writing in 2025, $207,264); things definitely haven’t gotten (and aren’t getting!) any cheaper. And second, we did introduce variable compensation for some sales roles. Yes, those roles can make more than the rest of us — but they can also make less, too. And, importantly: if/when those folks are making more than the rest of us, it’s because they’re selling a lot — a result that can be celebrated by everyone!

Those critical updates out of the way, how is it working? There have been a lot of surprises along the way, mostly (all?) of the positive variety. A couple of things that we have learned:

People take their own performance really seriously. When some outsiders hear about our compensation model, they insist that it can’t possibly work because "everyone will slack off." I have come to find this concern to be more revealing of the person making the objection than of our model, as our experience has been in fact the opposite: in my one-on-one conversations with team members, a frequent subject of conversation is people who are concerned that they aren’t doing enough (or that they aren’t doing the right thing, or that their work is progressing slower than they would like). I find my job is often to help quiet this inner critic while at the same time stoking what I feel is a healthy urge: when one holds one’s colleagues in high regard, there is an especially strong desire to help contribute — to prove oneself worthy of a superlative team. Our model allows people to focus on their own contribution (whatever it might be).

People take hiring really seriously. When evaluating a peer (rather than a subordinate), one naturally has high expectations — and because (in the sense of our wages, anyway) everyone at Oxide is a peer, it shouldn’t be surprising that folks have very high expectations for potential future colleagues. And because the Oxide hiring process is writing intensive, it allows for candidates to be thoroughly reviewed by Oxide employees — who are tough graders! It is, bluntly, really hard to get a job at Oxide.

It allows us to internalize the importance of different roles. One of the more incredible (and disturbingly frequent) objections I have heard is: "But is that what you’ll pay support folks?" I continue to find this question offensive, but I no longer find it surprising: the specific dismissal of support roles reveals a widespread and corrosive devaluation of those closest to customers. My rejoinder is simple: think of the best support engineers you’ve worked with; what were they worth? Anyone who has shipped complex systems knows these extraordinary people — calm under fire, deeply technical, brilliantly resourceful, profoundly empathetic — are invaluable to the business. So what if you built a team entirely of folks like that? The response has usually been: well, sure, if you’re going to only hire those folks. Yeah, we are — and we have!

It allows for fearless versatility. A bit of a corollary to the above, but subtly different: even though we (certainly!) hire and select for certain roles, our uniform compensation means we can in fact think primarily in terms of people unconfined by those roles. That is, we can be very fluid about what we’re working on, without fear of how it will affect a perceived career trajectory. As a concrete example: we had a large customer that wanted to put in place a program for some of the additional work they wanted to see in the product. The complexity of their needs required dedicated program management resources that we couldn’t spare, and in another more static company we would have perhaps looked to hire. But in our case, two folks came together — CJ from operations, and Izzy from support — and did something together that was in some regards new to both of them (and was neither of their putative full-time jobs!) The result was indisputably successful: the customer loved the results, and two terrific people got a chance to work closely together without worrying about who was dotted-lined to whom.

It has allowed us to organizationally scale. Many organizations describe themselves as flat, and a reasonable rebuttal to this are the "shadow hierarchies" created by the tyranny of structurelessness. And indeed, if one were to read (say) Valve’s (in)famous handbook, the autonomy seems great — but the stack ranking decidedly less so, especially because the handbook is conspicuously silent on the subject of compensation. (Unsurprisingly, compensation was weaponized at Valve, which descended into toxic cliquishness.) While we believe that autonomy is important to do one’s best work, we also have a clear structure at Oxide in that Steve Tuck (Oxide co-founder and CEO) is in charge. He has to be: he is held accountable to our investors — and he must have the latitude to make decisions. Under Steve, it is true that we don’t have layers of middle management. Might we need some in the future? Perhaps, but what fraction of middle management in a company is dedicated to — at some level — determining who gets what in terms of compensation? What happens when you eliminate that burden completely?

It frees us to both lead and follow. We expect that every Oxide employee has the capacity to lead others — and we tap this capacity frequently. Of course, a company in which everyone is trying to direct all traffic all the time would be a madhouse, so we also very much rely on following one another too! Just as our compensation model allows us to internalize the values of different roles, it allows us to appreciate the value of both leading and following, and empowers us each with the judgement to know when to do which. This isn’t always easy or free of ambiguity, but this particular dimension of our versatility has been essential — and our compensation model serves to encourage it.

It causes us to hire carefully and deliberately. Of course, one should always hire carefully and deliberately, but this often isn’t the case — and many a startup has been ruined by reckless expansion of headcount. One of the roots of this can be found in a dirty open secret of Silicon Valley middle management: its ranks are taught to grade their career by the number of reports in their organization. Just as if you were to compensate software engineers based on the number of lines of code they wrote, this results in perverse incentives and predictable disasters — and any Silicon Valley vet will have plenty of horror stories of middle management jockeying for reqs or reorgs when they should have been focusing on product and customers. When you can eliminate middle management, you eliminate this incentive. We grow the team not because of someone’s animal urges to have the largest possible organization, but rather because we are at a point where adding people will allow us to better serve our market and customers.

It liberates feedback from compensation. Feedback is, of course, very important: we all want to know when and where we’re doing the right thing! And of course, we want to know too where there is opportunity for improvement. However, Silicon Valley has historically tied feedback so tightly to compensation that it has ceased to even pretend to be constructive: if it needs to be said, performance review processes aren’t, in fact, about improving the performance of the team, but rather quantifying and stack-ranking that performance for purposes of compensation. When compensation is moved aside, there is a kind of liberation for feedback itself: because feedback is now entirely earnest, it can be expressed and received thoughtfully.

It allows people to focus on doing the right thing. In a world of traditional, compensation-tied performance review, the organizational priority is around those things that affect compensation — even at the expense of activity that clearly benefits the company. This leads to all sorts of wild phenomena, and most technology workers will be able to tell stories of doing things that were clearly right for the company, but having to hide it from management that thought only narrowly in terms of their own stated KPIs and MBOs. By contrast, over and over (and over!) again, we have found that people do the right thing at Oxide — even if (especially if?) no one is looking. The beneficiary of that right thing? More often than not, it’s our customers, who have uniformly praised the team for going above and beyond.

It allows us to focus on the work that matters. Relatedly, when compensation is non-uniform, the process to figure out (and maintain) that non-uniformity is laborious. All of that work — of line workers assembling packets explaining themselves, of managers arming themselves with those packets to fight in the arena of organizational combat, and then of those same packets ultimately being regurgitated back onto something called a review — is work. Assuming such a process is executed perfectly (something which I suppose is possible in the abstract, even though I personally have never seen it), this is work that does not in fact advance the mission of the company. Not having variable compensation gives us all of that time and energy back to do the actual work — the stuff that matters.

It has stoked an extraordinary sense of teamwork. For me personally — and as I relayed on an episode of Software Misadventures — the highlights of my career have been being a part of an extraordinary team. The currency of a team is mutual trust, and while uniform compensation certainly isn’t the only way to achieve that trust, boy does it ever help! As Steve and I have told one another more times that we can count: we are so lucky to work on this team, with its extraordinary depth and breadth.

While our findings have been very positive, I would still reiterate what we said four years ago: we don’t know what the future holds, and it’s easier to make an unwavering commitment to the transparency rather than the uniformity. That said, the uniformity has had so many positive ramifications that the model feels more important than ever. We are beyond the point of this being a curiosity; it’s been essential for building a mission-focused team taking on a problem larger than ourselves. So it’s not a fit for everyone — but if you are seeking an extraordinary team solving hard problems in service to customers, consider Oxide!

On efficiency and resilience in IT The Trouble with Tribbles...

Years ago, I was in a meeting when a C-level executive proclaimed:

IT systems run at less than 10% utilization on average, so we're moving to the cloud to save money.

The logic behind this was that you could run systems in the cloud that were the size you needed, rather than the size you had on the floor.

Of course, this particular claim was specious. Did he know the average utilization of our systems, I asked. He did not. (It was at least 30%.)

Furthermore, measuring CPU utilization is just one aspect of a complex multidimensional space. Systems may have spare CPU cycles, but are hitting capacity limits on memory, memory bandwidth, network bandwidth, storage and storage bandwidth. It's rare to have a system so well balanced that it saturates all parameters equally.

Not only that, but the load on all systems fluctuate, even on very short timescales. There will always be troughs between the peaks. And, as we all know, busy systems tend to generate queues and congestion - or, as a technical term, higher utilization leads to increased latency.

Attempting to build systems that maximise efficiency implies minimizing waste. But if you always consider spare capacity as wasted capacity, then you will always get congested systems and slow response. (Just think about queueing at the tills in a supermarket where they've staffed them for average footfall.)

So guaranteeing performance and response time implies a certain level of overprovisioning.

Beyond that, resilient systems need to have sufficient capacity to not only handle normal fluctuations in usage, but abnormal usage due to failures and external events. And resilient design needs to have unused capacity to take up the slack when necessary.

In this case, a blinkered focus on efficiency not only leads to poor response, it also makes systems brittle and incapable of responding if a problem occurs.

A simple way to build resiliency is to have redundant systems - provision spare capacity that springs into action when needed. In such an active-passive configuration, the standby system might be idle. It doesn't have to be - you might use redundant systems for development/test/batch workloads (this presupposes you have a mechanism like Solaris zones to provide strong workload isolation).

Going to the cloud might solve the problem for a customer, but the cloud provider has exactly the same problem to solve, on a larger scale. They need to provision excess capacity to handle the variability in customer workloads. Which leads to the creation of interesting pricing models - such as reserved instances and the spot markets on AWS.

Understanding emission scopes, or failing to The Trouble with Tribbles...

I've been trying to get my head around all this Scope 1, Scope 2, Scope 3 emissions malarkey. Although it appears that lots of people smarter than me are struggling with it.

Having spent a while looking at how the Scopes are defined, I can understand how this can be difficult.

OK, Scope 1 is an organisation's direct emissions. Presumably an organisation knows what it's doing and how it's doing it, so getting the Scope 1 emissions from that ought to be fairly straightforward.

And Scope 2 is electricity, steam, heating and cooling purchased from someone else. I'm immediately suspicious here because this is a weirdly specific categorisation. But at least it should be easy to calculate - there's a conversion factor but at least you know the usage because it's on a bill you have to pay.

Then Scope 3 is - everything else. The fact that there are 15 official categories included ought to be a big red flag. That it's problematic is shown by the fact so many organisations have problems with it. (And by the growth of an industry to solve the problem for you.)

Personally, I wouldn't have defined it this way. If the idea is to evaluate emissions across the supply chain, then dumping almost all the emissions into the vaguest bucket is always going to be problematic.

So, why wasn't Scope 2 simply defined as the combined Scope 1 emissions of everyone providing services to the organisation. (That includes upstream and downstream, suppliers and employees, by the way.) That has 2 advantages I can see:

It's easy to calculate, because Scope 1 is pretty easy to calculate for all the providers of services (and they may well be doing it anyway), and an organisation ought to know who's providing services to it
It makes Scope 2 bigger (obviously) because there's more included, and therefore makes Scope 3 smaller, so uncertainties in Scope 3 matter less
Because you can better identify the contributors to your Scope 2 emissions, it's easier to know where to start making improvement efforts

I presume there's some reason it wasn't done this way, but I can't immediately see it.

What is this AI anyway? The Trouble with Tribbles...

AI is all the rage right now. It's everywhere, you can't avoid it.

But what is AI?

I'm not going to try and answer that here. What I will do, though, is state the question somewhat differently:

What is meant by "AI" in a given context?

And this matters, because the words we use are important.

The reality is that when you see AI mentioned it really could be almost anything. Some things AI might mean are:

Copilot
ChatGPT
Gemini
Some other specific off the shelf public LLM
Anything involving any off the shelf LLM
A custom domain-specific LLM
Machine learning
Pattern matching
Image recognition
Any old computer program
One of the AI companies

And there's always the possibility that someone has simply slapped AI on a product as a marketing term with no AI involved.

This persistent abuse of terminology is really unhelpful. Yesterday I went to a very interesting event for conversations about Hopes and Fears around AI.

Am I hopeful or fearful about AI? It depends which of the above definitions you mean.

There are certain uses of what might now be lumped in with AI that have proven to be very successful, but in many cases they're really machine learning, and have actually been around for a long time. I'm very positive about those (for example, helping in medical diagnoses).

On the other hand, if the AI is a stochastic parrot trained via large scale abuse of copyright while wreaking massive environmental damage, then I'm very negative about that.

So I think it's important to get away from sticking the AI label onto everything that might have some remote association with a computer program, and be far more careful in our terminology.

Tribblix on SPARC: sparse devices in an LDOM The Trouble with Tribbles...

I recently added a ddu like capability to Tribblix.

In that article I showed the devices in a bhyve instance. As might be expected there really aren't a lot of devices you need to handle.

What about SPARC, you might ask? Even if you don't, I'll ask for you.

Running Tribblix in a LDOM, this is what you see:

root@sparc-m32:/root# zap ddu
Device SUNW,kt-rng handled by n2rng in TRIBsys-kernel-platform [installed]
Device SUNW,ramdisk handled by ramdisk in TRIBsys-kernel [installed]
Device SUNW,sun4v-channel-devices handled by cnex in TRIBsys-ldoms [installed]
Device SUNW,sun4v-console handled by qcn in TRIBsys-kernel-platform [installed]
Device SUNW,sun4v-disk handled by vdc in TRIBsys-ldoms [installed]
Device SUNW,sun4v-domain-service handled by vlds in TRIBsys-ldoms [installed]
Device SUNW,sun4v-network handled by vnet in TRIBsys-ldoms [installed]
Device SUNW,sun4v-virtual-devices handled by vnex in TRIBsys-kernel-platform [installed]
Device SUNW,virtual-devices handled by vnex in TRIBsys-kernel-platform [installed]

It's hardly surprising, but that's a fairly minimal list.

It does make me wonder whether to produce a special SPARC Tribblix image precisely to run in an LDOM. After all, I already have slightly different variants on x86 designed for cloud in general, and one for EC2 specifically, that don't need the whole variety of device drivers that the generic image has to include.

Expecting an AI boom? The Trouble with Tribbles...

I recently went down to the smoke, to Tech Show London.

There were 5 constituent shows, and I found what each sub-show was offering - and the size of each component - quite interesting.

There wasn't much going on in Devops Live, to be honest. Relatively few players had shown up, nothing terribly interesting.

There wasn't that much in Big Data & AI World either. I was expecting much more here, and what there was seemed to be on the periphery. More support services than actual product.

The Cloud & Cyber Security Expo was middling, not great, and there was an AI slant in evidence. Not proper AI, but a sprinkling of AI dust on things just to keep up with the Joneses.

Cloud and AI Infrastructure had a few bright spots. I saw actual hardware on the floor - I had seen disk shelves over in the Big Data section, but here I spotted a Tape Library (I used to use those a lot, haven't seen much in that area for a while) and a VDI blade. Talked to a few people, including the Zabbix and Tailscale stands.

But when it came to Data Centre World, that was buzzing. It was about half the overall floor area, so it was far and away the dominant section. Tremendous diversity too - concrete, generators, power cables, electrical switching, fiber cables, cable management, thermal management, lots of power and cooling. Lots and lots of serious physical infrastructure.

There was an obvious expectation on display that there's a massive market around high-density compute. I saw multiple vendors with custom rack designs - rear-door and liquid cooling in evidence. Some companies addressing the massive demand for water.

If these people are at a trade show, then the target market isn't the 3 or 4 hyperscalers. What's being anticipated in this frenzy is very much companies building out their own datacentre facilities, and that's very much an interesting trend.

There's a saying "During a gold rush, sell shovels". What I saw here was a whole army of shovel-sellers getting ready for the diggers to show up.

Tribblix, UEFI, and UFS The Trouble with Tribbles...

Somewhat uniquely among illumos distributions, Tribblix doesn't require installation to ZFS - it allows the possibility of installing to a UFS root file system.

I'm not sure how widely used this is, but it will get removed as an option at some point, as the illumos UFS won't work past Y2038.

I recently went through the process of testing an install of the very latest Tribblix to UFS, in a bhyve guest running UEFI. The UEFI part was a bit more work, and doing it clarified how some of the internals fit together.

(One reason for doing these unusual experiments is to better understand how things work, especially those that are handed automatically by more mainstream components.)

OK, on to installation.

While install to zfs will automatically lay out zfs pools and file systems, the ufs variant needs manual partitioning. There are two separate concerns - the Tribblix install, and UEFI boot.

The Tribblix installer for UFS assumes 2 things about the layout of the disk it will install to:

The slice s0 will be used to install the operating system to, and mounted at /.
The slice s1 will be used for swap. (On zfs, you create a zfs volume for swap; on ufs you use a separate raw partition.)

It's slightly unfortunate that these slices are hard-coded into the installer.

For UEFI boot we need 2 other slices:

A system partition (this is what's called EFI System partition, aka ESP)
A separate partition to put the stage2 bootloader in. (On zfs there's a little bit of free space you can use; there isn't enough on ufs so it needs to be handled separately.)

The question then arises as to how big these need to be. Now, if you create a root pool with ZFS (using zpool create -B) it will create a 256MB partition for ESP. This turns out to be the minimum size for FAT32 on 4k disks, so that's a size that should always work. On disks with a 512 block size, it needs to be 32MB or larger (there's a comment in the code about 33MB). The amount of data you're going to store there is very much less.

The stage2 partition doesn't have to be terribly big.

So as a result of this I'm going to create a GPT label with 4 slices - 0 and 1 for Tribblix, 3 and 4 for EFI system and boot.

There are 2 things to note here: First,the partitions you create don't have to be laid out on disk in numerical order, you can put the slices in any order you want. This was true for SMI disks too, where it was common practice in Solaris to put swap on slice 1 at the start of the disk with slice 0 after it. Second, EFI/GPT doesn't assign any special significance to slice 2, unlike the old SMI label where slice 2 was conventionally the whole disk. I'm avoiding slice 2 here not because it's necessary, but so as to not confuse anyone used to the old SMI scheme.

The first thing to do with a fresh disk is to go into format, invoked as format -e (expert mode in order to access the EFI options). Select the disk, run fdisk from inside format, and then install an EFI label.

format -e
#
# choose the disk
#
fdisk
y - to accept defaults
l - to label
1 - choose efi

Then we can lay out the partitions. Still in format, type p to enter the partition menu and p to display the partitions.

p - enter partition menu
p - show current partition table

At this point on a new disk it should have 8 as "reserved" and 0 as "usr", with everything else "unassigned". We're going to leave slice 8 untouched.

First note where slice 0 currently starts. I'll resize it at the end, but we're going to put slices 3, 4, and 1 at the start of the disk and then resize 0 to fill in what's left.

To configure the settings for a given slice, just type its number.

Start with slice 3, type 3 and configure the system partition. This has to use the "system" tag.

tag: system
flags: wm (just hit return to accept)
start: 34
size: 64mb

Type p again to view the partition table and note the last sector of slice 3 we just created, and add 1 to it to give the start sector of the next slice. Type 4 to configure the boot partition, and it must have the tag "boot".

tag: boot
flags: wm (just hit return to accept)
start: 131106
size: 16mb

Type p again to view the partition table, take note of the last sector for the new slice 4, and add 1 to get the start sector for the next one. Which is 1 for the swap partition.

tag: swap
flags: wm (just hit return to accept)
start: 65570
size: 512mb

We're almost done. The final step is to resize partition 0. Again you get the start sector by adding 1 to the last sector of the swap partition you just created. And rather than giving a size you can give the end sector using an 'e' suffix, which should be one less than the start of the reserved partition 8, and also the last sector of the original partition 0. Type 0 and enter something like:

tag: usr
flags: wm (just hit return to accept)
start: 1212450
size: 16760798e

Type 'p' one last time to view the partition table, check that the Tag entries are correct, and that the First and Last Sectors don't overlap.

Then type 'l' to write the label to the disk. It will ask you for the label type - make sure it's EFI again - and for confirmation.

Then we can do the install

./ufs_install.sh c1t0d0s0

It will ask for confirmation that you want to create the file system

At the end it ought to say "Creating pcfs on ESP /dev/rdsk/c1t0d0s3"

If it says "Requested size is too small for FAT32." then that's a hint that you need the system partition to be bigger. (An alternative trick is to mkfs the pcfs file system yourself, if you create it using FAT16 it will still work but you can get away with it being a lot smaller.)

It should also tell you that it's writing the pmbr to slice 4 and to p0.

With that, rebooting into the newly installed system ought to work.

Now, the above is a fairly complicated set of instructions. I could automate this, but do we really want to make it that easy to install to UFS?

Introducing a ddu-alike for Tribblix The Trouble with Tribbles...

Introducing a new feature in Tribblix m36. There's a new ddu subcommand for zap.

In OpenSolaris, the Device Driver Utility would map the devices it found and work out what software was needed to drive them. This isn't that utility, but is inspired by that functionality, rewritten for Tribblix as a tiny little shell script.

As an example, this is the output of zap ddu for Tribblix in a bhyve instance:

jack@tribblix:~$ zap ddu
Device acpivirtnex handled by acpinex in TRIBsys-kernel-platform [installed]
Device pci1af4,1000,p handled by vioif in TRIBdrv-net-vioif [installed]
Device pci1af4,1001 handled by vioblk in TRIBdrv-storage-vioblk [installed]
Device pci1af4,1 handled by vioif in TRIBdrv-net-vioif [installed]
Device pciclass,030000 handled by vgatext in TRIBsys-kernel [installed]
Device pciclass,060100 handled by isa in TRIBsys-kernel-platform [installed]
Device pciex_root_complex handled by npe in TRIBsys-kernel-platform [installed]
Device pnpPNP,303 handled by kb8042 in TRIBsys-kernel [installed]
Device pnpPNP,f03 handled by mouse8042 in TRIBsys-kernel [installed]

Simply put, it will list the devices it finds, which driver is responsible for them, and which package that driver is contained in (and whether that package is installed).

This, while a tiny little feature, is one of those small things that is actually stunningly useful.

If there's a device that we have a driver for that isn't installed, this helps identify it so you know what to install.

What this doesn't do (yet, and unlike the original ddu) is show devices we don't have a driver for at all.

Is all this thing called AI worthwhile? The Trouble with Tribbles...

Before I even start, let's be clear: there are an awful lot of things currently being bundled under the "AI" banner, most of which of neither artificial nor intelligent.

So when I'm talking about AI here, I'm talking about what's being marketed to the masses as AI. This generally doesn't include the more traditional subjects of machine learning or image recognition, which I've often seen relabelled as AI.

But back to the title: is the modern thing called AI worthwhile?

Whatever it is, AI can do some truly remarkable things. That isn't something you can argue against. It can do some truly stupid and hopelessly wrong things as well.

But where does this good stuff fit in? Are businesses really going to benefit by embracing AI?

Well, yes, up to a point. There's a lot of menial work that can be handed off to an AI. It might be able to do it cheaper than a human.

The first snag is Jevon's paradox; by making menial tasks cheaper, a business simply opens the door to larger quantities of menial tasks, so it saves no money and its costs might even go up.

To be honest, though, I would have to say that if you can hand a task off to an AI, is it worth doing in the first place?

That's the rub, yes you might be able to optimise a process by using AI, but you can optimise it much more by eliminating it entirely.

(And you then don't have to pay extra for someone to come along and clean up after the AI has made a mess of it.)

It's not just the first level of process you need to look at. Take the example of summarising meetings. It's not so much that you don't need the summary, but to start with you need to run meetings better so they don't need to be summarised, and even better, the meeting probably wasn't needed at all.

Put it another way: the AI will get you to a local minimum of cost, but not to a global minimum. Worse, as AI gets cheaper and more widely used, that local optimisation makes it even harder to optimise the system globally.

So yes, I'm not convinced that much of the AI currently being rammed down our throats has any utility. It will actively block businesses in the pursuit of improvements, and the infatuation with current trendy AI will harm the development of useful AI.

Experience: dialog with prosecutor alp's notes

Wow! Today we had a guest from prosecutor's office.
They checked if we (South Federal University) ban sites from prosecutor's list. Luckily, our upstream provider does it for us.
But a check was ridiculous.
Yes, we receive daily lists of sites to ban. I though they would check some sites from this list and go in peace. But prosecutor just searched for prohibited works with Google and tried if she can download the materials. Of course, Google found working links :)
What a hell? Why should we imitate some work if everyone knows how to avoid these regulation rules? Why should anyone spend resources for content filtering? I think our government are a herd of archaic dinosaurs who just don't know how to lick chief's arse better

Tailscale for SunOS in 2025 Nahum Shalman

Happy New Year! The wireguard-go port is still sitting around in my fork. I don't know when I will have the energy for the next attempt to get it upstream. In the meantime, I've made some fun progress on the Tailscale side.

Taildrive

The Tailscale folks have shipped Taildrive (currently in alpha) and it's pretty neat. Naturally those of us using Tailscale on illumos wanted to try it out. There was nothing needed directly to get it working, but we had an indirect problem. The tailscale binary communicates with the tailscaled daemon over a unix socket, and the Tailscale folks had added some basic unix based authentication / authorization abstracted in their peercred library. That library needed support added for getpeerucred which meant I had to wire things up all the way down in x/sys/unix before then getting it into peercred. But with that work done, Taildrive now works! I tagged a release with that enabled if you're in a rush to play with it.

Using userspace-networking

Tailscale has a way to run without creating a TUN device. It means that client software on the machine can't connect directly to IPs on the Tailnet (though there is a SOCKS proxy you can use) but tailscaled can still lots of other server-y things (including Taildrive!) That's how Tailscale has been supporting AIX. Which led me to a strange realization: Tailscale had better in-tree support for AIX than it did for illumos and Solaris. No more! We are now on-par with AIX in the official tree!

What's next

I don't know if the Tailscale folks intend to ship binaries for us from their tree, but after their next release it should be possible to build illumos binaries from their tree that you could use to serve up a ZFS filesystem with Taildrive to your tailnet using the userspace-networking driver.

I will of course also rebase my TUN driver patches and tag a release as well.

Are you running Tailscale on illumos or Solaris? Let me know on Bluesky or Mastodon.

Tragedy Older Than Me Nahum Shalman

In July of 2021, in anticipation of the upcoming High Holy Days I purchased a copy of This Is Real and You Are Completely Unprepared: The Days of Awe as a Journey of Transformation by Rabbi Alan Lew, published in 2003. I was in fact completely unprepared to even read it. It sat in or on my nightstand for more than three years. I finally started reading it during the high holidays in October of 2024 (a few days before the one year anniversary of the events of October 7, 2023).

When I read the section excerpted here, I had to immediately flip back to check the publication date. 2003. In so many ways the ongoing tragedy today is in the same place it was over 20 years ago. Rabbi Lew died in 2009. We cannot ask him what he thinks of the world today, but in many ways there is no need. Little has changed. So, to emphasize one more time, let's go back to 2003:

I think that the great philosopher George Santayana got it exactly wrong. I think it is precisely those who insist on remembering history who are doomed to repeat it. For a subject with so little substance, for something that is really little more than a set of intellectual interpretations, history can become a formidable trap— a sticky snare from which we may find it impossible to extricate ourselves. I find it impossible to read the texts of Tisha B’Av, with their great themes of exile and return, and their endless sense of longing for the land of Israel, without thinking of the current political tragedy in the Middle East. I write this at a very dark moment in the long and bleak history of that conflict. Who knows what will be happening there when you read this? But I think it’s a safe bet that whenever you do, one thing is unlikely to have changed. There will likely be a tremendous compulsion for historical vindication on both sides. Very often, I think it is precisely the impossible yearning for historical justification that makes resolution of this conflict seem so impossible. The Jews want vindication for the Holocaust, and for the two thousand years of European persecution and ostracism that preceded it; the Jews want the same Europeans who now give them moral lectures to acknowledge that this entire situation would never have come about if not for two thousand years of European bigotry, barbarism, and xenophobia. They want the world to acknowledge that Israel was attacked first, in 1948, in 1967, in 1973, and in each of the recent Intifadas. They want acknowledgment that they only took the lands from which they were attacked during these conflicts, and offered to return them on one and only one condition— the acknowledgment of their right to exist. When Anwar Sadat met that condition, the Sinai Peninsula, with its rich oil fields and burgeoning settlement towns, was returned to him. And they want acknowledgment that there are many in the Palestinian camp who truly wish to destroy them, who have used the language of peace as a ploy to buy time until they have the capacity to liquidate Israel and the Jews once and for all. They want acknowledgment that they have suffered immensely from terrorism, that a people who lost six million innocents scarcely seventy years ago should not have had to endure the murder of its innocent men, women, and children so soon again. And they want acknowledgment that in spite of all this, they stood at Camp David prepared to offer the Palestinians everything they claimed to have wanted— full statehood, a capital in East Jerusalem— and the response of the Palestinians was the second Intifada, a murderous campaign of terror and suicide bombings.

And the Palestinians? They would like the world to acknowledge that they lived in the land now called Israel for centuries, that they planted olive trees, shepherded flocks, and raised families there for hundreds of years; they would like the world to acknowledge that when they look up from their blue-roofed villages, their trees and their flowers, their fields and their flocks, they see the horrific, uninvited monolith of western culture— immense apartment complexes, shopping centers, and industrial plants on the once-bare and rocky hills where the voice of God could be heard and where Muhammad ascended to heaven. And they would like the world to acknowledge that it was essentially a European problem that was plopped into their laps at the end of the last great war, not one of their own making. They would like the world to acknowledge that there has always been a kind of arrogance attached to this problem; that it was as if the United States and England said to them, Here are the Jews, get used to them. And they would like the world to acknowledge that it is a great indignity, not to mention a significant hardship, to have been an occupied people for so long, to have had to submit to strip searches on the way to work, and intimidation on the way to the grocery store, and the constant humiliation of being subject— a humiliation rendered nearly bottomless when Israel, with the benefit of the considerable intellectual and economic resources of world Jewry, made the desert bloom, in a way they had never been able to do. And they would like the world to acknowledge that there are those in Israel who are determined never to grant them independence, who have used the language of peace as a ploy to fill the West Bank with settlement after settlement until the facts on the ground are such that an independent Palestinian state on the West Bank is an impossibility. They would like the world to acknowledge that there is no such thing as a gentle occupation— that occupation corrodes the humanity of the occupier and makes the occupied vulnerable to brutality.

And I think the need to have these things acknowledged— the need for historical affirmation— is so great on both sides that both the Israelis and the Palestinians would rather perish as peoples than give this need up. In fact, I think they both feel that they would perish as peoples precisely if they did. They would rather die than admit their own complicity in the present situation, because to make such an admission would be to acknowledge the suffering of the other and the legitimacy of the other’s complaint, and that might mean that they themselves were wrong, that they were evil, that they were bad. That might give the other an opening to annihilate or enslave them. That might make such behavior seem justifiable.

I wonder how many of us are stuck in a similar snare. I wonder how many of us are holding on very hard to some piece of personal history that is preventing us from moving on with our lives, and keeping us from those we love. I wonder how many of us cling so tenaciously to a version of a story of our lives in which we appear to be utterly blameless and innocent, that we become oblivious to the pain we have inflicted on others, no matter how unconsciously or inevitably or innocently we have have inflicted it. I wonder how many of us are terrified of acknowledging the truth of our lives because we think it will expose us. How many of us stand paralyzed between the moon and the sun; frozen — unable to act in the moment — because of our terror of the past and because of the intractability of the present circumstances that past has wrought? Forgiveness, it has been said, means giving up our hopes for a better past. This may sound like a joke, but how many of us refuse to give up our version of the past, and so find it impossible to forgive ourselves or others, impossible to act in the present?

I don't have answers. In my childhood I was promised peace in the Middle East. I am still waiting. I wish I knew what was needed to get us there.

Pirkei_Avot Chapter 5, Verse 21 implies that I am perhaps old enough to have some Wisdom, but am not yet old enough to give Counsel. The only wisdom I have obtained so far is that in most disagreements, people can disagree about the "facts", be aproaching the situation with fundamentally different values, or both. I believe that to have any meaningful discussion on a topic as fraught as this one, first common values must be established. Only then can we approach reality side-by-side, examine our beliefs, find mutually trustworthy sources of information, and find agreement about the state of reality. When values are aligned and facts are agreed upon, we might have some hope of letting go of just enough bits of history to find a path through this mess.

Decades ago I was a child promised peace. Today I have children of my own. Today on both side there are children suffering from the choices of their parents and grandparents. All the children deserve better.

There are people on both sides with genocide in their hearts. I don't yet know what to do about that, but we cannot let them win.

I wish that those who should be wise enough to provide counsel, particularly those with power, would get their acts together.

I wish for the peace and safety of all the innocents.

I wish for peace. In my lifetime. This year. Tomorrow. Or even today.

I hope that these words of Rabbi Alan Lew will reach just a few more people thanks to this post being on the internet. I hope they have touched you. Thank you for reading.

OpenTelemetry Tracing for Dropshot Nahum Shalman

I spoke at Oxide's dtrace.conf(24) about a project I've been hacking on for the past couple weeks:

Slides:

Code:

Thoughts on Static Code Analysis The Trouble with Tribbles...

I use a number of tools in static code analysis for my projects - primarily Java based. Mostly

Wait, I hear you say. Spell checking? Absolutely, it's a key part of code and documentation quality. There's absolutely no excuse for shoddy spelling. And I sometimes find that if the spelling's off, it's a sign that concentration levels weren't what they should have been, and other errors might also have crept in.

checkstyle is far more than style, although it has very fixed ideas about that. I have a list of checks that must always pass (now I've cleaned them up at any rate), so that's now at the state where it's just looking for regressions - the remaining things it's complaining about I'm happy to ignore (or the cost of fixing them massively outweighs any benefit to fixing them).

One thing that checkstyle is keen on is thorough javadoc. Initially I might have been annoyed by some of its complaints, but then realised 2 things. First, it makes you consider whether a given API really should be public. And more generally as part of that, having to write javadoc can make you reevaluate the API you've designed, which pushes you towards improving it.

When it comes to shellcheck, I can summarise it's approach as "quote all the things". Which is fine, until it isn't and you actually want to expand a variable into its constituent words.

But even there, a big benefit again is that shellcheck makes you look at the code and think about what it's doing. Which leads to an important point - automatic fixing of reported problems will (apart from making mistakes) miss the benefit of code inspection.

Actual coding errors (or just imperfections) tend to be the domain of PMD and SpotBugs. I have a long list of exceptions for PMD, depending on each project. I'm writing applications for unix-like systems, and I really do want to write directly to stdout and stderr. If I want to shut the application down, then calling System.exit() really is the way to do it.

I've been using PMD for years, and it took a while to get the recent version 7 configured to my liking. But having run PMD against my code for so long means that a lot of the low hanging fruit had already been fixed (and early on my code was much much worse than it is now). I occasionally turn the exclusions off and see if I can improve my code, and occasionally win at this game, but it's a relatively hard slog.

So far, SpotBugs hasn't really added much. I find its output somewhat unhelpful (I do read the reports), but initial impressions are that it's finding things the other tools don't, so I need to work harder to make sense of it.

dtrace.conf(24) Oxide Computer Company Blog

Sometime in late 2007, we had the idea of a DTrace conference. Or really, more of a meetup; from the primordial e-mail I sent:

The goal here, by the way, is not a DTrace user group, but more of a face-to-face meeting with people actively involved in DTrace — either by porting it to another system, by integrating probes into higher level environments, by building higher-level tools on top of DTrace or by using it heavily and/or in a critical role. That said, we also don’t want to be exclusionary, so our thinking is that the only true requirement for attending is that everyone must be prepared to speak informally for 15 mins or so on what they are doing with DTrace, any limitations that they have encountered, and some ideas for the future. We’re thinking that this is going to be on the order of 15-30 people (though more would be a good problem to have — we’ll track it if necessary), that it will be one full day (breakfast in the morning through drinks into the evening), and that we’re going to host it here at our offices in San Francisco sometime in March 2008.

This same note also included some suggested names for the gathering, including what in hindsight seems a clear winner: DTrace Bi-Mon-Sci-Fi-Con. As if knowing that I should leave an explanatory note to my future self as to why this name was not selected, my past self fortunately clarified: "before everyone clamors for the obvious Bi-Mon-Sci-Fi-Con, you should know that most Millennials don’t (sadly) get the reference." (While I disagree with the judgement of my past self, it at least indicates that at some point I cared if anyone got the reference.)

We settled on a much more obscure reference, and had the first dtrace.conf in March 2008. Befitting the style of the time, it was an unconference (a term that may well have hit its apogee in 2008) that you signed up to attend by editing a wiki. More surprising given the year (and thanks entirely to attendee Ben Rockwood), it was recorded — though this is so long ago that I referred to it as video taping (and with none of the participants mic’d, I’m afraid the quality isn’t very good). The conference, however, was terrific, viz. the reports of Adam, Keith and Stephen (all somehow still online nearly two decades later). If anything, it was a little too good: we realized that we couldn’t recreate the magic, and we demurred on making it an annual event.

Years passed, and memories faded. By 2012, it felt like we wanted to get folks together again, now under a post-lawnmower corporate aegis in Joyent. The resulting dtrace.conf(12) was a success, and the Olympiad cadence felt like the right one; we did it again four years later at dtrace.conf(16).

In 2020, we came back together for a new adventure — and the DTrace Olympiad was not lost on Adam. Alas, dtrace.conf(20) — like the Olympics themselves — was cancelled, if implicitly. Unlike the Olympics, however, it was not to be rescheduled.

More years passed and DTrace continued to prove its utility at Oxide; last year when Adam and I did our "DTrace at 20" episode of Oxide and Friends, we vowed to hold dtrace.conf(24) — and a few months ago, we set our date to be December 11th.

At first we assumed we would do something similar to our earlier conferences: a one-day participant-run conference, at the Oxide office in Emeryville. But times have changed: thanks to the rise of remote work, technologists are much more dispersed — and many more people would need to travel for dtrace.conf(24) than in previous DTrace Olympiads. Travel hasn’t become any cheaper since 2008, and the cost (and inconvenience) was clearly going to limit attendance.

The dilemma for our small meetup highlights the changing dynamics in tech conferences in general: with talks all recorded and made publicly available after the conference, how does one justify attending a conference in person? There can be reasonable answers to that question, of course: it may be the hallway track, or the expo hall, or the after-hours socializing, or perhaps some other special conference experience. But it’s also not surprising that some conferences — especially ones really focused on technical content — have decided that they are better off doing as conference giant O’Reilly Media did, and going exclusively online. And without the need to feed and shelter participants, the logistics for running a conference become much more tenable — and the price point can be lowered to the point that even highly produced conferences like P99 CONF can be made freely available. This, in turn, leads to much greater attendance — and a network effect that can get back some of what one might lose going online. In particular, using chat as the hallway track can be more much effective (and is certainly more scalable!) than the actual physical hallways at a conference.

For conferences in general, there is a conversation to be had here (and as a teaser, Adam and I are going to talk about it with Stephen O’Grady and Theo Schlossnagle on Oxide and Friends next week, but for our quirky, one-day, Olympiad-cadence dtrace.conf, the decision was pretty easy: there was much more to be gained than lost by going exclusively on-line.

So dtrace.conf(24) is coming up next week, and it’s available to everyone. In terms of platform, we’re going to try to keep that pretty simple: we’re going to use Google Meet for the actual presenters, which we will stream in real-time to YouTube — and we’ll use the Oxide Discord for all chat. We’re hoping you’ll join us on December 11th — and if you want to talk about DTrace or a DTrace-adjacent topic, we’d love for you to present! Keeping to the unconference style, if you would like to present, please indicate your topic in the #session-topics Discord channel so we can get the agenda fleshed out.

While we’re excited to be online, there are some historical accoutrements of conferences that we didn’t want to give up. First, we have a tradition of t-shirts with dtrace.conf. Thanks to our designer Ben Leonard, we have a banger of a t-shirt, capturing the spirit of our original dtrace.conf(08) shirt but with an Oxide twist. It’s (obviously) harder to make those free but we have tried to price them reasonably. You can get your t-shirt by adding it to your (free) dtrace.conf ticket. (And for those who present at dtrace.conf, your shirt is on us — we’ll send you a coupon code!)

Second, for those who can make their way to the East Bay and want some hangout time, we are going to have an après conference social event at the Oxide office starting at 5p. We’re charging something nominal for that too (and like the t-shirt, you pay for that via your dtrace.conf ticket); we’ll have some food and drinks and an Oxide hardware tour for the curious — and (of course?) there will be Fishpong.

Much has changed since I sent that e-mail 17 years ago — but the shared values and disposition that brought together our small community continue to endure; we look forward to seeing everyone (virtually) at dtrace.conf(24)!

A New Standard in National Security and Innovation Oxide Computer Company Blog

Oxide Computer Company and Lawrence Livermore National Laboratory Work Together to Advance Cloud and HPC Convergence

Oxide Computer Company and Lawrence Livermore National Laboratory (LLNL) today announced a plan to bring on-premises cloud computing capabilities to the Livermore Computing (LC) high-performance computing (HPC) center. The rack-scale Oxide Cloud Computer allows LLNL to improve the efficiency of operational workloads and will provide users in the National Nuclear Security Administration (NNSA) with new capabilities for provisioning secure, virtualized services alongside HPC workloads.

HPC centers have traditionally run batch workloads for large-scale scientific simulations and other compute-heavy applications. HPC workloads do not exist in isolation—there are a multitude of persistent, operational services that keep the HPC center running. Meanwhile, HPC users also want to deploy cloud-like persistent services—databases, Jupyter notebooks, orchestration tools, Kubernetes clusters. Clouds have developed extensive APIs, security layers, and automation to enable these capabilities, but few options exist to deploy fully virtualized, automated cloud environments on-premises. The Oxide Cloud Computer allows organizations to deliver secure cloud computing capabilities within an on-premises environment.

On-premises environments are the next frontier for cloud computing. LLNL is tackling some of the hardest and most important problems in science and technology, requiring advanced hardware, software, and cloud capabilities. We are thrilled to be working with their exceptional team to help advance those efforts, delivering an integrated system that meets their rigorous requirements for performance, efficiency, and security.

— Steve TuckCEO at Oxide Computer Company

Leveraging the new Oxide Cloud Computer, LLNL will enable staff to provision virtual machines (VMs) and services via self-service APIs, improving operations and modernizing aspects of system management. In addition, LLNL will use the Oxide rack as a proving ground for secure multi-tenancy and for smooth integration with the LLNL-developed Flux resource manager. LLNL plans to bring its users cloud-like Infrastructure-as-a-Service (IaaS) capabilities that work seamlessly with their HPC jobs, while maintaining security and isolation from other users. Beyond LLNL personnel, researchers at the Los Alamos National Laboratory and Sandia National Laboratories will also partner in many of the activities on the Oxide Cloud Computer.

We look forward to working with Oxide to integrate this machine within our HPC center. Oxide’s Cloud Computer will allow us to securely support new types of workloads for users, and it will be a proving ground for introducing cloud-like features to operational processes and user workflows. We expect Oxide’s open-source software stack and their transparent and open approach to development to help us work closely together.

— Todd GamblinDistinguished Member of Technical Staff at LLNL

Sandia is excited to explore the Oxide platform as we work to integrate on-premises cloud technologies into our HPC environment. This advancement has the potential to enable new classes of interactive and on-demand modeling and simulation capabilities.

— Kevin PedrettiDistinguished Member of Technical Staff at Sandia National Laboratories

LLNL plans to work with Oxide on additional capabilities, including the deployment of additional Cloud Computers in its environment. Of particular interest are scale-out capabilities and disaster recovery. The latest installation underscores Oxide Computer’s momentum in the federal technology ecosystem, providing reliable, state-of-the-art Cloud Computers to support critical IT infrastructure.

To learn more about Oxide Computer, visit https://oxide.computer.

About Oxide Computer

Oxide Computer Company is the creator of the world’s first commercial Cloud Computer, a true rack-scale system with fully unified hardware and software, purpose-built to deliver hyperscale cloud computing to on-premises data centers. With Oxide, organizations can fully realize the economic and operational benefits of cloud ownership, with access to the same self-service development experience of public cloud, without the public cloud cost. Oxide empowers developers to build, run, and operate any application with enhanced security, latency, and control, and frees organizations to elevate IT operations to accelerate strategic initiatives. To learn more about Oxide’s Cloud Computer, visit oxide.computer.

About LLNL

Founded in 1952, Lawrence Livermore National Laboratory provides solutions to our nation’s most important national security challenges through innovative science, engineering, and technology. Lawrence Livermore National Laboratory is managed by Lawrence Livermore National Security, LLC for the U.S. Department of Energy’s National Nuclear Security Administration.

Media Contact

LaunchSquad for Oxide Computer oxide@launchsquad.com

Remembering Charles Beeler Oxide Computer Company Blog

We are heartbroken to relay that Charles Beeler, a friend and early investor in Oxide, passed away in September after a battle with cancer. We lost Charles far too soon; he had a tremendous influence on the careers of us both.

Our relationship with Charles dates back nearly two decades, to his involvement with the ACM Queue board where he met Bryan. It was unprecedented to have a venture capitalist serve in this capacity with ACM, and Charles brought an entirely different perspective on the practitioner content. A computer science pioneer who also served on the board took Bryan aside at one point: "Charles is one of the good ones, you know."

When Bryan joined Joyent a few years later, Charles also got to know Steve well. Seeing the promise in both node.js and cloud computing, Charles became an investor in the company. When companies hit challenging times, some investors will hide — but Charles was the kind of investor to figure out how to fix what was broken. When Joyent needed a change in executive leadership, it was Charles who not only had the tough conversations, but led the search for the leader the company needed, ultimately positioning the company for success.

Aside from his investment in Joyent, Charles was an outspoken proponent of node.js, becoming an organizer of the Node Summit conference. In 2017, he asked Bryan to deliver the conference’s keynote, but by then, the relationship between Joyent and node.js had become… complicated, and Bryan felt that it probably wouldn’t be a good idea. Any rational person would have dropped it, but Charles persisted, with characteristic zeal: if the Joyent relationship with node.js had become strained, so much more the reason to speak candidly about it! Charles prevailed, and the resulting talk, Platform as Reflection of Values, became one of Bryan’s most personally meaningful talks.

Charles’s persistence was emblematic: he worked behind the scenes to encourage people to do their best work, always with an enthusiasm for the innovators and the creators. As we were contemplating Oxide, we told Charles what we wanted to do long before we had a company. Charles laughed with delight: "I hoped that you two would do something big, and I am just so happy for you that you’re doing something so ambitious!"

As we raised seed capital, we knew that we were likely a poor fit for Charles and his fund. But we also knew that we deeply appreciated his wisdom and enthusiasm; we couldn’t resist pitching him on Oxide. Charles approached the investment in Oxide as he did with so many other aspects: with curiosity, diligence, empathy, and candor. He was direct with us that despite his enthusiasm for us personally, Oxide would be a challenging investment for his firm. But he also worked with us to address specific objections, and ultimately he won over his partnership. We were thrilled when he not only invested, but pulled together a syndicate of like-minded technologists and entrepreneurs to join him.

Ever since, he has been a huge Oxide fan. Befitting his enthusiasm, one of his final posts expressed his enthusiasm and pride in what the Oxide team has built.

Charles, thank you. You told us you were proud of us — and it meant the world. We are gutted to no longer have you with us; your influence lives on not just in Oxide, but also in the many people that you have inspired. You were the best of venture capital. Closer to the heart, you were a terrific friend to us both; thank you.

Debugging an OpenJDK crash on SPARC The Trouble with Tribbles...

I had to spend a little time recently fixing a crash in OpenJDK on Solaris SPARC.

What we're seeing is, from the hs_err file:

# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0xffffffff57c745a8, pid=18442, tid=37
...
# Problematic frame:
# V [libjvm.so+0x7745a8] G1CollectedHeap::allocate_new_tlab(unsigned long, unsigned long, unsigned long*)+0xb8

Well that's odd. I only see this on SPARC, and I've seen it sporadically on Tribblix during the process of continually building OpenJDK on SPARC, but haven't seen it on Solaris. Until a customer hit it in production, which is rather a painful place to find a reproducer.

In terms of source, this is located in the file src/hotspot/share/gc/g1/g1CollectedHeap.cpp (all future source references will be relative to that directory), and looks like:

HeapWord* G1CollectedHeap::allocate_new_tlab(size_t min_size,
                                             size_t requested_size,
                                             size_t* actual_size) {
assert_heap_not_locked_and_not_at_safepoint();
assert(!is_humongous(requested_size), "we do not allow humongous TLABs");

return attempt_allocation(min_size, requested_size, actual_size);
}

That's incredibly simple. There's not much that can go wrong there, is there?

The complexity here is that a whole load of functions get inlined. So what does it call? You find yourself in a twisty maze of passages, all alike. But anyway, the next one down is

inline HeapWord* G1CollectedHeap::attempt_allocation(size_t min_word_size,
                                                     size_t desired_word_size,
                                                     size_t* actual_word_size) {
assert_heap_not_locked_and_not_at_safepoint();
assert(!is_humongous(desired_word_size), "attempt_allocation() should not "
         "be called for humongous allocation requests");

HeapWord* result = _allocator->attempt_allocation(min_word_size, desired_word_size, actual_word_size);

if (result == NULL) {
    *actual_word_size = desired_word_size;
    result = attempt_allocation_slow(desired_word_size);
}

assert_heap_not_locked();
if (result != NULL) {
    assert(*actual_word_size != 0, "Actual size must have been set here");
    dirty_young_block(result, *actual_word_size);
} else {
    *actual_word_size = 0;
}

return result;
}

That then calls an inlined G1Allocator::attempt_allocation() in g1Allocator.hpp. That calls current_node_index(), which looks safe and then there are a couple of calls to mutator_alloc_region()->attempt_retained_allocation() and mutator_alloc_region()->attempt_allocation(), which come from g1AllocRegion.inline.hpp and both ultimately call a local par_allocate(), which then calls par_allocate_impl() or par_allocate() in heapRegion.inline.hpp.

Now, mostly all these are doing is calling something else. The one really complex piece of code is in par_allocate_impl() which contains

...
do {
    HeapWord* obj = top();
    size_t available = pointer_delta(end(), obj);
    size_t want_to_allocate = MIN2(available, desired_word_size);
    if (want_to_allocate >= min_word_size) {
      HeapWord* new_top = obj + want_to_allocate;
      HeapWord* result = Atomic::cmpxchg(&_top, obj, new_top);
      // result can be one of two:
      // the old top value: the exchange succeeded
      // otherwise: the new value of the top is returned.
      if (result == obj) {
        assert(is_object_aligned(obj) && is_object_aligned(new_top), "checking alignment");
        *actual_size = want_to_allocate;
        return obj;
      }
    } else {
      return NULL;
    }
} while (true);
}

Right, let's go back to the crash. We can open up the core file in
mdb, and look at the stack with $C

ffffffff7f39d751 libjvm.so`_ZN7VMError14report_and_dieEP6ThreadjPhPvS3_+0x3c(
    101cbb1d0?, b?, fffffffcb45dea7c?, ffffffff7f39ecb0?, ffffffff7f39e9a0?, 0?)
ffffffff7f39d811 libjvm.so`JVM_handle_solaris_signal+0x1d4(b?,
    ffffffff7f39ecb0?, ffffffff7f39e9a0?, 0?, ffffffff7f39e178?, 101cbb1d0?)
ffffffff7f39dde1 libjvm.so`_ZL17javaSignalHandleriP7siginfoPv+0x20(b?,
    ffffffff7f39ecb0?, ffffffff7f39e9a0?, 0?, 0?, ffffffff7e7dd370?)
ffffffff7f39de91 libc.so.1`__sighndlr+0xc(b?, ffffffff7f39ecb0?,
    ffffffff7f39e9a0?, fffffffcb4b38afc?, 0?, ffffffff7f20c7e8?)
ffffffff7f39df41 libc.so.1`call_user_handler+0x400((int) -1?,
    (siginfo_t *) 0xffffffff7f39ecb0?, (ucontext_t *) 0xc?)
ffffffff7f39e031 libc.so.1`sigacthandler+0xa0((int) 11?,
    (siginfo_t *) 0xffffffff7f39ecb0?, (void *) 0xffffffff7f39e9a0?)
ffffffff7f39e5b1 libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xb8(
    10013d030?, 100?, 520?, ffffffff7f39f000?, 0?, 0?)

What you see here is the allocate_new_tlab() at the botton, it throws a signal, the signal handler catches it, passes it ultimately to JVM_handle_solaris_signal() which bails, and the JVM exits.

We can look at the signal. It's at address 0xffffffff7f39ecb0 and is of type siginfo_t, so we can just print it

java:core> ffffffff7f39ecb0::print -t siginfo_t

and we first see

siginfo_t {
    int si_signo = 0t11 (0xb)
    int si_code = 1
    int si_errno = 0
...

OK, the signal was indeed 11 = SIGSEGV. The interesting thing is the si_code of 1, which is defined as

#define SEGV_MAPERR     1       /* address not mapped to object */

Ah. Now, in the jvm you actually see a lot of SIGSEGV, but a lot of them are handled by that mysterious JVM_handle_solaris_signal(). In particular, it'll handle anything with SEGV_ACCERR which is basically something running off the end of an array.

Further down, you can see the fault address

struct __fault = {
            void *__addr = 0x10
            int __trapno = 0
            caddr_t __pc = 0
            int __adivers = 0
        }

So, we're faulting on address 0x10. Yes, you try messing around down there and you will fault.

That confirms the crash is a SEGV. What are we actually trying to do? We can disassemble the allocate_new_tlab() function and see what's happening - remember the crash was at offset 0xb8

java:core> libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm::dis
...
libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xb8:

       ldx       [%i4 + 0x10], %i5

That's interesting, 0x10 was the fault address. What's %i4 then?

java:core> ::regs
%i4 = 0x0000000000000000

Yep. Given that, we'll try and read 0x10, giving the SEGV we see.

There's a little more context around that call site. A slightly
expanded view is

libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xa0:        nop
libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xa4:        add       %
i5, %g1, %g1
libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xa8:        casx      [
%g3], %i5, %g1
libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xac:        cmp       %
i5, %g1
libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xb0:        be,pn     %
xcc, +0x160 <libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0x210>
libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xb4:        nop
libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0xb8:        ldx       [
%i4 + 0x10], %i5

Now, the interesting thing here is the casx (compare and swap) instruction. That lines up with the Atomic::cmpxchg() in par_allocate_impl() that we were suspecting above. So the crash is somewhere around there.

It turns out there's another way to approach this. If we compile without optimization then effectively we turn off the inlining. The way to do this is to add an entry to the jvm Makefile via make/hotspot/lib/JvmOverrideFiles.gmk

...
else ifeq ($(call isTargetOs, solaris), true)
    ifeq ($(call isTargetCpuArch, sparc), true)
      # ptribble port tweaks
      BUILD_LIBJVM_g1CollectedHeap.cpp_CXXFLAGS += -O0
    endif
endif

If we rebuild (having touched all the files in the directory to force
make to rebuild everything correctly), and run again, we get the full
call stack:

Now the crash is

# V [libjvm.so+0x80cc48] HeapRegion::top() const+0xc

which we can expand to the following stack leading up to where it goes
into the signal handler.:

ffffffff7f39dff1 libjvm.so`_ZNK10HeapRegion3topEv+0xc(0?, ffffffff7f39ef40?,
    101583e38?, ffffffff7f39f020?, fffffffa46de8038?, 10000?)
ffffffff7f39e0a1 libjvm.so`_ZN10HeapRegion17par_allocate_implEmmPm+0x18(0?,
    100?, 10000?, ffffffff7f39ef60?, ffffffff7f39ef40?, 8f00?)
ffffffff7f39e181
libjvm.so`_ZN10HeapRegion27par_allocate_no_bot_updatesEmmPm+0x24(0?, 100?,
    10000?, ffffffff7f39ef60?, 566c?, 200031?)
ffffffff7f39e231
libjvm.so`_ZN13G1AllocRegion12par_allocateEP10HeapRegionmmPm+0x44(100145440?,
    0?, 100?, 10000?, ffffffff7f39ef60?, 0?)
ffffffff7f39e2e1 libjvm.so`_ZN13G1AllocRegion18attempt_allocationEmmPm+0x48(
    100145440?, 100?, 10000?, ffffffff7f39ef60?, 3?, fffffffa46ceff48?)
ffffffff7f39e3a1 libjvm.so`_ZN11G1Allocator18attempt_allocationEmmPm+0xa4(
    1001453b0?, 100?, 10000?, ffffffff7f39ef60?, 7c0007410?, ffffffff7f39ea41?)
ffffffff7f39e461 libjvm.so`_ZN15G1CollectedHeap18attempt_allocationEmmPm+0x2c(
    10013d030?, 100?, 10000?, ffffffff7f39ef60?, 7c01b15e8?, 0?)
ffffffff7f39e521 libjvm.so`_ZN15G1CollectedHeap17allocate_new_tlabEmmPm+0x24(
    10013d030?, 100?, 10000?, ffffffff7f39ef60?, 0?, 0?)

So yes, this confirms that we are indeed in par_allocate_impl() and
it's crashing on the very first line of the code segment I showed
above, where it calls top(). All top() does is return the _top member
of a HeapRegion.

So the only thing that can happen here is that the HeapRegion itself
is NULL. Then the _top member is presumably at offset 0x10, and trying
to access it gives the SIGSEGV.

Now, in G1AllocRegion::attempt_allocation() there's an assert:

HeapRegion* alloc_region = _alloc_region;
assert_alloc_region(alloc_region != NULL, "not initialized properly");

However, asserts aren't compiled into production builds.

But the fix here is to fail if we've got NULL and let the caller
retry. There are a lot of calls here, and the general approach is to
return NULL if anything goes wrong, so I do the same for this extra
failure case, adding the following:

if (alloc_region == NULL) {
    return NULL;
}

With that, no more of those pesky crashes. (There might be others
lurking elsewhere, of course.)

Of course, what this doesn't explain is why the HeapRegion wasn't
correctly initialized in the first place. But that's another problem
entirely.

How Oxide Cuts Data Center Power Consumption in Half Oxide Computer Company Blog

Here’s a sobering thought: today, data centers already consume 1-2% of the world’s power, and that percentage will likely rise to 3-4% by the end of the decade. According to Goldman Sachs research, that rise will include a doubling in data center carbon dioxide emissions. As the data and AI boom progresses, this thirst for power shows no signs of slowing down anytime soon. Two key challenges quickly become evident for the 85% of IT that currently lives on-premises.

How can organizations reduce power consumption and corresponding carbon emissions?
How can organizations keep pace with AI innovation as existing data centers run out of available power?

Graph of AI & Data Center Growth Boosting Electricity Demand

Masanet et al. (2020), Cisco, IEA, Goldman Sachs Research

Rack-scale design is critical to improved data center efficiency

Traditional data center IT consumes so much power because the fundamental unit of compute is an individual server; like a house where rooms were built one at a time, with each room having its own central AC unit, gas furnace, and electrical panel. Individual rackmount servers are stacked together, each with their own AC power supplies, cooling fans, and power management. They are then paired with storage appliances and network switches that communicate at arm’s length, not designed as a cohesive whole. This approach fundamentally limits organizations' ability to maintain sustainable, high-efficiency computing systems.

Of course, hyperscale public cloud providers did not design their data center systems this way. Instead, they operate like a carefully planned smart home where everything is designed to work together cohesively and is operated by software that understands the home’s systems end-to-end. High-efficiency, rack-scale computers are deployed at scale and operate as a single unit with integrated storage and networking to support elastic cloud computing services. This modern architecture is made available to the market as public cloud, but that rental-only model is ill-fit for many business needs.

Illustration of Oxide racks at a higher density (2x) than conventional ones

Compared to a popular rackmount server vendor, Oxide is able to fill our specialized racks with 32 AMD Milan sleds and highly-available network switches using less than 15kW per rack, doubling the compute density in a typical data center. With just 16 of the alternative 1U servers and equivalent network switches, over 16kW of power is required per rack, leading to only 1,024 CPU cores vs Oxide’s 2,048.

Extracting more useful compute from each kW of power and square foot of data center space is key to the future effectiveness of on-premises computing.

At Oxide, we’ve taken this lesson in advancing rack-scale design, improved upon it in several ways, and made it available for every organization to purchase and operate anywhere in the world without a tether back to the public cloud. Our Cloud Computer treats the entire rack as a single, unified computer rather than a collection of independent parts, achieving unprecedented power efficiency.

By designing the hardware and software together, we’ve eliminated unnecessary components and optimized every aspect of system operation through a control plane with visibility to end-to-end operations.

When we started Oxide, the DC bus bar stood as one of the most glaring differences between the rack-scale machines at the hyperscalers and the rack-and-stack servers that the rest of the market was stuck with. That a relatively simple piece of copper was unavailable to commercial buyers — despite being unequivocally the right way to build it! — represented everything wrong with the legacy approach.

The bus bar in the Oxide Cloud Computer is not merely more efficient, it is a concrete embodiment of the tremendous gains from designing at rack-scale, and by integrating hardware with software.

— Bryan Cantrill

The improvements we’re seeing are rooted in technical innovation

Replacing low-efficiency AC power supplies with a high-efficiency DC Bus Bar
Power conversion is performed once AC power is fed from the data center to the Oxide universal power shelf with a customized power shelf controller (PSC). The power shelf distributes DC power up and down the rack via a bus bar. This eliminates the 70 total AC power supplies found in an equivalent legacy server rack within 32 servers, two top-of-rack switches, and one out-of-band switch, each with two AC power supplies. This power shelf also ensures the load is balanced across phases, something that’s impossible with traditional power distribution units found in legacy server racks.
Bigger fans = bigger efficiency gains
Oxide server sleds are designed to a custom form factor to accommodate larger fans than legacy servers typically use. These fans can move more air more efficiently, cooling the systems using 12x less energy than legacy servers, which each contain as many as 7 fans, which must work much harder to move air over system components.
Purpose-built for power efficiency
Oxide server sleds have less restrictive airflow than legacy servers by eliminating extraneous components like PCIe risers, storage backplanes, and more. Legacy servers need many optional components like these because they could be used for any number of tasks, such as point-of-sale systems, data center servers, or network-attached-storage (NAS) systems. Still, they were never designed optimally for any one of those tasks. The Oxide Cloud Computer was designed from the ground up to be a rack-scale cloud computing powerhouse, and so it’s optimized for exactly that task.
Hardware + Software designed together
The Oxide Cloud Computer includes a robust cloud control plane with deep observability to the full system. By designing the hardware and software together, we can make hardware choices like more intelligent DC-DC power converters that can provide rich telemetry to our control plane, enabling future feature enhancements such as dynamic power capping and efficiency-based workload placement that are impossible with legacy servers and software systems.

Learn more about Oxide’s intelligent Power Shelf Controller

The Bottom Line: Customers and the Environment Both Benefit

Reducing data center power demands and achieving more useful computing per kilowatt requires fundamentally rethinking traditional data center utilization and compute design. At Oxide, we’ve proven that dramatic efficiency gains are possible when you rethink the computer at rack-scale with hardware and software designed thoughtfully and rigorously together.

Ready to learn how your organization can achieve these results? Schedule time with our team here.

Together, we can reclaim on-premises computing efficiency to achieve both business and sustainability goals.

OmniOS Community Edition r151052 OmniOS Community Edition

OmniOSce v11 r151052 is out!

On the 4th of November 2024, the OmniOSce Association has released a new stable version of OmniOS - The Open Source Enterprise Server OS. The release comes with many tool updates, brand-new features and additional hardware support. For details see the release notes.

Note that r151048 is now end-of-life. You should upgrade to r151050 or r151052 to stay on a supported track.

OmniOS is fully Open Source and free. Nevertheless, it takes a lot of time and money to keep maintaining a full-blown operating system distribution. Our statistics show that there are almost 2’000 active installations of OmniOS while fewer than 20 people send regular contributions. If your organisation uses OmniOS based servers, please consider becoming a regular patron or taking out a support contract.

Any problems or questions, please get in touch.

Reflections on Founder Mode The Observation Deck

Paul Graham’s Founder Mode is an important piece, and you should read it if for no other reason that “founder mode” will surely enter the lexicon (and as Graham grimly predicts: “as soon as the concept of founder mode becomes established, people will start misusing it”). When building a company, founders are engaged in several different acts at once: raising capital; building a product; connecting that product to a market; building an organization to do all of these. Founders make lots of mistakes in all of these activities, and Graham’s essay highlights a particular kind of mistake in which founders are overly deferential to expertise or convention. Pejoratively referring to this as “Management Mode”, Graham frames this in the Silicon Valley dramaturgical dyad of Steve Jobs and John Scully. While that’s a little too reductive (anyone seeking to understand Jobs needs to read Randall Stross’s superlative Steve Jobs and the NeXT Big Thing, highlighting Jobs’s many post-Scully failures at NeXT), Graham has identified a real issue here, albeit without much specificity.

For a treatment of the same themes but with much more supporting detail, one should read the (decade-old) piece from Tim O’Reilly, How I failed. (Speaking personally, O’Reilly’s piece had a profound influence on me, as it encouraged me to stand my ground on an issue on which I had my own beliefs but was being told to defer to convention.) But as terrific as it is, O’Reilly’s piece also doesn’t answer the question that Graham poses: how do founders prevent their companies from losing their way?

Graham says that founder mode is a complete mystery (“There are as far as I know no books specifically about founder mode”), and while there is a danger in being too pat or prescriptive, there does seem to be a clear component for keeping companies true to themselves: the written word. That is, a writing- (and reading-!) intensive company culture does, in fact, allow for scaling the kind of responsibility that Graham thinks of as founder mode. At Oxide, our writing-intensive culture has been absolutely essential: our RFD process is the backbone of Oxide, and has given us the structure to formalize, share, and refine our thinking. First among this formalized thinking – and captured in our first real RFD – is RFD 2 Mission, Principles, and Values. Immediately behind that (and frankly, the most important process for any company) is RFD 3 Oxide Hiring Process. These first three RFDs – on the process itself, on what we value, and on how we hire – were written in the earliest days of the company, and they have proven essential to scale the company: they are the foundation upon which we attract people who share our values.

While the shared values have proven necessary, they haven’t been sufficient to eliminate the kind of quandaries that Graham and O’Reilly describe. For example, there have been some who have told us that we can’t possibly hire non-engineering roles using our hiring process – or told us that our approach to compensation can’t possibly work. To the degree that we have had a need for Graham’s founder mode, it has been in those moments: to stay true to the course we have set for the company. But because we have written down so much, there is less occasion for this than one might think. And when it does occur – when there is a need for further elucidation or clarification – the artifact is not infrequently a new RFD that formalizes our newly extended thinking. (RFD 68 is an early public and concrete example of this; RFD 508 is a much more recent one that garnered some attention.)

Most importantly, because we have used our values as a clear lens for hiring, we are able to assure that everyone at Oxide is able to have the same disposition with respect to responsibility – and this (coupled with the transparency that the written word allows) permits us to trust one another. As I elucidated in Things I Learned The Hard Way, the most important quality in a leader is to bind a team with mutual trust: with it, all things are possible – and without it, even easy things can be debilitatingly difficult. Graham mentions trust, but he doesn’t give it its due. Too often, founders focus on the immediacy of a current challenge without realizing that they are, in fact, undermining trust with their approach. Bluntly, founders are at grave risk of misinterpreting Graham’s “Founders Mode” to be a license to micromanage their teams, descending into the kind of manic seagull management that inhibits a team rather than empowering it.

Founders seeking to internalize Graham’s advice should recast it by asking themselves how they can foster mutual trust – and how they can build the systems that allow trust to be strengthened even as the team expands. For us at Oxide, writing is the foundation upon which we build that trust. Others may land on different mechanisms, but the goal of founders should be the same: build the trust that allows a team to kick a Jobsian dent in the universe!

Reflections on Founder Mode Oxide Computer Company Blog

Paul Graham’s Founder Mode is an important piece, and you should read it if for no other reason that "founder mode" will surely enter the lexicon (and as Graham grimly predicts: "as soon as the concept of founder mode becomes established, people will start misusing it"). When building a company, founders are engaged in several different acts at once: raising capital; building a product; connecting that product to a market; building an organization to do all of these. Founders make lots of mistakes in all of these activities, and Graham’s essay highlights a particular kind of mistake in which founders are overly deferential to expertise or convention. Pejoratively referring to this as "Management Mode", Graham frames this in the Silicon Valley dramaturgical dyad of Steve Jobs and John Scully. While that’s a little too reductive (anyone seeking to understand Jobs needs to read Randall Stross’s superlative Steve Jobs and the NeXT Big Thing, highlighting Jobs’s many post-Scully failures at NeXT), Graham has identified a real issue here, albeit without much specificity.

For a treatment of the same themes but with much more supporting detail, one should read the (decade-old) piece from Tim O’Reilly, How I failed. (Speaking personally, O’Reilly’s piece had a profound influence on me, as it encouraged me to stand my ground on an issue on which I had my own beliefs but was being told to defer to convention.) But as terrific as it is, O’Reilly’s piece also doesn’t answer the question that Graham poses: how do founders prevent their companies from losing their way?

Graham says that founder mode is a complete mystery ("There are as far as I know no books specifically about founder mode"), and while there is a danger in being too pat or prescriptive, there does seem to be a clear component for keeping companies true to themselves: the written word. That is, a writing- (and reading-!) intensive company culture does, in fact, allow for scaling the kind of responsibility that Graham thinks of as founder mode. At Oxide, our writing-intensive culture has been absolutely essential: our RFD process is the backbone of Oxide, and has given us the structure to formalize, share, and refine our thinking. First among this formalized thinking – and captured in our first real RFD – is RFD 2 Mission, Principles, and Values. Immediately behind that (and frankly, the most important process for any company) is RFD 3 Oxide Hiring Process. These first three RFDs – on the process itself, on what we value, and on how we hire – were written in the earliest days of the company, and they have proven essential to scale the company: they are the foundation upon which we attract people who share our values.

While the shared values have proven necessary, they haven’t been sufficient to eliminate the kind of quandaries that Graham and O’Reilly describe. For example, there have been some who have told us that we can’t possibly hire non-engineering roles using our hiring process – or told us that our approach to compensation can’t possibly work. To the degree that we have had a need for Graham’s founder mode, it has been in those moments: to stay true to the course we have set for the company. But because we have written down so much, there is less occasion for this than one might think. And when it does occur – when there is a need for further elucidation or clarification – the artifact is not infrequently a new RFD that formalizes our newly extended thinking. (RFD 68 is an early public and concrete example of this; RFD 508 is a much more recent one that garnered some attention.)

Most importantly, because we have used our values as a clear lens for hiring, we are able to assure that everyone at Oxide is able to have the same disposition with respect to responsibility – and this (coupled with the transparency that the written word allows) permits us to trust one another. As I elucidated in Things I Learned The Hard Way, the most important quality in a leader is to bind a team with mutual trust: with it, all things are possible – and without it, even easy things can be debilitatingly difficult. Graham mentions trust, but he doesn’t give it its due. Too often, founders focus on the immediacy of a current challenge without realizing that they are, in fact, undermining trust with their approach. Bluntly, founders are at grave risk of misinterpreting Graham’s "Founders Mode" to be a license to micromanage their teams, descending into the kind of manic seagull management that inhibits a team rather than empowering it.

Founders seeking to internalize Graham’s advice should recast it by asking themselves how they can foster mutual trust – and how they can build the systems that allow trust to be strengthened even as the team expands. For us at Oxide, writing is the foundation upon which we build that trust. Others may land on different mechanisms, but the goal of founders should be the same: build the trust that allows a team to kick a Jobsian dent in the universe!

KORH Minimum Sector Altitude Gotcha Josef "Jeff" Sipek

I had this draft around for over 5 years—since January 2019. Since I still think it is about an interesting observation, I’m publishing it now.

In late December (2018), I was preparing for my next instrument rating lesson which was going to involve a couple of ILS approaches at Worcester, MA (KORH). While looking over the ILS approach to runway 29, I noticed something about the minimum sector altitude that surprised me.

Normally, I consider MSAs to be centered near the airport for the approach. For conventional (i.e., non-RNAV) approaches, this tends to be the main navaid used during the approach. At Worcester, the 25 nautical mile MSA is centered on the Gardner VOR which is 19 nm away.

I plotted the MSA boundary on the approach chart to visualize it better:

It is easy to glance at the chart, see 3300 most of the way around, but not realize that when flying in the vicinity of the airport we are near the edge of the MSA. GRIPE, the missed approach hold fix, is half a mile outside of the MSA. (Following the missed approach procedure will result in plenty of safety, of course, so this isn’t really that relevant.)