Dropping Linux and KVM in Favor of FreeBSD and Jails

[toc]

Before the Overhaul…

As mentioned previously, most of the services that I run out of my house have been via virtual machines. For a long while, I was using VMWare’s ESXi, but I got tired of its stupid Windows admin tool, and also tired of its restrictions on what hardware it would run on. I had a Mac Pro 5,1 that I’d replaced both Xeon CPUs with something faster, added a bunch of RAM, and packed it full of disks. Previously, I’d been using that to edit photos and videos, but I upgraded that system to something much faster, so I reassigned the Mac Pro to server duty. I installed CentOS on it, taught myself how to drive KVM, and moved all of my VMs over to it.

That ran very well, and probably would still be running well today. I had a VM for joker, a VM for my email server (riddler), a VM for a TeamSpeak server I was running for my gaming clan (clayface), and a VM for a website I was working on (penguin). Most VMs were FreeBSD except the TeamSpeak one, but I had plans to convert that one over, too.

I Love FreeBSD, but it’s Not a Hypervisor

I really wanted to convert the Mac Pro over the FreeBSD and run a hypervisor on it. The problem: KVM is no longer supported on it. The hypervisor of choice under FreeBSD is BHyve. It’s… suboptimal. If the intention is to only ever run FreeBSD VMs under it, it may be a decent-enough solution. But add a Linux VM to it, and things get ugly quick. There’s a multi-stage configuration and boot up process for each Linux VM, as opposed to something like KVM that just spins the VM up and starts loading whatever is on the disk image. This appears to be a conscious decision by the BHyve developers, and that’s fine. I’ll pass.

Do I Need a Hypervisor?

My main VM: joker was running all of my local services. Things I used for me, like my IMAP server, my ssh login, a lot of my anti-spam fighting, etc. Everything else: running one process. Riddler: sendmail. Clayface: the TeamSpeak server. Penguin? Apache. Do these all need VMs? The answer is no. They’re processes that can be thrown in a jail.

With that, I set out to install FreeBSD 10.2 on my Mac Pro, and I failed. Every attempt to boot the FreeBSD installer thumb drive was met with complete failure. Nothing I did would make that Mac boot from it. With that, I decided to retire the Mac and build a new machine.

Joker Gets Rebuilt

I decided to go somewhat all-out on the new server build. I also decided I’d call it joker, and continue using it for all of my general stuff. Each of the other former VMs would end up in their own jail, and I’d likely add a couple more. With that:

Supermicro X10DAI-O motherboard
2 x Intel E5-2640 v3 Xeons
2 x Noctua NH-D9L CPU coolers
128GB DDR4 RAM
Cooler Master HAF-X server case and 1000W power supply

I figured I’d use two of the 300GB 10K RPM WD SATA drives, mirror them, and make them the machine’s zroot. Four other 7200RPM 1TB drives would be put together into a RAID10 volume and used for /local.

Why so much hardware and money spent? The intention is to leave this one alone for a long time. And leave it running for a long time. Xeons aren’t super-fast processors, but they’re a bit more reliable than the Core i7 consumer chips. They’re meant to be server chips, and that’s what I’m using them as. I also have some video processing that I can do via command line (with that wonderful application called ffmpeg) that will benefit from the large number of CPU cores. Each 2640 is an 8-core chip, meaning a total of 32 cores including the hyperthreading.

Ultimately, I don’t really have to justify it to anyone. So I’m not going to, any further.

Trouble Before Paradise?

As mentioned in the Storage Server post, I’m a fan of Supermicro’s motherboards for server duty. But, to be fair: I’ve only ever put them in Supermicro chassis. This was the first attempt to put one in someone else’s chassis. Supermicro claims that the board is an EATX sized. The assumption when they say that is that it also has stand-off holes where an EATX board should. This one doesn’t. Think of the corner of a given motherboard where all of the external input devices connect to. That entire corner is unsupported because its stand-off hole does not line up with an EATX chassis. There were 1 or 2 other stand-off holes that didn’t line up with anything on the chassis. I know this is Supermicro’s fault because I’m also a big fan of Cooler Master’s chassis. They don’t mess stuff like that up. If they say a motherboard stand-off should be here, here, and here, it better be there.

Fortunately, it didn’t make enough of a difference to matter. The motherboard would be standing on end inside the chassis, meaning that corner wouldn’t have any (or much?) mass on it. Further, since the motherboard was headless, I had to toss in an old nVidia video card. That ended up serving as another anchor for the board as did a pair of Intel PCI-E GigE NICs. The rest of the stand-offs worked perfectly and the board was secured in the case stable enough. Even with the mass of 2 CPUs and their respective heat sinks and fans attached.

System First Boot and Install

Damn I’m good. I mean, really: I’m good. Or… I’ve just done this too many times. The system booted right up, beeped a few times as the motherboard realized it had 2 CPUs to contend with, rebooted a couple of times, and then came right up into its BIOS. I set the date and time, told it to boot off the USB thumb drive, and away it went.

Like when I installed the NAS, this FreeBSD install took all of a few minutes. I told the installer to create a zroot install, mirroring both 300GB drives together. I installed the base, source, and document packages off the thumb drive. Like the NAS, I also didn’t bother letting the installer worry about the networking as I’d be creating a couple of LAGG interfaces on my own. I finished up, and soon enough the system a new joker was born.

Networking

The Switch

As outlined in my NAS post, I have a managed Cisco switch that sits with the servers in my basement. Not only does it support 802.3ad bundling, but it also supports VLANs. While this is a server-based blog post, the config for the switch interfaces might be of interest to some:

interface gigabitethernet1/1/3
 description "joker : em0 : po3"
 channel-group 3 mode auto
!
interface gigabitethernet1/1/11
 description "joker : igb0 : po5"
 channel-group 5 mode auto
!
interface gigabitethernet1/1/15
 description "joker : em1 : po3"
 channel-group 3 mode auto
!
interface gigabitethernet1/1/23
 description "joker : igb1 : po5"
 channel-group 5 mode auto
!
interface Port-channel3
 description "joker : lagg1 : g1/1/3, g1/1/15"
 switchport mode access
 switchport access vlan 770
!
interface Port-channel5
 description "joker : lagg0 : g1/1/11, g1/1/23"
 switchport mode access
 switchport access vlan 690
!

The Server

The motherboard’s Intel GigE interfaces showed up as ibg0 and 1. The add-in PCI-E NICs: em0 and 1. I bundled those guys into 2 different LAGGs in /etc/rc.conf:

cloned_interfaces="lagg0 lagg1"
# Now create and configure the actual laggs
# lagg0 (public)
ifconfig_lagg0="inet XX.YY.ZZ.210/24 laggproto lacp laggport igb0 laggport igb1"
ifconfig_lagg0_ipv6="inet6 [REDACTED] prefixlen 64"
defaultrouter="XX.YY.ZZ.1"
ipv6_defaultrouter="[REDACTED]"
# lagg1 (private)
ifconfig_lagg1="inet 192.168.10.1/24 laggproto lacp laggport em0 laggport em1"
ifconfig_lagg1_ipv6="inet6 [REDACTED] prefixlen 64"

Then a couple of commands to make sure it all took:

service netif restart
service routing restart

joker$ ping 192.168.10.254
PING 192.168.10.254 (192.168.10.254): 56 data bytes
64 bytes from 192.168.10.254: icmp_seq=0 ttl=64 time=0.175 ms
64 bytes from 192.168.10.254: icmp_seq=1 ttl=64 time=0.150 ms
64 bytes from 192.168.10.254: icmp_seq=2 ttl=64 time=0.153 ms
64 bytes from 192.168.10.254: icmp_seq=3 ttl=64 time=0.148 ms
64 bytes from 192.168.10.254: icmp_seq=4 ttl=64 time=0.152 ms
^C
--- 192.168.10.254 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.148/0.156/0.175/0.010 ms
joker$ ping XX.YY.ZZ.1
PING XX.YY.ZZ.1 (XX.YY.ZZ.1): 56 data bytes
64 bytes from XX.YY.ZZ.1: icmp_seq=0 ttl=63 time=12.315 ms
64 bytes from XX.YY.ZZ.1: icmp_seq=1 ttl=63 time=2.645 ms
64 bytes from XX.YY.ZZ.1: icmp_seq=2 ttl=63 time=1.993 ms
64 bytes from XX.YY.ZZ.1: icmp_seq=3 ttl=63 time=2.596 ms
64 bytes from XX.YY.ZZ.1: icmp_seq=4 ttl=63 time=2.069 ms
^C
--- XX.YY.ZZ.1 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 1.993/4.324/12.315/4.004 ms

Dual Homed

I consciously and carefully dual-home joker. I know that it’s a security risk, and I don’t normally recommend that folks do that with a server. But I’m confident enough in the security I have in and around joker that I’m not as concerned about it. Again: I would not recommend doing this unless you know you can trust your security.

Filesystem Setup

Remember me mentioning those 4 7200RPM 1TB drives?

zpool create local mirror ada2 ada3 mirror ada4 ada5

And afterward:

joker# zpool status local
  pool: local
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	local       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    ada2    ONLINE       0     0     0
	    ada3    ONLINE       0     0     0
	  mirror-1  ONLINE       0     0     0
	    ada4    ONLINE       0     0     0
	    ada5    ONLINE       0     0     0

errors: No known data errors
joker# df /local
Filesystem    Size    Used   Avail Capacity  Mounted on
local         1.8T    1.0G    1.7T     0%    /local

My intent was to make /usr/local actually point to /local. I’d run the pkg command prior to making the /local zpool, so /usr/local already had stuff in it. That was simple to address:

cd /usr/local
tar xf - . | (cd /local ; tar xpvf -)
cd ..
rm -Rf local
ln -sf /local .

And done.

joker# cd /usr/local
joker# df .
Filesystem    Size    Used   Avail Capacity  Mounted on
local         1.8T    1.0G    1.7T     0%    /local

Packages

Over the course of using my joker FreeBSD virtual machine, I built up a list of packages that I needed installed such as Bind, Apache, PHP, Sendmail, etc. I won’t bore you with a complete list of them, but suffice it to say, I used the pkg command on the new joker to quickly add them all. Before too long, I had sendmail with Neal’s spamilter running, Dovecot for my IMAP mail retrieval, Bind 9.10 for name resolution both internally and externally, along with a few other services. And of course, I locked sshd down so that it wouldn’t respond to passwords, only ssh keys.

NFS Mount

I also followed the client steps in this post to get joker to mount bane’s exported filesystem on /opt. And with that mounted, I soft-linked /home over to it, then added my local user, along with a few others to joker’s /etc/passwd file.

A Note About User Management

I’m going to rant a bit here about user management. Most of the UNIX or UNIX-like operating systems that I’m familiar with have scripts that root users can run to add, delete, or change user attributes. One of the common things I see with them all is automatically making the UID and GID the same. So for each user that exists on the server, there also exists a matching group.

Why? I don’t ask because of any concern with resource wasting or whatnot. I ask because: Why? Why do that? If you have a group of what I call stupid-users (ie: non-root), why not have them all in the same GID? It’s kinda what groups are for!

Further, every single systems administrator better feel 100% comfortable and confident in themselves editing /etc/passwd with tools such as vipw. They should also be able to edit the /etc/group without fear and without fucking something up. If you as a SysAd can’t do either of those or are afraid to, step away from the keyboard. Seriously. These are very simple skills that you must understand how to do. Or you’re a hair’s breadth away from being replaced with a shell script.

I never use those silly scripts. The first thing I do is edit the group file, find an unused GID, and create a “users” group. I then edit the passwd file with vipw to add my user. The UID I use is the same one I’ve had since my AOL days (yes, I still remember it), and the GID is the one I just created. The user gets saved without a password, which I immediately fix with the passwd command.

These are things you should be able to do, too. Stop using those stupid scripts. And to the folks writing them: stop writing them. You’re doing beginners a disservice because they’ll never learn the joys of screwing up a system editing /etc/passwd and how to fix it all.

But, that’s just my opinion. Back to the story.

Jails for Public Facing Services

Prep the Config File

With joker up and running the way I wanted, I needed to get my former riddler virtual machine built as a jail reasonably quickly. Otherwise I wouldn’t be able to send any outbound mail. An exercise I worked on before the physical machine was even built was to ready the /etc/jails.conf file so that I could just build filesystems on the new machine, start the jails, and go. The file:

exec.start = "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
exec.clean;
mount.devfs;
mount.fdescfs;
mount.procfs;
allow.raw_sockets = 1;

path = "/local/jails/$name";

# Riddler for outbound smtp with auth
riddler {
	host.hostname = "riddler.lateapex.net";
	ip4.addr += "lagg0|XX.YY.ZZ.217/32";
	ip4.addr += "lo1|127.0.0.2/32";
	ip6.addr += "lagg0|[REDACTED]::217/128";
	ip6.addr += "lo1|::2/128";
}

# Penguin
penguin {
	host.hostname = "penguin.lateapex.net";
	ip4.addr += "lagg0|XX.YY.ZZ.215/32";
	ip4.addr += "lo2|127.0.0.3/32";
	ip6.addr += "lagg0|[REDACTED]::215/128";
	ip6.addr += "lo2|::3/128";
}

# Clayface:
clayface {
	host.hostname = "clayface.lateapex.net";
	ip4.addr += "lagg0|XX.YY.ZZ.216/32";
	ip4.addr += "lo3|127.0.0.4/32";
	ip6.addr += "lagg0|[REDACTED]::216/128";
        ip6.addr += "lo3|::4/128";
}

# catwoman
catwoman {
	host.hostname = "catwoman.lateapex.net";
	ip4.addr += "lagg0|XX.YY.ZZ.219/32";
	ip4.addr += "lo4|127.0.0.5/32";
	ip6.addr += "lagg0|[REDACTED]::219/128";
        ip6.addr += "lo4|::5/128";
}

# scarecrow
scarecrow {
	host.hostname = "scarecrow.lateapex.net";
	ip4.addr += "lagg0|XX.YY.ZZ.212/32";
	ip4.addr += "lo5|127.0.0.6/32";
	ip6.addr += "lagg0|[REDACTED]::212/128";
        ip6.addr += "lo5|::6/128";

I know it’s silly of me to try to censor the IP addresses. If you’re reading this blog, you’re actually connected to one of them. Astute readers will be able to figure out which one, and what its IP address actually is.

Get the Filesystems Ready

I’ve read plenty of guides and suggestions for combining the read-only sections of a bunch of jails’ / and /usr filesystems into a single directory. It seems like a waste of time in my opinion, so I skipped and just created new zfs filesystems in /local for them all.

zfs create local/jails
zfs create local/jails/riddler
zfs create local/jails/scarecrow
zfs create local/jails/clayface
zfs create local/jails/catwoman
zfs create local/jails/penguin

The directory /usr/freebsd-dist on the installation media has two files that I knew I’d be referencing with each of the jails. So I copied them over to the /local/jails base directory. That way they’d be handy if I needed to create another jail in the future and didn’t have the installation media mounted. Speaking of which, the installation media is on /opt (which is NFS-mounted from bane) in a directory called iso. Mount that on /media/freebsd and then copy the required files over:

mount -t cd9660 /dev/`mdconfig -a -t vnode -f /opt/iso/freebsd-10.2.iso` /media/freebsd
cp /media/frebsd/usr/freebsd-dist/base.txz /local/jails
cp /media/frebsd/usr/freebsd-dist/src.txz /local/jails
umount /media/freebsd

Now base.txz and src.txz are on the local filesystem. If an when I upgrade to a newer version of FreeBSD, I can easily replace those files.

Build the Jails

Simple step, just takes a bit of time. From the /local/jails directory:

tar xpvf base.txz -C scarecrow
tar xpvf base.txz -C catwoman
tar xpvf base.txz -C clayface
tar xpvf base.txz -C penguin
tar xpvf base.txz -C riddler
tar xpvf src.txz -C riddler

Start the Jails

Add the following to /etc/rc.conf:

# Get the jails going
jail_enable="YES"

And then fire them up:

service jails start

With that, I had 5 jails running, doing nothing but syslogging to themselves:

joker# jls
   JID  IP Address      Hostname                      Path
     6  XX.YY.ZZ.216  clayface.lateapex.net         /local/jails/clayface
     7  XX.YY.ZZ.217  riddler.lateapex.net          /local/jails/riddler
     8  XX.YY.ZZ.215  penguin.lateapex.net          /local/jails/penguin
     9  XX.YY.ZZ.219  catwoman.lateapex.net         /local/jails/catwoman
    10  XX.YY.ZZ.212  scarecrow.lateapex.net        /local/jails/scarecrow

To Be Continued

This is getting a bit long-winded, so I’m going to stop here and continue it with another post soon. Stay tuned!

Post Views: 6,130

JasonVanPatten.com

Jason's Various Ramblings

Dropping Linux and KVM in Favor of FreeBSD and Jails

Before the Overhaul…

I Love FreeBSD, but it’s Not a Hypervisor

Do I Need a Hypervisor?

Joker Gets Rebuilt

Trouble Before Paradise?

System First Boot and Install

Networking

The Switch

The Server

Dual Homed

Filesystem Setup

Packages

NFS Mount

A Note About User Management

Jails for Public Facing Services

Prep the Config File

Get the Filesystems Ready

Start the Jails

To Be Continued

About Jason Van Patten

2 thoughts on “Dropping Linux and KVM in Favor of FreeBSD and Jails”

Leave a Reply Cancel reply

JasonVanPatten.com

Jason's Various Ramblings

Before the Overhaul…

I Love FreeBSD, but it’s Not a Hypervisor

Do I Need a Hypervisor?

Joker Gets Rebuilt

Trouble Before Paradise?

System First Boot and Install

Networking

The Switch

The Server

Dual Homed

Filesystem Setup

Packages

NFS Mount

A Note About User Management

Jails for Public Facing Services

Prep the Config File

Get the Filesystems Ready

Start the Jails

To Be Continued

Related Posts

Network Customization: At The Switch Or Server?

FRR Patched And Working

The “Rackening” Is Complete

About Jason Van Patten

2 thoughts on “Dropping Linux and KVM in Favor of FreeBSD and Jails”

Leave a Reply Cancel reply