Microstack is a stripped-down version of Openstack distributed by Canonical, less customizable and with a lot of Sane Defaults. Normally I’m not a fan of Canonical, but since I’m not familiar with openstack, this seems like a good starting place.
And honestly, microk8s
wasn’t such a bad thing, and
this is the same deal, right?
On ubuntu, install microstack from the beta
sudo snap install microstack --devmode --beta
Now configure openstack:
sudo microstack init --auto --control
That seems to have worked out.
You might also need the client programs? I haven’t used them, I don’t think.
sudo snap install openstackclients
Finally, if you want to look at a GUI, you can get dashboard access:
sudo snap get microstack config.credentials.keystone-password
Go grab an Ubuntu image. The kvm image is ⧉here,
but it did not work for me, giving
GRUB_FORCE_PARTUUID set, attempting initrdless boot
, and
locking up. ⧉This
forum suggested that I instead use the non-kvm one, and it
worked.
wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
sudo cp jammy-server-cloudimg-amd64.img /var/snap/microstack/common/images/
Because snap-installed programs can only access certain
directories, you have to place the file in
/var/snap/microstack/common/images/focal-server-cloudimg-amd64.img
.
Then create the image like this:
microstack.openstack image create ubuntu-jammy-nokvm \
--disk-format qcow2 \
--file /var/snap/microstack/common/images/jammy-server-cloudimg-amd64.img \
--public
great, now we can run it
microstack.openstack server create --flavor m1.medium --image ubuntu-jammy-nokvm --nic net-id=831a8a23-4a6c-40a1-8435-620412144195 --key-name own --security-group 0d57b7c3-2e6c-4eab-be2c-319ecba62c7f test99
and you will need to find the correct ids and create and add a floating ip.
Atlernatively microstack has a nice shorthand:
microstack launch --flavor m1.medium ubuntu-jammy-nokvm
excellent!
This is where things get ugly. I’m used to being able to hack on python code, which the microstack repository is replete with. Unfortunately, ⧉most of the content of snaps are read-only. So I’m stuck reconfiguring and restarting things. At least I can read the code though. That’s OpenSourceTM.
I repeatedly got permission denied
when trying to
access hosts. I also briefly got no route to host
… Hm. I
started tyring to apply versions of
#cloud-config
users:
- default
- name: ubuntu
groups: sudo
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh-authorized-keys:
- ssh-rsa AAA
to the --user-data
parameter, and in the UI. Nothing
was taking. I could not get into the image via the SPICE consoel
because ⧉there is no
default user/pass for the ubuntu cloud image. The prepackaged
cirrOS
image would not allow me to SSH either, something
was broken in there.
Eventually, I found
[ 26.831426] cloud-init[603]: 2023-04-06 04:39:05,756 - url_helper.py[ERROR]: Timed out, no response from urls: ['http://169.254.169.254/openstack']
[ 26.832426] cloud-init[603]: 2023-04-06 04:39:05,763 - util.py[WARNING]: No active metadata service found
⧉169.254.169.254 is
commonly used as a metadata server on cloud computing platforms.
It’s classed under “Link Local” in ⧉this RFC on special-use
addresses. As far as I could find, that’s what serves the
cloud-config
, so that explains why it’s not working. Some
variant of the above is probably used to configure the default SSH
keys I added through the CLI.
So I went fishing on search engines, and I found nothing. ⧉This person had the same
issue, but resolved it by just restarting the VM :/ And the person
right below says I should simply
sudo snap restart microstack
… So that’s no help
either.
Well, the docs said this was tested on 20.04, but I’m on a fresh 22.04. So I guess we’ll switch to 20.04 and we’ll see how that goes…
That worked out. On a fresh 20.04 the thing works as advertised. Yikes. All the previous section worked fine. Not the most satisfying conclusion but at least now we can move forward. Maybe I can revisit this in a 22.04 VM and try to figure out what’s going on :)
After rebooting I noticed the machines did not start. Microstack
keeps it’s nova.conf in
/var/snap/microstack/common/etc/nova/nova.conf.d/nova-snap.conf
.
I edited it to include
resume_guests_state_on_host_boot = True
in the
[DEFAULT]
section and restarted the service with
sudo systemctl restart snap.microstack.nova-compute.service
,
though since I need to reboot to verify this, that part wasn’t really
necessary.
I realized I couldn’t get an internet connection on the VMs, which is kind of critical for the use case I’m thinking of. I found ⧉this answer, which then pointed to ⧉this answer. I did not apply the iptables rules. Instead I applied
(openstack) subnet set --dhcp external-subnet
(openstack) subnet set --dhcp test-subnet
(openstack) subnet set --dns-nameserver 8.8.8.8 external-subnet
(openstack) subnet set --dns-nameserver 8.8.8.8 test-subnet
(openstack) network set --share external
(openstack) network set --share test
followed by
sudo sysctl net.ipv4.ip_forward=1
I suspect I will find whether the former was crucial if/when I set up another network/subnet. But for now, it was one of those.
update: I did a little bit more work on networking
Let’s perform the ⧉regular grub PCI-passthrough steps to get our card captured by the vfio driver.
Basically that consists of editing your
/etc/default/grub
to add the following
GRUB_CMDLINE_LINUX_DEFAULT="splash amd_iommu=on kvm.ignore_msrs=1 vfio-pci.ids=10de:2231,10de:1aef"
YMMV for the particulars on vfio-pci.ids
, not to
mention whether you have an amd gpu or need to
ignore_msrs
.
This is basically just adding, ie
[pci]
passthrough_whitelist = [{ "vendor_id": "10de", "product_id": "2231" },{ "vendor_id": "10de", "product_id": "1aef" },{ "address": "0000:02:00.0" },{ "vendor_id": "1022", "product_id": "14da" }]
alias = { "vendor_id":"10de", "product_id":"2231", "device_type":"type-PCI", "name":"a5" }
alias = { "vendor_id":"10de", "product_id":"1aef", "device_type":"type-PCI", "name":"a5audio" }
alias = { "vendor_id":"c0a9", "product_id":"540a", "device_type":"type-PCI", "name":"nvme" }
alias = { "vendor_id":"1022", "product_id":"14da", "device_type":"type-PCI", "name":"bridge" }
to your nova-snap.conf
. Be careful that if you need to
pass identical devices, you are using the address
form of
the whitelist object, and not the
vendor_id
/product_id
form. For example in
the above I am using 0000:02:00.0
to whitelist my nvme
controller on that specific bus. Since I have two nvme controllers on
the same machine with the same vendor and product ids, they would both
be whitelisted if I used that form instead.
This consists of creating a flavor and adding the relevant PCI devices from your nova whitelist
openstack flavor set m1.large --property "pci_passthrough:alias"="a5:1,a5audio:1,nvme:1"
After trying to piece together the correct configs for PCI
passthrough using ⧉docs
for the full version of openstack, I was left with
No valid host was found. There are not enough hosts available.
,
so I began debugging that.
systemctl status snap.microstack.nova-scheduler.service
tells me Filter PciPassthroughFilter returned 0 hosts
.
OK. I must be configured wrong. ⧉Here’s
some doc for pci_passthrough in the
properties section of flavor
After reading the ⧉newer
docs, a little more carefully, I realized a lot of the
configuration I was doing was for SR-IOV
, which I am not
going to use here. Eventually I was able to get
Please ensure all devices within the iommu_group are bound to their vfio bus driver
in the nova logs! This error is comprehensible to me. Turns out my
iommu group is not isolated.
IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 0 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 0 00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 0 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2231] (rev a1)
IOMMU Group 0 01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aef] (rev a1)
IOMMU Group 0 02:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology Device [c0a9:540a] (rev 01)
Non-Volatile memory controller
is my NVME drive. Turns
out it shares a PCI lane with the GPU. I guess that’s what I get for
getting a compact motherboard. I managed to get a gpu machine running
after awhile though, here are some of the steps I took:
driverctl
to everything in the iommu group to
try to either remove drivers or replace them with the
vfio-pci
driverI think swapping the drives was definitely critical, and this seems
to be confirmed since when my gpu is passed through I can no longer
see the nvme drive in the same group on the controller. I also suspect
that install virt-manager
may have unstuck something,
since I (re)installed a bunch of random dependencies including
libvirt
, and I wasn’t really reading the output
apt-get
commands at that point.
I should be able to pass-through the nvme drive, and it would be a shame if I couldn’t, since it has data and models on it and is otherwise useless and inaccessible as it stands now.
And actually, it was a very simple matter once I read this in the ⧉openstack documentation:
If using vendor and product IDs, all PCI devices matching the
vendor_id
andproduct_id
are added to the pool of PCI devices available for passthrough to VMs.
Basically, the passthrough_whitelist
had to reference
the address
and not the vendor/product ids, since those
are not unique to each of the multiple nvme controllers in my
system.
Great! I have completed the basic setup of microstack
on my server and can now use it for running GPU workloads. I have
internet access and a large disk to use for models and data. I have
learned that the second PCI slot is in an IOMMU group with my Ethernet
controller, among others, which means this tower will be limited to a
single GPU unless and until I get a different motherboard.