Hardware Setup: Deploying OpenStack on MAAS 1.9+ with Juju
This is part 2 of my new “Deploying OpenStack on MAAS 1.9+ with Juju” series, and it follows up on my last week’s post Introduction: Deploying OpenStack on MAAS 1.9+ with Juju. In this post I’m going to explain the steps I used to configure the actual hardware – physical networking and wiring needed before MAAS is configured. I planned to include the MAAS setup in here as well, but the post grew larger than expected, so I’ll explain the MAAS setup in the next post.
NOTE: The hardware setup described below works for both MAAS 1.9 and MAAS 2.0+. Where this is not true, MAAS 2.0+ specific changes will be highlighted in green.
Hardware Setup for Deploying OpenStack on MAAS 1.9+ with Juju
The openstack-base bundle requires 4 machines, each with 2 network interface cards (NICs). Since we’re focusing on the networking side of things here, the other requirement (2 separate disks on each machine) is not a priority – it should be trivial to add it though, if needed (instead, I used MAAS 2.0+ web UI on the 3 storage/compute nodes to set up LVM storage with vg-root and vg-ceph LVs, the only partition on each containing a formatted ext4 file system, mounted at / and /srv/ceph-osd, respectively). To make the deployment slightly closer to a real-world scenario, I’ve decided to also use 2 zones in MAAS, putting a couple of machines in each zone, both of them plugged into a dedicated switch for the zone.
Used Hardware
- 4x Intel NUC Kits (DC53427HYE)
- I already have 2 of those, each with 120GB Crucial M2 M500 SSD drives
- Ordered 2 more from amazon.co.uk, each with larger 256GB Crucial M2 M550 SSD drives.
- There are cheaper options, but these NUCs support Intel vPro/AMT and can be powered on and off remotely by MAAS.
- 4x USB-to-Ethernet adapters
- Each NUC has only one 1Gbps NIC, so these are needed to satisfy the requirement for 2 NICs per machine.
- I’ve used Anker USB 3.0 to RJ45 Gigabit Ethernet Adapters, because they were cheap and support USB 3.0, but others should work as well.
- TP-LINK 24-Port Gigabit Easy Smart Switch (TL-SG1024DE)
- Used for zone1, has web-based admin panel, and most importantly has hardware support for IEEE 802.1Q VLANs.
- D-Link EasySmart 8-Port Gigabit Switch (DGS-1100-08)
- I already had this one and was using it for my local MAAS.
- Used for zone2 and provides mostly the same features as the one above.
- MAAS Server –
an old i386 Pentium 4 tower box with 512 MB RAM. I’ve retired the P4 box as it turned out to be unreliable and too slow, and decided to use a HP Compaq 8510p mobile workstation with much better dual-core CPU and 4GB RAM. - Cables
- 10x short (1 to 2m long) UTP Cat5e cables (connecting NUCs to switches and switches to each other and to MAAS).
- 1x longer (5m) UTP Cat5e cable (connecting MAAS to the home WiFi router for external connectivity).
- 4x EU 3-prong power cords (e.g. like this one), one for each NUC (since first 2 came from US, last 2 from UK).
- 4x short (1.5m) HDMI cables (optional), for accessing NUC BIOS and diagnosing issues.
If you’re interested, click here for a couple of photos I took of the completed deployment with all those components.
Setting up Wiring and Switches
TL-SG1024DE
As described the last post, I need to configure 8 VLANs for OpenStack in MAAS and the switches, using VIDs 50, 100, 150, 200, 250, 30, and 99, along with the untagged default VLAN used for PXE booting the NUCs from MAAS. I’ll use a subnet with CIDR 10.14.0.0/20 for PXE booting the NUCs, and subnets with ranges 10.<VID>.0.0/20 for the VLANs (e.g. 10.150.0.0/20 for VID 150). Before wiring up everything, I need to configure both switches, adding the VLANs to them using the web admin panels.
Following the user guide that came up with TL-SG1024DE, I’ve powered on the switch, connected port 1 to my laptop, and statically assigned 192.168.0.2/24 to it, then accessed the admin panel on 192.168.0.1, logging in with the default username and password (admin/admin). Of course, the first thing is to change the password to something more secure, using the menu on the left-side (System > User Account). Next, using System > IP Setting, I changed the IP address (10.14.0.2), subnet mask (255.255.240.0), default gateway (10.14.0.1 – this will be the MAAS server IP), and since MAAS will be managing most of the VLANs, I disabled DHCP on the switch. After changing the IP I needed to reconnect using the new address, but first I changed my laptop’s static IP to 10.14.0.1/24 for the NIC I was using (eth0). Finally, I’ve created all the VLANs and configured the ports like in the screenshot below:
Port 1 will be trunking all VLANs (tagged and untagged), and will be connected to the MAAS server. Ports 4, 6, 8, and 10 will receive tagged traffic for VLANs 50, 100, and 150 (on each NUC’s primary NIC), while ports 3, 5, 7, and 9, receive tagged traffic for VLANs 200, 250, 99, and 30 (on each NUC’s secondary NIC). Finally, port 24 is configured the same way as port 1, and will be connected to the other switch. Well, now you can plug all the UTP cables. IMPORTANT! Make sure to save the switch configuration once done (Save Config from the menu on the left) otherwise it will be LOST! Also, it’s a good idea to make a backup of it (System Tools > Backup and Restore).
DGS-1100-08
The initial configuration is very similar, with a few differences:
- 10.90.90.90/24 is the default subnet for configuring the switch, and the admin panel is accessible on 10.90.90.90.
- I used 10.90.90.99/24 as IP for my laptop, connected directly to port 1 on the switch.
- No username required, just password (admin again being the default one).
Once again, changing the admin password was the first step, then configuring the IP address, etc. I’ve used IP address 10.14.1.2, same subnet mask and gateway (255.255.240.0, 10.14.0.1), and disabled DHCP. After reconnecting to the new IP from the laptop (the latter now using 10.14.0.1/24 as earlier), I enabled 802.1Q VLAN support and configured all the VLANs and ports to match the first switch (as seen on screenshot below). Finally, I saved the switch configuration (using the Save dropdown menu) , backed it up (using the Tools dropdown menu), and plugged in the cables the same way.
Preparing the NUCs for MAAS
MAAS can power these NUCs on and off remotely, provided AMT is set up correctly. For each NUC the same set of steps are needed to enable AMT. My colleague Dustin Kirkland explained the process in great detail and nice screenshots in his “Everything you need to know about Intel” blog post, so highly recommend following his guide. Some notes for my specific setup:
- I used 10.14.0.11 and 10.14.0.12 as IP addresses for the first two NUCs (which will be in zone1).
- For the other two NUCs (in zone2), I picked 10.14.1.21 and 10.14.1.22 respectively.
- All subnet masks and default gateways for AMT are the same – 255.255.240.0 and 10.14.0.1 respectively.
- I plugged in the USB2Ethernet adapters only after finishing the AMT setup (so all these IPs are set on the primary on-board NIC).
- Each NUC has different AMT password (not necessarily required, but slightly more secure I guess).
- Of course, I made sure AMT is enabled before saving and exiting the BIOS.
- Finally, since MAAS needs let each NUC PXE boot over the network, for each NUC I entered the (regular) BIOS and changed the boot order to Network boot first, then boot from the SSD.
Next Steps: MAAS Setup
Stay tuned! 🙂 Next post will be about setting up MAAS and modeling the OpenStack deployment with spaces, subnets, VLANs, and fabrics – matching the described hardware configuration.
Convenient links to all articles in the series:
- Introduction
- Hardware Setup (this one)
- MAAS Setup
- Nodes Networking
- Advanced Networking
- Finale
Great two articles so far – what date can we expect the third?
Thanks Paul! I’m glad you find the articles useful 🙂
I’m preparing the next 2 articles and plan to post them some time this week.
These articles are great. Cant wait for4 the next one!
Thanks
Thanks!
Well, I was planning to post the next article last week, but couldn’t find the time unfortunately.
Hopefully, within the next week and a half I should be able to get some time away from Juju feature work to do the post.
Thanks! and please continue!
Out of curiosity, can you explain something in your physical layout? Your node-11 primary adaptor is connected to the zone1 switch to port 2 (according to your high level diagram) but you don’t specifically mention port 2 in your tagged ports. You mention 1,3,5,7,9 and 1,4,6,8,10.
if it was intentional, then node-11 and node-21 would have a connection to the switch(es) VLAN1 as untagged members (default). Fine, by why wouldn’t node-12, or node-22 have something similar?
Thanks.
Hey Rob,
Sorry, I’ve tried to explain the high-level-to-actual hardware wiring mapping. I see how it could be misleading – sorry about that!
I could’ve used 1:1 mapping between the high-level diagram (node-11 – ports 2 and 3), but it seemed easier to actually leave port 2 alone (node-11 uses ports 3 and 4; node-12 uses 5 & 6).
Similarly, node-21 uses ports 3 & 4; node-22 – 5 & 6 – leaving port 2 unused.
The logic in my mind is to keep nodes’ NICs plugged in the same order on both switches, while keeping the first and last switch ports for VLAN trunking (linking both switches and MAAS).
VLANs are configured as described – nodes’ eth0 switch ports trunk 0, 50, 100, and 150; (the following) eth1 switch ports trunk 0, 200, 250, 30, and 99.
Does that make sense? 🙂
Cheers,
Dimiter
Yes, it does. Thanks for responding.
In your experience, does having two NICs – ETH0 for VLAN 0,50 (public),100 (internal), and 150 (admin), and ETH1 for VLAN 0,200 (data),250 (storage),30 (cluster), and 99 (external) give you enough desired throughput / bandwidth for the respected tasks? Have you ever thought or had an inclination to track that?
If you had a quad port NIC, or even 10G interface(s), which of the service(s) do you feel would benefit if you had dedicated NIC’s – 1G, 10G, or both?
Thanks again!
Sorry, I should have mentioned that I understand that your NUC’s are limited in that regard, and that your environment is probably just meant for testing. I was just trying to speculate as to which of the 7 VLANs you are creating would benefit (*if you had the resources*) from being carved off on its own dedicated NIC.
Hey Rob,
It’s true I’ve mostly intended the deployment for testing purposes, and haven’t thought of tracking the optimal throughput.
Nevertheless, such a deployment can support a good number of guest VMs with moderate system load average.
Dual NICs are good to have (the more the better), so you can separate e.g. low-volume API / guest client traffic from high-volume storage replication / clustering.
With 1G and 10G dual NIC setup, it makes more sense to use the 10G NIC for storage replication and guest data, leaving the 1G NIC for the remaining VLANs.
Same applies for the network node – a 10G NIC dedicated for external compute (guest) traffic (if available) will yield “more responsive” VMs that download large blobs of data.
Any of those 8 VLANs can be on a separate NIC, but that’s just wasteful 🙂 It’s better to have e.g. 2 bonds over 4 1G NICs – for redundancy, HA, and load-balancing (depending on configured mode), than to use those 4 NICs separately (IMO – depends on the setup / constraints).
HTH,
Dimiter
Thanks again Dimiter for responding
I tend to get into the lower level for nothing more than my own curiosity. I am working on your design but implementing it completely virtual.
The way you worded the “low-volume API /guest client traffic vs. high volume traffic” is ultimately what I am seeking. I should have worded my question: based on the VLANs you created – assuming they are all necessary to separate traffic and relevant in the overall design, which of the VLANS would fall under the two categories you defined? Based on your response:
VLAN 50 (public) = low bandwidth?
VLAN 100 (internal) = low bandwidth?
VLAN 150 (admin) = low bandwidth?
VLAN 200 (data) = high bandwidth?
VLAN 250 (storage) = high bandwidth?
VLAN 99 (external) = high bandwidth?
VLAN 30 (cluster) = high bandwidth?
Would you agree with that assessment? Implement 1G for the lower traffic, and a single 10G for the higher traffic? From my perspective, separating traffic makes sense from a security stand point, but just trying to wrap my head around what traffic would benefit from what.
Hey Rob,
Yes, your assessment looks exactly right. Traffic separation is good not just for security, but also to ensure high volume traffic (like storage replication) does not trump over low volume traffic (OpenStack API calls, used between the components and by Juju as well). Imagine you add a 2GB image in an existing deployment – these 2GB will be replicated among all storage nodes, while Juju and/or OpenStack still needs fast, responsive API traffic flow to work in the mean time.
Another example is scaling out the storage nodes by adding a new one – it will need to be synced with the existing storage cluster and without separation that’ll cause API calls to lag behind the storage replication traffic (until sync is completed).
Cheers,
Dimiter
Hi Dimitern,
Congratulations on your work, although as of 2016 I have not found anything more lucid.
I currently want to start with 6 servers and expand to 10 servers for production. So I will have 3 servers on each switch and later on 5 on each. In this scenario, what would be the distribution to maintain traffic quality?
Thanks for sharing your work.
Greetings.
Hi Dear Dimiter, Thank you so much for the nice blog but I have stuck here
https://serverfault.com/questions/1050986/new-compute-node-with-openstack-base-bundle
would you please look at this question?