Juju Ubuntu

Nodes Networking: Deploying OpenStack on MAAS 1.9+ with Juju

This is part 4 of my ongoing “Deploying OpenStack on MAAS 1.9+ with Juju” series. The last post, MAAS Setup: Deploying OpenStack on MAAS 1.9+ with Juju, described the steps to deploy MAAS 1.9 itself and configure it with the fabrics, subnets, and VLANs we need to deploy OpenStack. In this post we’ll finish the MAAS setup, add to MAAS the 4 dual-NIC NUC nodes and commission them, and finally configure each node’s network interfaces as required.

Updates

A lot has happened since I started these posts, MAAS 1.9 (latest stable is 1.9.4) got released, then Juju 1.25 (latest stable is 1.25.7) got released (with a subset of the features I’m talking about, while most of the new networking improvements are part of the recently released Juju 2.0.1). MAAS 2.0 (even 2.1) is out now (latest stable version in ppa:maas/stable). I’m updating the posts to reflect the changes where appropriate, using green text color, so stay tuned! 🙂

Dual Zone MAAS Setup

MAAS physical-network layout for OpenStack

MAAS physical-network layout for OpenStack

As I mentioned in the first post of the series, we’ll use 2 physical zones in MAAS – each containing 2 nodes (Intel NUCs) plugged into a VLAN-enabled switch. Switches are configured to trunk all VLANs on certain ports – those used to connect to the other switch or to the MAAS cluster controller. I’ve decided to use IP addresses with the following format:

10.X.Y.Z – where:

  • X is either 14 (for the PXE managed subnet 10.14.0.0/20) or it’s equal to the VLAN ID the address is from (e.g. 10.150.0.100);
  • Y is 0 for zone 1 and 1 for zone 2 (e.g. zone 1’s switch uses 10.14.0.2, while zone 2’s switch uses 10.14.1.2);
  • Z is 1 only for the IP address of the MAAS cluster controller’s primary NIC (the default gateway), 2 for each switch’s IP address, 100 + or matches the last part of node’s hostname (when possible) – e.g. node-11’s IP in VLAN 250 will be 10.250.0.111, while node-21’s IP in VLAN 30 is 10.30.1.121.

MAAS comes with a “default” zone where all nodes end up when added. It cannot be removed, but we can easily add more using the MAAS CLI, like this:

The same can be done from the MAAS web UI – click on “Zones” in the header, then click “Add zone”. Description is optional, but why not use it for clarity? When done, you can list all zones to verify and should get the same output as below:

So what are those 2 hostnames I mentioned above? MAAS can provide DHCP and DNS services to other things, not just nodes it manages, but also other devices on the network – like switches, Apple TVs, containers, etc. While not required, I decided it’s nice add the two switches as devices in MAAS, with the following CLI commands (or by clicking “Add Device” on the “Nodes” page):

Now the nodes UI page will show “0 Nodes | 2 Devices”, and clicking on the latter you can this nice summary:

switches-as-devices

Adding and Commissioning All Nodes

OK, now we’re ready to add the nodes. Assuming all NUCs, switches, and MAAS itself are plugged in as described in the diagram in the beginning, the process is really simple: just turn them on! I’d recommend to do that one node at a time for simplicity (i.e. you know which one is on currently, which can be tricky when more than one NUC is on – you have to match MAC addresses to distinguish between them).

What happens during the initial enlistment, in brief:

  • MAAS will detect the new node while it’s trying to PXE boot.
  • An ephemeral image will be given to the node, and during the initial boot some hardware information will be collected (takes a couple of minutes).
  • When done, the node will shut down automatically.
  • A node with status New will appear with a randomly generated hostname (quite funny sometimes).

You can then (from the UI or CLI) finish the node config and get it commissioned (so it’s ready to use):

  • Accept the node and fill in the power settings to use, change the hostname, and set the zone.
  • Once those changes are saved, MAAS should automatically commission the node (takes a bit longer and involves at least 2 reboots – again, node shuts down when done).
  • During commissioning MAAS discovers the rest of the machine details, including hardware specs, storage, and networking.
  • Unless commissioning fails, the node should transition from Commissioning to Ready.

A lot more details can be found in the MAAS Documentation. I’ll be using the web UI for the next steps, but of course they can also be done via the CLI, following the documentation. I’m happy to include those CLI steps later, if someone asks about them.

Remember the steps we did in an earlier post – Preparing the NUCs for MAAS ? Now we need the IP and MAC addresses for each NUC’s on-board NIC and the password set in the Intel MEBx BIOS. Revisiting Dustin’s “Everything you need to know about Intel” blog post will help a lot should you need to redo or change AMT settings.

Here is a summary of what to change for each node, in order (i.e. from the UI, edit the New node and first set the hostname, save, then zone – save again, finally set the power settings).

Node # Hostname Zone Power Type MAC Address Power Password Power Address
1 node-11.maas zone1 Intel AMT on-board NIC’s MAC address (as set in Intel MEBx) 10.14.0.11
2 node-12.maas zone1 Intel AMT on-board NIC’s MAC address (as set in Intel MEBx) 10.14.0.12
3 node-21.maas zone2 Intel AMT on-board NIC’s MAC address (as set in Intel MEBx) 10.14.1.21
4 node-22.maas zone2 Intel AMT on-board NIC’s MAC address (as set in Intel MEBx) 10.14.1.22

NOTE: While editing the node from the UI, MAAS can show a warning / suggestion you need to install wsmancli package on the cluster controller machine in order to be able to use AMT. Verify the amttool (which MAAS uses to power AMT nodes on/off) binary exists, if not – install it with $ sudo apt-get install wsmancli.

Once all 4 nodes are ready, your “Nodes” page in the UI should look very much like this:

all-nodes-ready

Now we can let MAAS can deploy each node (one by one or all at once – doesn’t matter) to verify it works. From the UI, click on the checkbox next to FQDN to select all nodes, then from the “Take action” drop down menu (that just replaced the “Add Hardware” button) pick Deploy, choose series/kernel (defaults are OK) and click “Go”. Watch as nodes “come to life” and the UI auto-updates. It should take no more than 10-15 m for a node to get from Deploying to Deployed (unless an issue occurs). Try to SSH into each node (username “ubuntu”, and eth0’s IP address as seen in the node details page), check external connectivity, DNS resolution, pinging MAAS, both switches, other nodes, etc.

Setting up Nodes Networking

Now that all your nodes can be deployed successfully, we need to change the network interfaces on each node to make them usable for hosting the needed OpenStack components.

As described in the first post, 3 of the nodes will host nova-compute units (with ntp and neutron-openvswitch as subordinates), and collocated ceph units. The remaining node will hosts neutron-gateway (with ntp as a subordinate), ceph-osd, and is also used for the Juju Controller. The rest of the OpenStack services are deployed inside LXD containers, distributed across the 4 nodes.

We can summarize each node’s connectivity requirements (i.e. to which subnet each NIC should be linked to and what IP address to use) in the following matrix:

 Subnet, Space, VLAN / Node

node-11 node-12 node-21 node-22

subnet: 10.14.0.0/20

space: default

VLAN: untagged(0)

eth0: 10.14.0.111

eth1: 10.14.2.111

eth0: 10.14.0.112

eth1: 10.14.2.112

eth0: 10.14.1.121

eth1: 10.14.2.121

eth0: 10.14.1.122

eth1: 10.14.2.122

subnet: 10.50.0.0/20

space: public-api

VLAN: 50

eth0.50: 10.50.0.111

eth0.50: 10.50.0.112 eth0.50: 10.50.1.121 eth0.50: 10.50.1.122

subnet: 10.100.0.0/20

space: internal-api

VLAN: 100

eth0.100: 10.100.0.111

eth0.100: 10.100.0.112 eth0.100: 10.100.1.121 eth0.100: 10.100.1.122

subnet: 10.150.0.0/20

space: admin-api

VLAN: 150

eth0.150: 10.150.0.111

eth0.150: 10.150.0.112 eth0.150: 10.150.1.121 eth0.150: 10.150.1.122

subnet: 10.200.0.0/20

space: storage-data

VLAN: 200

eth1.200: 10.200.0.111

eth1.200: 10.200.0.112 eth1.200: 10.200.1.121 eth1.200: 10.200.1.122

subnet: 10.250.0.0/20

space: compute-data

VLAN: 250

eth1.250: 10.250.0.111 eth1.250: 10.250.0.112 eth1.250: 10.250.1.121 eth1.250: 10.250.1.122

subnet: 10.30.0.0/20

space: storage-cluster

VLAN: 30

N/A eth1.30: 10.30.0.112 eth1.30: 10.30.1.121 eth1.30: 10.30.1.122

subnet: 10.99.0.0/20

space: compute-external

VLAN: 99

eth1.99: unconfigured N/A N/A N/A

Since the I found some issues  while using the web UI to configure node NICs, I recommend using the CLI instead for the remaining steps, which roughly are:

  • Starting from the basic dual-NIC config, post-commissioning MAAS creates 2 physical NICs for each node.
  • Additionally, on each node we need to create as many VLAN NIC as specified above.
  • For node-11 in particular, we need the second NIC eth1.99 to be linked to VLAN 99 and subnet 10.99.0.0/20 in the “compute-external” space. This is required for OpenStack Neturon Gateway to work as expected. No need to assign an IP address, as Neutron will ignore it (i.e. leave eth1.99 unconfigured).
  • Finally, we need to link each NIC (physical or VLAN) to the subnet it needs to use and associate a static IP address for the NIC.
    NOTE: Using statically assigned IP addresses for each NIC vs. auto-assigned addresses is not required, but I’d like to have less gaps to fill in and more consistency across the board. 

TIP: In case you need to later redo the network config from scratch, the easiest way I found is to simply re-commission all nodes, which will reset any existing NICs except for the physical ones discovered (again) during commissioning.

We can create a VLAN NIC on a node with the following CLI command (taking the node’s system ID, and the MAAS IDs for the VLAN and parent NIC):

NOTE: It’s confusing, but the “vlan=” argument expects to see the (database) ID of the VLAN (e.g. 5009 – all of these are >5000), NOT the VLAN “tag” (e.g. 200) as you might expect.

Unfortunately,  we can’t use a prefixed reference for the vlan and parent arguments. It would’ve been a nicer experience, if the create-vlan CLI supported “vlan=vid:50” and “parent=name:eth0”. So we need to first list all VLANs in the “maas-management” fabric to get their IDs. And to do that, we need to also know the fabric ID. Once we have the VLAN IDs, we then need the IDs of both physical NICs of the node. To summarize the sequence of commands we need to run initially, in order to get all the VLAN IDs we need:

Once we have these, on each node we run one or more of the following commands to set up each needed VLAN NIC and assign a static IP to it:

See how much nicer to use is the link-subnet command, allowing you to use “eth0.50” instead of the NIC ID (e.g. 1050), and similarly “cidr:10.50.0.0/20” for the subnet instead of its ID ?

Now we can use the CLI commands described above to configure each node’s NICs as required (see the table in the beginning of this section). For simplicity, in the commands below we’ll use variables like $VLAN50_ID, $NODE11_ID, $NODE21_ETH0_ID, and $NODE22_ETH1_ID  to refer to any IDs we need to use. Those variables should make it easier to automate the steps using a script which pre-populates the IDs and the runs the steps.

node-11

node-12

node-21

node-22

We can verify the steps works by listing all NICs of each node with $ maas 19-root interfaces read $NODE##_ID, or with a quick look in the web UI.

Next Steps: Advanced networking with Juju (on MAAS)

Whew… quite a few steps needed to configure MAAS nodes for complex networking scenarios. Automating those with scripts seems the obvious solution, and in fact I’m working on a generalized solution, which I’ll post about soon, when it’s in a usable state. 

We now have the nodes configured the way we want them to be, in the following post we’ll take a deeper look into how to orchestrate the deployment of OpenStack on this MAAS with Juju. You’ll have a brief overview of the advanced networking features available in Juju 2.0.1 (released in October, 2016). Stay tuned, and thanks for reading so far! 🙂

Convenient links to all articles in the series:

  1. Introduction
  2. Hardware Setup
  3. MAAS Setup
  4. Nodes Networking (this one)
  5. Advanced Networking
  6. Finale

40 Comments

  • Tsvetin Vasilev
    March 11, 2016 - 11:46 | Permalink

    Hi Dimiter,

    It is very interesting article but it is still unclear for me how that env will be integrated with Openstack?

    Configuring all the networks described what will be our benefit at the end? Is there a way to push openstack using these via juju because now looks like when I deploy the openstack charms they’re going to use first available network, do we need to create manually some ovs bridges in order to utilize maas configured network?
    Is there a way to instruct juju what network shall be used, for example if I have:
    1. Maas server with two networks 192.168.0.0/24(eth0) and 10.0.0.0/24(eth1) both set managed with DHCP & DNS enabled
    2. Juju server with one addapter in 10.0.0.0
    3. openstack controller node again with one adapter in 10.0.0.0
    4. compute node connected to both 192.168.0.0 and 10.0.0.0
    When I start deploying juju it brings up juju-br0 on 10.0.0.0 network on 2 & 3 but when it comes to the compute node when both networks are available it try to utilize 192.168.0.0. for juju-br0 and that create problems. I tried to set 192.168.0.0 to unmanaged state (without dns/dhcp) or even to delete it and add it afterwards but I hit exactly the same result. Looks like for some reason juju decide to use that network for its activities instead of 10.0.0.0 as on all the other nodes.
    May be there is a way how to specify what network/adapter shall be used?

    Kind regards,
    Tsvetin

    • dimitern
      March 14, 2016 - 19:58 | Permalink

      Hi Tsvetin,

      I’ve been crazily busy lately, so I didn’t have time to reply earlier.

      So, a short answer: the next post(s) will describe exactly the details you’re asking about – i.e. how Juju can use and manage the described MAAS setup to deploy components of OpenStack to the machines / containers they need to go to, including setting up a multi-NIC network config on the machines / containers.

      Right now the plan is to finish the bulk of the feature work for Juju 2.0 by the end of this month, including the enhanced networking features.
      I’ll post updates with a lot more details once the mentioned features are usable.

      Thanks!
      Dimiter

      • Tsvetin Vasilev
        March 17, 2016 - 10:49 | Permalink

        Thanks a lot Dimiter!

        Will wait for your new posts with great interest

        BTW, I think would be useful if those posts/docs are somehow attached to official juju/maas/ubuntu page, they have lot of detailed info what is not well described in the official docs (there are very-very basic use cases and almost no info about the openstack on top of juju/maas )

        Thanks a lot,
        Tsvetin

  • April 27, 2016 - 22:11 | Permalink

    Hi Dimiter
    Thanks for some great posts!
    I bit curious when you will publish the post about Advanced networking? Both Juju 2.0 and MAAS 2.0 is on the way and things will change a lot compared to previous versions. Will you update your posts with the 2.0 versions?
    Thanks again!
    David

    • dimitern
      April 28, 2016 - 08:59 | Permalink

      Thanks David!
      I will post the last 2 articles in the series soon, not sure when as we’re pretty busy around the 2.0 release of Juju.
      So once it’s out, I’ll have some time for new posts.
      I plan later to post updates to outline the differences between Juju 1.25 and 2.0 on MAAS 1.9 and 2.0.

      • Johan
        April 29, 2016 - 22:19 | Permalink

        Looking forward to the next two articles in this series and the differences between Juju 1.25 and 2.0 and MAAS 1.9 and 2.0. We are currently setting up a test environment for Juju and MAAS. Thanks for your hard work 🙂

  • May 2, 2016 - 12:53 | Permalink

    Hi Dimiter

    I’ve setup a system according to your posts and have made attempts to install OpenStack but run into the following two issues:

    1. When using multiple networks, as in the setup, and deploying LXC containers the containers are stuck in pending state because there is no route to one of the subnets (10.100.0.111). How can I configure the LXC to use 10.14.0.111 instead? A workaround to use I did was to put all the subnets for the nodes as unconfigured but then I got stuck on issue #2.

    + printf Attempt 5 to download tools from %s…\n https://10.100.0.111:17070/tools/1.25.5-trusty-amd64
    Attempt 5 to download tools from https://10.100.0.111:17070/tools/1.25.5-trusty-amd64
    + curl -sSfw tools from %{url_effective} downloaded: HTTP %{http_code}; time %{time_total}s; size %{size_download} bytes; speed %{speed_download} bytes/s –noproxy * –insecure -o /var/lib/juju/tools/1.25.5-trusty-amd64/tools.tar.gz https://10.100.0.111:17070/tools/1.25.5-trusty-amd64
    curl: (7) Failed to connect to 10.100.0.111 port 17070: Connection timed out

    2. When I try to install the openstack-base charm using juju i get the following error for keystone:
    unit-keystone-0[914]: 2016-05-02 08:16:48 ERROR unit.keystone/0.juju-log server.go:268 FATAL ERROR: Could not determine OpenStack codename for version 8.1

    Thanks in advance!
    David

    • dimitern
      May 4, 2016 - 09:45 | Permalink

      Hi David,

      With Juju 1.25 the options are limited, Juju 2.0 supports multi-NIC containers on MAAS (each container gets provisioned with as many NICs as its host, which in turn gets multiple bridges created for that to work).
      Juju 1.25 only creates a single bridge. Can you please file a bug with the issue #1 you’re having, and attach the contents of /etc/network/interfaces on the host and the container, also /var/log/juju/machine-0.log will be useful.
      About issue #2, I suspect it might be because you’re not using xenial for that machine and the charm cannot find the expected packages to install?

      Hope this helps,
      Dimiter

      • May 4, 2016 - 11:10 | Permalink

        Thanks Dimiter!
        Please reply with a link where I can submit the bug.

        Regarding issue #2: There is a new update of openstack-base charm, version 41, that solves this issue and which I’ve verified.

        BTW: when will final releases of Juju 2.0 and MAAS 2.0 be available? Just beta versions available which I’ve tried but ran into some strange issues.

        Regards,
        David

        • dimitern
          May 4, 2016 - 11:21 | Permalink

          Sure, please use the following link to file a bug against juju:
          https://bugs.launchpad.net/juju-core/+filebug

          Glad to hear you managed to resolve issue #2!

          As for the release dates for Juju 2.0 and MAAS 2.0, unfortunately I can’t give you an exact date,
          but it should be soon (I suspect within the next 2 weeks).

          Cheers,
          Dimiter

  • bdx
    May 3, 2016 - 20:55 | Permalink

    Great article! After giving it a once over, I have a few questions for you.

    What is the ‘compute-data’ network used for?
    Are openstack charms aware of ‘compute-data’ network, how?

    Thanks!

    • dimitern
      May 4, 2016 - 09:54 | Permalink

      Thanks James!

      The compute-data network is used for guest VM traffic (instances started by OpenStack) to other guest VMs or to OpenStack services.
      If you have a look at the first post in the series, the architecture diagram shows the different networks OpenStack uses (the ‘data’ network there is called ‘compute-data’ in later posts, to distinguish it from the ‘storage-data’ network).
      OpenStack charms (nova-compute and neutron-gateway in particular) are aware of the compute-data network, either via config settings or extra-bindings (in latest versions of OpenStack charms).

      I see you asked this on the Juju mailing list as well, I’d suggest to ask there for details about how OpenStack charms are configured (I’m not quite up-to-date with the latest developments).

      Cheers,
      Dimiter

  • Luis Lozano
    May 17, 2016 - 17:11 | Permalink

    Hi, nice article, looking forward to seeing what’s next in this series.

    Regards

  • ak
    May 18, 2016 - 18:17 | Permalink

    Thanks for these posts! Extremely helpful so far. I’m also looking forward to the Juju 2.0 updates.

    As a quick question to the guidance above, using your “Physical Network Layout” diagram it appears you have physically setup the following:

    Nodes 12:eth0 –> maas-zone1-sw/4
    Nodes 21:eth0 –> maas-zone2-sw/2
    Nodes 22:eth0 –> maas-zone2-sw/4

    In the “Setting up Wiring and Switches” section you have also tagged the following ports on each of the switches as below:

    maas-zone1-sw/4: VLAN 1,30,99,200,250
    maas-zone2-sw/2: VLAN 1,30,99,200,250
    maas-zone2-sw/4: VLAN 1,30,99,200,250

    However, you are tagging the Node eth0 interfaces as:

    Nodes 12:eth0 –> VLAN 50,100,150,200,250
    Nodes 21:eth0 –> VLAN 50,100,150,200,250
    Nodes 22:eth0 –> VLAN 50,100,150,200,250

    Wilth VLANs 50,100,150 assigned to the odd ports on both switches, will this traffic ever navigate to the eth0 interfaces of these nodes?

    If you can get a chance to add a screen shot of the Web UI Node>Interfaces section, this would probably clear up the confusion for me.

    This is admittedly my first foray into advanced networking scenarios with MAAS and as such my understanding may be off.

    Once again, thanks for these posts as all other guidance does not present such a full picture as this.

    • dimitern
      May 19, 2016 - 10:26 | Permalink

      Thanks!

      I see how it might be confusing, because the ‘Physical Network Layout’ shows a simplified diagram, while the ‘Setting up Wiring and Switches’ describes the actual hardware setup.

      maas-zone1-sw has 24 ports, maas-zone2-sw has 8 ports.
      Out of those 24 ports in maas-zone1-sw only the following ports have wires plugged in them:

      • Port #1: connected to MAAS’s eth1, and carrying all trunked VLAN traffic (tagged and untagged)
      • Port #3: connected to node-11’s eth0, and carrying both untagged and tagged traffic (for VIDs 50, 100, 150)
      • Port #4: connected to node-11’s eth1, and carrying both untagged and tagged traffic (for VIDs 30, 99, 200, 250)
      • Port #5: connected to node-12’s eth0, and carrying both untagged and tagged traffic (for VIDs 50, 100, 150)
      • Port #6: connected to node-12’s eth1, and carrying both untagged and tagged traffic (for VIDs 30, 99, 200, 250)
      • Port #24: connected to maas-zone2-sw’s port #1, and carrying all trunked VLAN traffic (tagged and untagged)

      Very similar picture for the 8 ports in maas-zone2-sw; only the following ports are used:

      • Port #1: connected to maas-zone1-sw’ port #24, and carrying all trunked VLAN traffic (tagged and untagged)
      • Port #3: connected to node-21’s eth0, and carrying both untagged and tagged traffic (for VIDs 50, 100, 150)
      • Port #4: connected to node-21’s eth1, and carrying both untagged and tagged traffic (for VIDs 30, 99, 200, 250)
      • Port #5: connected to node-22’s eth0, and carrying both untagged and tagged traffic (for VIDs 50, 100, 150)
      • Port #6: connected to node-22’s eth1, and carrying both untagged and tagged traffic (for VIDs 30, 99, 200, 250)

      So both eth0 and eth1 on each node can receive all untagged traffic, and tagged traffic for only those VLANs configured on the ports each node NIC is connected to.

      I hope this helps 🙂
      Dimiter

      • Dean Poulin
        October 8, 2016 - 23:12 | Permalink

        Thanks so much for these posts!! I’m following them now to setup my openstack cluster. I have gotten up the the point of attempting to deploy to one of the nodes and after deployment through MAAS and sshing into the node and trying to ping google.com returns back ” ubuntu@node-15:~$ ping google.com
        connect: Network is unreachable”. I’m not sure why DNS isn’t working.

        I noticed something different between your comment here and the network diagram above. In this comment you are skipping port #2 and say that the switch is configured with port #1 connected to the MAS server and ports #3-6 connected to the nodes. should it be port #1 connected to the MAS server and ports #2-5 connected to the nodes for each switch?

        • dimitern
          October 9, 2016 - 11:55 | Permalink

          Hey Dean,

          Please, double check you have the ‘maas-dns’ package installed on your MAAS server. Also, setting the ‘DNS Server’ field of the PXE subnet (10.14.0.0/20 as described in the posts) to the IP address of the MAAS server (10.14.0.1), might help with getting DNS to work (redeploy the node after changing that). If you’re nodes can resolve ‘google.com’, but not ping e.g. 8.8.8.8 you’ll need to set up the iptables rule for NAT-ing outgoing traffic from the nodes to the MAAS server IP address.

          As for the switch ports, what I’m describing with screenshots in the previous ports is how I configured the switch ports initially – #2 wasn’t used, #1 was connected to MAAS or the last port #24 of switch 1. The schematic diagram is simplified, not showing all ports. The principle is to set up the allowed VLANs in the switches configuration, such that the first NIC of each NUC is on the port carrying VLANs 0, 50, 100, 150, and the second NIC of each NUC – on ports carrying 0, 200, 250, 30, and 99. One way to do that is to use even/odd ports split, but you can also use any setup you like of course.

          As for

          • Dean Poulin
            October 10, 2016 - 01:20 | Permalink

            Thanks for your claifications! I ended up getting it to work. What I had to do was actually go in and setup iptables on the MAAS machine. with these commands
            $ sudo iptables -t nat -A POSTROUTING -o em1 -j MASQUERADE
            $ sudo iptables -A FORWARD -i em1 -o em2 -m state –state RELATED,ESTABLISHED -j ACCEPT
            $ sudo iptables -A FORWARD -i em2 -o em1 -j ACCEPT
            $ sudo iptables-save | sudo tee /etc/iptables.sav
            and updated rc.local like so
            #!/bin/sh -e
            #
            # rc.local
            #
            # This script is executed at the end of each multiuser runlevel.
            # Make sure that the script will “exit 0” on success or any other
            # value on error.
            #
            # In order to enable or disable this script just change the execution
            # bits.
            #
            # By default this script does nothing.
            iptables-restore < /etc/iptables.sav

            exit 0

            I removed the lines from /etc/network/interfaces
            post-up iptables -t nat -A POSTROUTING -o eth0 -j SNAT –to-source 192.168.1.170
            post-down iptables -t nat -D POSTROUTING -o eth0 -j SNAT –to-source 192.168.1.170

            the one thing that I had to do was execute a MAAS api command to set the default_gateway_ip like so:
            $ maas admin subnet update 10.14.0.0/20 gateway_ip=10.14.0.1

            After doing that the machines had internet access.

            I just finished running the ubuntu openstack autopilot install on my set of servers and it completed successfully!

            Thanks so much for guiding me through the steps, your posts are very clear and easy to follow. Without your guide this would have taken me so much longer to complete!

  • Ogi
    May 31, 2016 - 17:14 | Permalink

    Hi

    i managed to install autopilot twice but i am not able to reproduce reliably. I have MAAS network 192.168.26.0/23 and a public network 192.168.24.0/23. I am able to provision nodes without any problems or errors. The actual problem start with running openstack autopilot. Two things happen it runs through the setup and completes but when installing openstack open vswitch does not show any networks that is supposed to learn from maas. Second more annoying problem is that autopilot fails frequently and i think it something to do with networking since lxc containers just hang until timeout is reached. Each node has eth3 and eth4 configured as eth3 is in maas network and eth4 in public network. Depending on how i set the public network IP as auto assign or unassigned depends how far i get through the install. Every 1 in 10 installs seems to succeed. Can you shine some light maybe on why would a setup fail?

    Logs show:

    juju.state allwatcher.go:351 getting a public address for unit “landscape-server/0” failed: “public no address”

    • dimitern
      May 31, 2016 - 17:30 | Permalink

      Hey,
      Have you tried juju-2.0beta7 and LXD containers instead of LXC? If you have to use LXC, I’d strongly suggest the config setting ‘lxc-clone=false’ to avoid at least some of the issues.
      Also, what version of MAAS are you using?

      I haven’t tried autopilot myself, but in the last few days I was able to deploy openstack from a bundle a few times.. some things are a bit unreliable still, unfortunately. I’m verifying the steps needed to do it and those will be part of the next post, but I’d like to make sure I’ve tried different combinations and possible workarounds for issues that can happen.

      With containers, I’d suggest looking for any that come up with machine-local IPs for any reason (e.g. 10.3.x.y) and run juju remove-machine X/lxc/Y --force then juju add-unit --to lxc:X (same for LXD: s/lxc/lxd/). Example: ceph-mon/2 failed to come up OK in 2/lxc/3. Run juju remove-machine 2/lxc/3 --force then juju add-unit ceph-mon --to lxc:2.

      Hope that might help,
      Dimiter

      • Ogi
        May 31, 2016 - 17:43 | Permalink

        hey,

        so i am running 1.9.2 maas and i seem to be stuck. Where do i find ‘lxc-clone=false’ option in which .conf file?

        For some reason 10.0.3.0 network is always present:

        default 192.168.26.1 0.0.0.0 UG 0 0 0 juju-br0
        10.0.3.0 * 255.255.255.0 U 0 0 0 lxcbr0
        192.168.24.0 * 255.255.254.0 U 0 0 0 eth3
        192.168.26.0 * 255.255.254.0 U 0 0 0 juju-br0

        • dimitern
          May 31, 2016 - 20:21 | Permalink

          The 10.0.3.0/24 network comes from the lxc package (which sets up lxcbr0). From the output I can infer you’re using Juju 1.25.x, where only a single bridge is created for single-NIC containers (in your case the 192.168.26.0/23 looks like the maas-network, on the default route, so juju-br0 was created there). I’d try to bootstrap with juju manually and add 1 container to see if it comes up OK before trying anything more complicated, like autopilot (which also uses juju): try following the docs – https://jujucharms.com/docs/stable/getting-started

          Briefly, you’ll need to edit (or create if missing) ~/.juju/environments.yaml with these:

          Then, juju bootstrap -e maas-19, followed by juju add-machine lxc:0 and e.g. watch juju status as it happens.

          • Ogi
            June 1, 2016 - 14:41 | Permalink

            Hey,

            ok i tried that and it still fails but looking at the logs for LXC containers i see they try and eth0 as a network card? None of my nodes use eth0, they all have eth2 configured and i use eth4. just so it happens how they were hooked up to the switch. So LXC logs show things like this

            So my question is really is there a workaround for this? Isn’t this supposed to come from MAAS?

          • dimitern
            June 1, 2016 - 18:19 | Permalink

            The “eth0” interface inside the container is generated by Juju, not related to the names of interfaces on the host machine.

            Can you paste the full /var/log/cloud-init-output.log file from the container to e.g. http://paste.ubuntu.com please ? From the host you can use sudo lxc-attach -n juju-machine-0-lxc-0 bash to get inside. It will be useful to also see the in-container contents of /etc/network/interfaces, /etc/network/interfaces.d/eth0.cfg (if it’s there – alternatively either 00-juju.cfg and/or 50-cloud-init.cfg if any of them are there).

            You did mention trying to set the maas-network or public one static or unconfigured, etc. In order for juju to be able to allocate an address for the container, the maas-network needs to be Static or Auto on the host. DNS issues are likely due to not having maas-dns package installed on the server?

            Thanks!

  • Ogi
    June 2, 2016 - 11:50 | Permalink

    Hey thanks for helping out, this is where i am at the moment: http://paste.ubuntu.com/16942805/

    After more investigation i have figured out that inside LXC containers to use apt for some reason proxy is being used. I do not have a need for proxy but i think when i went apt-get install mass it installed maas-proxy as well. So if i ping and dig from container i am able to hit the internet just fine. The issue is apt desperately want to use some internal proxy and that is falling.

    the log is massive but here is part of it: http://paste.ubuntu.com/16942814/

    (dimitern: Note I reformatted the comment to put logs in pastes)

    • dimitern
      June 3, 2016 - 11:56 | Permalink

      I suspect the maas-proxy is acting up in your setup.
      Alternatively, the LXC is not using the MAAS DNS server (it should, assuming the host node also is).

      I’d suggest uninstalling the maas-proxy and trying another deployment. Since this appears to be a MAAS issue, I’d also suggest asking on #maas at IRC FreeNode network.

  • Garrett Goebel
    June 9, 2016 - 19:39 | Permalink

    Thank you! Glad to see you are actively responding to comments.

    Do you have an ETA on Next Steps: Advanced networking with Juju (on MAAS) and openstack deployment?

    • dimitern
      June 10, 2016 - 11:04 | Permalink

      Hey Garrett,

      I have a working draft of the next post, just testing a few bits of it before I publish (hopefully some time over the weekend).

      Cheers!

  • Gabriele
    October 10, 2016 - 17:08 | Permalink

    Hi Dimitern,

    Thank you so much for your excellent posts.
    I followed all the steps until this section and I found it very interesting.
    My 7 nodes are now all marked as Ready (in MAAS 2.0) and perfectly configured (Network, VLANS, Fabrics, Spaces…)
    I look forward to read what the next steps are to deploy Openstack with Juju 2.0.

    Thanks again for your great job.

  • October 19, 2016 - 19:51 | Permalink

    Great Article series, been reading at length… but I wonder if you have any idea, no matter how I deploy MAAS, either your method or my method… I still get…

    Nodes, BOOT via PXE….

    and I get a warning about the url_helper, trying to reach 169.254169.254 which seems a common message if you Google.

    Any ideas ?

    • dimitern
      October 19, 2016 - 21:20 | Permalink

      Thanks Andrew!

      Where are you getting this error? I doubt it’s on the PXE console, is it in /var/log/maas/*.log ?

      168.254.169.254 is the EC2 metadata service, which cloud-init hits during node deployment, and MAAS emulates that metadata service.

      Describe your network setup on MAAS and the nodes?

  • October 20, 2016 - 01:27 | Permalink

    Thanks for prompt reply..

    Network setup exactly as yours….other than the External Network, which has different IP Address/Subnet….

    As for the nodes, it’s when a node PXE Boots, and approx 45 seconds after booting via PXE, just after the node displays it’s IP Address and Route info.

    the node does not shutdown, or register with MAAS, e.g. does not appear in nodes, but it is observed by MAAS.

    checking logs

  • October 20, 2016 - 01:33 | Permalink

    its not in the /var/log/maas/*.log

    I’m using Ubuntu 14.04.5 LTS

    • dimitern
      October 20, 2016 - 10:58 | Permalink

      OK, in such cases I found it’s easier to try this on the MAAS server first before digging any deeper – usually fixes the issue:

      sudo dpkg-reconfigure maas-region-controller
      sudo dpkg-reconfigure maas-cluster-controller # maas-rack-controller in MAAS 2.0+

      For both cases, use the same IP address (in the former case as part of the MAAS URL) – the one on the managed subnet used for PXE booting.

      HTH, let me know 😉

  • October 21, 2016 - 19:40 | Permalink

    dimitern

    That resolved it, why it has the wrong IP Address, I’ll never know, I’m sure I checked it!

    Regards

    Andrew

  • Don
    January 7, 2017 - 04:59 | Permalink

    I have read this and the other articles so many times and maybe I am just confusing something in Neutron but what exactly happens at the end of 10.1.99.0 (compute-external).

    I have a network where the PXE/Internet Proxy and everything else are 2 separate networks. So, I need the Neutron/Horizon network on a vlan and the Internet on another. I know the PXE network is will route to the internet and that is working but then I need to understand how I force Horizon and VMs to work over our internal lab network without the need for those to be in anyway involved in internet. What I need to do is set up a static route that is put in place and set up the Horizon and VMs to respond over it.

    So take your example and assume I that 10.14.0.0 is my internet access but 10.99.0.0 is my corp network. Then route default over 10.14.X but static route specific networks over 10.99.0.0. Issue seems to be routing within the LXDs I think. Or at least specifically the openstack-dashboard LXC. The LXCs do not get the static routes but the bare metal machines do.

    • Don
      January 7, 2017 - 21:30 | Permalink

      Sorry, ignore my last comment. I am getting a better handle on it now via pinger. Need to work through some details but the big issue I have is that I need to deliver to all the nodes, a static route and I have tried via the GUI but that does not work because it requires you create the subnet in MAAS and then static route to it. I would prefer to do it via a dhcp_snippet but I don’t find any documentation on creating those. This would be my static route and I know it is weird when but:

      route add -net 10.0.0.0/8 gw 10.1.32.1

      10.1.32.X/24 is my lab network but it is not my PXE network and PXE network has no access to it.

  • ageiger
    January 26, 2017 - 04:33 | Permalink

    I’m attempting a setup in MAAS 2.1.1, so I hope this is still a reasonable question to ask. When a node is deployed, should it be able to access the internet without further configuration? They can resolve DNS properly, but they never ping. I could set the http_proxy to my controller’s PXE address (port 3128), and then I get a 403 error.

    Or do I need to add in iptables rules or another squid proxy (besides the default one configured at /var/lib/maas/maas-proxy.conf )?

    I feel like I’ve missed something obvious, since a lot won’t work if the internet can’t be seen (like downloading images for juju to use), but there doesn’t seem to be a ton on the topic, and there doesn’t seem to be a ton of complaints about it.

    (I really appreciate any advice – the tutorial here helps enormously in making sense of the transition from 1.9 setups to the newer 2.x ones.)

    • ageiger
      January 27, 2017 - 18:40 | Permalink

      It seems that the setup had a bad NAT configuration. I needed the following added to my /etc/rc.local to get my maas-management NIC on the region/rack controller to allow nodes to talk through the maas-external NIC.

      /sbin/iptables -t nat -A POSTROUTING -o enp3s0f0 -j MASQUERADE
      /sbin/iptables -A FORWARD -i enp3s0f0 -o enp3s0f1 -m state –state RELATED,ESTABLISHED -j ACCEPT
      /sbin/iptables -A FORWARD -i enp3s0f1 -o enp3s0f0 -j ACCEPT

      echo ‘net.ipv4.ip_forward=1’ >> /etc/sysctl.conf
      sysctl -p

      Now to really get started!

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    Powered by: Wordpress