Automated Server Provisioning with Device42, Stackstorm, and PXE Kickstart

TL; DR –  This automation will provide a path through a provisioning workflow that can stand up any number of new servers in a data center with a controllable configuration at the device specific level.  We’ll be using Device42 as a central hub for configuration info and to store the current life cycle stage of a machine within the automation, and Stackstorm will  orchestrate the workflow and handle remote execution. Finally, we’ll utilize a dynamically distributed PXE Kickstart system for the actual work of installing a customized operating system on client machines.  

If looking for more in-depth coverage on this automation, take a look at the accompanying blog post Automated Server Provisioning Technical Walkthrough

Introduction:

Provisioning a new flock of servers in a datacenter is a substantial task that often requires the coordination of multiple individuals and teams.  Sure, bootstrapping a single new production server might be a simple task that only takes part of an afternoon, but the same cannot be said for adding 10, or 100 such servers.  When considered alongside all the other responsibilities of everyone involved with running the datacenter, the desire for a provisioning automation is immediately understandable.

That desire will be fulfilled today as we discuss an automation that streamlines and standardizes server provisioning workflows. Imagine that 100 new servers are needed in a data center to help deal with increased load.  Let’s say our requirements are that the first 50 machines will run databases, and the second 50 machines will run web servers. After ordering the machines, we will know their serial numbers, MAC addresses, hardware models, and potentially their intended role in the data center stack.   With only this limited information, we have everything we need to create a dynamic provisioning automation.

To deliver this automation, we’ll depend on the CI data in Device42 to broker networking and OS configurations.  Here at Device42 we know a crucial component of any powerful automation is accountable CI data  for  your entire infrastructure .  Any CMDB or DCIM product in charge of managing  data center infrastructure needs to provide accurate data for automations to work off, and this is certainly the case with Device42. Furthermore, this automation will serve to demonstrate Device42’s modern role as a dynamic configuration broker that promotes scalable, elastic infrastructure.

This provisioning automation will utilize Device42’s “Lifecycle Event Actions” system to track devices through the provisioning workflow.  When lifecycle events are added to devices, we’ll configure webhooks to be emitted and processed by Stackstorm, a powerful event based automations platform.

We will use Device42’s IP suggestion function (a feature of its IPAM module) for new devices and automatically add them to a specific subnet.  The automation will take care of making DHCP reservations for these devices to ensure they’re assigned the correct  IP.   We’ll then demonstrate a dynamic PXE configuration distribution method that delivers machine specific OS installation instructions.  Dynamic PXE configurations offer us the ability to control which OS is provisioned on a machine, and exactly how an OS is provisioned from within a single field in Device42

Stackstorm is an “if this then that” Event-Driven Automation platform that supports our Device42 package. Our Stackstorm package provides you with valuable integration tools and workflows out of the box,  including the automation we will cover today. Stackstorm is highly customizable and can interface with just about anything, which makes it a wonderful platform for our integrations.

If you’re looking for an in-depth tutorial on how this automation is built and configured, take a look at the accompanying blog Automated Server Provisioning Technical Walkthrough.

Device42 Lifecycle Overview

As mentioned in the introduction, we will be keeping track of the current stage of each device throughout this automation using  Device42’s lifecycle event actions.

This automation will take machines all the way from being ordered by the procurement team to having a full OS installed on it.  We’ll represent that process with the following lifecycle stages:

  • Purchasing
      • Machines are purchased
      • Spreadsheet is uploaded to D42 to initially create all the devices.
      • Spreadsheet contains (at least): serial number, MAC address, hardware model number and a string “provisioning_auto”  in the notes field.
        • The string in the notes field can be customized.  This is discussed later
        • All other fields are optional, but the OS field can be used to specify what PXE the machine gets upon first boot
  • Mounting
      • Physical machines are mounted in the data center once delivered.
      • Physical machine’s BIOS should be set to PXE boot
  • Networking
      • Device42 suggests an IP for the machine in a specified subnet.
      • Stackstorm creates IP for device on D42 and adds the IP to the subnet
      • A DHCP reservation is created for that device’s MAC and IP address.
      • A specific PXE configuration file is copied from a template matching the OS field in Device42, allowing for device specific PXE installation.
  • OS_Provisioning
    • The device properly boots for the first time and:
      • requests an IP from DHCP
      • receives a PXE configuration over TFTP
      • reboots
      • obtains Kickstart file from Nginx server
      • automatically installs the OS via Kickstart

This automation is designed to break down the provisioning workflow into smaller pieces of work as seen in these lifecycle stages.  This is done with an understanding of the advantages of splitting a large, complex tasks into smaller, simple tasks. Additionally, lifecycle events provide us with a simple way to trigger any external system using webhooks as devices move through the provisioning automation.

That said, let’s see the automation in action!

Demonstration:

Create an Import/Export spreadsheet with a single device  like so:

Upload it to Device42 to create this device by navigating to Tools > Imports / Exports (xls):

Our first Stackstorm action in the automation will be triggered by this device creation.  The action device42.lifecycle_triggered_object_category_change rule adds a  ‘purchasing’ lifecycle event to the new devices so that they can be easily found in Device42 or referenced in external systems.

Once the device arrives at the data center, the device is  moved into the ‘mounting’ lifecycle by a data center operator. The data center operator then racks and mounts the device.  Once complete, the data center operator moves the device into the ‘networking’ lifecycle and the networking lifecycle automation executes.

The networking lifecycle automation first gets information about the device from D42, gets an IP suggestion, and creates that IP on the device:

Next, a DHCP lease reservation is created via an OMAPI request to the network DHCP server for that host.  This can be verified in /var/lib/dhcp/dhcpd.leases on the DHCP server:

After the DHCP reservation is made, the action device42.write_pxe_cfg creates a custom PXE configuration file for our device that corresponds to the device’s OS field.  This can be verified in /opt/tftpboot/pxelinux.cfg/01-00-50-56-00-00-02 on the PXE server:

To cap it off, the networking lifecycle automation moves the device into the os_provisioning lifecycle.  The networking stage should complete in about a minute or less.

Now, the machine will boot, receive its specific PXE configuration over TFTP, obtain required OS files, reboot, and have the Kickstart script automate its path through the OS installation process.

You can watch the machine run through the installation all on its own, which is quite satisfying…

After installation is complete, the server is ready to be logged into using the credentials specified in the Kickstart configuration file.  We can check ifconfig to verify that the IP suggested by Device42 was, in the end, assigned to MAC address:

Conclusion:

This post just scratches the surface of this automation. Much more information is available in the Automated Server Provisioning Technical Walkthrough.

This automation covers a lot of ground and makes big strides towards the ideal, fully automated data center. Provisioning workflows are a great automation target because they are typically rather consistent and systematic.  Even so, these workflows  often times get bogged down by change request forms and inter-team delay.  By implementing a pre-approved provisioning automation with a known workflow, we can alleviate so much frustration and inefficiency.

Additionally, this automation helps promote a higher degree of standardization in your data center which comes with its own benefits beyond saving time and resources.  Namely, if all the servers in your datacenter belong to one of a few known archetypes, automating remediation workflows for common issues becomes far more powerful and achievable.  Perhaps that’s an automation to cover here in the future!

As always, these blogs and libraries are released to our readers and clients with the hope that they will empower your team with everything you’ll need to develop your own custom automations using Device42.  If you’ve already been developing these types of solutions, we’d love to get in touch with you and hear all about it!

If you have any questions or suggestions, leave a comment below, or get in touch via email at support@device42.com for general inquiries, or email the author directly at will.acheson@device42.com.  Thanks for reading!