Openstack Architecture & Components

OpenStack: The Open Source Cloud Operating System

OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering their users to provision resources through a web interface.

History & Origins

Launched in 2010, OpenStack was a collaborative effort between NASA and Rackspace Hosting. Today, it is managed by the Open Infrastructure Foundation and supported by a massive global community including Red Hat, Canonical, and Mirantis. Major users include CERN, AT&T, and Walmart.

1. Why We Need OpenStack

  • Vendor Neutrality: Avoids lock-in to proprietary providers like AWS or VMware.
  • Cost Efficiency: Allows large organizations to utilize their existing hardware as a highly efficient private cloud.
  • API-Driven Infrastructure: Enables "Infrastructure as Code" (IaC) for internal IT teams.

2. Core Components in Depth

Nova (Compute)

The primary engine of OpenStack. It manages the lifecycle of virtual machines, interacting with hypervisors to schedule and spawn instances.

Neutron (Networking)

Provides "Networking as a Service." It manages IP addresses, VLANs, and security groups to ensure connectivity between VMs.

Cinder (Block Storage)

Provides persistent block-level storage volumes. These function like virtual hard drives that can be attached or detached from VMs.

Swift (Object Storage)

A highly redundant system for storing unstructured data (like images or backups) across a distributed cluster of servers.

Keystone (Identity)

The central authentication and authorization service. Every project in OpenStack must verify its "token" with Keystone.

Glance (Image Service)

Acts as the library for virtual machine disk images. It stores the templates (Ubuntu, Windows, etc.) used to boot new instances.

3. OpenStack vs. AWS Comparison

Service Type OpenStack Project AWS Equivalent
Virtual Machines Nova Amazon EC2
Object Storage Swift Amazon S3
Block Storage Cinder Amazon EBS
Identity Service Keystone Amazon IAM
Networking Neutron Amazon VPC
Dashboard Horizon AWS Management Console

4. Standard Provisioning Workflow

  1. User logs into Horizon and authenticates via Keystone.
  2. User requests a VM; Nova pulls the OS template from Glance.
  3. Neutron assigns a private IP and configures the virtual network.
  4. Cinder attaches a persistent storage volume to the new instance.
  5. The VM goes live on a physical compute node, ready for user access.

In-Depth: The Nova VM Spawning Workflow

Spawning a Virtual Machine in OpenStack is an asynchronous process coordinated by the Message Queue (RabbitMQ). It involves multiple sub-services within the Nova project working in concert.

1. Key Architectural Components

Nova API The entry point that validates user requests and verifies quotas via Keystone.
Nova Scheduler The "matchmaker" that uses filters and weights to choose the best physical host.
Nova Conductor The coordinator that handles complex tasks and protects the database from Compute nodes.
Nova Compute The worker agent that talks to the Hypervisor (KVM/QEMU) on the physical host.

2. The Step-by-Step Spawning Process

Phase A: Request & Authentication

  1. API Intake: User sends a "boot" request to nova-api.
  2. Identity Check: API verifies the user's token with Keystone and checks resource quotas.
  3. Initial State: An entry is created in the Nova DB; VM status is set to BUILDING.

Phase B: Scheduling & Placement

  1. Messaging: API sends an RPC message to the Message Queue for the nova-conductor.
  2. Host Selection: Conductor asks nova-scheduler for a host. The scheduler filters hosts based on RAM/CPU availability and weights them for the best fit.
  3. Record Update: Conductor updates the DB with the selected Host ID.

Phase C: The Build (Compute Node)

  1. Task Handoff: Conductor sends an RPC message to nova-compute on the target physical host.
  2. Resource Assembly: Compute node contacts Glance (Image), Neutron (Network/IP), and Cinder (Storage) to prepare the environment.
  3. Hypervisor Call: Compute triggers the Hypervisor (e.g., KVM via libvirt) to boot the VM using the gathered resources.
Why the Conductor exists: Historically, Compute nodes talked directly to the Database. The Nova Conductor was introduced as a security proxy; if a single physical server is compromised, the attacker cannot access the global cloud database.

3. OpenStack vs. AWS Mapping

OpenStack Service AWS Equivalent
Nova Scheduler EC2 Placement Groups / Internal Logic
Nova Compute AWS Nitro System / Hypervisor Agent
RabbitMQ (Queue) Internal Event Bus (Proprietary)