Many networked devices, such as UNIX servers, routers and switches, are accessible through local serial console ports, for initial configuration and emergency management.
Today, 'console management' is an integral part of the data center, allowing secure access to systems even when the network is unavailable. Until now, this functionality has not been available on Microsoft Windows servers, which are traditionally managed with cumbersome KVM (keyboard, video and mouse) switches.
In Windows Server 2003, Microsoft introduced emergency management services (EMS), a powerful suite of applications for out-of-band management. EMS enables true 'headless' server operation, which significantly reduces hardware costs by eliminating the need for video cards, monitors, keyboards and mice during OS installation, operation and recovery. This paper provides an overview of out-of-band management and EMS, and shows how to connect EMS ports to networks.
In-Band vs. Out-of-Band Management
By default, systems administrators use standard tools such as Terminal Services and Microsoft Management Console (MMC) to manage Windows servers through the network. These in-band tools require the target server to cooperate; if the server's TCP/IP stack isn't functioning, for example, there is no way to reach the server.
An out-of-band connection through a serial console port relies on only the most primitive of operating system services. As long as the kernel is functioning, access to the system is possible – even if the network stack or user interface is down. And because the console port delivers only text data, it offers good performance over low-bandwidth connections such as dial-up lines. Out-of-band management is a reliable way to reach servers when things have gone wrong.
Out-of-band management doesn't replace in-band management – it improves a system administrator's ability to quickly respond to situations when standard tools aren't available. Of course, faster response times translate to increased uptime.
Emergency Management Services
Emergency Management Services (EMS) is a suite of features spread over multiple elements of Windows Server 2003. Together, these applications enable remote management and system recovery through the server's serial console port, even when the server is unavailable through the network. EMS consists of Console Redirection and the Special Administration Console.
Console redirection allows administrators to monitor and control the boot process by sending all system output to both the video adapter (if present) and the serial console port. Likewise, it accepts inputs from the console port, as well as the keyboard (if present).
During a server's power-on self-test, before Windows begins to load, console redirection is typically a function of the server's BIOS. EMS' console redirection starts as soon as Windows begins to load, and is available until the Windows graphical interface becomes active. It has been integrated into a number of tools, including the setup loader, the text-based setup process, the recovery console, the loader and the Stop Error handler.
Once the Windows graphical interface is active, Console redirection is no longer available, and EMS focuses on the Special Administration Console.
Special Administration Console
The Special Administration Console (SAC) allows administrators to access and control the operating system when standard remote management tools such as Terminal Services are unavailable. Since SAC is a kernel-level function, it remains accessible even after higher-level applications have ceased to respond because, for example, a misbehaving program has used all available memory.
SAC provides low-level emergency features, such as the ability to
- set the server's IP address during the initial install
- reconfigure IP settings to regain in-band connectivity
- reboot or shutdown the server
- list all used and available resources (physical memory, kernel memory, etc.)
- list all processes, kill them, limit memory usage or change their priority
- create command prompt channels, for access to the file system.
- run text-based applications (e.g., traceroute, telnet, etc.)
SAC provides the ability to analyze the logs and restart the server even after a Stop Error (a.k.a. 'Bluescreen'), though most other features will be disabled.
SAC is text-based and command-line driven, very much like DOS:
Is EMS Reliable?
Because EMS is a function of the kernel, and doesn't rely upon any drivers or applications, it is much more robust than either network-based management tools or KVM switches.
Network-based tools, such as Windows Terminal Services, require the server's TCP/IP stack to be running and require system memory. For managing the Windows GUI, KVM switches require keyboard, video and mouse drivers to be running and accessible.
To test EMS, we created a program that uses up all system memory, leaving the server functioning but acting very slowly. Pressing brought up the Windows Security screen, but a restart was not executed in a reasonable timeframe. Using SAC, we were able to kill the malicious process immediately and to restart the server. EMS offers much more reliable access than a KVM switch, and can quickly restore in-band management capabilities.
The problem of easily connecting serial devices to networks was solved more than a decade ago. Terminal servers (not to be confused with Terminal Services, a Windows application) are standalone network devices featuring an Ethernet interface and multiple serial ports, for attaching devices such as terminals and printers to the network without tying up valuable server resources. Typically, the serial devices are connected by telnet, or through COM port redirection software.
In the UNIX world, terminal servers are commonly used to provide access to the serial console ports of servers, routers and switches, for out-of-band management. The console ports are connected to the terminal server, which is attached to the network switch, and optionally, a dial-up modem. Using telnet or Secure Shell (SSH), the administrator can connect directly to a console port from anywhere on the corporate TCP/IP network, over the internet, or through a direct dial-up connection.
Today's terminal servers are optimized for console port management. Encrypted SSH version 2 has replaced the unsecure telnet connection, and all data exchanged with the console port is logged for troubleshooting and auditing purposes. The latest generation of terminal servers can also scan the console output for keywords and issue an SNMP trap or email message in case of an emergency. For ease of use, many terminal servers now offer a web-based interface for configuration, and access to the connected servers, routers and other network devices.
Making EMS Easy
Windows administrators are accustomed to graphical user interfaces, rather than command lines. With that in mind, some new terminal servers have a browser-based point-and-click interface to SAC, which simplifies server management:
The browser-based front end makes SAC's functionality instantly available, and reduces training time to a minimum. Additionally, HTTPS makes the SAC interface secure.
Out-of-band management is the reliable way to reach servers when things go wrong. Emergency Management Services brings the benefits of out-of-band management to Windows Server 2003 systems, providing access to vital server functions over a low-bandwidth connection, even when the server's network stack and user interface are down.
By adding new Windows systems to terminal servers, one common technology can be used to reach any server, router or network device within a company, independent of its location or operating system through a simple graphical user interface. This approach consolidates and simplifies out-of-band data center management, and can contribute to increased uptime and greater IT efficiency.
Burk Murray is vice president of terminal servers, Digi International (www.digi.com).