The open archive for STFC research publications

Full Record Details

DOI 10.5286/raltr.2010018
Persistent URL http://purl.org/net/epubs/work/53009
Record Status Checked
Record Id 53009
Title Developing a resilient power management portal for an enterprise infrastructure
Abstract The Scientific Computing Technology (SCT) group within the Science and Technology Facilities Council's (STFC) e-Science centre is responsible for managing large scale cluster computing services for the UK's e-Infrastructure as well as STFC's core Facilities. Such an infrastructure needs managing, and SCT has invested a great deal of time and effort in finding suitable products to streamline the System Administration duties where possible, and develop in-house tools when solutions either do not exist or do not justify the cost. Alongside requirements to provide power management functionality to an OS commissioning service, SCT also saw the need to extend functionality to cover all power outlets available in the SCT Infrastructure so a consistent view could be maintained. To meet this challenge, a new service was developed which manages Power Distribution Units (PDU), Intelligent Platform Management Interfaces (IPMI), and VMware Virtual Machines. With this service, authenticated System Administrators can perform fine grained power control of servers and Virtual Machines with information collected from both Oracle and MySQL databases. It employs a database caching technique and runs in a fault tolerant environment to provide a resilient service in the event of building power failure; a scenario where the service is deemed most needed. It gives a consistent view of an enterprise infrastructure, with fine grained control for power management. For the first time, SCT can perform fully automated Operating System deployment on virtually any host in its infrastructure.
Organisation ESC , STFC , ESC-SCT
Funding Information
Related Research Object(s):
Licence Information:
Language English (EN)
Type Details URI(s) Local file(s) Year
Report RAL Technical Reports RAL-TR-2010-018. 2010. RALTR2010018.pdf 2010