Predictive Analysis – Fault Tolerance

The goal of this project is to build a system that shall utilize Predictive Analysis technology to create a state-of-the-art fault-tolerance system that can lead towards the ability to “predict” based upon past events if and when faults such as component failures may occur. 

Fault Tolerance is the ability of a system to continue to run when one or more dependencies fail such as hardware failures or other malfunctions in Hardware, Firmware, OS faults, Software or other components. Usually, this issue is addressed via some form of redundancy. Redundancy ensures that when a component fails, another component shall be available in order to replace the failed component automatically with minimum or ideally no degradation in performance. The advantages of Live Migration over pure redundant solutions is the ability to perform loading and migration of processes dynamically without requiring the use of duplicate resources, thus increasing the capacity and efficiency of the Data Center while reducing cost. 

Combining Live Migration with Predictive Analysis can provide for a more robust Fault Tolerant System providing a differential advantage for Red Hat’s portfolio of products by expanding upon existing investments while providing added value for customers and subscribers.

As an initial phase, the approach shall be that if one can determine for example degradation of performance for one or more elements in the system, one can perform migration of affected processes from a Host and/or location experiencing say degradation to another Host within a “safer”, well functioning environment. With this scenario, live migration may allow for processes to move from one location to another for a better load balancing based upon not only available resources, but historically reliable resources.

Status

Research Area(s)

Contacts

Project Resources

RIG(s)

Affiliations