Tuesday, April 29, 2008

Causes and Limitations of Thrashing

In computer science, thrash is the term used to describe a degenerate situation on a computer where increasing resources are used to do a decreasing amount of work. Usually it refers to two or more processes accessing a shared resource repeatedly such that serious system performance degradation occurs because the system is spending a disproportionate amount of time just accessing the shared resource. Resource access time may generally be considered as wasted, since it does not contribute to the advancement of any process.

Cause of Thrashing
Thrashing results in several performance problems. Consider the following scenario, which is based on actual behavior of early paging systems.

The operating system monitors CPU utilization. If the CPU utilization is too low, we increase the degree of multiprogramming by introducing new processes to the system. A global page-replacement algorithm is used; it replaces pages without regard to the process that they belong. Now suppose that a process enters a new phase in its execution and needs more frames. It starts faulting and taking frames away from other processes. These processes need those pages, however, and so they also fault, taking frames away from other processes. These faulting processes must use the paging device to swap pages in and out. As they queue up for the paging device, the ready queue empties. As the processes wait for the paging device, CPU utilization decreases.

The CPU scheduler sees the decreasing CPU utilization and increases the degree of multiprogramming as a result. The new process tries to get started by taking frames from running processes, causing more page faults and a longer queue for the paging device. As a result CPU utilization drops even further, and CPU scheduler tries to increase the degree of multiprogramming even more. Thrashing has occurred, and system throughput plunges. The page fault rate increases tremendously. As a result, the effective memory-access time increases. No work is being done.

Limit the effect of Thrashing
The effect of thrashing can be limited by using a local replacement algorithm. With local replacement, if one process starts thrashing, it cannot steal frames from another process and cause the latter to thrash as well. However, the problem is not entirely solved. If the processes are thrashing they will be in the queue for the paging device. The average service time for a page fault will increase because of the longer average queue for the paging device. Thus, the effective access time will increase even for a process that is not thrashing.

To prevent thrashing we must provide a process with as many frames as it needs. But how do we know how many frames it “needs”?. There are several techniques. The working-set strategy is one such. This approach defines the locality model of process execution.

No comments: