Module 4: Queuing Models and Little's Law

Table of Contents 

Question of the Day

TBD

Assigned Reading

Chapter 3.1 - 3.2. ECE 6400 students should also read Chapter 4.4 as an independent study.

Queuing Models

The goal of this module is to introduce the simplest way in which we can model the behavior of a link.

Queueing Systems

A queueing system consists of one or more servers that provide some desired service to arriving customers. The system consists of the servers with a line/queue in which these customers must wait to be served. The queue must be able to hold zero or more customers. Note that oftentimes the term queue is used to refer to both the servers and the line together. After all, it doesn't make sense to wait in a queue for nothing, right?

Such a system consists of the following dynamics, in order:

  1. A customer arrives at the queueing system.
  2. The customer waits in the queue for service by a server.
  3. The customer receives service from a server.
  4. The customer departs (ideally happily) from the queueing system.

Bringing this discussion back to what we know about LANs, a customer is a packet, message, or frame, the queue is some buffer or block of memory in which the packet can be placed while it waits for processing, a server is the link, switch, or other resource on which the packet is waiting, and a departure is a packet that is transmitted by the server.

A queueing system can often be depicted by the following diagram:

As an example, let's consider a point-to-point topology with a single link that allows data to be requested by one end host from a remote end host. Let's assume this link has no probability of error during its operation.

In this case, the customers are the requests from the client's application, the queue is the source buffer feeding into the NIC of the client, the service desired is the transmission of the request over the point-to-point link, and a departure is a completed transmission over the link to the remote end host.

Classification of Queues

Kendall's Notation

In order to keep track of the different types of queues, we'll adopt a notation called Kendall's notation. The general form appears as follows:

x / y / n / m / p

where:

  • x is the distribution of interarrival times
    • M is Markovian or memoryless, which implies a Poisson arrival process (i.e. exponentially distributed interarrival times)
    • D is Deterministic or constant
    • Er is r-stage Erlang
    • G is General (i.e. any valid distribution function)
  • y is the distribution of service times
    • M is Markovian or memoryless, which implies exponentially distributed service times
    • D is Deterministic or constant
    • Er is r-stage Erlang
    • G is General (i.e. any valid distribution function)
  • n is the number of parallel servers
    • supports 1 to infinity
    • greater than 1 server exploits processing parallelism
  • m is the number of customers allowed in the queueing system
    • supports 1 to infinity
    • 1 implies zero queue length and one position in server itself
    • infinity implies an infinite buffer space (assume infinity if m is omitted)
    • any surplus in customers must be returned to sender or discarded
  • p is the population size or number of customers that generate arrivals
    • supports 1 to infinity
    • if p is finite and p customers are in the system, then no new arrivals may occur (i.e. source exhaustion)
    • if p is omitted, assume p is infinite

Queuing Discipline

The queueing discipline is the order in which customers are served from the queue or waiting line. There are a number of types of queuing disciplines, such as:

  • First-come, first-served (FCFS)
    • process customers in order of arrival
    • most common queueing discipline
    • e.g. linked list data structure where 0th element is always removed
  • Last-come, first-served (LCFS)
    • process most recent customer to have arrived
    • could lead to starvation
    • e.g. use of a stack via push and pop
  • Round robin (RR)
    • Poll from separate queues in a circular manner
  • Processor sharing (PS)
    • Time shared access to server
    • All customers served simultaneously (but a little at a time)
    • e.g. a CPU on a general purpose machine where the OS schedules tasks
  • Random
  • Priority
    • Multiple classes of customers
    • Some classes have higher or lower priority than others
    • Higher priority customers receive "better" service
    • Lower priority customers receive "worse" service

Measuring the Performance of Queueing Systems

The performance of a queueing system can be assessed in a variety of ways, such as:

  • Mean delay
    • time a customer spent in the entire system
    • time a customer spent in the queue only
  • Mean queue length
    • number of customers in the entire system
    • number of customers in the queue only
  • Server utilization
    • fraction of time the server is busy
  • Throughput
    • the amount of data moved through the system minus any overhead per unit time

Using these metrics, we can determine useful information about the performance of links or even entire networks. In fact, a relatively small amount of information about network traffic statistics is required to ascertain network performance.

Basic Properties of Networks and Simplifying Assumptions

We can view a network as a "black box" system, which is to say we need not know any of the internal implementation details of the network. We can learn properties of this black box network though by observing characteristics, such as the entry and exit points of the network.

To simply our models, we will assume a few things:

  1. Any customer that arrives from outside the network may be stored inside the network but must eventually depart the network. 
  2. Customers are neither created nor destroyed in the network.
  3. We will consider stable networks that are operating in steady state so that the statistical behavior is the same at any time and that any measuring such as averaging over long time intervals yields statistical averages.
  4. The arrival rate of customers/packets into the network is equal to the rate of customers/packets leaving the network.

Note that for item (4), if the arrival rate is greater than the departure rate, the number of customers in the network approaches infinity, which results in an unstable system. Conversely, if the arrival rate is on average less than the departure rate, then the number of customers stored in the network approaches zero, which results in a stable system where the arrival rate is equal to the departure rate.

Little's Law

Intuitive Derivation of Little's Law

Little's Law can be used to derive properties of networks given statistical information about network traffic.

If no customers are in the network at time t = 0, then let A(t) be the number of arrivals in the interval [0, t] and let D(t) be the number of departures in the interval [0, t].

Thus, it can be said that the number of customers in the network at a given time t, N(t), is N(t) = A(t) - D(t).

The above figure depicts the number of customers in the network, N(t), given the number of customer arrivals, A(t), and the number of customer departures, D(t), at a given time.

The measured arrival rate over the interval [0, t] is:

and the total time spent in the system by all customers during this interval [0, t] is:

which is the area between A(x) and D(x) over all x in the interval [0, t].

Thus, the average number of customers in the system during the interval [0, t] is:

Consider all customers in system at time t (i.e. N(t) customers). Let RN(t) denote the total time after t that those N(t) customers spend in the system. The average time spent in the system by all customers arriving in the interval [0, t] is:

So:

and:

For the stable system we are concerned with, the following limits hold:

and

So:

For our systems of interest, N is the mean number of customers in the system, λ is the arrival rate (i.e. the mean number of arrivals per second), and T is the mean time that a customer spends in the system. Thus, Little's Law can be defined as:

Little's Law applies to any closed boundary containing system elements that do not produce or consume customers. As such, it is valid for many deterministic or probabilistic queueing systems and does not depend on any specific distribution function or service times.

Applications of Little's Law

There are many ways we can apply Little's Law in analyzing networks.

Using Little's Law with a Queue and a Server

Let's first consider its application to the queue and server of a system. N is the expected number of customers in the system, and T is the expected time each customer spends in the system.

Thus, Little's Law can be applied directly as:

Using Little's Law with Just a Queue

Next, let's consider its application to the queue only. Nq is the expected number of customers in the queue, and W is the expected time a customer spends waiting in the queue.

Thus, we can apply Little's Law scoped at the queue as:

Computing the Utilization (i.e. Pr(server is busy))

The average service time is the inverse of the service rate, or 1/μ. Thus:

So, multiplying all terms by λ, we get:

which implies:

where:

Note that ρ <= 1, since the arrival rate λ should be less than or equal to the service rate μ in a stable system.

Example: Route Through a Network

This has a queueing module that can be expressed as:

What is the average delay through the network?

The arrival to system 1 in arrivals per second is λ = λ1 + λ2 + λ3 = 3 arrivals/s. Thus, for system 1, the rate in is equal to the rate out, so the arrival rate to system 2 is λ = 3 arrivals/s. Similarly, the arrival rate to system 3 is λ = 3 arrivals/s.

The time spent in the entire system is T = T1 + T2 + T3. We can solve for each subsystem's time:

 

Thus, T = T1 + T2 + T3 = 0.883 + 0.367 + 0.867 = 2.067 seconds.

Utilization

Recall that utilization is the average fraction of time the system is busy. For a single server queueing system, the utilization is the average fraction of time the server is busy.

We can characterize the utilization in terms of the arrival rate λ and the service rate μ.

Method 1

The measured utilization during the interval [0, t'] is:

For a stable system operating in steady state:

Recall that:

where:

Method 2

Consider a system with a FCFS service and a single output channel.

We can remove the FCFS constraint, but analysis is more complex. So, let Ni denote the number of customers in the network just after the departure of the ith customer and let Ai+1 denote the number of arrivals that occur while customer i+1 is being served.

Case 1

Ni > 0, meaning the network is not empty when the ith customer departs.

The service for the (i+1)st customer begins immediately after the departure of the ith customer, so Ni+1 = Ni + Ai+1 - 1.

Case 2

Ni = 0, meaning the network is empty when the ith customer departs.

The network remains empty until the arrival of the (i+1)st customer, so Ni+1 = Ni + 1 + Ai+1 - 1 = 0 + 1 + Ai+1 - 1 = Ai+1.

Let's define an indicator function to denote whether or not the network is empty:

Combining the expressions for cases 1 and 2 into a single expression, we can write Ni+1 = Ni - U(Ni) + Ai+1, where Ni+1, Ni, and Ai+1 are random variables and U(Ni) is a random variable.

This means that E[Ni+1] = E[Ni] - E[U(Ni)] + E[Ai+1]. Note that E[U(Ni)] = Pr(Ni > 0).

We assume a steady state network, which means E[Ni] = E[Ni+1]. Thus, E[Ni] = E[Ni] - E[U(Ni)] + E[Ai+1] and E[U(Ni)] = E[Ai+1]. This result indicates the probability that the server is busy immediately after a departure is equal to the average number of arrivals that occur in one service time.

Our goal is to find the utilization (ρ), or the fraction of time the server is busy, or the probability the server is busy at an arbitrary time instant (all meaning the same thing).

When does ρ = E[U(Ni)]?

Deterministic Arrivals

Assume a D/D/1 queueing system (recall D = deterministic). The above model sports one arrival per second and has a service time of 1/2 second per customer.

Ni = 0 for all i, so E[U(Ni)] = E[0] = 0. But, ρ = 1/2. So, E[U(Ni)] != ρ for this D/D/1 queueing system.

Poisson Arrivals

However, it can be shown that for Poisson arrivals, E[U(Ni)] = ρ. Thus, for Poisson arrivals, ρ = E[U(Ni)] = E[Ai+1].

This means the utilization of the server is equal to the average number of arrivals during a service interval.

We will use this fact to evaluate the performance of queueing systems with Poisson arrivals.