Simulation: event-bounded VS time-bounded termination
Doing discrete event (network) simulations; which termination criterion do you use? This article concerns OMNeT++/MiXiM but could easily be generalised.
Event-bounded
The module which generates events creates a fixed number of events. E.g. create 5000 packets and send these to the MAC which will subsequently transmit them.
The MAC will perform medium access (with all related timers and events) and the messages will eventually arrive at the receivers. When no more new packets are generated for transmission, the whole simulation will eventually terminate: the event-queue is empty. Modules call their finish() function to export the logged information.
Time-bounded
The module which generates events creates an event, and then schedules the next event. When the simulation time limit has been reached (say 250 seconds), the simulation terminates and all modules call their finish() functions.
Why use one or the other?
An event-bounded simulation has a very nice property: you know exactly how many events are generated. For instance, you can simply generate 5000 packets, then log how many arrive at their destination (say 4324) and there’s your success probability: 4324/5000. This holds irrespective of how you generate these events, by means of which arrival process you use. This could be deterministic (one per 10ms) or from random distributions such as the exponential or uniform distributions. Also, batch arrivals are no problem. The system will only terminate when all work is done.
A time-bounded simulation (at least for networks) does not have this property, and I’ll explain why. Say we have Poisson arrivals at rate 20Hz and simulate for 250 seconds. The expected number of events in those 250 seconds is 5000. However, you could also observe 4890 or 5110 events. This is due to the randomness of, well, the random distributions. Okay, so you just keep track of the number of generated events (say 4890) and then the number of receptions (say 4324). Then, is your success probability 4324/4890? The answer, quite frankly, is NO. The module which generates packets puts them in a transmission buffer (or queue). Depending on your implementation, there may be several queues (such as at the network and mac layer). Now, if the simulation time is up, the simulation terminates: packets which find themselves in buffers at that moment are discarded. They have been counted as generated but have not been counted as received (or lost for that matter). Time-boundedness introduces an artificial from of packet loss!
However, we need not fear. If modules output the state of their buffers in their finish() methods, we can correct these values. Say 4890 packets are generated, and 5 are in the queue upon termination. Then only 4885 packets are truly part of the experiment; they were subject to the system under investigation. So our success probability can be found as 4324 / (4890 – 5).
Now, why would you use time-bounded simulations, if they require more elaborate processing in order to arrive at correct results? I will explain with an example.
Event-bounded example
Two wireless nodes (IEEE 802.11 MAC) share a common channel. They influence each other (bc freeze, carrier sense etc.). If an application generates 5000 packets by means of a Poisson process, it could easily be node A generates its events in less time than B (it is just randomly choosing shorter inter-generation times). This could result in the period where they influence each other to be terminated before the simulation terminates: A is done before B and you are looking at results of (A and B) AND (only B). Now this could not be the case if we are interested in how A and B influence each other.
Time-bounded example
Now, the same scenario but with a time-bounded setup. A and B generate packets during the entire simulation time, say 250s. A generates 5110 and B 4890 packets. A is just lucky to choose shorter inter-generation times. When the simulation terminates, we are looking results of (A and B); they influence each other during the entire experiment. That’s good.
Doesn’t the method of multiple independent replications solve this?
Brief answer: no. Because by design the experiment in the event-bounded case allows for a period in which nodes do not influence each other (and this increases in severity with an increasing number of nodes). Statistically it is possible there are replications which have a “large portion” of influence (i.e. the effective number of nodes in the system remains constant during most of the simulation run) but the average will contain data from regimes where the effective number of nodes is lower.
The effect? In case of the metric of success probability this will result in over-estimation: we think we research a system with 100 nodes, but the results are generated in a system which started with 100, then gradually some nodes reach their configured number of events and finally the last node generates its packets in isolation. This last node achieves VERY good performance compared to the regime with 100 active nodes.
Conclusion
Should you terminate your simulations based on time in stead of events? In a system with influence of events (such as the carrier sensing by the MAC modules) this is a good plan. If you plan to research a scenario of 100 cooperating nodes, you should also simulate this scenario.