Rapid technological developments have increased computational speed and power through the exploitation of parallelism. However, it is often difficult to test new developments for parallel architectures since the architecture may not be available or may be too expensive for the researcher to obtain. A common approach to solving the problem of unavailability of the target architecture is to use a simulator to capture the behavior of the architectural system under development. In this chapter, we present the design of an execution-driven multiprocessor simulator and we compare the execution-driven approach to the trace-driven approach. In comparing the two approaches, we observe that the trace-driven approach is faster than the execution-driven approach since it does not actually execute the threads. However, the trace-driven approach must store the traces and therefore requires more space than the execution-driven approach. Since the execution-driven simulator actually executes the input program, the output from the simulated parallel execution can be compared to the output of the original sequential program; this comparison is a functional verification of the threads produced by our parallelizing compiler.