--- title: "Introduction to StochSimR" author: "Ayush Kundu" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to StochSimR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4, dpi = 100 ) set.seed(42) ``` ## Overview **StochSimR** is a modular simulation engine for stochastic processes. It provides a unified interface for simulating, analysing, and visualising paths from a wide range of processes used in finance, physics, queueing theory, and applied probability. Every simulator returns a `stoch_path` object that plugs directly into the plotting and analysis functions. ```{r} library(StochSimR) ``` ## Classic Processes ### Brownian Motion Standard Brownian motion (Wiener process) is the building block for most continuous-time models. The `sim_brownian()` function supports both exact simulation and Brownian bridge interpolation. ```{r brownian} bm <- sim_brownian(T_max = 1, n_steps = 1000, mu = 0, sigma = 1, n_paths = 100) bm plot_paths(bm, max_paths = 50, show_mean = TRUE, show_bands = TRUE) ``` The red line is the sample mean across 100 paths; the shaded band shows the pointwise 2.5%--97.5% quantile envelope. ### Poisson Process The counting process for random arrivals. Supports both homogeneous (constant rate) and inhomogeneous (time-varying rate) specifications. ```{r poisson} # Homogeneous pp <- sim_poisson(T_max = 10, lambda = 3, n_paths = 20) plot_paths(pp, show_bands = FALSE) ``` ```{r poisson-inhom} # Inhomogeneous with sinusoidal rate pp_ih <- sim_poisson( T_max = 10, lambda = function(t) 3 + 2 * sin(2 * pi * t / 5), lambda_max = 5, n_paths = 20 ) plot_paths(pp_ih, show_bands = FALSE, title = "Inhomogeneous Poisson (sinusoidal rate)") ``` ### Markov Chain Discrete-time Markov chains are simulated exactly by sampling from the transition matrix row at each step. ```{r markov} P <- matrix(c(0.7, 0.2, 0.1, 0.3, 0.4, 0.3, 0.2, 0.3, 0.5), nrow = 3, byrow = TRUE) mc <- sim_markov(P, n_steps = 300, x0 = 1, n_paths = 10) plot_paths(mc, show_bands = FALSE, show_mean = FALSE) ``` Verify convergence to the stationary distribution: ```{r markov-stationary} long_mc <- sim_markov(P, n_steps = 50000, x0 = 1, n_paths = 1) freq <- table(factor(long_mc$values, levels = 1:3)) / length(long_mc$values) freq ``` ## Advanced Processes ### Geometric Brownian Motion (GBM) The standard model for stock prices. Two methods are available: `"exact"` uses the known log-normal solution; `"euler"` uses Euler--Maruyama discretisation. ```{r gbm} stock <- sim_gbm(T_max = 1, n_steps = 252, mu = 0.08, sigma = 0.25, x0 = 100, n_paths = 200) plot_paths(stock, max_paths = 80) plot_distribution(stock) ``` ### Ornstein--Uhlenbeck Process A mean-reverting diffusion. With `method = "exact"`, the known Gaussian transition density is used -- no discretisation error. ```{r ou} ou <- sim_ou(T_max = 10, n_steps = 1000, theta = 2, mu = 5, sigma = 0.5, x0 = 0, n_paths = 50) plot_paths(ou, show_mean = TRUE, show_bands = TRUE) ``` Note how all paths converge toward the long-run mean of 5. ### Jump-Diffusion (Merton Model) Combines GBM with compound Poisson jumps in log-price. ```{r jump-diffusion} jd <- sim_jump_diffusion( T_max = 1, n_steps = 1000, mu = 0.05, sigma = 0.2, lambda = 3, mu_j = -0.02, sigma_j = 0.1, x0 = 100, n_paths = 50 ) plot_paths(jd, max_paths = 30) ``` ### Hawkes (Self-Exciting) Process A point process where each event temporarily increases the rate of future events. The branching ratio `alpha/beta` controls the degree of clustering. ```{r hawkes} hk <- sim_hawkes(T_max = 50, mu = 0.5, alpha = 0.8, beta = 1.2, n_paths = 15) plot_paths(hk, show_bands = FALSE, show_mean = TRUE) ``` ### Levy Processes Four families are available: ```{r levy-gamma} # Gamma subordinator (non-decreasing) gam <- sim_levy(T_max = 5, n_steps = 500, type = "gamma", shape = 2, rate = 1, n_paths = 20) plot_paths(gam, show_bands = FALSE) ``` ```{r levy-vg} # Variance-Gamma vg <- sim_levy(T_max = 1, n_steps = 500, type = "variance_gamma", theta = -0.1, sigma = 0.3, nu = 0.5, n_paths = 30) plot_paths(vg, max_paths = 20) ``` ## Summary Statistics The `path_summary()` function computes terminal value distribution, extremes, increment moments, skewness, and excess kurtosis: ```{r summary} path_summary(stock) ``` The `plot_acf_paths()` function shows the autocorrelation of path increments, useful for verifying independence (Brownian) or detecting dependence (OU): ```{r acf} plot_acf_paths(ou, path_idx = 1) ``` ## Variance Reduction StochSimR includes four variance reduction techniques. Each returns the estimate, standard error, and the variance reduction factor versus crude Monte Carlo. ### Control Variates Use the terminal stock price (whose expectation is known analytically) as a control for pricing a European call: ```{r cv} h_call <- function(vals) pmax(vals[nrow(vals), ] - 100, 0) g_term <- function(vals) vals[nrow(vals), ] E_g <- 100 * exp(0.05) cv <- vr_control_variate(sim_gbm, h = h_call, g = g_term, E_g = E_g, n_paths = 5000, T_max = 1, n_steps = 252, mu = 0.05, sigma = 0.2, x0 = 100) cat("Estimate:", round(cv$estimate, 4), "\n") cat("SE: ", round(cv$se, 4), "\n") cat("Reduction:", round(cv$reduction_factor, 1), "x\n") cat("Correlation(h, g):", round(cv$correlation, 3), "\n") ``` ### Importance Sampling Tilt the drift upward to estimate a tail probability more efficiently: ```{r is} h_tail <- function(vals) as.numeric(vals[nrow(vals), ] > 150) is_result <- vr_importance(sim_gbm, h = h_tail, drift_shift = 0.4, n_paths = 10000, T_max = 1, n_steps = 252, mu = 0.05, sigma = 0.2, x0 = 100) cat("P(S_T > 150):", round(is_result$estimate, 6), "\n") cat("ESS: ", round(is_result$ess), "/", is_result$n_samples, "\n") cat("Reduction: ", round(is_result$reduction_factor, 1), "x\n") ``` ## Monte Carlo Estimation The `mc_estimate()` function provides a general-purpose estimator with batch-means standard errors and a convergence trace: ```{r mc} h_pos <- function(vals) pmax(vals[nrow(vals), ], 0) mc <- mc_estimate(sim_brownian, h = h_pos, n_paths = 10000, batch_size = 500, T_max = 1, n_steps = 200) cat("E[max(B_1, 0)] =", round(mc$estimate, 4), "+/-", round(mc$se, 4), "\n") cat("Exact answer: ", round(1 / sqrt(2 * pi), 4), "\n") ``` ## Method Comparison Compare simulation methods on the same process to measure bias and speed: ```{r compare} comp <- compare_methods(sim_ou, methods = c("exact", "euler"), n_paths = 500, n_reps = 20, T_max = 5, n_steps = 500, theta = 2, mu = 1, sigma = 0.5, x0 = 0) print(comp) ``` The exact method uses the known Gaussian transition density and has zero discretisation bias; the Euler method is faster per step but introduces $O(\Delta t)$ weak error. ## Session Info ```{r session} sessionInfo() ```