Mitchell, LewisRoughan, MatthewKalenkova, AnnaSmart, Bridget Anna2023-07-242023-07-242023https://hdl.handle.net/2440/138944In many real-world systems, numerous interacting components produce intricate structures and complex emergent dynamics. Modelling this complexity can be challenging, but doing so can produce meaningful insights into both individual and collective behaviours, revealing non-trivial patterns from simple pairwise interactions. This thesis focuses on models which describe how the information which is passed around these complex systems influences how components interact and behave. To understand the dynamics of complex systems, models to estimate features of pairwise relationships and information flows are critical. Often, existing techniques have not been tested in realistic data settings, or the statistical significance of results is unknown. These limitations result in a number of domain-specific techniques which are inaccessible outside of their originating field, exposing a need for robust, general frameworks to evaluate their performance. In Chapter 2, we begin by considering measures of association between point processes, a stochastic object often used to represent sets of event times; for example in social media, cognitive neuroscience or financial applications. A survey is conducted of existing techniques across a broad range of fields. These techniques are presented in four classes; heuristic, information-theoretic, stochastic and machine learning based. A major challenge for this body of research is a lack of universal notation making many approaches inaccessible. To overcome this, we present a discussion on types of association, a general notation, and a conceptual framework. The benefits and drawbacks of each class are discussed, highlighting potential applications and practical considerations. While this survey increases the accessibility of techniques across fields, empirical evaluation remains a useful validation tool to ensure approaches are robust. In Chapter 3, a simulation framework is described which allows complex, realistic event time data to be generated for a set of point processes. This framework is used to construct a tool that can be used to evaluate the relevance of techniques to measure association in time series. While event time data is readily available, it is not unusual to have access to richer data associated with each event. In the case of social media data, posts can be taken as events, with the textual information used to measure information flows and influence between accounts online. In existing work, non-parametric information-theoretic estimators have been applied to measure information flows between individual accounts online. Here, we extend these estimators to calculate information flows between groups of accounts, and propose an empirical significance test to evaluate which of these flows are meaningful in context. In Chapter 4, these contributions are applied to a Twitter dataset concerning discussion of the 2022 Russian invasion of Ukraine, and is accompanied by an exploration of the potential impacts of these information flows. Inspired by the desire to explore simulation-based validation for sequential entropy estimators, limitations on the convergence rate of the non-parametric entropy estimator were discovered. To overcome these limitations, work was done to calculate the theoretical entropy rate of a Linear Additive Markov Process (LAMP) model, a model used to generate sequential data with specified long-range dependency structures. As described in Chapter 5, the aim of this work is to enable the behaviour of non-parametric estimators to be explored on more realistic sequences with dependency structures, opening up an avenue for future work. These results build upon and improve the robustness of existing techniques to measure information flows in real-world networks, ultimately improving our ability to model complex systems appropriately.enComputational social science; temporal relationships; entropy; networks; long-range dependencyMeasuring and modelling information flows in real-world networksThesis