Students at Lancaster University

Term 2: Optional modules

Term 2 allows for specialism in either advanced technical or application areas. Applications offered are those for which there is a considerable demand for data scientists. Subject-specific experts from across the University will deliver the specific pathways. We currently provide pathways in:

The available modules build upon the core set and have a prerequisite level of skills and knowledge.

Business intelligence pathway

Accordion

  • Forecasting

    Every managerial decision concerned with future actions is based upon a prediction of some aspects of the future. Therefore Forecasting plays an important role in enhancing managerial decision making.

    After introducing the topic of forecasting in organisations, time series patterns and simple forecasting methods (naïve and moving averages) are explored. Then, the extrapolative forecasting methods of exponential smoothing and ARIMA models are considered. A detailed treatment of causal modelling follows, with a full evaluation of the estimated models. Forecasting applications in operations and marketing are then discussed. The module ends with an examination of judgmental forecasting and how forecasting can best be improved in an organisational context. Assessment is through a report aimed at extending and evaluating student learning in causal modelling and time series analysis.

  • Optimisation and Heuristics

    Optimisation, sometimes called mathematical programming, has applications in many fields, including operational research, computer science, statistics, finance, engineering and the physical sciences. Commercial optimisation software is now capable of solving many industrial-scale problems to proven optimality.

    The module is designed to enable students to apply optimisation techniques to business problems. Building on the introduction to optimisation in the first term, students will be introduced to different problem formulations and algorithmic methods to guide decision making in business and other organisations.

  • Introduction to Intelligent Data Analysis (Data Mining)

    This module develops modelling skills on synthetic and empirical data by showing simple statistical methods and introducing novel methods from artificial intelligence and machine learning.

    The module will cover a wide range of data mining methods, including simple algorithms such as decision trees all the way to state of the art algorithms of artificial neural networks, support vector regression, k-nearest neighbour methods etc. We will consider both Data Mining methods for descriptive modelling, exploration & data reduction that aim to simplify and add insights to large, complex data sets, and Data Mining methods for predictive modelling that aim to classify and cluster individuals into distinct, disjoint segments with different patterns of behaviour.

    The module will also include a series of workshops in which you will learn how to use the SAS Enterprise Miner software for data mining (a software skill much sought after in the job market) and how to use it on real datasets in a real world scenario.

Health pathway

Accordion

  • Principles of Epidemiology

    Introducing epidemiology, the study of the distribution and determents of disease in human population, this module presents its main principles and statistical methods. The module addresses the fundamental measures of disease, such as indicence, prevalence, risk and rates, including indices of morbidity and mortality.

    Students will also develop awareness in epidemiologic study design, such as ecological studies, surveys, and cohort and case-control studies, in addition to diagnostic test studies. Epidemiological concepts will be addressed, such as bias and confounding, matching and stratification, and the module will also address calculation of rates, standardisation and adjustment, as well as issues in screening.

    This module provides students with a historical and general overview of epidemiology and related strategies for study design, and should enable students to conduct appropriate methods of analysis for rates and risk of disease. Students will develop skills in critical appraisal of the literature and, in completing this module, will have developed an appreciation for epidemiology and an ability to describe the key statistical issues in the design of ecological studies, surveys, case-control studies, cohort studies and RCT, whilst recognising their advantages and disadvantages.

  • Longitudinal Data Analysis

    This module presents an approach to the analysis of longitudinal data, based on statistical modelling and likelihood methods of parameter estimation and hypothesis testing. Among other topics, students will learn about the exploratory and simple analysis strategies, the independence working assumption, normal linear model with correlated errors and generalised estimation questions.

    Students will develop an understanding in dealing with correlated data commonly arising in longitudinal studies, as well as an awareness of issues associated with collecting and analysing longitudinal data, whilst gaining a higher level of knowledge different modelling assumptions used in the analysis and their relations to the scientific aims of the study.

    On module completion, students will gain the ability to explain the difference between longitudinal studies and cross-sectional studies, in addition to the knowledge required to select appropriate techniques to explore data, and the ability to compare different approaches to estimation and their usage in the analysis. Finally, students will obtain the skill level required to build statistical models for longitudinal data and draw valid conclusions from their models.

  • Survival and Event History Analysis

    This module addresses a range of topics relating to survival data; censoring, hazard functions, Kaplan-Meier plots, parametric models and likelihood construction will be discussed in detail. Students will engage with the Cox proportional hazard model, partial likelihood, Nelson-Aalen estimation and survival time prediction and will also focus on counting processes, diagnostic methods, and frailty models and effects.

    The module provides an understanding of the unique features and statistical challenges surrounding the analysis of survival avant history data, in addition to an understanding of how non-parametric methods can aid in the identification of modelling strategies for time-to-event data, and recognition of the range and scope of survival techniques that can be implemented within standard statistical software.

    General skills will be developed, including the ability to express scientific problems in a mathematical language, improvement of scientific writing skills, and an enhanced range of computing skills related to the manipulation on analysis of data.

    On successful completion of this module, students will be able to apply a range of appropriate statistical techniques to survival and event history data using statistical software, to accurately interpret the output of statistical analyses using survival models, fitted using standard software, and the ability to construct and manipulate likelihood functions from parametric models for censored data. Students will also gain observation skills, such as the ability to identify when particular models are appropriate, through the application of diagnostic checks and model building strategies.

  • Bioinformatics

    This course will equip students with a working knowledge of the main themes in bioinformatics. On successful completion, students should be confident and competent in all aspects of bioinformatics that can be executed via the web or on software running on Windows/Mac systems. They will have an understanding of the theoretical algorithms that underpin the various software applications that they use, and be able to perform bioinformatics within their own biological sub-field. More generally, this module also aims to encourage students to access and evaluate information from a variety of sources and to communicate the principles in a way that is well-organised, topical and recognises the limits of current hypotheses. It also aims to equip students with practical techniques including data collection, analysis and interpretation.

  • Modelling of Infectious Diseases

    This module aims to provide students with the necessary knowledge, and analytical and modelling skills to develop and fit mathematical transmission models to understand infection dynamics, explore interventions, and to inform control policy. It will also provide students with the ability to analyse outbreak information, and to implement transmission models using the R programming language. Students will gain experience of handling and linking epidemiological data relevant to infectious disease outbreaks. They will gain hands-on experience of developing transmission models, appropriate to a specific research question or epidemiological application, and of using those models for scenario exploration. Students will also gain experience in communicating and presenting epidemic models and their outputs.

  • Clinical Trials

    Clinical trials are planned experiments on human beings designed to assess the relative benefits of one or more forms of treatment. For instance, we might be interested in studying whether aspirin reduces the incidence of pregnancy-induced hypertension, or we may wish to assess whether a new immunosuppressive drug improves the survival rate of transplant recipients.

    This module combines the study of technical methodology with discussion of more general research issues, beginning with a discussion of the relative advantages and disadvantages of different types of medical studies. The module will provide a definition and estimation of treatment effects. Furthermore, cross-over trials, issues of sample size determination, and equivalence trials are covered. There is an introduction to flexible trial designs that allow a sample size re-estimation during the ongoing trial. Finally, other relevant topics such as meta-analysis and accommodating confounding at the design stage are briefly discussed.

    Students will gain knowledge of the basic elements of clinical trials. They will develop the ability to recognise and use principles of good study design, and will also be able to analyse and interpret study results to make correct scientific inferences.

Environmental pathway

Accordion

  • Geoinformatics

    This module introduces students to the fundamental principles of Geographical Information Systems (GIS) and Remote Sensing (RS) and shows how these complementary technologies may be used to capture/derive, manipulate, integrate, analyse and display different forms of spatially-referenced environmental data. The module is highly vocational with theory-based lectures complemented by hands-on practical sessions using state-of-the-art software (ArcGIS & ERDAS Imagine).

    In addition to the subject-specific aims, the module provides students with a range of generic skills to synthesise geographical data, develop suitable approaches to problem-solving, undertake independent learning (including time management) and present the results of the analysis in novel graphical formats.

  • Modelling Environmental Processes

    This module provides an introduction to the basic principles and approaches to computer-aided modelling of environmental processes with applications to real environmental problems such as catchment modelling, pollutant dispersal in rivers and estuaries, population dynamics etc. More general, the module provides an introduction to general aspects of dynamic systems modelling including the role of uncertainty and data in the modelling process.

  • Extreme Value Theory

    This module aims to develop the asymptotic theory, and associated techniques for modelling and inference, associated with the analysis of extreme values of random processes. The module focuses on the mathematic basis of the models, the statistical principles for implementation and the computational aspects of data modelling.

    Students will develop an appreciation of, and facility in, the various asymptotic arguments and models, and will also gain the ability to fit appropriate models to data using specially developed R software, in addition to a working understanding of fitted models. Knowledge in R software computing is an essential skill that is transferrable with a wide range of modules on the mathematics programme, and beyond.

Societal pathway

Accordion

  • Methods for Missing Data

    This module offers students an advanced understanding of statistics, and will explore the idea of missingness as a stochastic process. Students will develop and apply their knowledge in missing data formulas, focusing on the imputation model and the model of interest. More naive methods will be introduced, such as single imputation and list wise deletion, and students will develop the ability to recognise the limitations of each method and gain the knowledge required to identify situations where their use may be appropriate.

    A portion of the module will introduce VIM software and explore its uses for finding missingness patterns.

    This module will enhance deduction skills, and students will become accustomed to the differences between sampling and parameter uncertainty, in addition to noticing similarities between the Bayesian and imputation approaches.

  • Principles of Epidemiology

    Introducing epidemiology, the study of the distribution and determents of disease in human population, this module presents its main principles and statistical methods. The module addresses the fundamental measures of disease, such as indicence, prevalence, risk and rates, including indices of morbidity and mortality.

    Students will also develop awareness in epidemiologic study design, such as ecological studies, surveys, and cohort and case-control studies, in addition to diagnostic test studies. Epidemiological concepts will be addressed, such as bias and confounding, matching and stratification, and the module will also address calculation of rates, standardisation and adjustment, as well as issues in screening.

    This module provides students with a historical and general overview of epidemiology and related strategies for study design, and should enable students to conduct appropriate methods of analysis for rates and risk of disease. Students will develop skills in critical appraisal of the literature and, in completing this module, will have developed an appreciation for epidemiology and an ability to describe the key statistical issues in the design of ecological studies, surveys, case-control studies, cohort studies and RCT, whilst recognising their advantages and disadvantages.

  • Longitudinal Data Analysis

    This module presents an approach to the analysis of longitudinal data, based on statistical modelling and likelihood methods of parameter estimation and hypothesis testing. Among other topics, students will learn about the exploratory and simple analysis strategies, the independence working assumption, normal linear model with correlated errors and generalised estimation questions.

    Students will develop an understanding in dealing with correlated data commonly arising in longitudinal studies, as well as an awareness of issues associated with collecting and analysing longitudinal data, whilst gaining a higher level of knowledge different modelling assumptions used in the analysis and their relations to the scientific aims of the study.

    On module completion, students will gain the ability to explain the difference between longitudinal studies and cross-sectional studies, in addition to the knowledge required to select appropriate techniques to explore data, and the ability to compare different approaches to estimation and their usage in the analysis. Finally, students will obtain the skill level required to build statistical models for longitudinal data and draw valid conclusions from their models.

  • Introduction to Intelligent Data Analysis (Data Mining)

    This module develops modelling skills on synthetic and empirical data by showing simple statistical methods and introducing novel methods from artificial intelligence and machine learning.

    The module will cover a wide range of data mining methods, including simple algorithms such as decision trees all the way to state of the art algorithms of artificial neural networks, support vector regression, k-nearest neighbour methods etc. We will consider both Data Mining methods for descriptive modelling, exploration & data reduction that aim to simplify and add insights to large, complex data sets, and Data Mining methods for predictive modelling that aim to classify and cluster individuals into distinct, disjoint segments with different patterns of behaviour.

    The module will also include a series of workshops in which you will learn how to use the SAS Enterprise Miner software for data mining (a software skill much sought after in the job market) and how to use it on real datasets in a real world scenario.

Computing pathway

Accordion

  • Applied Data Mining

    This module provides students with up-to-date information on current applications of data in both industry and research. Expanding on the module ‘Fundamentals of Data’, students will gain a more detailed level of understanding about how data is processed and applied on a large scale across a variety of different areas.

    Students will develop knowledge in different areas of science and will recognise their relation to big data, in addition to understanding how large-scale challenges are being addressed with current state-of-the-art techniques. The module will provide recommendations on the Social Web and their roots in social network theory and analysis, in addition their adaption and extension to large-scale problems, by focusing on primer, user-generated content and crowd-sourced data, social networks (theories, analysis), recommendation (collaborative filtering, content recommendation challenges, and friend recommendation/link prediction).

    On completion of this module, students will be able to create scalable solutions to problems involving data from the semantic, social and scientific web, in addition to abilities gained in processing networks and performing of network analysis in order to identify key factors in information flow.

  • Building Big Data Systems

    In this module we explore the architectural approaches, techniques and technologies that underpin today's Big Data system infrastructure and particularly large-scale enterprise systems. It is one of two complementary modules that comprise the Systems stream of the Computer Science MSc, which together provide a broad knowledge and context of systems architecture enabling students to assess new systems technologies, to know where technologies fit in the larger scheme of enterprise systems and state of the art research thinking, and to know what to read to go deeper.

    The principal ethos of the module is to focus on the principles of Big Data systems, and applying those principles using state of the art technology to engineer and lead data science projects. Detailed case studies and invited industrial speakers will be used to provide supporting real-world context and a basis for interactive seminar discussions.

  • Distributed Artificial Intelligence

    Distributed artificial intelligence is fundamental in contemporary data analysis. Large volumes of data and computation call for multiple computers in problem solving. Being able to understand and use those resources efficiently is an important skill for a data scientist. A distributed approach is also important for fault tolerance and robustness, as the loss of a single component must not significantly compromise the whole system. Additionally, contemporary and future distributed systems go beyond computer clusters and networks. Distributed systems are often comprised of multiple agents - multiple software, humans and/or robots that all interact in problem solving. As a data scientist, we may have control of the full distributed system, or we may have control of only one piece, and we have to decide how it must behave in face of others in order to accomplish our goals.