Term 1: Core modules
Term 1 provides core data scientific knowledge and skills training and is divided into three compulsory modules. The modules are delivered by the School of Mathematical Sciences and the School of Computing and Communications.
Choose a specialist pathway that aligns with your interests and career goals. Each pathway offers access to the cutting-edge tools and techniques of data science allowing you to graduate with the skills to build a career in this area.
You will select 3 of the optional modules from your chosen pathway.
Strong computational skills are essential for data scientists. With the application of AI within commercial settings, this has become increasingly important. Enhance your technical skills (through advanced Python programming and the design of Big Data Systems) and develop a wider understanding of AI (exploring topics including Natural Language Processing, Reinforcement Learning, and the design of distributed algorithms). The skills and knowledge gained on this pathway provide an excellent technical grounding for a career in any sector.
The main goal of this module is to explore the development and optimisation of intelligent, autonomous agents capable of outperforming human capabilities in various tasks. You'll learn the core concepts of intelligent agents, from fundamental AI paradigms like rule-based systems, planning, and learning, to advanced decision-making algorithms. The module emphasises both classical and modern AI techniques, showing how traditional ideas continue to inspire powerful innovations. Through practical exercises, you'll design, implement, and validate AI algorithms, enhancing your skills in problem-solving, critical thinking, and translating complex algorithms into functional code.
The main goal of this module is to equip students with the expertise to design and implement robust technology platforms essential for effective AI and Data Science systems. You’ll explore a range of technologies like Hadoop, Spark, and PyTorch Distributed, learning how to select, configure, and optimise them for large-scale, high-performance computing. The module focuses on principles of system architecture, distributed machine learning, and scalability, with real-world case studies and industry insights. By the end, you'll be able to architect and engineer data-driven systems, critically evaluate enterprise-scale IT solutions, and implement distributed machine learning models effectively.
This module aims to equip students with a deep understanding of the rapidly evolving field of Artificial Intelligence (AI), focusing on both cutting-edge methods and applications. You'll explore AI's role in areas like cybersecurity, ethical considerations, human-AI interaction, and emerging technologies like Quantum AI. The module prepares you to apply AI to real-world challenges or pursue research in innovative AI techniques, ensuring alignment with current industry and academic trends. Additionally, you'll develop skills in implementing AI solutions, making ethical decisions, and effectively communicating your findings in professional settings.
The main goal of this module is to provide students with cutting-edge knowledge in natural language processing (NLP) as applied in both industry and research. You'll learn how to collect, clean, and analyse language data at scale, using methods ranging from rule-based to deep learning techniques. The module covers key applications like machine translation, sentiment analysis, and summarisation, alongside discussions on language models, ethics, and bias in NLP. By the end, you'll be able to create scalable solutions for language data challenges, understand current NLP research trends, and enhance your skills in independent study, critical thinking, and effective communication.
Study how data science tools enable organisations to derive insight from their data and to make better decisions. You will develop advanced technical skills in data handling and machine learning methodologies (such as Time Series Forecasting, Supervised Learning, and Optimisation), and the capacity to communicate your findings in a business setting. Develop a range of technical and professional skills that are highly sought-after in commercial data scientists.
Optimisation, sometimes called mathematical programming, has applications in many fields, including operational research, computer science, statistics, finance, engineering and the physical sciences. Commercial optimisation software is now capable of solving many industrial-scale problems to proven optimality.
The module is designed to enable students to apply optimisation techniques to business problems. Building on the introduction to optimisation in the first term, students will be introduced to different problem formulations and algorithmic methods to guide decision-making in business and other organisations.
The module aims to teach the elements of time series and econometric forecasting in such a way that passing students will be able to prepare technical and non-technical reports for clients based on these methods which are methodologically competent, understandable and concisely presented.
On successful completion of the module students should be able to:
The module introduces time series and causal forecasting methods so that passing students will be able to prepare methodologically competent, understandable and concisely presented reports for clients. By the end of the course you should be able to model causal and time series models, assess their accuracy and robustness and apply them in a real world problem domain.
The course provides an introduction to the fundamental methods and approaches from the interrelated areas of data mining, statistical/ machine learning, and intelligent data analysis. The course covers then entire data analysis process, starting from the formulation of a project objective, developing an understanding of the available data and other resources, up to the point of statistical modelling and performance assessment. The focus of the course is classification. The course content covers:
The course uses the R programming language and more specifically the RStudio integrated programming environment. The course makes extensive use of online video lectures from top scientists in the field, and (I hope) will be supported by DataCamp (I am currently in the process of enrolling students to DataCamp for the classroom to allow them free access to a large number of otherwise non-free DataCamp courses).
1. G. James, D. Witten, T. Hastie and R. Tibshirani (2013) An Introduction to Statistical Learning: with
Applications in R, Springer
2. M.R. Berthold, C. Borgelt, F. Höppner and F. Klawonn (2010) Guide to Intelligent Data Analysis,
Springer
3. P.-N. Tan, M. Steinbach and V. Kumar (2005). Introduction to data mining. Boston, Pearson Addison Wesley.
The purpose of this course is to understand and use mathematical models in making strategic, tactical,
and operational logistics decisions. Emerging logistical concepts will be introduced and the associated
mathematical modelling needs will be discussed. Algebraic formulations will be used as vehicle for
describing models and discussing their relationships. There will be a focus on modelling, the use of
professional software, and the understanding of results. For problems where exact solutions are hard to
achieve even for simple instances of the problem, heuristics will be discussed.
Depending on students need and level of programming skill, the computer workshops will focus on either
solver languages (e.g. GAMS, AMPL, MPL) and/or programming interfaces (PYOMO, CPLEX Concert,
Gurobi Python Interface).
The Biodiversity crisis is one of the most urgent challenges facing ecologists today. Study how modern data capture tools and the application of AI have revolutionised research in environmental and ecological applications. You will develop the skills and understanding necessary to use advanced statistical, machine learning and AI methods to address biodiversity challenges in a robust and scientifically valid way.
This module introduces students to the fundamental principles of Geographical Information Systems (GIS) and Remote Sensing (RS) and shows how these complementary technologies may be used to capture/derive, manipulate, integrate, analyse and display different forms of spatially-referenced environmental data. The module is highly vocational with theory-based lectures complemented by hands-on practical sessions using state-of-the-art software (ArcGIS & ERDAS Imagine).
In addition to the subject-specific aims, the module provides students with a range of generic skills to synthesise geographical data, develop suitable approaches to problem-solving, undertake independent learning (including time management) and present the results of the analysis in novel graphical formats.
This module will immerse students in advances in ecological research and conservation that provide key skills for working as an ecologist in the era of Big Data. Teaching is delivered by world-leading researchers who are experts in biodiversity from coral reefs to tropical forests and freshwater lakes, ensuring a deep understanding of how data science can generate actionable insights for global conservation. Throughout the course, students will gain an understanding of the principles behind data science tools and techniques at the forefront of developing both a fundamental understanding of the natural world and urgent solutions to the global biodiversity crisis. The curriculum is dynamic and will adapt annually to address contemporary issues.
Indicative topics include:
1. Why do we need data science for biodiversity?
2. Big Data: advantages, challenges and solutions
3. Automating species ID for citizen science (AI, machine learning)
4. Tracking animal movements underwater (acoustic telemetry)
5. Quantifying 3D habitat structure (photogrammetry)
6. Biodiversity soundscapes in a noisy world (bioacoustics)
7. The ecological role of colour (machine learning)
8. Scaling up: from animal behaviour at global species distributions (geospatial)
9. Extended reality for ocean empathy (XR)
10. Responsible data science for biodiversity
Workshops offer in-depth exploration of advanced topics, such as AI’s role in predictive ecology, cutting-edge ecological technologies, biodiversity beyond species richness, data visualization strategies, and innovative data-driven solutions to the biodiversity crisis. Our interdisciplinary approach blends ecological and computational perspectives, preparing you for in-demand roles in the evolving ecology sector.
This module will equip the student with the understanding and skills required to use statistical methods to solve current ecological challenges in a robust and well-considered manner, translating statistical uncertainty into decision-making processes. These skills are highly sought after by conservation charities and non-governmental organisations.
Over the course of the module, the student will become familiar with the principles of statistical inference, including likelihood theory and Bayesian inference, and by the end of the module, they will be confident in justifying the use of one approach over the other and comparing and contrasting the results from the two methods. Students will experiment with different ecological data types and examine these through the lens of different visualisation and descriptive analyses.