Learning from Data in post-Foundation Models Era: bringing learning and reasoning together.


Yokohama

Professor Plamen Angelov will be giving a keynote talk at the IEEE World Congress on Computational Intelligence (WCCI) 2024 in Yokohama, Japan in July 2024.

Deep Learning continues to attract the attention and interest not only of the wider scientific community and industry, but also society and policy makers. Fuelled by the remarkable generalisation and separability capabilities offered by the transformers (e.g. ViT), Foundation Models (FM) offer unparalleled feature extraction opportunities. However, the mainstream approach of end-to-end iterative training of a hyper-parametric, cumbersome, and opaque model architecture led some authors to brand them “black box”. This degrades their generalisation, requires many labelled data, compute power and related energy, etc. costs. Cases were reported when such models can give wrong predictions with high confidence - something that jeopardises the safety and trust. Deep Learning is focused on accuracy and overlooks explainability and the semantic meaning of the internal model representations, reasoning and its link with the problem domain. In fact, it shortcuts from the large amount of (labelled) data to the predictions bypassing and substituting the causality with correlation and error minimisation. It relies on assumptions about the data distributions that are often not satisfied and suffers from catastrophic forgetting when faced with continual and open set learning. Once trained, such models are inflexible to new knowledge. They are good only for what they were originally trained for. Indeed, the ability to detect unseen and unexpected and start learning this new class/es in real time with no or very little supervision (zero- or few- shot learning) is critically important but is still an open problem. The challenge is to fill the gap between the high levels of accuracy and the semantically meaningful solutions.

This talk will focus on “getting the best from both worlds”: the powerful latent feature spaces formed by pre-trained deep architectures such as transformers combined with the interpretable-by-design (in lingui-stic, visual, semantic, and similarity-based form) models. One can see this as a fully interpretable frontend and a powerful backend working in harmony. Examples will be demonstrated from the latest projects from the area of autonomous driving, Earth Observation, health and a set of well-known benchmarks.

Back to News