Lancaster University data science experts are working with linguists to develop a free bilingual online platform to analyse survey data.
Involving academics from Cardiff and Lancaster Universities as well as Cadw and the National Trust Wales, the FreeText | TestunRhydd project will allow large organisations to quickly respond to opinions from both Welsh and English consumers.
Free-text qualitative comments, which are captured in feedback from surveys and questionnaires, pose a particular challenge to a range of private and public companies and institutions, who may not have the skills and expertise to process and analyse these comments with ease.
As a result of the Welsh Language Act, survey respondents in Wales have the opportunity to respond to surveys in English or Welsh. This poses even more of a challenge in the analysis of resultant data, if adequate Welsh language expertise do not exist within the workforce or if some of the responses submitted are a mixture of both Welsh and English.
Although a range of sophisticated digital tools for the analysis of text-based data are available, particularly for researchers working in academia, marketing and public relations contexts, these tools are costly and do not fully support the task of systematically processing free-text responses in Welsh.
Project lead Dr Dawn Knight, based at Cardiff University, said: “In our modern consumer-led culture, the process of obtaining and responding to feedback permeates all walks of life. Feedback from surveys, focus groups and questionnaires can include language-rich ‘free-text' comments which often prove to be a challenge to manually process and analyse with ease, due to the volume of comments.
“In this project we aim to build a free online toolkit, FreeTxt | TestunRhydd, which will be made accessible to anyone in any sector, to support the analysis of multiple forms of open-ended, free-text data in both English and Welsh. The platform will analyse multiple comments quickly, pulling out common themes, so that important feedback can be acted upon efficiently.
“We hope this project will be particularly helpful for organisations as they work to ensure those who prefer to communicate through the Welsh language are given equal opportunities to provide comments and feedback.”
Users of the tool will be able to enter survey data. The platform will then quickly process the text to provide an easy-to-interpret visualisation of the words that appear most prominently.
Academics will work closely with project partners Cadw and National Trust Wales to co-design, co-construct and test FreeTxt | TestunRhydd to ensure that the resource is fit-for-purpose and fairly and consistently meets the needs of Welsh and English-language responses.
This project builds on previous work by Dr Knight, the CorCenCC (National Corpus of Contemporary Welsh), which is a freely accessible collection of multiple language samples, gathered from real-life communication.
Project Co-Investigator, Professor Paul Rayson, from Lancaster University’s School of Computing and Communications, Data Science Institute and UCREL (University Centre for Computer Corpus Research on Language), said: “We look forward to getting to grips with Welsh language data again following our successful previous collaboration with Cardiff University in the CorCenCC project.
“Along with my colleagues, Dr Mo El-Haj and Dr Ignatius Ezeani, we will build a set of openly available Welsh and English language analysis tools and make them accessible through the free online tool for survey data analysis. Working with partners Cadw and National Trust Wales will ensure strong stakeholder input to the design of the user tool, and testing with them will enable us to judge how useful it will be for scaling up their own analyses of free text survey responses.”
Rebecca Williams, Assistant Director of Consultancy at National Trust Cymru, said: “We’re looking forward to working with Cardiff and Lancaster Universities on the FreeTxt/TestunRhydd project to create a tool that can be used by a range of organisations across Wales to analyse their data. There isn’t a tool on the market that allows us to measure sentiment in Welsh, therefore we hope this tool will enable us to understand written feedback more accurately as well as opening the door for organisations to interpret and act on Welsh language data.
“With 185,000 members and over 1.8 million visitors to our places in Wales every year, we capture a wealth of data, both quantitative and qualitative. The latter being harder to analyse due to the language it’s written in and its format such as written comment cards, online comments and surveys. Feedback from our supporters help guide our priorities, programming, events and more, and by having a tool in place that helps us identify key themes will allow us to improve people’s understanding of the Trust and experience at our places.”Back to News