About Me

NLP Senior Research Associate. Working on the ESRC funded Understanding Corporate Communications and Corporate Financial Information Environment (CFIE) projects at the School of Computing and  Communications and the Linguistics and English Language department - Lancaster University.


-- Research Associate at the at the School of Computing and  Communications - Lancaster University.

-- Worked as a Developmental Systems and Data Mining Developer at the UK Data Archive at Essex University.


PhD in Computer Science, Essex University 2012.
MSc in Information Systems, Jordan University 2008.
BSc in Computer Information Systems, Jordan University 2005.

Research Interests

Natural Language Processing (NLP); mainly on multi-document text summarisation for both Arabic and English, Information Retrieval, Question Answering, machine translation, text classification, crowd-sourcing, information extraction and creating NLP resources.

PhD Thesis

Thesis Topic: Multi-document Arabic Text Summarisation.
Candidacy: Research and investigate the field of Arabic Natural Language Processing for both Single and Multi-Document Text Summarisation and providing resources and corpora that could help in advancing and push forward the research on this field.

Download my PhD Thesis


1- Understanding Corporate Communications (UCC), Lancaster Uni, UK

A comprehensive analysis of the form, content and impact of communications between large, publicly traded corporations and their key stakeholder groups concerning the following three key aspects of corporate governance: i) compliance with governance requirements and recommendations (e.g. The Combined Code in the UK); ii) executive remuneration; and iii) senior management turnover..

Professor Tony McEnery, LAEL Lancaster University.
Professor Steven Young, LUMS Lancaster University.
Dr Paul Rayson, SCC Lancaster University.
Dr Andrew Hardie, LAEL Lancaster University.
Dr Mahmoud El-Haj, SCC/LAEL Lancaster University.

2- VardSourcing and SenseSourcing, Lancaster Uni, UK

The use of crowd-sourcing to build lexicons and check spelling variation in historical data.
Dr Paul Rayson, SCC Lancaster University.
Dr Mahmoud El-Haj, SCC/LAEL Lancaster University.
Dr Alistair Barron, SCC Lancaster University.

3- Corporate Financial Information Environment (CFIE), Lancaster Uni, UK

To advance research on the lexical properties and narrative aspects of corporate disclosures by developing a suite of statistical natural language processing (NLP) tools for analysing firms' narrative communication practices.

Professor Martin Walker, The University of Manchester.
Professor Steven Young, Lancaster University.
Dr Paul Rayson, Lancaster University.
Dr Mahmoud El-Haj, Lancaster University.
Dr Vasiliki Athanasakou, London School of Economics.
Dr Thomas Schleicher, The University of Manchester.

4- SKOS-HASSET Project at the UK Data Archive, Essex, UK.
The objective of this project is to bring HASSET, the leading and well-respected English language social science thesaurus, into the Linked Data web. Its aims are twofold: firstly, it will apply SKOS to HASSET, thus creating SKOS-HASSET, a Linked Open Data product for the use of the wider social science community; secondly, it will test SKOS-HASSET's automatic indexing capabilities in relation to survey data resources. The project is funded by the Joint Information Systems Committee (JISC).
My role is to automatically index the HASSET thesaurus, publications and questionnaires and evaluate the automatic indexing with other human manual indexing.
Apply Natural Language Processing tools to connect the thesaurus index terms with the related terms in the index of the publications and questionnaires to enhance the retrieving process of these documents.

5- Updating Digital Preservation and Systems (DPS) at the UK Data Archive, Essex, UK
The objective of this project is to build applications to help organise and manage the DPS current systems. The project is funded by the Economic and Social Research Council (ESRC).
My role is to write PowerShell scripts to enhance the process of organising the Archive's studies and to manage the process of creating and downloading the studies zip bundles which requires security and validation check to ensure that the uploaded studies and zip bundles meet Archive's required specifications and standards.

Professional Services

- Programme Committee for the Corpus Linguistics 2015 2015.

- Organiser of the UCREL Corpus Research Seminars (CRS)

- Coordinator of the MultiLing Workshop at the ACL 2013 Conference in Sofia, Bulgaria.

- Coordinator of the MultiLing Pilot at the Text Analysis Conference (TAC) 2011 in Maryland, USA.

- Organiser of the disciplinary Language And Computation (LAC) group at Essex University.

- Organiser of the FlatLands 2012 Workshop on Natural Language Processing Research for postgraduate students at Cambridge, Essex, Open, and Oxford Universities Friday, 29th June, 2012 at Essex University, Wivenhoe Park, Colchester, Essex, UK.

Journal and Conference Reviewing

- Reviewer for the ESRC Research Project Proposal (RCUK) 2015.

- Reviewer for the International Journal of Corpus Linguistics 2015.

- Reviewer for the Journal of Natural Language Engineering 2014, 2015.

- Reviewer for the MDPI Future Internet Journal 2014.

- Reviewer for the 15th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing) 2014.
- Reviewer for the LRE-Rel Workshop at the eighth international conference on Language Resources and Evaluation (LREC) 2012.
- Reviewer for the fourth Computer Science and Electronic Engineering Conference (CEEC) 2012.

- Reviewer for the 2nd IEEE International Conference on Computer and Communication Technology (ICCCT) 2011.

- Reviewer for the 32nd European Conference on Information Retrieval (ECIR) 2010.

Voluntary Work

GCSE Examiner of Unit 2 of the Arabic Speaking exam at Morecambe High School, Morecambe, UK, 2015.


-- Fully funded Internship at the National Institute of Informatics, Tokyo, Japan, 2011-2012.

-- Best Paper Award at the 4th LTC Conference, Poznan, Poland, 2009. The paper was then selected to appear at the Springer's Lecture Notes in Computer Science.