Text Analysing my Facebook Data

Using Wmatrix corpus analysis NLP tool in addition to my own Java word frequency program to handle Arabic words I Text-Analysed my Facebook data between the years 2007-2013.

The data contains around 600,0000 words (~50,000 distinct words) in Arabic, English and Franco-Arabic (or romanisation- Arabic written in Latin alphabet).

Here are some interesting facts about me (this include my wall, comments, activities, places and private messages). I didn't include stop-words in the statistics (i.e. in, an, it, at ...etc) or my names:


------------------

Top 5 most frequent English words:
1-"good"
2-"friends"
3-"university"
4-"thanks"
5-"happy"
------------------

Top 5 most frequent Arabic words:
1-"الله" --> God
2-"ابوي" --> Dad
3-"بخير" --> Fine
4-"حبيبي" --> Darling
5-"الإضراب" --> Protest
- ------------------

Top 5 most frequent Arabic-English words:
1-"walak"
2-"wallah"
3-"eid"
4-"kteer"
5-"yalla"
------------------

I'm mostly active on:
"Sundays" and "July"
------------------

I'm less active on:
"Saturdays" and "December"
------------------

Most active year:
"2011"
------------------

Most frequent Emoticons:
1-":)"
2-":D"
3-":S"
4-":P"
5-";)"
------------------

I found that I don't swear a lot!! Top F-word was mentioned only 14 times in 6 years!
I swear a bit more in Arabic!!
------------------

Places I mentioned the most:
1-"London"
2-"Jordan"
3-"Colchester"
4-"Essex"
5-"Tokyo"
------------------

Attractions I mentioned the most:
1-"Museum"
2-"Castle"
3-"Lake"
4-"Monument"
5-"Gallery"
------------------

Sports I mentioned the most:
1-"Football"
2-"Tennis"
3-"Badminton"
4-"Cycling"
5-"Swimming"
------------------

Football team I mentioned the most:
"Arsenal" [well, of course]
------------------

Names I mentioned the most (English):
Male-Names:
1-"Moutaz"
2-"Radwan"
3-"Anas"
4-"Dyaa"
5-"Fahed"

Female-Names:
1-"Farah"
2-"Seetha"
3-"Amal"
4-"Dareen"
5-"Noor"
------------------

Names I mentioned the most (Arabic)
Male-Names:
1-"علاء"
2-"عبود"
3-"ضياء"
4-"محمد"
5-"فهد"

Female-Names:
1-"فرح"
2-"الهام"
3-"أمل"
4-"دارين"
5-"دندوش"
------------------

Between 2007 and 2013 I had at least one activity every
single minute except for the following times:

3:21am
3:53am
3:59am
4:21am
4:22am
4:25am
4:36am
5:56am
6:25am
8:24am
9:19am
------------------

Top 5 Positive words:
1-"good"
2-"hope"
3-"thanks"
4-"great"
5-"best"
------------------

Top 5 Negative-emotions:
1-"worry"
2-"sad"
3-"attacks"
4-"upset"
5-"hate"
------------------

Top 5 -ing words
1-"something"
2-"doing"
3-"going"
4-"everything"
5-"working"
------------------

Top 5 -ed words
1-"added"
2-"tagged"
3-"posted"
4-"shared"
5-"received"
------------------

Top 5 governmental keywords:
1-"soldiers"
2-"blitz"
3-"military"
4-"army"
5-"siege"
------------------

Top 5 Part of speech tags:
1-"NN1" singular common noun (e.g. book, girl)
2-"MC" cardinal number,neutral for number (two, three..)
3-"NP1" singular proper noun (e.g. London, Jane, John)
4-"II" general preposition
5-"FO" formula
------------------

Top 5 colloctations:
1-"Sub-Zero"
2-"Sabra-Shatila"
3-"Optimality-Theory"
4-"MoneyGram-Reference"
5-"Graduate-Teaching"