Natural Language Processing of a Facebook Page

, , Leave a comment

After my S2DS course ended, I decided to keep looking into the organisation’s Facebook page to try some Natural Language Processing using Python.

I divided the Facebook posts into two groups: the posts from the organisations and the posts from their readers.

I describe my analysis on my github page, but here is the conclusion of my analysis:
I’ve here used multiple Natural Language Processing tools:

  • Most frequent words
  • Collocations
  • Unique expressions
  • Concordance
  • Bigrams

These tools helped the analysis of Facebook posts from two different groups: Parkinson’s UK and their readers. It highlighted the common and diverging interest or priorities, but also highlighted some interesting gender imbalance in how people are affected by Parkinson’s.

Interests and priorities

Parkinson’s UK and their readers shared similar interests: diagnoses, raising awareness, raising money, research, and medications. However, priorities differed slighlty, with Parkinson’s UK focus on research and their readers focus on medication. Parkinson’s UK also presented stories that were telling positive stories, while their readers talked positively about their parent but had more negative expressions. I did not conduct any sentiment analysis on this text, but this would probably find a similar pattern.

Gender imbalance

Two main factors influence the gender of those who are talked about on Facebook. First, the number of people with Parkinson’s; and second, the people who post on Facebook. Recent research finds a higher prevalence of Parkinson’s in male than female, possibly due to the role of hormones in the development of the condition. Therefore more ‘dads’ might be talked about. On the other hand, as often stressed by Parkinson’s UK, those who are affected by Parkinson’s are not just those who develop it, but also all their families. This is supported by a reading of the text, which shows that when a person talks about their ‘dad’, they also mention their ‘mum’: in this case the dad is the person with Parkinson’s, while the mum is the carer/family member. During the project, this was very quickly mentioned as a possible issue in applying Natural Langage Processing tools. When considering the same generation male vs female, we should remember that women might be sharing more than men on Facebook. Therefore, more husbands are likely to be mentioned, regardless of Parkinson’s prevalence in the population.


I’ve now manually sorted the posts that were not from the Parkinson’s UK page, and I’m looking into the different type of posts (asking for advice, fundraising, ..)