Blog Post #4


Hasan Can BiyikMS Student in Computational 


News articles, which serve as primary sources of information about current events have the power to influence the beliefs, conceptions, and behaviors of individuals. In Turkey, LGBTQ+ representation in mainstream media contributes significantly to the perception that LGBTQ+ individuals are unnecessary or live in the margins of society. In Turkish media, specifically, these individuals are often denied the platform to accurately represent themselves and defend their rights. In fact, Turkish media tends to promote hate speech against LGBTQ+ individuals. In response, people in the LGBTQ+ community turn to alternative media to advocate for their rights, express themselves openly, and address the challenges they face. This research investigated the extent to which news articles in Turkish social media have a positive or negative view of LGBTQ+ individuals.

Literature Review

Turkey supports heteronormativity while marginalizing same-sex sexualities and gender-nonconforming identities (Atalay & Doan, 2019). The influence of the AKP’s (Justice and Development Party) rhetoric has pushed the country towards conservatism, religiosity, and more societal oppression. Ozbay (2015) states that homophobia has been widespread, with specific sexualities considered ‘deviations’ and ‘illnesses’ by public authorities and military organizations. Selma Aliye Kavaf, Turkish Minister of State responsible for Women and Family Affairs, stated in 2010 that homosexuality was a ‘biological disorder’, an ‘illness’ that should be treated (Amnesty International, 2011, p. 5). After the coup in 1980 and through the 1990s, especially trans individuals were represented in a sexist and homophobic context in private media channels. This period witnessed the cultivation of a national “fear of the queer” ideology (Gurel, 2017). Regardless of the growing visibility of the LGBT community, the depiction was negative, characterizing its members as sinful individuals, outcasts, and even as monsters (Atalay & Doan, 2019). In this blog post, I explore how the Turkish media represents the LGBTQ+ community.


To achieve this goal, I obtained a dataset of Turkish news articles from Kaggle. This dataset included 42 thousand unlabeled news articles. Using this large corpus, I first selected only the news articles that mentioned LGBTQ+ people. In order to do this, I used AntConc and searched for the following keywords:

Then, only news articles that contained ne of these words were kept in the corpus. The next step was to conduct a sentiment analysis using Python. As the news articles dataset did not have labels, sentiment analysis models based on BERTurk for Turkish and XLM-RoBERTa, found on Hugging Face, were utilized. The sentences were labeled and scored as negative, positive, and neutral through XLM-RoBERTa; and negative and positive through BERTurk models. The datasets were visualized using R through the ggplot2 package.


After conducting a sentiment analysis using the aforementioned models, two datasets were created. Both datasets had ‘File Number’, ‘Text’, ‘Sentiment Label’, and ‘Sentiment Score’ columns. The first dataset (the first three rows will be displayed), whose sentiment analysis was based on the XLM-RoBERTa model, is as follows:

The second dataset (the first three rows will be displayed), whose sentiment analysis was based on the BERTurk model, is as follows:


According to the BERTurk model:

On the other hand, according to the XLM-RoBERTa model:

The following excerpt was classified as ‘negative’ by both models. BERTurk scored the sentence ‘0.868’ and XLM-RoBERTa scored it ‘0.903’.

The following excerpt was classified as ‘positive’ by both models. BERTurk scored the sentence ‘0.954’ and XLM-RoBERTa scored it ‘0.674’.


The overall results with sentiment scores are represented on the x-axis, while the y-axis depicted the corresponding sentiment labels (see figures below).


According to the findings of this study, the tone and framing of the LGBTQ+ community in Turkish media are mainly neutral and negative. After analyzing the results of the sentiment analyses of both models, the XLM-RoBERTa model has displayed a better classification than BERTurk. Even though BERTurk classified more sentences as positive, sentiment scores indicated that the model was not confident enough, meaning that if there was a neutral classification, most of the sentences would be classified as neutral.

The results of this initial analysis that Turkish media avoids describing the LGBTQ+ community positively; instead, words that have neutral or negative connotations are preferred. As a result, it is likely that media organizations influence people’s opinions, beliefs, and conceptions towards the LGBTQ+ community negatively, meaning that they do not attempt to change the current view of LGBTQ+ people in Turkish society.

Future studies should confirm the results of the sentiment analysis and use larger datasets containing more news articles related to the LGBTQ+ community. As an alternative, it would be interesting to compare the results of various sentiment analysis models.