Newswise —

As regulators and providers struggle with safeguarding the privacy of younger social media users from harassment and bullying, a team of researchers from four prominent universities has suggested a machine learning-based method to identify potentially harmful conversations on Instagram without invading users' privacy. This breakthrough could offer new ways for platforms and parents to safeguard vulnerable, younger users while respecting their privacy.

Researchers from Drexel University, Boston University, Georgia Institute of Technology, and Vanderbilt University have published their timely research in the Proceedings of the Association for Computing Machinery's Conference on Human-Computer Interaction. Their investigation aimed to determine the most effective data inputs, such as metadata, text, and image features, for machine learning models to identify risky conversations on Instagram. The team's findings indicate that metadata characteristics, such as conversation length and participant engagement, can be utilized to detect risky conversations.

Their work addresses a pressing issue on the widely used social media platform among 13-to-21-year-olds in the United States. Studies have revealed that harassment on Instagram has contributed to a significant increase in depression among its youngest users, with teenage girls experiencing a rise in mental health and eating disorders.

"The popularity of Instagram among young people, which stems from its perceived safety in fostering open connections with others, is concerning in light of the pervasive harassment, abuse, and bullying by malicious users," expressed Afsaneh Razi, PhD, an assistant professor in Drexel's College of Computing & Informatics and a co-author of the research.

Simultaneously, platforms are facing growing demands to safeguard users' privacy, driven by events such as the Cambridge Analytica scandal and the implementation of privacy protection laws by the European Union. Consequently, Meta, the parent company of Facebook and Instagram, is introducing end-to-end encryption for all messages on its platforms. This ensures that the content of messages is securely encrypted and can only be accessed by the participants in the conversation.

But this added level of security also makes it more difficult for the platforms to employ automated technology to detect and prevent online risks — which is why the group’s system could play an important role in protecting users.

"An approach to tackle this rise in malicious users, with the ability to safeguard vulnerable users, is through automated risk-detection programs," stated Razi. "However, the challenge lies in designing them ethically to ensure accuracy without invading privacy. When implementing security features like end-to-end encryption in communication platforms, it is crucial to prioritize the safety and privacy of the younger generation."

The machine learning system devised by Razi and her team employs layered algorithms to generate a metadata profile of risky conversations, taking into account factors like conversation length and one-sidedness, as well as contextual clues such as image or link sharing. In their testing, the program demonstrated an 87% accuracy rate in identifying risky conversations using only these limited and anonymous details.

For the training and testing of the system, the team gathered and examined over 17,000 private chats from 172 Instagram users aged 13-21, who willingly provided their conversations, amounting to more than 4 million messages in total, to support the research. Participants were requested to review their conversations and classify them as "safe" or "unsafe." Out of these, around 3,300 conversations were identified as "unsafe" and further categorized into five risk categories, including harassment, sexual message/solicitation, nudity/porn, hate speech, and sale or promotion of illegal activities.

The team employed various machine learning models and randomly sampled conversations from each risk category to extract a set of metadata features. These features included average conversation length, number of participants, number of messages exchanged, response time, number of images shared, and mutual connections between participants on Instagram. These metadata features were found to be closely associated with risky conversations through the machine learning analysis conducted by the team.

This data enabled the team to create a program that can operate using only metadata, some of which would be available if Instagram conversations were end-to-end encrypted.

"The findings from our research present intriguing possibilities for future investigations and implications for the industry at large," the team stated. "Firstly, utilizing metadata features alone for risk detection enables lightweight methods that do not necessitate resource-intensive analysis of text and images. Secondly, developing systems that do not analyze content addresses privacy and ethical concerns, prioritizing user protection in this domain."

To enhance the effectiveness of the program and enable identification of specific risk types, the team explored the option of users or parents voluntarily sharing additional conversation details for security purposes. The team conducted a similar machine learning analysis, this time incorporating linguistic cues and image features, using the same dataset to further refine the program.

In this approach, advanced machine learning programs analyzed the text of the conversations, specifically focusing on the words and combinations of words that were prevalent in risky conversations as identified by the users. These words and combinations of words were used as triggers to flag risky conversations, enabling a more precise identification of potential risks.

To analyze the images and videos shared in conversations on Instagram, the team utilized a set of programs. One program was used to identify and extract text that appeared on top of images and videos, while another program was used to generate captions for each image. Similar to the textual analysis, the machine learning programs then created a profile of words that were indicative of images and videos shared in risky conversations, allowing for a more comprehensive analysis of potential risks.

The machine learning system, trained with the characteristics of risky conversations, was then tested by analyzing a random sampling of conversations from the larger dataset that were not used in the profile-generation or training process. By combining analyses of metadata traits, linguistic cues, and image features, the program was able to accurately identify risky conversations with an accuracy rate of up to 85%. This demonstrated the potential effectiveness of the system in identifying and flagging risky conversations on Instagram.

The researchers noted that while metadata can provide high-level cues about conversations that may be unsafe for youth, the detection and response to specific types of risks require the use of linguistic cues and image data. This finding raises important philosophical and ethical questions, especially in the context of Meta's recent push towards end-to-end encryption, as contextual cues could be valuable for well-designed risk mitigation systems that leverage AI. It highlights the need for thoughtful consideration of privacy and ethical implications in the development and implementation of automated risk detection programs on social media platforms like Instagram.

The researchers acknowledged certain limitations in their study, as it focused solely on messages exchanged on Instagram. However, they believe that the system could be adapted to analyze messages on other platforms that also employ end-to-end encryption. They also noted that the accuracy of the program could potentially be further improved by continuing the training process with a larger and more diverse sampling of messages. This suggests that there is potential for further advancements in automated risk detection programs and their applicability to various communication platforms with end-to-end encryption.

The researchers emphasized that their findings demonstrate the feasibility of effective automated risk detection, and although privacy protection is a valid concern, it should not hinder progress in this area. They believe that steps should be pursued to develop and implement such risk detection systems in order to protect the most vulnerable users of popular communication platforms. This highlights the importance of balancing privacy concerns with the need to safeguard users, particularly those who may be more susceptible to risks online, such as younger generations.

"Our analysis marks an important initial step towards enabling automated detection of online risk behavior using machine learning techniques. Our system currently relies on reactive characteristics of conversations, but our research also sets the stage for more proactive approaches to risk detection that are likely to be more applicable in real-world settings, considering their high ecological validity."

Journal Link: Proceedings of the ACM on Human-Computer Interaction