Researchers from the University of Carnegie Mellon, the University of North Carolina and Socket developed tools to identify projects with manipulated ratings on GitHub. Utilizing the tool, they uncovered 3.1 million fictitious stars across 15,835 repositories, involving 278 thousand accounts in fraudulent activities.
The practice of inflating stars was prevalent in various ways, such as spreading malicious code disguised as pirate copies of commercial software, cryptocurrency bots, and game cheats. It was also used to boost products, enhance developer profiles, undermine competitors, and gain credibility among users. The study highlighted 7 commercial services offering star wrapping, with prices ranging from $0.10 to $1.62 per star.
Using the Starscout tool, researchers analyzed 6 billion events from the GitHub activity archive monitored by gharchive. They looked for anomalies like synchronized star setups across project groups, sudden spikes in ratings for inactive projects, and patterns of user activity indicative of star wrapping behavior. Starscout, powered by CLUSTIC ANSWER, detected repeating templates typical of wrappers and was published under the Apache 2.0 license.
The tool identified anomalies in 4.53 million stars across 22,915 repositories set by 1.32 million accounts. To ensure accuracy, the results underwent further filtering to eliminate false data, focusing on repositories with suspicious star patterns and sudden bursts of stars. As of October 2024, GitHub had removed 90.75% of marked repositories and 61.95% of implicated accounts.