cross-posted from: https://beehaw.org/post/17683690

Archived version

Download study (pdf)

GitHub, the de-facto platform for open-source software development, provides a set of social-media-like features to signal high-quality repositories. Among them, the star count is the most widely used popularity signal, but it is also at risk of being artificially inflated (i.e., faked), decreasing its value as a decision-making signal and posing a security risk to all GitHub users.

A recent paper by Cornell University published on Arxiv, the researchers present a systematic, global, and longitudinal measurement study of fake stars in GitHub: StarScout, a scalable tool able to detect anomalous starring behaviors (i.e., low activity and lockstep) across the entire GitHub metadata.

Analyzing the data collected using StarScout, they find that:

(1) fake-star-related activities have rapidly surged since 2024

(2) the user profile characteristics of fake stargazers are not distinct from average GitHub users, but many of them have highly abnormal activity patterns

(3) the majority of fake stars are used to promote short-lived malware repositories masquerading as pirating software, game cheats, or cryptocurrency bots

(4) some repositories may have acquired fake stars for growth hacking, but fake stars only have a promotion effect in the short term (i.e., less than two months) and become a burden in the long term.

The study has implications for platform moderators, open-source practitioners, and supply chain security researchers.

  • mox@lemmy.sdf.orgOP
    link
    fedilink
    arrow-up
    19
    arrow-down
    3
    ·
    25 days ago

    IMHO, GitHub has been steadily getting worse ever since Microsoft bought it.

    The first things I noticed were minor UI annoyances. Later on, it started hijacking some of my browser’s keyboard shortcuts and controls. Then there was the continual nagging: to give them more email addresses, to re-re-re-re-download my TOTP recovery keys, etc. Unilaterally deciding to use all of our creative works to train their LLM hasn’t made them many friends. And now there’s this issue, which might not be Microsoft’s fault (at least not entirely), but it is a consequence of the global software community using a single, centralised service for so much of what we do.

    I put my most recently published project on Codeberg. If it goes well, I’ll probably move my GitHub projects there. The UI is familiar and comfortable, and I think their work toward federated software forges is important.

    It’s worth noting that Codeberg requires most projects to be open-source. I think they make exceptions in some cases.