Liveblog: Open-sourcing Twitter's algorithm

Sourcegraph team
March 31, 2023

Sourcegraph devs (and our Discord community) will be liveblogging the most interesting things we see once it's published. Follow along here for updates!


02:02pm

We are signing off for now. Check out the following:

01:43pm

Government requests for intervention on Twitter must have been so pervasive Twitter Engineers even have a class for it in the Twitter Algorithm pic.twitter.com/F05sD5h9Lk

— Alec Sears (@alec_sears) March 31, 2023

Link to code

=======

01:19pm PDT

Twitter just released source code for "the algorithm"

Oh, what file is this? Predicates for tweets on the home timeline?

Oh what is that 2nd image? pic.twitter.com/UE3dU8e3Os

— Ólafur Waage (@olafurw) March 31, 2023

What is this?

(
"author_is_elon",
candidate =>
candidate
.getOrElse(AuthorIdFeature, None).contains(candidate.getOrElse(DDGStatsElonFeature, 0L))),https://t.co/mLdjWWYHrF

— David Mander (@davmander) March 31, 2023


12:43pm PDT

The 4 types of Twitter posters, according to the just open-sourced algorithm 😯https://t.co/xTLX77vJ75 pic.twitter.com/SaQN03P9eK

— Amjad Masad â • (@amasad) March 31, 2023

Link to code


A quick search in Twitter's Recommendation Algorithm for Ukraine. 🇺🇦 topic is on the same list as:
Do not amplify, do not public publish, medical misinformation, NSFW, and violence. What do you think it means? 🤔 pic.twitter.com/PYqm8pZjI4

— Mykhailo (@mxpoliakov) March 31, 2023

Link to code


12:24pm PDT

  • Precise code navigation is now on! Example
  • Cody codebase exploration
  • Using Cody to explore the codebase; it pretty quickly found the search indexer, which handles about half of the tweets


12:14pm PDT: LOC

```

Language Files Lines Blank Comment Code

Scala 3007 234531 26038 21493 187000 Java 1043 135517 19944 18259 97314 Python 152 21817 3561 5681 12575 C++ 51 10614 1630 466 8518 Rust 30 7360 404 275 6681 Protobuf 90 9456 1484 4514 3458 C/C++ Header 41 2868 482 377 2009 Markdown 63 2136 538 0 1598 SQL 23 1262 98 82 1082 YAML 7 1446 376 19 1051 XML 8 1263 175 190 898 Bourne Shell 9 267 65 29 173 Toml 4 124 7 3 114 reStructuredText 1 132 36 0 96 CMake 2 115 21 7 87 INI 8 76 15 21 40 Docker 1 34 3 6 25 JSON 1 5 0 0 5

Total 4541 429023 54877 51422 322724

```


12:04am PDT: communications

Twitter recommendation source code now available to all on GitHub https://t.co/9ozsyZANwa

— Elon Musk (@elonmusk) March 31, 2023

The real magic of Twitter is in our recommendations algorithm, which powers the hit Tweets you see in your For You timeline. We broke down how it all works here: https://t.co/2s5Hk57JPe

— Twitter Engineering (@TwitterEng) March 31, 2023

Blog post TL;DR (thank you Cody)

  • Twitter is releasing source code for parts of its platform, including its recommendations algorithm
  • The source code is being released on GitHub in two repositories: main repo and ml repo
  • The release aims for maximum transparency while excluding code that could compromise safety/privacy or enable bad actors
  • Training data and model weights for the recommendations algorithm are not being released at this time
  • This is Twitter's first step towards more transparency and they plan to release more code in the future that does not pose significant risks
  • The community is invited to submit GitHub issues and pull requests to suggest improvements to the recommendations algorithm
  • Twitter is working on tools to manage community suggestions and sync changes to internal repositories
  • Security concerns or issues should be reported through Twitter's official bug bounty program on HackerOne
  • Twitter hopes the global community can help identify issues and suggest improvements to lead to a better Twitter
  • Twitter is doing this to increase transparency and build trust with users, customers, and the public



11:50am PDT: code pushed

The code is now live: https://sourcegraph.com/github.com/twitter/the-algorithm

We are digging in!


2:15am PDT: start the countdown

@sqs: A little under 10 hours to go until it's open source. We'll be back closer to 12pm PDT (unless Twitter unexpectedly releases it early, which might happen!). If you want to start exploring the rest of Twitter's open-source code in the meantime, here's a starting point.

Subscribe for the latest code AI news and product updates
By submitting, I agree to Sourcegraph's Terms of Service and Privacy Policy.

Get Cody, the AI coding assistant

Cody makes it easy to write, fix, and maintain code.

Learn more