There's a yearly programming contest called Advent of Code (AoC).
If you haven't heard about it, I'd recommend reading betaveros's post explaining what makes it unique.
This was my third attempt at AoC, previously trying it in 2019 (made it to day 5) and 2021 (day 6). This year I made it to... drumroll ...day 14! I had a good time this year, primarily because a group of friends (read: wiki folks on Mastodon) were doing it
every day, so I'd be motivated to be able to compare my solution with their own.
Then on day 15 at midnight I looked at the puzzle and said "nope." and went to sleep.
AoC definitely messed with my sleep schedule being on EST and starting the puzzles at midnight rather than the 9 p.m. back in PST. Once I finished each puzzle, it always took a while to calm down from the rush and by then I'm sleeping at least an hour later than I should've been.
But since I was starting as soon as the puzzle came out on most days, the leaderboard accurately reflects how long it took me on those puzzles:
Day 5 was my best performance, I attribute that to the input format requiring a more-complex-than-usual parser, which I sidestepped by cleaning up the input
in my editor first.
I posted a link to each day's solution and some commentary on a Mastodon thread. All of my solutions
are available in a Git repo.
Overall I enjoyed doing the challenges in Rust. I feel that a good amount of the puzzles just required basic string/array manipulation, which are faster to
do in a dynamically typed language like Python, but there were plenty of times I felt Rust's match statement (which Python now sort of has...) and sum types
came in handy. Specifically with Rust's match statement, the compiler will complain if you don't satisfy some branch, which helped when e.g. implementing the rock-paper-scissors state machine.
As far as learning goes, I picked up some CS concepts like Dijkstra's algorithm. I'm not sure I really learned any more Rust, just got more comfortable with the
concepts I already knew and likely faster at applying them. For the past few months I feel like I'm now thinking in Rust, rather than thinking in Python and writing it in Rust.
Past puzzles are available indefinitely, so you can do them whenever you want. I don't plan on finishing the rest, I mostly lost the incentive now that it's no longer a daily thing. But I'll probably try again in December and see
how far I go :-)
I set some goals for myself at the beginning of the year.
Here's how it went:
Move out of my parents' house. ✔️ I live in New York City now. This was definitely my biggest goal and accomplishment of the year.
Contribute something meaningful to SecureDrop. ✔️ I think so. I need to writeup some of the stuff I worked on in the past year.
Contribute something meaningful to MediaWiki. ✔️ Slightly more mixed because I contributed a lot less this year than in the past, but I still consider myself having contributed in a meaningful way.
Not get COVID. ❌ Got it in June :(
Continue contributing to Mailman. ❌ Didn't really find the motivation this year. I'm hoping to spend more time on this in 2023, Wikimedia's Mailman install is showing that it needs more love.
Continue working on mwbot-rs,
while having fun and learning more Rust. ✔️ I posted updates on the News page.
Get more stickers (lack of in-person meetups has really been hurting my
sticker collecting). ✔️ Definitely. New stickers include FIRE, California poppies, Pacific Northwest, Qubes, HOPE 2022, Pinnacles National Park, a corgi, and very nice yellow and blue bird that says, "There is more power in peace than in violence".
Port the rest of my wiki bots to Rust. ❌ Still running in good old PHP. I don't really think this is worth it anymore, people are too used to the current bugs that introducing a different set of bugs would be more disruptive than helpful.
Travel outside the US (COVID-permitting). ❌ I probably had the opportunity, but just didn't feel comfortable because of COVID. Planning at least two international trips in 2023!
Finish in the top half of our Fantasy Football league and Pick 'em pool. I
did pretty well in 2020 and really regressed in 2021. ❓ Too early to say. Currently doing well in the Pick 'em pool, but not in Fantasy.
Keep track of TV show reviews/ratings. I've been pretty good about tracking
movies I watch, but don't yet do the same for TV. ❌ Started, but didn't finish.
Publicly publishing a list of goals was nice, every few months I'd re-read the post to see if I was on track or not. But I don't intend to publish my 2023 goals, I expect they'll be more personal than these were.
Toolforge is a free cloud computing platform designed for and used by the Wikimedia movement to host various tools and bots. One of the coolest parts of using Toolforge
is that you get access to redacted copies of the MediaWiki MySQL database replicas, aka the wiki replicas.
(Note that whenever I say "MySQL" in this post I actually mean "MariaDB".)
In web applications, it's pretty common to use a connection pool, which keeps a set of open connections ready so there's less overhead when a new request comes in. But the wiki replicas
are a shared resource and more importantly the database servers don't have enough connection slots for every tool that uses them to maintain idle connections. To quote from the Toolforge connection handling policy:
Usage of connection pools (maintaining open connections without them being in use), persistent connections, or any kind of connection pattern that maintains several connections open even if they are unused is not permitted on shared MySQL instances (Wiki Replicas and ToolsDB).
The memory and processing power available to the database servers is a finite resource. Each open connection to a database, even if inactive, consumes some of these resources. Given the number of potential users for the Wiki Replicas and ToolsDB, if even a relatively small percentage of users held open idle connections, the server would quickly run out of resources to allow new connections. Please close your connections as soon as you stop using them. Note that connecting interactively and being idle for a few minutes is not an issue—opening dozens of connections and maintaining them automatically open is.
But use of a connection pool in code has other benefits from just having idle connections open and ready to go. A connection pool manages the max number of open connections, so we can wait for a connection slot to be available rather
than showing the user an error that the number of connections for our user has already been met. A pool also allows us to reuse open connections if we know something is waiting for them instead of closing them. (Both of those are
real issues Enterprisey ran into with their new fast-ec tool: T325501, T325511; which caused
me to finally investigate this.)
With that in mind, let's set up a connection pool using the mysql_async crate that doesn't keep any idle connections open. You can pass pool options programatically using a builder, or as part of the
URL connection string. I was already using the connection string method, so that's the direction I went in because it was trivial to tack more options on.
implfmt::DisplayforDBConnectionInfo{fnfmt(&self,f: &mutfmt::Formatter)-> fmt::Result{// pool_min=0 means the connection pool will hold 0 active connections at minimum// pool_max=? means the max number of connections the pool will hold (should be no more than// the max_connections_limit for your user (default 10)// inactive_connection_ttl=0 means inactive connections will be dropped immediately// ttl_check_interval=30 means it will check for inactive connections every 30secwrite!(f,"mysql://{}:{}@{}:3306/{}?pool_min=0&pool_max={}&inactive_connection_ttl=0&ttl_check_interval=30",self.user,self.password,self.host,self.database,self.pool_max)}}
In the end, it was pretty simple to configure the pool to immediately close unused connections, while still getting us the other benefits! This was released as part of toolforge 5.3.0.
This is only half of the solution though, because this pool only works for connecting to a single database server. If your tool wants to support all the Wikimedia wikis, you're out of luck since the wikis are split across 8 different database servers ("slices").
Ideally our pool would automatically open
connections on the correct database server, reusing them when appropriate. For example, the "enwiki" (English Wikipedia) database is on "s1", while "s2" has
"fiwki" (Finnish Wikipedia), "itwiki" (Italian Wikipedia), and a few more. There is a "meta_p" database that contains information about which wiki is on which server:
(Most of the wikis are on s3, so I excluded it so we'd actually get some variety.)
Essentially we want 8 different connection pools, and then a way to route a connection request for a database to the server that contains the database. We can get the mapping of database to slice from the meta_p.wiki table.
This is what the new WikiPool type aims to do (again, in the toolforge crate).
At construction, it loads the username/password from the
my.cnf file. Then when a new connection is requested, it lazily loads the mapping, and opens a connection to the corresponding server, switches to the desired database and returns the connection.
I've done some limited local testing of this, mostly using ab to fire off a bunch of concurrent requests and watching SHOW PROCESSLIST in another tab to observe all connections slots being used with no idle connections
staying open. But it's not at a state where I feel comfortable declaring the API stable, so it's currently behind an unstable-pool feature, with the understanding that breaking changes may be made in the future, without a
semver major bump. If you don't mind that, please try out toolforge 5.4.0 and provide feedback! T325951 tracks stabilizing this feature.
If this works interests you, the mwbot-rs project is always looking for more contributors, please reach out, either on-wiki or in the #wikimedia-rust:libera.chat room (Matrix or IRC).
There were two prominent stories this week about how rich and famous people tried to influence Wikipedia's coverage, and depending on your point of view, got their way. I think the coverage of both stories missed the mark
so I'd like to dive into them a bit deeper.
But first, Canada is currently discussing enacting a new gun control law, known as Bill C-21. A prominent ice hockey player, Montreal Canadiens goalie Carey Price, spoke out in opposition to the bill, aligning himself with the
Canadian Coalition for Firearm Rights. At the same time the CCFR was under fire for creating a online coupon code, "POLY", which people assumed referred to the 1989 École Polytechnique massacre (the group denies this).
If you had wanted to look up the Canadian Coalition for Firearm Rights on Wikipedia prior to December 7, you wouldn't have found anything. You probably wouldn't have learned that in 2019 they asked members to file complaints against a doctor who called for a ban on assault rifles, or that their CEO shot his first firearm in...the United States.
I'm not very in tune with Canadian politics, so it's unclear to me how prominent this group is actually (doesn't seem to be on the level of the NRA in the US).
But Price put them on the map and now there's a Wikipedia article that will educate people on its history. (It's even been
approved to go on the Main Page, just pending scheduling.) 1 point for rich and famous people influencing Wikipedia's coverage
for the better.
OK, so now onto author Emily St. John Mandel, who is divorced and wanted Wikipedia to not falsely say she was married.
She posted on Twitter,
"Friends, did you know that if you have a Wikipedia page and you get a divorce, the only way to update your Wikipedia is to say you’re divorced in an interview?"
She then did an interview in Slate, where she was specifically asked and answered that she was divorced.
The thing is, that probably wasn't necessary. Yes, Wikipedia strongly prefers independent, reliable sources as the "Wikipedia:Reliable sources" policy page goes into great detail
about. But in certain cases, using the person themselves as a source is fine. In the section "Self-published and questionable sources as sources on themselves",
the policy lists 5 criteria that should be met:
The material is neither unduly self-serving nor an exceptional claim.
It does not involve claims about third parties (such as people, organizations, or other entities).
It does not involve claims about events not directly related to the subject.
There is no reasonable doubt as to its authenticity.
The Wikipedia article is not based primarily on such sources.
On top of this, Wikipedia has a strict policy regarding biographies of living persons (BLP), that would lend more weight to using the self-published source.
If Mandel had just tweeted, "I'm divorced now.", that would've been fine. In fact, the first person to update her article with a citation about her divorce used her tweet, not the Slate interview!
In the past I've also used people's tweets to remove incorrect information from Wikipedia.
(That said, people do lie about their age, height, etc. So far the worst case I've ever run into was Taio Cruz, who reached the level of sending in a fake birth certificate.
You can read the talk page, it's a giant mess.)
And then there's Elon Musk (sigh), who tweeted about how Wikipedia is biased, right after an "Articles for deletion" discussion was started on the Twitter Files article.
I cast a vote in the discussion, stating it was easily notable and an obvious keep. By the time it was closed, the tally was 73 keep votes, 27 delete votes, and 23 merge votes. Wikipedians will tell you that these discussions are not a vote,
rather the conclusion is based on the strength of the arguments. But in this case, I want to focus on the direction of the discussion rather than the final result.
At the time Musk tweeted (Dec 6, 18:46 UTC), the vote count was 12 delete votes, 4 keep votes, 4 merge votes (I should say that I'm relying on Enterprisey's vote-history analysis for these numbers).
The votes post-tweet were 69 keep, 15 delete, 19 merge. That's a pretty big shift!
I would like to think that Wikipedians would have reached the same (and IMO correct) conclusion regarding the existence of the Twitter Files article without Musk's "intervention", but it's hard to say that for sure.
But, as I've hopefully demonstrated, Musk is not alone in trying to influence Wikipedia. Rich and famous people do it all the time, for entirely different goals, and sometimes without even realizing it!
The Wikimedia Foundation, the non-profit that hosts and provides other support for Wikipedia and its sibling projects, has been under fire recently for the messaging it uses in the infamous donation banners and the disconnect with how
that funding is used. These criticisms are not particularly new, but the tension rose to a new level last month with a "Request for comment" on the English Wikipedia on whether the planned fundraising campaign banners were appropriate.
I didn't end up participating in the RfC because it coincided with a heavy travel period for me and I just didn't have time to read through it all. I also don't find arguing about random parts of the WMF's fundraising strategy to
be super useful, I think it's all part of a larger picture on how the WMF allocates resources, and whether those goals and projects are inline with what editors want. (There is also the question of whether editors solely should be
deciding what the WMF works on, or whether someone needs to speak up for the silent readers. So like I said, much larger picture.) I used to work at the WMF, and I'd like to think that most of the work I did was valuable and that my
compensation was appropriate. A bunch of my
former coworkers and friends still work there and I do think that the work they do is also valuable, and they should be compensated appropriately for it.
Anyways, there is one point I want to make, and that's the title of this post: the best way to support Wikipedia is with your time. Yes, if you give $5 or whatever to the Wikimedia Foundation, it's a reasonable investment
in humanity's collective future...and there are way worse ways to spend $5. But if you give 30 minutes of your time to Wikipedia by contributing to articles, that's worth significantly more than any cash donation!
You can look through the English Wikipedia's backlog for yourself. There are currently 442,000+ articles tagged as needing more references, 98,000+ that need geographic coordinates, etc. This doesn't even
include articles that have fallen out of date and need someone to update them. Over the weekend I was looking up demographics on various U.S. cities and noticed that the majority of articles I looked at were still using 2010 census
data instead of the newer 2020 dataset! It was frustrating.
One major criticism of the fundraising banners tends to be that they say your money is going to supporting Wikipedia, when it's actually going to a non-profit that does support Wikipedia[1], in addition to doing some other things.
So if you want to be sure your contribution is going directly to Wikipedia, donate your time. You will see firsthand where your efforts go, and it'll be way more valuable than any financial donation.
P.S. Editing Wikipedia can become addicting; you've been warned.
[1] Critics tend to downplay how much money is actually needed to support Wikipedia on a regular basis. And the WMF has done itself no favors by being less and less transparent over the years on what it's up to!
The majority of posts on Mastodon right now are about how to get started, discussions about various features or making fun of the dumpster fire that is the birdsite. This is mostly unavoidable
as new people sign up, but I've tried to keep using Mastodon as an actual social network by not posting about "meta" things unless necessary. That said, I have enough thoughts though that I should say something, so here it is.
I first wrote about Mastodon in February 2020 in "Twitter's tools help online harassers" (I was probably one of the first people to ever get their Mastodon handle in newsprint!),
examining the flaws of Twitter that Mastodon has attempted to fix. I think this framing, "a better Twitter", is a good first introduction but misses the bigger picture.
In 2005, Wikipedia co-founder Jimmy Wales gave a talk in which he outlined 10 areas for us to free.
(His slide concludes by promoting Wikicities, later renamed Wikia, later renamed Fandom. Given that communities had to escape Wikia, I'd say that didn't
end up freeing them. A topic for another day...)
I've been unable to find a working video of his talk, but the general point is clear: people should be in charge of their communities, not companies. We should dictate the terms of who we include and
exclude, what we find acceptable for people to say in our spaces, and most importantly, how we make those decisions. We shouldn't need to report trolls to opaque content moderators who can't make correct decisions because
they lack sufficient context; we should just boot them ourselves. It's incredibly empowering to be in communities that have agency to make these decisions for themselves.
(Tangent: this is a good time to plug Mako's 2018 LibrePlanet keynote, "How markets coopted free software’s most powerful weapon", discussing how companies monetized "peer production"
features. In this case Twitter is monetizing our posts, thoughts, experiences, commentary, etc., relying on the masses for content and curation.)
Mastodon kind of gets us to running our own communities, though it's far from perfect. I think it's a much better representation of how online communities have historically worked, you have groups of people who are tied together by
some common interest (some project, geographic location, etc.) but have open doors so you can easily be in multiple communities at the same time. Shoving everyone into one space... I don't think it really worked out that well.
It will take some time for people to unlearn the bad habits that Twitter continually reinforced. There's some meta discussion happening on how journalists should engage on Mastodon (some instances have already started blocking the new
journa.host). I attribute this friction to switching from using social media to drive up engagement to the established culture on Mastodon to actually engage with people! I think it's entirely doable,
in the past I reported on Googleville developments, Elsevier negotations, and a bunch of other things on Mastodon without getting a single complaint.
The hardest part of Mastodon is finding the "right" server (read: community) to join. There's probably a good chance the server doesn't even exist yet! Given that you hit this problem as you try to signup and don't actually know
how anything works yet, the UX is baaaaaad. (No, I don't have any proposals to fix this, I just think it's important to acknowledge that this is a significant hurdle to onboard new people.)
Like most other community-based projects, I expect the UX will improve gradually over time through careful refinement and feedback from a large and diverse group of users. Getting through the poor UX now is merely an investment in
the future. Many servers have also been struggling on the rapid increase in people signing up and posting, so some performance/scaling improvements are in order hopefully.
Is Mastodon ready for the masses? Probably not yet, but now is a great time to try.
Techdirt covered how Twitter previously had a very strong free speech stance, especially when
it came to protecting users' anonymity. One of the downsides of having small community-run instances is that they have much less legal infrastructure and protection. How many Mastodon server administrators would have simply given in
when faced with state demands for private user data? Or been able to assemble a legal team to put up a winning defense?
I sometimes forget how ingrained Twitter is in our current society and infrastructure. I went to look up the Caltrain timetable yesterday and to get service alerts if a train will be more than 5 minutes late you have to check
@CaltrainAlerts on Twitter. Or get updates on whether you should evacuate because of a fire, you check Twitter.
Because of its federated nature, I don't think Mastodon can (currently) replace something that's so dependent on real-time updates. And I doubt most organizations/sites that are currently using Twitter can implement their own website
or app or whatever to provide instant notifications in a manner that was as usable as Twitter.
Back to Mastodon
I'm very excited to see where Mastodon goes next. More than the software, I have thrived in free communities for years now and hope even more people can experience the liberation that comes from joining one.
I'm putting my money and time
where my mouth is by co-adminning a Mastodon server for wiki enthusiasts. We're growing rather slowly (about 1 new account per day), which I hope will help build a real community instead
of just importing one from somewhere else. If you need help, contact me by whatever means we normally use, I'm very happy to help.