By the time you read this, you'll probably have seen Wikipedia's new layout ("skin"), dubbed "Vector 2022". You can read about the changes it brings.
As with most design changes, some people will like it and some people won't. But me? I just feel sad because years ago we had a popular, volunteer-driven skin proposal that was shut down by arguments that today we now know were in
bad faith and hypocritical.
Back in 2012, then-Wikimedia Foundation senior designer Brandon Harris aka Jorm pitched a new idea: "The Athena Project: being bold",
outlining his vision for what Wikipedia should look like.
During the question-and-answer period, I was asked whether people should think of Athena as a skin, a project, or something else. I responded, "You should think of Athena as a kick in the head" – because that's exactly what it's supposed to be: a radical and bold re-examination of some of our sacred cows when it comes to the interface.
His proposal had some flaws, but it was ambitious, different and forced people to think about what the software could be like.
By 2013-2014, focus pivoted to "Winter", an actual prototype that people could play with and conduct user testing on. Unfortunately I've been unable to find any screenshots or videos
of the prototype You can play with the original prototype (thanks to Izno for pointing out it has been resurrected). Jorm would leave the WMF in 2015 and it seemed like the project had effectively died.
But later in 2015, Isarra (a volunteer, and a good friend of mine) unexpectedly dropped a mostly functional skin implementing the Winter prototype, named "Timeless". You can
try it yourself on Wikipedia today. (I'll wait.)
By the end of 2016, there was a request for it to be deployed to Wikimedia sites. It underwent a security review, multiple rounds of developers
poking at it, filing bugs and most importantly, fixing those bugs. The first set of French communities volunteered to test Timeless in February and March 2017. Finally in August 2017 it was deployed as an opt-in user preference to
test.wikipedia.org, then iteratively deployed to wikis that requested it in the following weeks before being enabled everywhere in November.
I've been using Timeless ever since, on both my wide monitor and tiny (relatively) phone, it works great. I regularly show it to people as a better alternative to the current mobile interface and they're usually blown away. On my
desktop, I can't imagine going back to a single-sidebar layout.
In January 2021, I interviewed Jorm for a Signpost story, and asked him about Timeless. He said, "I love Timeless and it absolutely should
replace Vector. Vector is a terrible design and didn't actually solve any of the problems that it was trying to; at best it just swept them under rugs. I think the communities should switch to Timeless immediately."
At the end of 2017, following Timeless being deployed everywhere as opt-in, Isarra applied for a grant to continue supporting and developing
Timeless (I volunteered as one of the advisors). Despite overwhelming public support from community members and WMF staff, it was
rejected for vague reasons that I'm comfortable describing as in bad faith. Eventually she
applied yet once again and received approval midway through 2018. This time I provided some of the
"official WMF feedback" publicly. But the constant delays and secret objections took a lot of steam out of the
project.
Despite all of that, people were still enthusiastic about Timeless! In March 2019, the French Wiktionary requested Timeless to become their default skin. This is a much bigger
deal than just allowing it as an opt-in choice, and led to discussion of whether Wikimedia wikis need to have a consistent brand identity, how much extra work developers would need to do to ensure they fully support the
now-two default skins, and so on. You can read the full statement on why the task was declined - I largely don't disagree with most of it and the conclusion. If Timeless was
going to become the default, it really needed to be the default for everyone.
Of course, this principle of consistency would be thrown out in the 2022 English Wikipedia discussion on whether to switch
the default to the new "Vector 2022" skin, which was going to be allowed to opt-out of the interface everyone else was using if they voted against it.
Had the French Wiktionary been allowed to switch their default to Timeless, it would've continued to get more attention from users and developers, likely leading to more wikis asking for it to become the default.
You can skim through how Vector 2022 came about. Just imagine if even a fraction of those resources had gone
toward moving forward with Timeless, backing a volunteer-driven project. It's just sad to think of it now.
I started this story with Jorm's op-ed rather than a history of MediaWiki skins because I think he accurately captured that the skin is just a subset of the broader workflows that Wikipedians go through that desperately need
improvement. Unfortunately that focus on workflows has been lost and it shows, we're all still using the same gadgets for critical workflows that we were 10 years ago. (I won't go into detail on the various Timeless features
that make workflows easier rather than more difficult.)
Vector 2022, coming 12 years after the original Vector, is a rather narrow subset of fixes to the largest problems Vector had (lack of responsiveness, collapsed personal menu, sticky header, etc.). It's just not the bold
change we need. Timeless, far from perfect, was certainly a lot closer.
There's a yearly programming contest called Advent of Code (AoC).
If you haven't heard about it, I'd recommend reading betaveros's post explaining what makes it unique.
This was my third attempt at AoC, previously trying it in 2019 (made it to day 5) and 2021 (day 6). This year I made it to... drumroll ...day 14! I had a good time this year, primarily because a group of friends (read: wiki folks on Mastodon) were doing it
every day, so I'd be motivated to be able to compare my solution with their own.
Then on day 15 at midnight I looked at the puzzle and said "nope." and went to sleep.
AoC definitely messed with my sleep schedule being on EST and starting the puzzles at midnight rather than the 9 p.m. back in PST. Once I finished each puzzle, it always took a while to calm down from the rush and by then I'm sleeping at least an hour later than I should've been.
But since I was starting as soon as the puzzle came out on most days, the leaderboard accurately reflects how long it took me on those puzzles:
--------Part 1--------- --------Part 2---------
Day Time Rank Score Time Rank Score
14 00:35:44 2411 0 00:40:21 1977 0
13 00:30:11 1920 0 00:38:08 1735 0
12 23:09:41 34803 0 23:24:54 33874 0
11 00:28:01 1435 0 01:01:03 2707 0
10 00:15:40 2657 0 00:27:38 1841 0
9 02:34:24 15092 0 02:56:58 11213 0
8 00:36:38 6896 0 >24h 61768 0
7 00:34:54 2671 0 00:45:38 2924 0
6 00:08:31 5046 0 00:10:01 4555 0
5 00:16:09 1720 0 00:17:34 1375 0
4 00:08:33 3667 0 00:10:10 2539 0
3 14:34:00 82418 0 22:00:31 92084 0
2 14:27:16 100430 0 14:47:19 94770 0
1 17:13:27 112294 0 17:16:09 107095 0
Day 5 was my best performance, I attribute that to the input format requiring a more-complex-than-usual parser, which I sidestepped by cleaning up the input
in my editor first.
I posted a link to each day's solution and some commentary on a Mastodon thread. All of my solutions
are available in a Git repo.
Overall I enjoyed doing the challenges in Rust. I feel that a good amount of the puzzles just required basic string/array manipulation, which are faster to
do in a dynamically typed language like Python, but there were plenty of times I felt Rust's match statement (which Python now sort of has...) and sum types
came in handy. Specifically with Rust's match statement, the compiler will complain if you don't satisfy some branch, which helped when e.g. implementing the rock-paper-scissors state machine.
As far as learning goes, I picked up some CS concepts like Dijkstra's algorithm. I'm not sure I really learned any more Rust, just got more comfortable with the
concepts I already knew and likely faster at applying them. For the past few months I feel like I'm now thinking in Rust, rather than thinking in Python and writing it in Rust.
Past puzzles are available indefinitely, so you can do them whenever you want. I don't plan on finishing the rest, I mostly lost the incentive now that it's no longer a daily thing. But I'll probably try again in December and see
how far I go :-)
I set some goals for myself at the beginning of the year.
Here's how it went:
- Move out of my parents' house.
✔️ I live in New York City now. This was definitely my biggest goal and accomplishment of the year.
- Contribute something meaningful to SecureDrop.
✔️ I think so. I need to writeup some of the stuff I worked on in the past year.
- Contribute something meaningful to MediaWiki.
✔️ Slightly more mixed because I contributed a lot less this year than in the past, but I still consider myself having contributed in a meaningful way.
- Not get COVID.
❌ Got it in June :(
- Continue contributing to Mailman.
❌ Didn't really find the motivation this year. I'm hoping to spend more time on this in 2023, Wikimedia's Mailman install is showing that it needs more love.
- Continue working on mwbot-rs,
while having fun and learning more Rust.
✔️ I posted updates on the News page.
- Get more stickers (lack of in-person meetups has really been hurting my
sticker collecting).
✔️ Definitely. New stickers include FIRE, California poppies, Pacific Northwest, Qubes, HOPE 2022, Pinnacles National Park, a corgi, and very nice yellow and blue bird that says, "There is more power in peace than in violence".
- Port the rest of my wiki bots to Rust.
❌ Still running in good old PHP. I don't really think this is worth it anymore, people are too used to the current bugs that introducing a different set of bugs would be more disruptive than helpful.
- Make progress on moving wiki.debian.org to MediaWiki.
❌ No real progress :(
- Write at least one piece of recognized content (DYK/GA/FA) for Wikipedia.
✔️ I racked up 4 DYKs this year, List of United States Supreme Court leaks (May 2022), Eleanor Bellows Pillsbury (June 2022), 2022 University of California academic workers' strike (Dec. 2022), and Canadian Coalition for Firearm Rights (Dec. 2022). Hitting the DYK threshold feels pretty straightforward now that I should probably aim for a GA!
- Travel outside the US (COVID-permitting).
❌ I probably had the opportunity, but just didn't feel comfortable because of COVID. Planning at least two international trips in 2023!
- Finish in the top half of our Fantasy Football league and Pick 'em pool. I
did pretty well in 2020 and really regressed in 2021.
❓ Too early to say. Currently doing well in the Pick 'em pool, but not in Fantasy.
- Keep track of TV show reviews/ratings. I've been pretty good about tracking
movies I watch, but don't yet do the same for TV.
❌ Started, but didn't finish.
Publicly publishing a list of goals was nice, every few months I'd re-read the post to see if I was on track or not. But I don't intend to publish my 2023 goals, I expect they'll be more personal than these were.
Toolforge is a free cloud computing platform designed for and used by the Wikimedia movement to host various tools and bots. One of the coolest parts of using Toolforge
is that you get access to redacted copies of the MediaWiki MySQL database replicas, aka the wiki replicas.
(Note that whenever I say "MySQL" in this post I actually mean "MariaDB".)
In web applications, it's pretty common to use a connection pool, which keeps a set of open connections ready so there's less overhead when a new request comes in. But the wiki replicas
are a shared resource and more importantly the database servers don't have enough connection slots for every tool that uses them to maintain idle connections. To quote from the Toolforge connection handling policy:
Usage of connection pools (maintaining open connections without them being in use), persistent connections, or any kind of connection pattern that maintains several connections open even if they are unused is not permitted on shared MySQL instances (Wiki Replicas and ToolsDB).
The memory and processing power available to the database servers is a finite resource. Each open connection to a database, even if inactive, consumes some of these resources. Given the number of potential users for the Wiki Replicas and ToolsDB, if even a relatively small percentage of users held open idle connections, the server would quickly run out of resources to allow new connections. Please close your connections as soon as you stop using them. Note that connecting interactively and being idle for a few minutes is not an issue—opening dozens of connections and maintaining them automatically open is.
But use of a connection pool in code has other benefits from just having idle connections open and ready to go. A connection pool manages the max number of open connections, so we can wait for a connection slot to be available rather
than showing the user an error that the number of connections for our user has already been met. A pool also allows us to reuse open connections if we know something is waiting for them instead of closing them. (Both of those are
real issues Enterprisey ran into with their new fast-ec tool: T325501, T325511; which caused
me to finally investigate this.)
With that in mind, let's set up a connection pool using the mysql_async
crate that doesn't keep any idle connections open. You can pass pool options programatically using a builder, or as part of the
URL connection string. I was already using the connection string method, so that's the direction I went in because it was trivial to tack more options on.
Here's the annotated Rust code I ended up with, from the toolforge
crate (source code):
impl fmt::Display for DBConnectionInfo {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
// pool_min=0 means the connection pool will hold 0 active connections at minimum
// pool_max=? means the max number of connections the pool will hold (should be no more than
// the max_connections_limit for your user (default 10)
// inactive_connection_ttl=0 means inactive connections will be dropped immediately
// ttl_check_interval=30 means it will check for inactive connections every 30sec
write!(
f,
"mysql://{}:{}@{}:3306/{}?pool_min=0&pool_max={}&inactive_connection_ttl=0&ttl_check_interval=30",
self.user, self.password, self.host, self.database, self.pool_max
)
}
}
In the end, it was pretty simple to configure the pool to immediately close unused connections, while still getting us the other benefits! This was released as part of toolforge 5.3.0.
This is only half of the solution though, because this pool only works for connecting to a single database server. If your tool wants to support all the Wikimedia wikis, you're out of luck since the wikis are split across 8 different database servers ("slices").
Ideally our pool would automatically open
connections on the correct database server, reusing them when appropriate. For example, the "enwiki" (English Wikipedia) database is on "s1", while "s2" has
"fiwki" (Finnish Wikipedia), "itwiki" (Italian Wikipedia), and a few more. There is a "meta_p" database that contains information about which wiki is on which server:
MariaDB [meta_p]> select dbname, url, slice from wiki where slice != "s3.labsdb" order by rand() limit 10;
+---------------+--------------------------------+-----------+
| dbname | url | slice |
+---------------+--------------------------------+-----------+
| mniwiktionary | https://mni.wiktionary.org | s5.labsdb |
| labswiki | https://wikitech.wikimedia.org | s6.labsdb |
| dewiki | https://de.wikipedia.org | s5.labsdb |
| igwiktionary | https://ig.wiktionary.org | s5.labsdb |
| viwiki | https://vi.wikipedia.org | s7.labsdb |
| cswiki | https://cs.wikipedia.org | s2.labsdb |
| enwiki | https://en.wikipedia.org | s1.labsdb |
| mniwiki | https://mni.wikipedia.org | s5.labsdb |
| wawikisource | https://wa.wikisource.org | s5.labsdb |
| fiwiki | https://fi.wikipedia.org | s2.labsdb |
+---------------+--------------------------------+-----------+
10 rows in set (0.006 sec)
(Most of the wikis are on s3, so I excluded it so we'd actually get some variety.)
Essentially we want 8 different connection pools, and then a way to route a connection request for a database to the server that contains the database. We can get the mapping of database to slice from the meta_p.wiki
table.
This is what the new WikiPool
type aims to do (again, in the toolforge
crate).
At construction, it loads the username/password from the
my.cnf file. Then when a new connection is requested, it lazily loads the mapping, and opens a connection to the corresponding server, switches to the desired database and returns the connection.
I've done some limited local testing of this, mostly using ab
to fire off a bunch of concurrent requests and watching SHOW PROCESSLIST
in another tab to observe all connections slots being used with no idle connections
staying open. But it's not at a state where I feel comfortable declaring the API stable, so it's currently behind an unstable-pool
feature, with the understanding that breaking changes may be made in the future, without a
semver major bump. If you don't mind that, please try out toolforge 5.4.0 and provide feedback! T325951 tracks stabilizing this feature.
If this works interests you, the mwbot-rs project is always looking for more contributors, please reach out, either on-wiki or in the #wikimedia-rust:libera.chat
room (Matrix or IRC).
There were two prominent stories this week about how rich and famous people tried to influence Wikipedia's coverage, and depending on your point of view, got their way. I think the coverage of both stories missed the mark
so I'd like to dive into them a bit deeper.
But first, Canada is currently discussing enacting a new gun control law, known as Bill C-21. A prominent ice hockey player, Montreal Canadiens goalie Carey Price, spoke out in opposition to the bill, aligning himself with the
Canadian Coalition for Firearm Rights. At the same time the CCFR was under fire for creating a online coupon code, "POLY", which people assumed referred to the 1989 École Polytechnique massacre (the group denies this).
If you had wanted to look up the Canadian Coalition for Firearm Rights on Wikipedia prior to December 7, you wouldn't have found anything. You probably wouldn't have learned that in 2019 they asked members to file complaints against a doctor who called for a ban on assault rifles, or that their CEO shot his first firearm in...the United States.
I'm not very in tune with Canadian politics, so it's unclear to me how prominent this group is actually (doesn't seem to be on the level of the NRA in the US).
But Price put them on the map and now there's a Wikipedia article that will educate people on its history. (It's even been
approved to go on the Main Page, just pending scheduling.) 1 point for rich and famous people influencing Wikipedia's coverage
for the better.
OK, so now onto author Emily St. John Mandel, who is divorced and wanted Wikipedia to not falsely say she was married.
She posted on Twitter,
"Friends, did you know that if you have a Wikipedia page and you get a divorce, the only way to update your Wikipedia is to say you’re divorced in an interview?"
She then did an interview in Slate, where she was specifically asked and answered that she was divorced.
The thing is, that probably wasn't necessary. Yes, Wikipedia strongly prefers independent, reliable sources as the "Wikipedia:Reliable sources" policy page goes into great detail
about. But in certain cases, using the person themselves as a source is fine. In the section "Self-published and questionable sources as sources on themselves",
the policy lists 5 criteria that should be met:
- The material is neither unduly self-serving nor an exceptional claim.
- It does not involve claims about third parties (such as people, organizations, or other entities).
- It does not involve claims about events not directly related to the subject.
- There is no reasonable doubt as to its authenticity.
- The Wikipedia article is not based primarily on such sources.
On top of this, Wikipedia has a strict policy regarding biographies of living persons (BLP), that would lend more weight to using the self-published source.
If Mandel had just tweeted, "I'm divorced now.", that would've been fine. In fact, the first person to update her article with a citation about her divorce used her tweet, not the Slate interview!
In the past I've also used people's tweets to remove incorrect information from Wikipedia.
(That said, people do lie about their age, height, etc. So far the worst case I've ever run into was Taio Cruz, who reached the level of sending in a fake birth certificate.
You can read the talk page, it's a giant mess.)
And then there's Elon Musk (sigh), who tweeted about how Wikipedia is biased, right after an "Articles for deletion" discussion was started on the Twitter Files article.
Vice covered it with: "We Are Watching Elon Musk and His Fans Create a Conspiracy Theory About Wikipedia in Real Time".
It goes into good detail about the Wikipedia deletion process, but I don't fully agree with the conclusion that this is how the process is supposed to work, and how it usually works.
I cast a vote in the discussion, stating it was easily notable and an obvious keep. By the time it was closed, the tally was 73 keep votes, 27 delete votes, and 23 merge votes. Wikipedians will tell you that these discussions are not a vote,
rather the conclusion is based on the strength of the arguments. But in this case, I want to focus on the direction of the discussion rather than the final result.
At the time Musk tweeted (Dec 6, 18:46 UTC), the vote count was 12 delete votes, 4 keep votes, 4 merge votes (I should say that I'm relying on Enterprisey's vote-history analysis for these numbers).
The votes post-tweet were 69 keep, 15 delete, 19 merge. That's a pretty big shift!
I would like to think that Wikipedians would have reached the same (and IMO correct) conclusion regarding the existence of the Twitter Files article without Musk's "intervention", but it's hard to say that for sure.
But, as I've hopefully demonstrated, Musk is not alone in trying to influence Wikipedia. Rich and famous people do it all the time, for entirely different goals, and sometimes without even realizing it!
The Wikimedia Foundation, the non-profit that hosts and provides other support for Wikipedia and its sibling projects, has been under fire recently for the messaging it uses in the infamous donation banners and the disconnect with how
that funding is used. These criticisms are not particularly new, but the tension rose to a new level last month with a "Request for comment" on the English Wikipedia on whether the planned fundraising campaign banners were appropriate.
I didn't end up participating in the RfC because it coincided with a heavy travel period for me and I just didn't have time to read through it all. I also don't find arguing about random parts of the WMF's fundraising strategy to
be super useful, I think it's all part of a larger picture on how the WMF allocates resources, and whether those goals and projects are inline with what editors want. (There is also the question of whether editors solely should be
deciding what the WMF works on, or whether someone needs to speak up for the silent readers. So like I said, much larger picture.) I used to work at the WMF, and I'd like to think that most of the work I did was valuable and that my
compensation was appropriate. A bunch of my
former coworkers and friends still work there and I do think that the work they do is also valuable, and they should be compensated appropriately for it.
Anyways, there is one point I want to make, and that's the title of this post: the best way to support Wikipedia is with your time. Yes, if you give $5 or whatever to the Wikimedia Foundation, it's a reasonable investment
in humanity's collective future...and there are way worse ways to spend $5. But if you give 30 minutes of your time to Wikipedia by contributing to articles, that's worth significantly more than any cash donation!
You can look through the English Wikipedia's backlog for yourself. There are currently 442,000+ articles tagged as needing more references, 98,000+ that need geographic coordinates, etc. This doesn't even
include articles that have fallen out of date and need someone to update them. Over the weekend I was looking up demographics on various U.S. cities and noticed that the majority of articles I looked at were still using 2010 census
data instead of the newer 2020 dataset! It was frustrating.
One major criticism of the fundraising banners tends to be that they say your money is going to supporting Wikipedia, when it's actually going to a non-profit that does support Wikipedia[1], in addition to doing some other things.
So if you want to be sure your contribution is going directly to Wikipedia, donate your time. You will see firsthand where your efforts go, and it'll be way more valuable than any financial donation.
P.S. Editing Wikipedia can become addicting; you've been warned.
[1] Critics tend to downplay how much money is actually needed to support Wikipedia on a regular basis. And the WMF has done itself no favors by being less and less transparent over the years on what it's up to!