MediaWiki packages for Ubuntu 20.04 Focal available

Packages for the MediaWiki 1.31 LTS release are now available for the new Ubuntu 20.04 LTS "Focal Fossa" release in my PPA. Please let me know if you run into any errors or issues.

In the future these packages will be upgraded to the MediaWiki 1.35 LTS release, whenever that's ready. It's currently delayed because of the pandemic, but I expect that it'll be ready for the next Debian release.


Inside Scoop - Breaking down our award-winning special issue

Inside Scoop is a column about the operation of the Spartan Daily, San Jose State's student newspaper.

Each semester the Spartan Daily staff puts out a special issue that focuses on a specific topic or concept. It allows for an in-depth exploration of something that the San Jose State community is interested in. Previous topics have included race, gender and home. The special issue normally comes toward the end of the semester but we thought we'd do things a bit differently.

Rather than do one special issue - we planned on doing three. The first of them is what I want to break down today: Fighting 'fake news'. It won second place for best special issue in the 2020 Best of Show ACP San Francisco convention competition (here's what won first place), two steps up from our fourth-place finish for Heroic in 2019.

You can download a PDF copy of the issue to follow along.

Fighting 'fake news'

The way we picked the special topics went differently this time around, and I think for the better. In the past topics were chosen during the semester based on what was going on as well as having the staff vote on it. Sometimes this worked, but I felt that if we spent more time selecting the topics, we could do a better job ensuring the content was cohesive and complete without significant duplication.

When we picked the topic over the summer, we thought we had the title too: Combatting 'fake news'. The word "combatting" is weird though, as it can also be spelled "combating". I definitely did not feel comfortable using a word in the title that people could (incorrectly) interpret as misspelled. Misspellings destroy our credibility ("if you can't even get the spelling right, how can we trust you to get the facts right?"), even the perception of one. So we switched the title to: Fighting 'fake news'.

The title takes a clear stance - "fake news" is bad and we need to fight it. There is an expectation that journalists will objectively cover their subjects, trying to stay as neutral as possible. But I think anything that threatens our mission of informing our audiences (restricting our First Amendment rights is a common example) is appropriate for a newspaper to take a stance on, even outside of the opinion/editorial pages.

The second problem with the title of the issue was whether to put "fake news" in quotes or not. We decided that upon first reference we would put it in quotes to indicate that we were referring to it as the term rather than literal fake news. Later references wouldn't use quotes because by then the reader should be able to understand what the meaning/intent is.

That leads us into the first story - what exactly does "fake news" mean?

Misinformation, page 2

I'm not sure how it happened, but John ended up writing the intro/lead story for two of the three special papers in addition to the lead story in our soccer season preview. And like usual, he did a pretty great job with balancing his own research with what SJSU professors (our experts) told him. Typically we have a blanket prohibition on interviewing journalism professors/students because it's "incestuous" (as said by Professor Craig) to be interviewing ourselves, but because journalism professors are subject-matter experts in this case, writers were given an exception this one time.

This was also the time we fully realized that John was designing his infographics in InDesign rather than Illustrator/Photoshop. I really like how the timeline visualizes all of the events covered in the story, starting with Gutenberg's invention of the printing press. We struggled a bit with how to represent two different timescales, as a significant amount of events happened very recently. Eventually we settled on the zig-zag line that you seen in graphs to indicate a shift.

My one regret is that the quote, "Honestly, it is a fire hose of shit," didn't make it into a pull quote.

ICE, page 2

The next story about false ICE rumors spreading over Twitter was easily the most important story in this entire issue. As seen in the screenshot (or on Twitter itself), people falsely claimed on Twitter that ICE was on the SJSU campus, obviously scaring undocumented students. It's trivial to find other tweets that falsely claim this too.

There's no better case study for "fake news" than something concerning our own campus. Given that the Spartan Daily already covered the incident when the rumors first spread, Vicente's story retells the tale from a different angle: focusing on how the university fought the "fake news". This is the theme that all the remaining stories keep in mind - how do we as individuals and as a society fight "fake news". I do wish that we could have had an interview with someone who actually spread the rumors on social media though.

Echo chambers, page 3

Echo chambers seem rather relevant to SJSU as most students lean progressive/liberal/left/etc. but there are definitely some people holding conservative/right/etc. views on campus. After defining a jargon-y term, Christian included insight from students on how they obtain their news, really priming the reader for the next section on Pages 4 & 5.

Melody's illustration was absolutely beautiful...and then we couldn't afford to run it in color. :( I also think the rest of the page ends up becoming a sea of gray - it would have benefited from a pull quote to break up the solid columns of text. That would have also forced us to shorten the story a bit and maybe make it slightly more concise.

Where do students get their news? pages 4 & 5

These two pages discussing where students get their news from are a beautiful mess that did not turn out the way any of us planned. The original idea was that we would survey students on where they get their news (e.g. CNN, NY Times, BuzzFeed News), and provide a ranking and analysis (e.g. AllSides media bias ratings) for each source. But that's not at all what happened. Most students returned the survey with the platforms (Twitter, Facebook, Instagram, etc.) they got their news from rather than the actual sources.

The resulting dataset was messy and overlapping, calling for one type of chart: a Venn diagram. Chelsea and Ed did a great job putting it together that night as people were still returning survey slips. Some really fantastic things can come out of being forced to work under pressure, but we should have had a better plan for this. We should have started the survey earlier and had a backup plan for what to do with that space if the data didn't come out the way we wanted.

The second half answered the natural follow-up question: does it actually matter? It was kind of a leading question, unsurprisingly everyone gave favorable answers to the point we were trying to push. Also we messed up in the layout, the headline with the question should have been above the speech bubbles.

The main take away I had was that if you run a free response survey, expect that the responses will be all over the place. Also to run a controlled test of the questions with friends or such before giving it to everyone.

Guide, page 6

The idea was that our guide would be the "if you take one thing away from this issue, it should be this." I think the content did that, but the framing/location in the issue wasn't ideal even though it's a logical follow-up from the previous infographic content. The infographic is unsigned as if it's an editorial statement, but it wasn't something the entire board worked on, just 3-4 editors did. Also putting the main takeaway on Page 6 probably isn't that great, I think it would have worked better on the back page (Page 8) with the editorial. We would have needed majority approval from the editorial board on the concept/text, but I think it would have been a good idea regardless.

Sports, page 6

The evolution of this story was unexpected but great. Going into it, Brendan was going to reach out to coaches/players at SJSU as well as professional journalists to capture the story from both sides of reporting. We expected that it would be straightforward for him to get interviews at SJSU because we already have a good relationship with our athletics media relations team and a bit more difficult to get a hold of professional sports journalists ... but the exact opposite happened. CJ was able to put him in touch with multiple fantastic sources while we were struggling to get good quotes from SJSU coaches. Coincidentally we visited The Mercury News on Monday (the issue came out on Thursday) and while we took a tour around the building, a few of us hung back to interview two of their sports reporters.

Our biggest failing in this story was the lack of good artwork. The infographic was a last-minute thing because we needed some graphical element for the story, but at least to me it really looks like an afterthought. Had we moved the guide to another page, this story would have gotten an entire page and I think we could have done some kind of cool photo illustration with Durant and headlines.

Deepfakes, page 7

Perfection is elusive. One of the hardest lessons for me to learn as executive editor has been that no matter how many checks or reviews we do, some things will just slip by everyone.

Somehow a faint gray border was accidentally set on the text box of Olivia's story, which pushed everything down a line so her story did not finish. It was fixed for the digital PDF copy, so I took a picture of the physical print copy to show it.

It's not productive for us to blame any specific person given the number of eyes that went over it and didn't notice (also blaming people in general is a terrible practice). But it's still frustrating enough that I don't really have any comments on the story itself. For a later special paper I had one of our editors go through all of the pages again before we sent them to the printer to double check stuff like this.

Satire, page 7

This story was not planned. I don't remember where we went wrong, but when we were laying out which story was going to go on each page, we ended up short one story in the opinion section. While normally I would have preferred a staff writer to fill in the gap, because we were very short on time Jonathan, the opinion editor, wrote the story himself. Plus he likes satire so it was a great fit.

I would also like everyone to know that there is a Wikipedia article titled 'No Way To Prevent This,' Says Only Nation Where This Regularly Happens about The Onion stories with the same name.

Editorial, page 10

Before I get into discussing this page, I want to share an excerpt that explains my position on editorials from one of my favorite books, The Landry News.

Editorials are the heart of a newspaper, from The Landry News

(Maybe another time I'll write about how much I love editorials.)

While most of the special issue addresses how consumers of news can do a better job avoiding "fake news", I think we tackle the hard issue that many journalists may not want to admit - the news media is also culpable in the rise of "fake news". As we explain how our correction policy works, I hope that came across as the heart of the Spartan Daily - as student journalists we do our best to get it right the first time around, but we will always publicly admit our mistakes and rectify them with the aim of doing a better job the next time.

For laying out the page, Melody really came in clutch and did a great job. We were expecting to have a half-page ad and then it fell through rather late in the day. Melody stepped up and had the attitude of "I get to draw something how big?" And finishing the page with the one correction we had to run that day just sealed the deal.


Overall, I do wonder how much this issue resonated with our primary audience, San Jose State students. I think we did well among one of our secondary audiences, SJSU professors, but that doesn't surprise me given this is a topic many of them are already interested in.

We tried to push the principle of doing things early and planning to give a better result and I think it mostly worked. Nearly all of the content finished going through the entire editing process by the weekend, which really helped Marci put together skeletons/templates of each page for editors to just paste in the final content. This allowed us to spend more time making minor tweaks to get the details right rather than having to just settle for the basics (of course, some stuff still slipped through - "founding founders" still gets to me).

One oddity of this paper was that there was only a single image in the entire issue (the teaser photo of Kevin Durant on the bottom right of Page 1). "Fake news" isn't really a photograph-able subject I suppose.

Because of the name of this issue, it's frequently referred to as "the fake news issue" (even I'm totally guilty of this), which is going to be a great way to be a great way to be remembered in Spartan Daily history: "Yep, that was the semester they put out a fake news issue."

Thanks to Victoria for reviewing and editing this post before publication.


Fixing npm security issues immediately in MediaWiki projects

For the past 5ish years, I've been working on a project called libraryupgrader (LibUp for short) to semi-automatically upgrade dependency libraries in the 900+ MediaWiki extension and related git repositories. For those that use GitHub, it's similar to the new dependabot tool, except LibUp is free software.

One cool feature that I want to highlight is how we are able to fix npm security issues in generally under 24 hours across all repositories with little to no human intervention. The first time this feature came into use was to roll out the eslint RCE fix (example commit).

This functionality is all built around the npm audit command that was introduced in npm 6. It has a JSON output mode, which made it straightforward to create a npm vulnerability dashboard for all of the repositories we track.

The magic happens in the npm audit fix command, which automatically updates semver-safe changes. The one thing I'm not super happy about is that we're basically blindly trusting the response given to us by the npm server, but I'm not aware of any free software alternative.

LibUp then writes a commit message by mostly analyzing the diff, fixes up some changes since we tend to pin dependencies and then pushes the commit to Gerrit to pass through CI and be merged. If npm is aware of the CVE ID for the security update, that will also be mentioned in the commit message (example). In addition, each package upgrade is tagged, so if you want to e.g. look for all commits that bumped MediaWiki Codesniffer to v26, it's a quick search away.

Lately LibUp has been occupied fixing the minimist prototype pollution advisory through a bunch of dependencies: gonzales-pe, grunt, mkdirp and postcss-sass. It's a rather low priority security issue, but it now requires very little human attention because it has been automated away.

There are some potential risks - someone could install a backdoor by putting an intentional vulnerability in the same version as fixing a known/published security issue. LibUp would then automatically roll out the new version, making us more vulnerable to the backdoor. This is definitely a risk, but I think our strategy of pulling in new security fixes automatically protects us more than the potential downside of malicious actors abusing the system (also because I wouldn't absolutely trust any code pulled down from npm in the first place!).

There are some errors we see occasionally, and could use help resolving them: T228173 and T242703 are the two most pressing ones right now.


Inside Scoop - The best student newspaper in California

tl;dr: The Spartan Daily picked up best student newspaper honors for the first time and had its best awards season ever. Inside Scoop is a column about the operation of the Spartan Daily, San Jose State's student newspaper.

In 2016, I made the decision to go back to school and pursue a degree in journalism. I had briefly dabbled in it in middle school, but really had no idea what I was getting myself into.

I started at De Anza College, as a member of the La Voz staff. After a quarter covering the student government beat, I moved up to serve as news editor. I regularly felt that putting out a paper every two weeks was incredibly difficult ... not realizing what was waiting for me at San Jose State University.

I spent the Fall 2018 semester on SJSU's broadcast program, Update News, mostly getting familiar with the campus. And then, in January 2019, I began my stint as a staff writer on SJSU's flagship publication, the Spartan Daily. I quickly learned that putting out a paper 3 days a week was basically a real job. Every moment I wasn't in class, I'd be running off to conduct an interview or finish typing up a story before my deadline. I started staying late as the editors put together the paper - I was fully hooked.

The Daily basically rotates staff every semester, so in April the advisers and some of the outgoing editors selected me as the next executive editor (our fancy name for the editor-in-chief). I wasn't actually present in class when they played Taylor Swift to announce my selection - I was at a robotics tournament in Houston. Oops.

I spent the summer interning in New York, slowly plotting planning how exactly to run the Spartan Daily. There were some things we had done great while I was a writer, but some things I wanted to redo entirely.

Thankfully, I wasn't embarking on this journey alone. Victoria, my managing editor, was technically #2 in the leadership heirarchy, but it ended up becoming a partnership. Early on I disregarded her advice a few times - and generally came to regret it. I'd like to think I very much learned my lesson.

We were backed up by a great team of editors. I've previously written how we put the team together, but the main thing I want to emphasize is that the editors were picked to create a cohesive team, rather than picking the most skilled person for each role. Add in our staff writers and it really felt like we were a family. Most everyone understood that we won or lost as a team AND THAT'S EXACTLY WHAT HAPPENED.

For the 2019 calendar year, the Spartan Daily was recognized as the best student newspaper in California by the California College Media Association (CCMA) and then again by the California News Publishers Assocation (CNPA).

"The best newspaper editors"

Left to right: Nick (Spring 2019 executive editor), Jana (Spring 2019 managing editor), Victoria (Fall 2019 managing editor), me (Fall 2019 executive editor). Photo by Professor Craig.

This is probably one of the most team-based awards that I've had my individual name on. It's impossible for me to overstate how much every single person on the Daily staff contributed to this award. It felt incredibly fullfiling and validating with a bit of vindication mixed in to know that all of the work we put in paid off in being named the best student newspaper in the state.

On top of that, the Daily picked up a host of individual awards, wrapping up basically our best awards season ever. Here's the full list:

  • Pinnacle Awards: 2nd place best sports investigative story (Lindsey)
  • ACP: 2nd place best in-depth news story (Lindsey)
  • ACP: 5th place best breaking news photo (Lindsey)
  • ACP: honorable mention best newspaper inside page (Marci)
  • ACP San Francisco Best of Show: 2nd place best newspaper special edition (for Fighting 'fake news')
  • ACP San Francisco Best of Show: 4th place people's choice: newspaper
  • ACP San Francisco Best of Show: 4th place people's choice: overall
  • Hearst Journalism Awards: 2nd place Hearst Enterprise Reporting (Lindsey)
  • CCMA: 1st place best newspaper (Nick, Jana, Kunal, Victoria)
  • CCMA: 1st place best podcast (Vicente)
  • CCMA: 2nd place best news series (Erica, Brendan, Jozy, Nathan, Chris)
  • CCMA: 2nd place best editorial (Jonathan, Kunal)
  • CCMA: 2nd place best news photograph (Lindsey)
  • CCMA: 3rd place best sports photograph (Melody)
  • CCMA: 3rd place best photo series (Brendan)
  • CCMA: 3rd place best newspaper inside spread design (Lindsey, Kunal, Marci)
  • CCMA: 3rd place best social media reporting (Spartan Daily staff)
  • CNPA: 1st place general excellence (Spartan Daily staff)
  • CNPA: 1st place best enterprise news story (Lindsey, Jana, Mauricio, Kunal)
  • CNPA: 1st place best illustration (Nachaela)
  • CNPA: 3rd place best enterprise news story (Christian)
  • CNPA: 4th place best enterprise news story (Chelsea, Vicente)
  • CNPA: 4th place best news photo (Mauricio)
  • CNPA: 4th place best illustration (Cindy)

The list has never been this long before. And while the CCMA and CNPA awards are only statewide, for ACP, Pinnacle and Hearst we competed against colleges all across the country.

I would be remiss if I didn't thank our two advisers, Richard Craig and Mike Corpos, for supporting us throughout this entire experience. I knew that both of them would always have our backs, no matter what. Even that one time I walked into the newsroom and told them, "I'm going to be served sometime this week." The same applies to my adviser from La Voz, Cecilia Deck, who really helped me get started in the first place.


mwparser on wheels

mwparserfromhell is now fully on wheels. Well...not those wheels - Python wheels!

If you're not familiar with it, mwparserfromhell is a powerful parser for MediaWiki's wikitext syntax with an API that's really convenient for bots to use. It is primarily developed and maintained by Earwig, who originally wrote it for their bot.

Nearly 7 years ago, I implemented opt-in support for using mwparserfromhell in Pywikibot, which is arguably the most used MediaWiki bot framework. About a year later, Merlijn van Deen added it as a formal dependency, so that most Pywikibot users would be installing it...which inadvertently was the start of some of our problems.

mwparserfromhell is written in pure Python with an optional C speedup, and to build that C extension, you need to have the appropriate compiler tools and development headers installed. On most Linux systems that's pretty straightforward, but not exactly for Windows users (especially not for non-technical users, which many Pywikibot users are).

This brings us to Python wheels, which allow for easily distributing built C code without requiring users to have all of the build tools installed. Starting with v0.4.1 (July 2015), Windows users could download wheels from PyPI so they didn't have to compile it themselves. This resolved most of the complaints (along with John Vandenberg's patch to gracefully fallback to the pure Python implementation if building the C extension fails).

In November 2016, I filed a bug asking for Linux wheels, mostly because it would be faster. I thought it would be just as straightforward as Windows, until I looked into it and found PEP 513, which specified that basically, the wheels needed to be built on CentOS 5 to be portable enough to most Linux systems.

With the new Github actions, it's actually pretty straightforward to build these manylinux1 wheels - so a week ago I put together a pull request that did just that. On every push it will build the manylinux1 wheels (to test that we didn't break the manylinux1 compatibility) and then on tag pushes, it will upload those wheels to PyPI for everyone to use.

Yesterday I did the same for macOS because it was so straightforward. Yay.

So, starting with the 0.6.0 release (no date set yet), mwparserfromhell will have pre-built wheels for Windows, macOS and Linux users, giving everyone faster install times. And, nearly everyone will now be able to use the faster C parser without needing to make any changes to their setup.


My new tech column in the Spartan Daily

After a pretty hectic last semester, I'm taking a much more backseat role on the Spartan Daily for hopefully my final semester at San Jose State. I'm going to be the new "Science & Tech Editor" - yes, I invented my own position. I am currently planning for a science & tech section every month as a special feature.

Every two weeks though, I'm going to be publishing a column, titled "Binary Bombshells", about the different values imbued in technology, analyzing the values they contain, explaining what effects they have upon us and suggesting any avenues for improvement.

You can read the first installment of my column now: Values exist in all technologies.