Inside Scoop - The best student newspaper in California

tl;dr: The Spartan Daily picked up best student newspaper honors for the first time and had its best awards season ever. Inside Scoop is a column about the operation of the Spartan Daily, San Jose State's student newspaper.

In 2016, I made the decision to go back to school and pursue a degree in journalism. I had briefly dabbled in it in middle school, but really had no idea what I was getting myself into.

I started at De Anza College, as a member of the La Voz staff. After a quarter covering the student government beat, I moved up to serve as news editor. I regularly felt that putting out a paper every two weeks was incredibly difficult ... not realizing what was waiting for me at San Jose State University.

I spent the Fall 2018 semester on SJSU's broadcast program, Update News, mostly getting familiar with the campus. And then, in January 2019, I began my stint as a staff writer on SJSU's flagship publication, the Spartan Daily. I quickly learned that putting out a paper 3 days a week was basically a real job. Every moment I wasn't in class, I'd be running off to conduct an interview or finish typing up a story before my deadline. I started staying late as the editors put together the paper - I was fully hooked.

The Daily basically rotates staff every semester, so in April the advisers and some of the outgoing editors selected me as the next executive editor (our fancy name for the editor-in-chief). I wasn't actually present in class when they played Taylor Swift to announce my selection - I was at a robotics tournament in Houston. Oops.

I spent the summer interning in New York, slowly plotting planning how exactly to run the Spartan Daily. There were some things we had done great while I was a writer, but some things I wanted to redo entirely.

Thankfully, I wasn't embarking on this journey alone. Victoria, my managing editor, was technically #2 in the leadership heirarchy, but it ended up becoming a partnership. Early on I disregarded her advice a few times - and generally came to regret it. I'd like to think I very much learned my lesson.

We were backed up by a great team of editors. I've previously written how we put the team together, but the main thing I want to emphasize is that the editors were picked to create a cohesive team, rather than picking the most skilled person for each role. Add in our staff writers and it really felt like we were a family. Most everyone understood that we won or lost as a team AND THAT'S EXACTLY WHAT HAPPENED.

For the 2019 calendar year, the Spartan Daily was recognized as the best student newspaper in California by the California College Media Association (CCMA) and then again by the California News Publishers Assocation (CNPA).

"The best newspaper editors"

Left to right: Nick (Spring 2019 executive editor), Jana (Spring 2019 managing editor), Victoria (Fall 2019 managing editor), me (Fall 2019 executive editor). Photo by Professor Craig.

This is probably one of the most team-based awards that I've had my individual name on. It's impossible for me to overstate how much every single person on the Daily staff contributed to this award. It felt incredibly fullfiling and validating with a bit of vindication mixed in to know that all of the work we put in paid off in being named the best student newspaper in the state.

On top of that, the Daily picked up a host of individual awards, wrapping up basically our best awards season ever. Here's the full list:

  • Pinnacle Awards: 2nd place best sports investigative story (Lindsey)
  • ACP: 2nd place best in-depth news story (Lindsey)
  • ACP: 5th place best breaking news photo (Lindsey)
  • ACP: honorable mention best newspaper inside page (Marci)
  • ACP San Francisco Best of Show: 2nd place best newspaper special edition (for Fighting 'fake news')
  • ACP San Francisco Best of Show: 4th place people's choice: newspaper
  • ACP San Francisco Best of Show: 4th place people's choice: overall
  • Hearst Journalism Awards: 2nd place Hearst Enterprise Reporting (Lindsey)
  • CCMA: 1st place best newspaper (Nick, Jana, Kunal, Victoria)
  • CCMA: 1st place best podcast (Vicente)
  • CCMA: 2nd place best news series (Erica, Brendan, Jozy, Nathan, Chris)
  • CCMA: 2nd place best editorial (Jonathan, Kunal)
  • CCMA: 2nd place best news photograph (Lindsey)
  • CCMA: 3rd place best sports photograph (Melody)
  • CCMA: 3rd place best photo series (Brendan)
  • CCMA: 3rd place best newspaper inside spread design (Lindsey, Kunal, Marci)
  • CCMA: 3rd place best social media reporting (Spartan Daily staff)
  • CNPA: 1st place general excellence (Spartan Daily staff)
  • CNPA: 1st place best enterprise news story (Lindsey, Jana, Mauricio, Kunal)
  • CNPA: 1st place best illustration (Nachaela)
  • CNPA: 3rd place best enterprise news story (Christian)
  • CNPA: 4th place best enterprise news story (Chelsea, Vicente)
  • CNPA: 4th place best news photo (Mauricio)
  • CNPA: 4th place best illustration (Cindy)

The list has never been this long before. And while the CCMA and CNPA awards are only statewide, for ACP, Pinnacle and Hearst we competed against colleges all across the country.

I would be remiss if I didn't thank our two advisers, Richard Craig and Mike Corpos, for supporting us throughout this entire experience. I knew that both of them would always have our backs, no matter what. Even that one time I walked into the newsroom and told them, "I'm going to be served sometime this week." The same applies to my adviser from La Voz, Cecilia Deck, who really helped me get started in the first place.



mwparser on wheels

mwparserfromhell is now fully on wheels. Well...not those wheels - Python wheels!

If you're not familiar with it, mwparserfromhell is a powerful parser for MediaWiki's wikitext syntax with an API that's really convenient for bots to use. It is primarily developed and maintained by Earwig, who originally wrote it for their bot.

Nearly 7 years ago, I implemented opt-in support for using mwparserfromhell in Pywikibot, which is arguably the most used MediaWiki bot framework. About a year later, Merlijn van Deen added it as a formal dependency, so that most Pywikibot users would be installing it...which inadvertently was the start of some of our problems.

mwparserfromhell is written in pure Python with an optional C speedup, and to build that C extension, you need to have the appropriate compiler tools and development headers installed. On most Linux systems that's pretty straightforward, but not exactly for Windows users (especially not for non-technical users, which many Pywikibot users are).

This brings us to Python wheels, which allow for easily distributing built C code without requiring users to have all of the build tools installed. Starting with v0.4.1 (July 2015), Windows users could download wheels from PyPI so they didn't have to compile it themselves. This resolved most of the complaints (along with John Vandenberg's patch to gracefully fallback to the pure Python implementation if building the C extension fails).

In November 2016, I filed a bug asking for Linux wheels, mostly because it would be faster. I thought it would be just as straightforward as Windows, until I looked into it and found PEP 513, which specified that basically, the wheels needed to be built on CentOS 5 to be portable enough to most Linux systems.

With the new Github actions, it's actually pretty straightforward to build these manylinux1 wheels - so a week ago I put together a pull request that did just that. On every push it will build the manylinux1 wheels (to test that we didn't break the manylinux1 compatibility) and then on tag pushes, it will upload those wheels to PyPI for everyone to use.

Yesterday I did the same for macOS because it was so straightforward. Yay.

So, starting with the 0.6.0 release (no date set yet), mwparserfromhell will have pre-built wheels for Windows, macOS and Linux users, giving everyone faster install times. And, nearly everyone will now be able to use the faster C parser without needing to make any changes to their setup.


Binary Bombshells: Twitter’s tools help online harassers

Originally posted on mastodon.technology.

The second installment of my Spartan Daily tech column, Binary Bombshells, is out! I discuss design flaws in Twitter that lead to harassment and how Mastodon addresses some of them: sjsunews.com/article/binary-bo


My new tech column in the Spartan Daily

After a pretty hectic last semester, I'm taking a much more backseat role on the Spartan Daily for hopefully my final semester at San Jose State. I'm going to be the new "Science & Tech Editor" - yes, I invented my own position. I am currently planning for a science & tech section every month as a special feature.

Every two weeks though, I'm going to be publishing a column, titled "Binary Bombshells", about the different values imbued in technology, analyzing the values they contain, explaining what effects they have upon us and suggesting any avenues for improvement.

You can read the first installment of my column now: Values exist in all technologies.


Celebrating 2 years of MediaWiki codesearch

MediaWiki codesearch logo

It's been a little over 2 years since I announced MediaWiki codesearch, a fully free software tool that lets people make regex searches across all the MediaWiki-related code in Gerrit and much more. While I expected it to be useful to others, I didn't anticipate how popular it would become.

My goal was to replace the usage of the proprietary GitHub search that many MediaWiki developers were using due to lack of a free software alternative, but doing so meant that it needed to be a superior product. One of the biggest complaints about searching via GitHub was that it pulled in a lot of extraneous repositories, making it hard to search just MediaWiki extensions or skins.

codesearch is based on hound, a code search engine written in go, originally maintained by etsy. It took me all of 10 minutes to get an initial prototype working using the upstream docker image, but I ran into an issue pretty quickly: the repository selector didn't scale to our then-500+ git repositories (now we're at more like 900!). So it wouldn't really be possible to just search extensions easily.

After searching around for other upstream code search engines and not having much luck finding things I liked, I went back to hound and instead tried running multiple instances at once and it more or less worked. I wrote a small ~50 line Python proxy to wrap around the different hound instances and provide a unified UI. The proxy was sketch enough that I wrote "Please don't hurt me." in the commit message!

But it seems to have held up over time, surprisingly well. I attribute that to having systemd manage everything and the fact that hound is abandoned/unmaintained/dead upstream, creating a very stable platform, for better or worse. We've worked around most of the upstream bugs so I usually pretend it's a feature. But if it doesn't get adopted sometime this year I expect we'll create our own fork or adopt someone else's.

I recently used the anniversary to work on puppetizing codesearch so there would be even less manual maintenance work in the future. Shoutout to Daniel Zahn (mutante) for all of his help in reviewing, fixing up and merging all the puppet patches. All of the package installation, systemd units and cron jobs are now declared in puppet - it's really straightforward.

For those interested, I've documented the architecture of codesearch, and started writing more comprehensive docs on how to add a new search profile and how to add a new instance.

Here's to the next two years of MediaWiki codesearch.