Preparing for LWP Hack Night

I’ve had a couple of people ask me how they can prepare for LWP Hack Night, so I thought I’d just give a quick introduction to the set of modules.

I whipped up a graph of the various GitHub repositories to give you an idea of which are the most popular and which have the most open issues. Those stats seem to roughly correspond.

If you want to poke around the repositories on GitHub, that will give you an idea of where you can start.

Now, a lot of these modules have open issues in RT as well, so don’t let the GitHub numbers fool you. You can find the RT bugs for the various libraries here:

If you’ve looked at RT and GitHub, you can see there’s a monster amount of work to be done here, including but not limited to:

  • Establishing which bugs are still bugs
  • Establishing which patches could possibly be applied
  • Adding tests for existing pull requests
  • Rebasing some existing pull requests which have merge conflicts
  • Identifying bugs which are possibly in the wrong queue
  • Performing code review of existing pull requests which look like they are close to a state where they could be merged

How you might like to go about this is entirely up to you. If you have time before the meeting to identify some bugs which you may like to approach or comment on, please feel free to get started now. When we’re at the meeting, we can work out a plan to divide and conquer. There’s more than enough work to go around. We won’t (and can’t) clear this all up in one evening, but the point here is to make incremental improvements and learn something useful in the process.

Please feel free to get in touch with me in advance if you have any questions about this. The best way to do this would be via the Toronto Perl Mongers email list.

Introducing LWP::ConsoleLogger::Everywhere

In an earlier post, I introduced you to LWP::ConsoleLogger. I’ve been using it heavily since then, but one thing I didn’t tackle was how to debug a user agent you can’t easily get it. Some modules don’t provide a public API which allows you to access their user agent. Or, maybe the user agent which you want to debug is so far removed from your code that you can’t easily access its public API. Previously, this was not an easy problem to solve. However, this is no longer the case. simbabque was kind enough to write LWP::ConsoleLogger::Everywhere.

It’s quite simple to use.

Simply add this line to your code and run it. Any objects of the LWP::UserAgent family should now dump extensive logging information to your terminal. It can get a bit fancier than that, but this is really all you need to know in order to get started debugging 3rd party LWP::UserAgent-based HTTP requests.

meta::hack Wrap-up Report

meta::hack v1

Earlier this month (Thu, Nov 16 – Sun, Nov 20) I had the pleasure of meeting up with 7 other Perl hackers at ServerCentral’s downtown Chicago offices, in order to hack on MetaCPAN. Before I get started, I’d like to thank our sponsors.

This hackathon wouldn’t have been possible without the overwhelming support of our sponsors. Our platinum sponsors were Booking.com and cPanel. Our gold sponsors were Elastic, FastMail, and Perl Careers. Our silver sponsors were ActiveState, Perl Services, ServerCentral and Advance Systems. Our bronze sponsors were Vienna.pm, Easyname, and the Enlightened Perl Organisation (EPO). Please take a moment to thank them for helping our Perl community.

For the past 2.5 years, we’ve been working off and on at porting MetaCPAN from Elasticsearch 0.20.2 to 1.x and (eventually) 2.x. There were enough breaking changes between the versions to make this a non-trivial task. We had made very good progress over the past two QA hackathons, but the job was just too big to finish in the hours that we had available.

After the QA Hackathon in Rugby, I spoke to Neil Bowers about how we might go about doing some fundraising. Neil was so kind as to offer to help. His offer to help soon evolved into him taking on all of the work (thanks Neil)! Neil worked his magic and got the event fully funded. I know there was a lot of work invovled, but he made it look easy. Mark Keating and the Enlightened Perl Organization kindly took on the financial side of things, invoicing and accepting payment from sponsors. Without EPO and Neil, this event never would have taken place. (Please do take a moment to thank them).

While this was going on, we began searching for a venue. Joel Berger offered to host us at ServerCentral in Chicago and we immediately took him up on the offer. After that it was just a matter of folks booking plane tickets and getting approval from employers for the time off.

The final list of invitees was:

  • Brad Lhotsky (San Francisco)
  • Doug Bell (Chicago)
  • Graham Knop (Baltimore)
  • Joel Berger (Chicago)
  • Leo Lapworth (London)
  • Mickey Nasriachi (Amsterdam)
  • Olaf Alders (Toronto)
  • Thomas Sibley (Seattle)

The event was invitation only. We did this in order to maximize the amount of work we’d be able to finish at the event. [Insert reference to “The Mythical Man Month”]. Everyone who participated was already up to speed on the internals of the project or has an area of expertise which we needed in order to complete our goal of launching fully with v1 of the API. Because everyone already had a working VM and working knowledge of the project, we were able to tackle the problems at hand right from the first morning.

As far as living space goes, we initially had looked at renting hotel rooms, but the cost would have made it almost prohibitive to meet in Chicago. After doing some research, we booked two apartments (each with 3 bedrooms) on the same floor of a condo building in the Lincoln Park area of Chicago. We booked the accommodations via booking.com of course. 🙂 I think we were happy with the housing. Everyone had their own room and we had big enough living rooms for all of us to meet up mornings and some evenings. At the end of the day the rental was a fraction of the price of a Chicago hotel. I’ve also made a mental note not to be the last one to arrive in town. Apparently it also means you get the smallest room.

Each day we took the subway downtown to ServerCentral. We had a dedicated boardroom in the office with a large TV that we could use for sharing presentations, IRC chat or error logs. ServerCentral also sponsored lunch each day of the event. Extra monitors were also available for those who wanted them. (Lots of Roost laptop stands were to be seen. Also lots of people who couldn’t figure out how to open them after having collapsed them for the first time in forever).

After settling in at the office we’d discuss our plans for the day and map out goals for that day. We had breakout discussions where appropriate but the time spent not writing code was minimal. Generally, as a group, we worked well into the evenings. We didn’t get the full Chicago experience, but we got a lot done. We did make it to the Chicago Christkindlmarkt, which was a few blocks from the office and we went out for a breakfast and a dinner as well. Minimal downtime, but the breaks we had were lots of fun.

Day one was spent removing anything which was blocking the API upgrade. Wishlist items were ignored and as a group we worked really well. Lots of pull requests were created, reviewed and merged.

By day two of the hackathon we flipped the switch and went live with the new API. We could have waited a bit longer, but we opted to make the change earlier so that we could troubleshoot any issues as a group and watch the error logs in real time. There were no showstopping bugs and the transition was actually pretty smooth.

Day three was spent squashing some of the bugs which came up after the upgrade. We also started to tackle some wishlist items.

Day four was a slightly shorter day. We wrapped around 4 PM. Some of us went to check out “the Bean” before flying out while Leo and I headed right for our respective airports.

This list is by no means exhaustive, but over this long weekend we:

  • moved ++ data to v1 of the API.
  • moved https://metacpan.org to v1 of the API.
  • implemented load balancing via Fastly, our CDN sponsor.
  • reduced noise in the logs by squashing bugs which generated warnings or exceptions.
  • updated our API documentation as well as the metacpan-examples GitHub repository from v0 to v1.
  • published an upgrade document which explains to how upgrade your query syntax and configuration for v1.
  • moved http://explorer.metacpan.org to v1 of the API.
  • began work on streaming logs to Elasticsearch.
  • began moving the query logic that metacpan.org uses over to the API so that other clients can use this same logic.
  • began porting author queries from metacpan.org to the API as well.
  • added a meta::hack event page along with sponsor info to metacpan.org.
  • continued work on adding a /permission endpoint which will provide access to the data in 06perms.txt.
  • added more tests for the /download_url endpoint which translates module names into download URL. Specifically this is meant to be used by cpanm.
  • added snapshotting of Elasticsearch indices in v1 so that we can easily restore from backup.

/permission is something I spent a fair bit of my time working on over the last two days. Having 06perms.txt data in the API will mean that we can display a list of all authors who have maint on a module on metacpan.org. This will make it easier to track down authors who can release a module, particularly for those who aren’t familiar with the way PAUSE works. I think this branch is probably about 1.5 years old, so I was happy to get the time to try to finish it off. I didn’t quite get there, but that’s okay. It was a wishlist item and it’s actually quite close to being released.

Also of note is the fact that we’ve now officially deprecated the v0 API. There is a 6 month runway to move clients over to v1 and v0 will be taken offline on or after June 1, 2017.

Since https://metacpan.org now uses v1 of the API, results for v0 are no longer available. If you have a client which uses v0 of the API, please feel free to reach out to us with any concerns you may have about making the switch.

If you rely on updated ++ data, you’ll need to switch to v1 now, as ++ data in v0 is no longer being updated. The indexer is, however, still running on v0, so it will still find and index new CPAN uploads. v0 development is officially closed. Any v0 bugs (barring catastrophic issues) will likely not be addressed. v0 has been around for just over 6 years now. It has served us well, but it’s time to let it go. [Insert musical scene with a talking snowman, an ice queen and her loyal sister.]

UserAgent Debugging Made Easy

Earlier today I saw a recent blog post from Gabor Szabo. In it, he shows a very concise way to handle Basic Authentication using LWP::UserAgent. Now, what if you had a problem running the script? How might you go about debugging it? You could add a bunch of print statements. Maybe dump the request and the response objects. That’s entirely valid, but I want to show you a slightly simpler way of going about it, using LWP::ConsoleLogger::Easy.

Gabor’s original script looks like this:

Let’s run it to see what the output looks like.

Here’s the debugging version. Note the important changes are on lines 4 and 9.

The output we get is:

You can see that the debugging version is just one line longer. I added 2 lines and removed a print statement. It prints out a whole pile of (nicely?) formatted information. Let’s try running it with valid credentials. (Brace yourself, there’s going to be a lot of output.)

You can see that I ran the script with LWPCL_REDACT_HEADERS='Authorization'. That’s a handy flag to use if you want to copy/paste an example when asking for help publicly. It replaced the Authorization header value with [REDACTED]. That’s maybe not a big deal here, but there are cases where it’s more important. See also LWP_REDACT_PARAMS.

Let’s make it prettier. We’ll do this by installing HTML::FormatText::Lynx.

Let’s run it again. I’ll only show you the changed part. Instead of just displaying the text with the HTML stripped away, we get something nicer to look at.

Now, we can also turn down the verbosity of the script by passing a flag to debug_ua(). Any integer from 0-8 will do the trick. Let’s try 6.

Let’s see what we get:

That’s far easier to read now.

This is just a very basic example of what you can do with LWP::ConsoleLogger::Easy. There’s a lot more you can do with it and it’s all laid out for you in the documentation. It really shines when you have a user agent which is going through multiple links or if you’re debugging someone else’s API calls. Have fun with it. It beats inserting arbitrary print statements and it could save you from pulling a lot of your own hair out someday.

Make libwww-perl Great Again ™

You may have noticed that WWW::Mechanize has seen some releases over the last couple of months. No big, breaking changes, but bugs have been fixed and enhancements have been shipped. This module is part of the libwww-perl ecosystem and also a part of the libwww-perl GitHub organization, to which I now also belong. I started pestering people to get involved because these modules, although quite important in the CPAN scheme of things, aren’t really on a regular release cycle. I don’t have the backstory on everything and this is not a complaint about anybody who has commit bits, maint or co-maint. It’s just an observation that a lot of modules on CPAN depend on the modules in this organization. The issue queues are slowly growing and pull requests are going unmerged.

I think there’s a fairly simple solution to all of this and my hope is that we can crowdsource enough mindshare to get this done. (I’m hoping that previous sentence is fully buzzword compliant).

Now, I don’t have a lot of hours of spare time to devote to this stuff in any given week, but this doesn’t all fall to me anyway. What I’d like to see is more eyeballs on the code. If you’d like to get involved or you have an interest in seeing things move along with any of these modules, please go through any outstanding pull requests and issues. Even comments such as “LGTM” (looks good to me) are very helpful. If enough people who know what they’re doing stamp a “LGTM” on a pull request, then that signals that this code is less risky to merge. If people can look into open issues and identify what is or is not a bug and what is or is not RFC-compliant, then that can speed up the issue review cycle as well.

If you’d like to join the libwww-perl org, then that would be great as well. Probably a good first step for that would be to get involved with reviewing open pull requests and issues or even contributing some code.

Here’s a quick summary of the repositories which are currently in the org:

velocity indicates how likely a pull request is likely to get merged. You can see that WWW::Mechanize is the worst offender of the bunch, despite my minimal cleanup attempts. You can mostly ignore WWW::Mechanize::Cached for these purposes as that’s a module I’ve been actively maintaining for a lot of years.

However, you can see that the libwww-perl (LWP::UserAgent) repo, for instance takes about 874 days per pull request before that pull is merged. It takes an average of 165 days before a PR is closed and the remaining open pulls have been open for 801 days each. If you’re looking at over 2 years before the average pull request is merged, you can see how this probably isn’t encouraging people to get involved.

For my part, I’ve added a Travis CI config to all of the repositories and I’ve also converted WWW::Mechanize to use Dist::Zilla. Not all of the repositories are in a passing state, but at least now we have a baseline for passing and failing tests.

Now, I don’t have co-maint on most of a lot of the remaining modules, but I’m willing to pester people who do. People can also help by releasing TRIAL distributions so that CPAN testers can smoke the dist before we pester someone to release a module.

So, that’s my plea for today. Please feel free to pitch in and help clean this up. If you rely on these modules at your $work, please find a way to donate a few hours here and there to the upkeep of these modules.

For those of you who are bound to say “what about Mojo::UserAgent or module X”, I have two responses:

1) TIMTOWDI
2) It’s easier to maintain these modules than to update and re-release all of the CPAN which currently use them

Fortunately, I don’t know of any really terrible bugs which have gone unfixed, but I think if these modules do see active development and releases, then any terrible bugs will be easier to patch and ship if and when they do rear their ugly heads.

Edit: I neglected to mention that there is #lwp on irc.perl.org for libwww-perl discussion.

Announcing meta::hack

Every so often, someone asks if they can donate money to MetaCPAN. I usually direct them to CPAN Testers, since (due to our generous hosting sponsors) we’ve generally not had a need for money. You can probably see where I’m going with this. Times have changed. We’re no longer turning financial sponsors away.

Back at the QA Hackathon in Rugby, we had a great group of hackers together and we got a lot of work done. However, as we worked together, it became clear that the size of our job meant that we wouldn’t be able to finish everything we had set out to do over that four day period. There are times when there’s no replacement for getting everyone in the same room together.


P4230367.jpg

The first dedicated MetaCPAN hackathon will be held at the offices of ServerCentral
in Chicago, from November 17th through 20th. The primary goal for this hackathon is to complete MetaCPAN’s transition to Elasticsearch version 2. This will enable the live service to run on a cluster of machines, greatly improving reliability and performance. The hackathon will also give the core team a chance to plan work for the coming 18 months.

The meta::hack event is a hackathon where we’re bringing together key developers to work on the MetaCPAN search engine and API. This will give core team members time to work together to complete the transition to Elasticsearch version 2, and time to discuss gnarly issues and plan the roadmap beyond the v1 upgrade.

MetaCPAN is now one of the key tools in a Perl developer’s toolbox, so supporting this event is a great way to support the Perl community and raise your company’s profile at the same time. This hackathon is by invitation only. It’s a core group of MetaCPAN hackers. We are keeping the group small in order to maintain focus on the v1 API and maximize the productivity of the group.

Why sponsor the MetaCPAN Hackathon?

 

• If your company uses Perl in any way, then your developers almost certainly use MetaCPAN to find CPAN modules, and they probably use other tools that are built on the MetaCPAN API.
• The MetaCPAN upgrade will improve the search engine and the API for all Perl developers. As a critical tool, we need it to be always available, and fast. This upgrade is a key step in that direction.
• This is a good way to establish your company as a friend of Perl, for example if you’re hiring.

Participants

 

There will be 8 people taking part, including me. Everyone taking part is an experienced senior-level software engineer, and most of them have already spent a lot of time working on MetaCPAN. As noted above, this is an invitational event with a very specific focus.

What is meta::hack?

 

MetaCPAN was created in late 2010. Version 0 of the MetaCPAN API was built on a very early version of Elasticsearch. For the first 5 years, most of the work on MetaCPAN focussed on improving the data coverage, and the web interface. In that time Elasticsearch has moved on, and we’re now well behind.

The work to upgrade Elasticsearch began in May of 2014. It continued in early Feb of 2015. Later, at the 2015 QA Hackathon in Berlin, Clinton Gormley (who works for Elastic) and I worked on moving MetaCPAN to Elasticsearch version 2. This work was continued at the 2016 QA Hackathon in Rugby, and as a result we now have a beta version in live usage.

The primary goal of meta::hack is to complete the port to Elasticsearch version 2, so the public API and search engine can be switched over. There are a number of benefits:

• Switching from a single server to a cluster of 3 servers, giving a more reliable service and improved performance.
• Once we decommission the old service, we’ll be able to set up a second cluster of 3 machines in a second data centre, for further improvements.
• We’ll be able to take advantage of new Elasticsearch features, like search suggesters.
• We’ll be able to use a new endpoint that has been developed specifically to speed up cpanminus lookups. Cpanminus is probably the most widely used CPAN client these days, so improving this will benefit a large percentage of the community.
• If and when search.cpan.org is decommissioned, we’ll be able to handle the extra traffic that will bring with it, and we’ll also have the redundancy to do this safely.
• We’ll be able to shift focus back to bug fixes and new MetaCPAN features.

Becoming a Sponsor

 

Neil Bowers has kindly taken on the task of shepherding the sponsorship process.  (He also wrote the sponsorship prospectus from which I cribbed most of this post.) Please contact Neil or contact me for a copy of the meta::hack sponsorship prospectus.  It contains most of the information listed above as well as the various available sponsorship levels which are available.  Thank you for your help in making this event happen.  We’re looking forward to getting the key people together in one room again and making this already useful tool even better.

Getting to Travis and GitHub Pages Quickly

Disclaimer: I’m sure this functionality exists elsewhere, but this was a fun little thing for me to work on. Also, you’ll need a minimum of git 2.7 for this to work.

Often, when I’m working locally I like to bounce right over to a GitHub repository url to check something. I ended up writing a bit of code to make this easier. While I was at it, I decided it would be nice to have the same thing for Travis URLs. So, I’ve released this as part of Git::Helpers.

When you’re inside a Git repository, you can use gh-open to open a browser window with the GitHub URL of your repository. gh-open also accepts an origin name as an argument, so

would open a tab in your default browser containing your upstream’s URL, assuming you have an origin by that name. Don’t specify a remote name and it will assume origin:

It doesn’t currently care which branch you’re on, but patches welcome (in the kindest sense of the expression).

If you want to check your Travis page for the repository then travis-open will do the same kind of thing. It also accepts an origin name, just as gh-open does:

or defaults to origin if you don’t:

Don’t Forget about URI::Heuristic

Imagine you’ve got some user input that is supposed to be a valid URL, but it’s user input, so you can’t be sure of anything. It’s not very consistent data, so you at least make sure to prepend a default scheme to it. It’s a fairly common case. Sometimes I see it solved this way:

This converts example.com to http://example.com, but it can be error prone. For instance, what if I forgot to make the regex case insensitive? Actually, I’ve already made a mistake. Did you spot it? In my haste I’ve neglected to deal with https URLs. Not good. URI::Heuristic can help here.

This does exactly the same thing as the example above, but I’ve left the logic of checking for an existing scheme to the URI::Heuristic module. If you like this approach, but you’d rather get a URI object back then try this:

Caveats

Are we sure this is what we want? Checking the scheme is helpful and even if we weren’t using this module, we’d probably want to do this anyway.

That’s it! This module has been around for almost 18 years now, but it still solves some of today’s problems.

How to Get a CPAN Module Download URL

Every so often you find yourself requiring the download URL for a CPAN module. You can use the MetaCPAN API to do this quite easily, but depending on your use case, you may not be able to do this in a single query. Well, that’s actually not entirely true. Now that we have v1 of the MetaCPAN API deployed, you can test out the shiny new (experimental) download_url endpoint. This was an endpoint added by Clinton Gormley at the QA Hackathon in Berlin. Its primary purpose is to make it easy for an app like cpanm to figure out which archive to download when a module needs to be installed. MetaCPAN::Client doesn’t support this new endpoint yet, but if you want to take advantage of it, it’s pretty easy.

Now invoke your script:


olaf$ perl download_url.pl Plack
https://cpan.metacpan.org/authors/id/M/MI/MIYAGAWA/Plack-1.0039.tar.gz

Update!

After I originally wrote this post, MICKEY stepped up and actually added the functionality to MetaCPAN::Client. A huge thank you to him for doing this. 🙂 Let’s try this again:

That cuts the lines of code almost in half and is less error prone than crafting the query ourselves. I’d encourage you to use MetaCPAN::Client unless you have a compelling reason not to.

Caveats

This endpoint is experimental.  It might not do what you want in all cases.  See this GitHub issue for reference.  Please add to this issue if you find more cases which need to be addressed.  Having said that, this endpoint should do the right thing for most cases.  Feel free to play with it to see if it suits your needs.

Easy Perl OAuth Integration with Runkeeper and Spotify

I’ve been tooling around with a fun little app that I’m building on evenings and weekends. As part of that work I figured I’d let users authenticate via Runkeeper. Luckily Runkeeper uses OAuth2 and it’s all pretty easy to get going with. I’ve published my very small contribution as Mojolicious::Plugin::Web::Auth::Site::Runkeeper

On a similar note, earlier this year I also released Mojolicious::Plugin::Web::Auth::Site::Spotify

If you’re already using Mojolicious::Plugin::Web::Auth, then these modules will make it trivial for you connect with the Runkeeper and/or Spotify web services.