Open Hub in 2017

Hail Hubbites!

We’d love to share some of the things that have been going on and will be going on here in Open Hub Land. We accomplished some very significant work in 2016 and would like to take a moment to lay it out and then talk about what we’d like to accomplish in 2017.

2016 Review

Please recall from our 2016 Review what we did in 2015: rebuilt the UI, addressed spam account creation, improved back-end performance (5X in some cases), started inventing new security data features. The plan for 2016 was to create a new Project Vulnerability Report and Project Security Pages, run the Spammer Cleanup Program, virtualize the back end (the FISbot project), switch to Ohcount4J, connect to other sites related to OSS.  Here’s how we did:

  • Invented the Project Vulnerability Report algorithms and presentations
  • Prototyped Project Security Pages with the (now closed) pages
  • Deployed FISbots and Ohloh Analysis onto virtual servers (this involved migration some 10TB of OSS project data from multiple servers to a single SAN)
  • Started running batches of accounts through the Spammer Cleanup Program.  To date, we’ve cleared out some 350,000 spam accounts (YAY!!)
  • Design and implemented a Prototype Project Security Page to report known vulnerabilities in OSS projects.  Collected user feedback from that experiment
  • Explored using Ohcount4J instead of Ohcount.  Decided to stay with Ohcount.
  • Added a feature to add an entire GitHub account to a single Open Hub project
  • Numerous back end improvements and defect resolutions to consistently delivery web pages under 200 ms (6X faster than 2015 on average)
  • Defended against a number of malicious attacks against our API service and web site (comes with the territory of running a non-trivial web application, amirite?)

There’s more though!

The FISbot was implemented as a stop gap measure to address issues we had with the back end bare metal crawlers. We were waiting for another project to provide a central set of Fetch, Import, and SLOC services to the Black Duck enterprise. The plan was to shut down the FISbots and use this other service.  However, after deploying our FISbots, it was decided that we should expand the FISbot to handle the additional enterprise scenarios.  So, completely unplanned at the beginning of the year, we implemented the eFISbot Project, which we also delivered last year.

Last point: as we talked about in Detail on the Infrastructure post, the migration of that 10TB collection of OSS project data onto the production server ran into serious issues that forced us to re-fetch every one of the nearly 600K code locations we monitor.  This was a serious multi-month disruption, from which we have mostly entirely recovered.  We have re-fetched all the repositories, but there are lingering issues in getting all those repositories and corresponding projects refreshed in the 24 – 72 hour window we’ve set for ourselves.

So, in summary, we’ll add to our 2016 Review:

  • Implemented and delivered eFISbot
  • Survived the treacherous NFS SNAFU and the Great Code Location ReFetch

I feel it is also important that we mention again the passing of our friend and colleague Pugalraj Inbasekaran in February. I still feel his absence as an ache near my heart and miss him.

2017 Plan

We have a few main focuses for 2017

  1. Make the back end screamingly fast
  2. Make it wicked easy to add projects from GitHub to the Open Hub and get data from the Open Hub into your GitHub pages
  3. Continue the UI update with wider pages and more responsive layouts
  4. Add new languages to Ohcount

For that back end, we’ve been given permission to obtain a new set of servers.  Currently, the Open Hub runs off a single database (we’ve talked about that over and over again).  We’ve put in a purchase request for 2 database servers that have over 4X the CPU cores and 9X the RAM. One server will be the master and the other the replicate. These servers will support only Fetch, Import, SLOC and Analysis operations (write intensive) so, we’re calling this the FISA DB.  The current database will remain with the purpose of only presenting generated analysis (read intensive) through the Ohloh-UI application, so that will be the UI DB.  We are SO VERY EXCITED!!! SQUEEEEEE!!! Ah. Sorry; sorry. Please excuse the author (but it’s SOO exciting!)

As always, thank you so very much for being part of the Open Source Software community and your continued support of the Open Hub.

About Peter Degen-Portnoy

Mars-One Round 3 Candidate. Engineer on the Open Hub development team at Black Duck Software. Family man, athlete, inventor

16 Responses to Open Hub in 2017