Four Months. 1/3 of a Year. About 123 Days Have Passed

Yeah, but it’s been only around 87 work days. On the other hand, we keep strange hours and are working regularly on Sunday morning to perform upgrades and improvements. We’ve done a lot and would like to share some things with you.

At the end of May, we announced our FIS Ohloh Database Split (FODS) project (About the FODS Architecture), and we said that there was still work to do, and we got busy working on that. Here’s a quick punch list:

  1. Fixed an issue in SVN where a “REPLACED” file was incorrectly counted
  2. Did a round of performance improvements on the FODS architecture
  3. Post FODS, the Ohloh UI Website performance was poor. We brought server response times back to pre-deployment speeds.
  4. Aggressively hunted down expensive queries and improved them
  5. Maintained 99.7% test coverage on our Ohloh UI
  6. Updated our automated Selenium scripts to verify post-FODS functionality
  7. Improved language support for Ohcount: Grace, AMPL, Shell Script detection, Puppet versus Pascal disambiguation, Objective C detection
  8. Overhauled our Job Scheduler Logic to better identify Code Locations in need of update
  9. Leveraged Machine Learning to identify spam accounts
  10. Fixed issue that caused bloated Ruby processes in production web servers
  11. Deployed incremental FODS improvements to address DB contention, and back-end process deaths
  12. Fixed issue that was blocking AnalyzeJobs and updated over 300,000 Project Analyses in a few days
  13. Added 60,000 Go Code Locations for the Black Duck Knowledge Base
  14. Added a new Enhanced License feature to better illustrate the rules around a license type (we have more data to populate)
  15. Even more fixes to job logic, job execution, job death, jobs, jobs, jobs, jobs, jobs!!!!

Our goal over the past few months was to Make Our Back End Screamingly Fast. And we’ve achieved that.

About a year ago, we were coming off of the massive Back-End Background issue, and Project updates were in the double digits per hour.  Like “10” and “20”. With the work we’ve done, it’s consistently been 5,000 updated analyses per hour.

We also dealt with the 200,000+ projects that had only enlistments at the long-defunct Google Code forge. We deleted them. We have a plan to search for those projects on GitHub, and if we can find them, we can re-add the project, but it was really important to clear out all those projects that would never be able to be updated again, This let us clear out all the failed jobs related to those projects too.

Up next is Microsoft’s Codeplex. In October, the site will switch to “read-only” mode. In December, the site shuts down. We’ll do the same process — delete the projects and clear out the jobs.

Right now, we’re updating 68% of our projects every 3 days.  When we drop the Codeplex projects, which are mostly broken because CodePlex broke their SVN implementation ages ago (Google it; it’s too painful for me to talk about), we expect to push that number over 80%.

Oh, yeah, on May 11, 2017, after I presented at OSCON, we switched the Ohloh UI repository from private to public. So yeah, the Open Hub is OSS. While I was at OSCON, I also had a chance to sit down and chat with the indomitable Randal Schwartz for TWIT’s FLOSS Weekly.

We’ve got more wonderful things planned, so more blog posts coming. As always, thank you for being a member of the Open Source Community and the Open Hub

About Peter Degen-Portnoy

Mars-One Round 3 Candidate. Engineer on the Open Hub development team at Black Duck Software. Family man, athlete, inventor
  • Thanks for the update, as always it is appreciated!

    As always the question – with all the improvements you’ve done, things should be getting reliable, yes? So when projects don’t update while they’re “just” on github, should I ask on the forum or wait until the system gets to it? (projects in question – nextcloud and owncloud, oC being 4 months old, Nextcloud 1)