As we talked about in our Open Hub in 2016 post, we have recently made a major step forward in addressing significant infrastructure concerns. Down in the “More Infrastructure” section, we mentioned, “So we started an effort to virtualize our crawlers and are pilot testing that work now.” The FISbot servers are now out of the pilot test and the old crawlers are being decommissioned and un-racked. That’s not to say that there are no problems, but the problems we have are not worth switching horses back to the old infrastructure. No, we’d rather take care of the horse we’re riding now.
However, the issues we are having are impacting data on the site and, while we are moving quickly to address them, we’d like to share what we know with you so that everyone can be kept up to date:
- There is an issue that after a Fetch, Import, SLOC cycle is completed, the follow-on Analysis Job is not always being generated. This is leaving some projects with fresh raw data, but no updated analysis.
- There is an issue that the Job Scheduler is not always detecting projects with out of date analysis and scheduling new jobs. This is leaving some projects with no new fresh raw data.
- We’ve changed the way we are doing some internal tracking and accounting of when jobs were executed. This switch has resulted in a mismatch between the fields where we are tracking job progress and the data we are presenting on the site so that some projects either show the wrong date the data were collected or do not show that value at all.
- There are some new low-level issues with local copies of repositories. Since we’ve switched from 18 crawlers with dedicated local storage to virtual servers with a NFS mount to a SAN, we are seeing new file system level issues. These issues typically cause Fetch jobs to fail.
To address these, we are combing through project and repositories repeatedly throughout the day and scheduling jobs to try and keep everything up to date. Please let us know if you project has fallen behind so we can address it while we work on the code fixes to bring the new FISbot infrastructure up to snuff.
In other news, the Spammer Cleanup program is also out of the Pilot phase and is chugging through our accounts and inviting account holders to verify their account. We are focusing on those accounts that were created and then show no activity on the Open Hub. If you get one of these re-verification emails, please simply log on to the site and provide one of the requested forms of verification. However, if you have been an active member of the Open Hub, then you should not be part of this email re-verification process. However, we will still ask for verification when you log in if you’ve not logged in since these new security checks were put in place.
The “Invention Process” for our new security pages has started and is very exciting. We are looking at what we can produce and deploy quickly that will help illustrate the security landscape for OSS projects. After the initial deployment of fact-based data presentation, we will look towards adding additional elements that provide a broader overview of OSS security. Oh, and look forward to a new Project page layout that will begin moving throughout the site and will take advantage of the larger screen size of modern day browsers.
Final point: Such Perform. Wow Speed.
In the post GitHub, Performance, and Crawlers (Oh My!) from October 2015, we talked about the People Index page performance improving from 18-60 seconds to less than 1 second, and the Explore Projects page improving from 100 seconds (!) also improving to less than 1 second, and widget performance improving to 1.5 seconds. We were very pleased that we restored the average web server response times to under 1.2 seconds, or 1200 milliseconds.
Ladies and Gentlemen, Boys and Girls, Things and Its; for the past few months, average web server response time has been under 400 milliseconds — a 3X improvement in speed. Since the deployment of FISbot, average web server response time has been around 200 milliseconds, a 6X improvement in speed. With a number of FIS jobs and Analysis jobs going unscheduled, we expect some impact to the site performance when we fix these code defects. Never fear; the next infrastructure project will separate the analysis database from the web application database and result in consistently speedy web application performance.
I know it’s been a tough process and at times the site was nigh unusable. Thanks for sticking in there with us. You guys are the best (I’m getting teary over here). And we’re continuing to work hard to bring you the unparalleled best set of freely available analysis of ALL the OSS projects. Thank you so very much for being part of the OSS community and member of the Open Hub.