Bad Disk == Bad Performance

This website is under heavy load screen grab

We found the root cause for the horrible performance of the Black Duck Open Hub. Our primary database has a disk in the failed state, which is causing massive degradation.  Queries that typically take 60 – 100 ms to execute are taking 2,000 to 3,000 ms and more.  The application server queue, which is typically under 10 requests before being handled, is regularly pushing to 100, which causes our web server to declare “This website is under heavy load”

The problem was masked by a known occasional over-temperature error that would shut down the database.  Digging in to the problem we throttled back the bots that were eagerly crawling this new “openhub.net” domain, ensured the replicated database was properly synchronized, and then dug further to find the bad disk.

On the Good News Front, we have a spare disk in house and will be swapping the disks within the next half-hour.  The RAID disk will need to rebuild, but performance should start improving shortly there after.

As always, thanks for being part of the Open Source community.

About Peter Degen-Portnoy

Mars-One Round2 Candidate. Engineer on the Ohloh development team at Black Duck Software. Family man, athlete, inventor
  • http://www.optaplanner.org/ Geoffrey De Smet

    On a project page, at the bottom, the link “Summary -> Contributors” always ends with nginx time-out. Might be the same cause, but at least the other pages show up after some time.

    • http://degenportnoy.blogspot.com/ Peter Degen-Portnoy

      Thanks for the heads up. The new disk is in place and is rebuilding, but database performance times remain very poor.