What is taking so long?

In the “Update on Infrastructure” post, we talked about how we needed to get disks that would adequately handle our performance needs and were designed for our hardware (i.e. “supported by the manufacturer”). Those arrived on Tuesday and we brought them to our data center straight away and built the new RAID array.

Yesterday, as of the writing of this post, we put the Open Hub into Read Only mode.  Our first plan was to use our replicated database to run a pg_basebackup from slave to master, however the replicated database was a few hours out of synch.  It shouldn’t have been more than a few tens of milliseconds.  So, we decided to do a pg_dumpall of the primary database, change the mounts so the database server was pointing to the new RAID array and restore the cluster.

We were optimistic that this would take 4 – 6 hours, 10 at the far outside.

Ladies and Gentlemen, Girls and Boys; this restore has been running for 18 and three-quarter hours and it is very difficult to determine precisely how long it will take.  The restore process, being a straight file load into pgsql, has no progress indicators.

Again, we apologize for the inconvenience this is causing.  On the plus side, we will be examining all aspects of our replication implementation, have added appropriate new monitoring and reporting, and are planning for architectural changes that will let us continue to serve the API even if the rest of the website has to be put into RO mode.  Oh, and when we’re done, we’ll have brand new indexes on our database.  That will be nice.

About Peter Degen-Portnoy

Mars-One Round2 Candidate. Engineer on the Ohloh development team at Black Duck Software. Family man, athlete, inventor
  • Lukas

    Unless the project refresh gets quicker, I suggest changing the text “# days since last commit”, somehow, to clarify that it could just as well be # days since openhub last refreshed the project.

    • http://degenportnoy.blogspot.com/ Peter Degen-Portnoy

      Thanks Lukas. Each project has a time stamp that show the age of the analysis and the age of the last refresh. Is that what you were thinking of?

      https://drive.google.com/file/d/0B4oiRbAN9CThOEd2QzRjOEhBMmM/edit?usp=sharing

      • Lukas

        Sorry for being unclear, I was thinking more of how it appears on the search page:

        http://i.imgur.com/rMyo5Sp.png

        That states “about one month since last commit” very clearly, and the analysis age note is much weaker and detached from the commit info.

        Perhaps adding some refresh comment near the commit count or rephrasing to e.g. “last analyzed commit # days ago”.

        • http://degenportnoy.blogspot.com/ Peter Degen-Portnoy

          Thanks for the clarification. When our crawlers have been upgraded and caught up to the analysis load, then this won’t be an issue. However, your point is well taken. Thank you.

  • http://incongnghe.net/ Trung Nhat

    http://incongnghe.net/ In offset gia re

  • Per

    Will reading the commiters list, such as https://www.openhub.net/p/12126/contributors/summary work once this is done?

    And, I also second the note about “# of days since last commit” since it’s really not relevant any more, “# days since last index” seems more useful. Perhaps move the ‘last detected commit’ to some other place?