Uploaded image for project: 'FreeNAS / TrueNAS'
  1. FreeNAS / TrueNAS
  2. NAS-109026

investigate using ctdb event scripts and a lock for overall ctdb health checks

    XMLWordPrintable

    Details

      Description

      ctdb has a facility for calling external scripts during a cluster event. For example, it can call external scripts if the cluster health changes.

      Adding SMB shares and/or adding private peers is explicity prohibited if the cluster health isn't 100% healthy. This is to protect from split-brain or partitioning scenarios in the cluster.

      I've written a `healthy()` method in the ctdb_general module, however, this does not account for the fact that the cluster can be transitioning to unhealthy or healthy or vice versa. To fix this, I'd like to investgate doing the following (using pseudo code):

      writer does this:
          try to grab lock on <file-on-disk> for a maximum of 10 seconds
          if 10 seconds expire, then exit

      reader does this:
          while <file-on-disk> is not locked (for a maximum of 10 seconds)
          if not locked:
              return ctdb command for health of cluster
          if locked, wait 10 seconds maximum
              if > 10 seconds, then return unhealthy (for safe measures)
              else:
                     return ctdb command for healty of cluster

        Attachments

          Attachments

            JEditor

              Issue Links

                Activity

                  People

                  Assignee:
                  caleb Caleb St. John
                  Reporter:
                  caleb Caleb St. John
                  Votes:
                  0 Vote for this issue
                  Watchers:
                  1 Start watching this issue

                    Dates

                    Created:
                    Updated: