Operator Dashboard & Workflow

The Operator is responsible for:
  • Managing the DKube cluster
  • Managing Pools, Groups, & Users

Operator Menu

The Operator screen provides a GUI-based mechanism to navigate through the workflow.

_images/Operator_Menu.jpg
Dashboard Overview of the current status of the cluster
OAuth Activate/update GitHub backend authorization
Pools Create, view, modify, & delete device Pools
Groups Create, view, modify, & delete User Groups
Users On-board, view, and delete Users
Nodes View status of cluster nodes
Devices View status of cluster devices
Storage NFS storage usage (does not include local node storage)

Operator Dashboard

_images/Operator_Dashboard.jpg

The Operator dashboard provides a snapshot of the cluster. This enables the Operator to see the status and health of the cluster, and provides the information needed for proper load balancing. It provides details on:

  • The number of cluster nodes, pools, groups, & users
  • The number of CPUs and GPUs across the cluster
  • A summary of the current resource cluster-wide resource usage
  • Graphs of the historical cluster-wide resource usage
  • A list of the top cluster-wide resource usage by User

Cluster Management

The cluster management function of the Operator is organized through 2 main menus.

Cluster Credentials

Cluster access is managed from the “OAuth” menu selection on the left hand side of the screen.

_images/Operator_OAuth_Active.jpg

After installation, the DKube credentials are handled through local authorization. The OAuth menu has 2 functions:

  • Activate the GitHub authorization for general DKube use
  • Update the GitHub credentials when required

The OAuth pop is the same for both functions. This is discussed in more detail in section Credentials & Roles.

DKube Status Information

The rest of the cluster management functions are accessed from the right hand menu.

_images/Operator_Right_Menu_Open.jpg

Status information about the DKube installation is available from the “About” menu item.

_images/Operator_Right_Menu_About.jpg

The “About” popup provides information on the DKube installation version, and on the status of the license. If the license shows that it is running out soon, contact your cluster administrator to update.



_images/Operator_About_Popup.jpg

DKube Cluster Upgrade

DKube can be upgraded to the latest version through the Operator menu on the top right hand side.

_images/Operator_Right_Menu_Upgrade.jpg

Selecting “Upgrade” will ask for confirmation, then upgrade DKube to the latest version. DKube will be unavailable while the upgrade is in progress. The progress of the upgrade can be tracked from the following url. The IP Address is the same as the node that DKube is running on, available from the browser url.

https://<node-IP-Address>:32323/installer


DKube OAuth Token

DKube allows access through the use of an OAuth token. This is used when access is required to the node. This is used for migration between platforms, discussed in section Migrating User Data Between Platforms. The token is provided through the Developer Settings menu item, accessible from the top right hand menu.

_images/Operator_Right_Menu_Developer.jpg




_images/Operator_Developer_Settings_Popup.png

Node & Device Status

The nodes on the cluster can be viewed from the “Nodes” menu selection on the left hand of the screen. Selecting a node provides a list of the devices on each node, their status & utilization.

_images/Operator_Nodes.jpg

Details on each Node in the cluster can be viewed through the “Devices” menu selection. This will show the cluster-wide utilization of the device, as well as its health.

_images/Operator_Devices.jpg

Pool, Group, & User Management

Pools, Groups, & Users are managed from their respective menu selections on the left.

There is no requirement for the Operator to allocate Pools, Groups, and Users for single User operation. At installation time, the Operator User is specified, and that User is authenticated and assigned to the Default Group, which is assigned the Default Pool. The Default Pool contains all of the Devices recognized in the cluster. As such, the single Operator/Data Scientist User has access to all of the GPUs while logged in as a Data Scientist.

If this scenario is true in your situation, you can skip ahead to Data Scientist Dashboard & Workflow

If more than one User, Group, or Pool is required, then the following sections apply. It is recommended that the following flow is used to add an additional Pool, Group, or User.

  • Create a new Pool and assign Devices to the Pools
  • Create a new Group and assign a Pool to the new Group
  • On-board new Users, and assign them to a Group

Pool Management

_images/Operator_Pool_Management.jpg

The Pool Management screen allows the Operator to view all the Devices, and manage Pools. Initially, all of the Devices on the system are assigned to the Default Pool. The devices in the Pool Management screen are shown for the entire cluster.

Create Pool

_images/Operator_Create_Pool_Popup.jpg
  • Select the “+” icon in the top right-hand part of the Pools section
  • The Create Pool popup will appear
  • Input the name of the Pool
  • Assign the Devices to the Pool
  • Select the Create button

Note

Assigning a Device to a Pool will remove it from the Default Pool

Note

Only a single type of GPU can be assigned to a Pool

Edit Pool

  • Select the Pool to be edited
  • The Edit Pool popup will appear
  • The Devices assigned to the Pool can be modified
  • When complete, select the Save button

Delete Pool

  • Select the Pool to be deleted from the left-hand checkbox
  • Select the delete icon from the top right-hand side of the section
  • A confirmation box will pop up
  • Accept the confirmation, and the Pool will be deleted
    • Devices assigned to the Pool will be available for selection by other Pools

Note

If any Device in the Pool has an active job running, it cannot be deleted. An error popup will be displayed to alert the Operator that this is the case.

Group Management

_images/Operator_Group_Management.jpg

The Group management screen shows information about the current Groups, and allows Groups to be created, modified, and deleted.

Create Group

_images/Operator_Group_Create_Popup.jpg
  • Select “+” icon from the top right-hand side of the Groups
  • The Group creation popup will appear
  • Input the name of the Group
  • Select the Users that will be part of the Group, and the Pool that the Group will be associated with
    • Only Users & Pools that are not part of another Group will appear at this time
    • Multiple Users can be assigned, but only a single Pool
  • Select the Create button

Note

Only free Users (Users not already assigned to another Group) can be assigned to a Group. If a User is already assigned to another Group, including the Default Group, it must first be deassigned from that Group, then assigned to the new Group as described above.

Edit Group

_images/Operator_Group_Edit_Popup.jpg
  • Select the Group to be edited from the left-hand checkbox
    • Only a single Group can be edited at one time
  • Select the Edit icon at the top right
  • The Edit Group popup will appear
  • Users and Pools can be assigned or removed
    • Only available Pools and Users will be shown in the edit box
    • To move a User between Groups, the User must first be removed from its current Group, then assigned to the new Group
  • Select the Save button

Delete Group

  • Remove all Users & Pools from the Group
  • Select the Group to be deleted from the left-hand checkbox
  • Select the delete icon from the top right-hand side of the section
  • A confirmation box will pop up
  • Accept the confirmation, and the Group will be deleted
    • Users assigned to the Group will be available for selection by other Groups

Note

If the Group has Users or Pools assigned, it cannot be deleted

User Management

_images/Operator_User_Management.jpg

The User Management screen allows Users to be brought on board and assigned to Groups, as well as deleted from DKube.

Add (On-Board) User

_images/Operator_User_Popup.jpg
  • Select the “+” icon in the top right-hand part of the User section
  • The On-Board User popup will appear
  • The Users that need to be added are selected by the checkboxes, and they can be added to one of the groups by selecting it on the right
    • By default, the Default Group will be selected in the new User popup. If the User is going to be assigned to another Group later (a Group that does not yet exist), it is better to uncheck the Default Group assignment in this step. The User will then show up as available when the new Group is created.
  • When the selection has been made, select “On Board”
  • Once the User has been added, the only other supported capability is to delete the user
    • To change which Group a User belongs to, the edit Group function must be used

Note

If the list of Users does not appear, refresh the User list at the top of the On-Board User popup. This can happen when the GitHub authorization has been activated or changed.

Move User Between Groups

At the initial on-board time, a User can be assigned to a specific Group, or not assigned to any Group (for later assignment). After that, a User can be moved from one Group to another by editing the Group.

  • The User must first be removed from its current Group
  • The target Group can then add the User

Delete User

  • Select the User to be deleted from the left-hand checkbox
    • Multiple Users can be selected
  • Select the delete icon on the top right part of the section
  • A confirmation box will pop up
  • Accept the confirmation, and the User will be deleted from the DKube environment

Note

If the User has an active job running, it cannot be deleted. An error popup will provide that information