预约演示
预约演示

更新日志 - MMCloud Half Moon Bay (2.5)

MMCloud Half Moon Bay (2.5)

Memory Machine Cloud Half Moon Bay (2.5) Release Info

New in the Half Moon Bay 2.5 Release

The Half Moon Bay 2.5 release adds major features and improves the overall reliability and scalability of the platform.

  • SurfZone is a cost management feature that allows an administrator to configure a monthly budget (quota) for a group of users. If the spending limit is reached, jobs are canceled immediately or suspended until the budget is replenished. The choice to cancel or suspend is a configuration option. If the job is suspended, the SurfZone configuration determines whether the job resumes automatically when the budget is replenished or waits for user input.
  • Workflow view of jobs allows the user to examine all the tasks grouped in a single workflow. For example, there can be hundreds of tasks in a single Nextflow pipeline. In the web interface, the Workflow Details screen includes a summary of the entire pipeline, for example, wall time, CPU time, and the numbers of on-demand and spot instances created. A Timeline tab shows when individual jobs start and, if completed, when they stop. Current status is color-coded to indicate success, failure, running, and so on. This feature applies automatically to Nextflow pipelines and can be applied to other workflows by including identifying tags.
  • Multiple Machine Images is a feature (called Quiver) that enables the OpCenter to create VMs using VMIs (virtual machine images) that are specialized for the task, for example, to support an instance with a GPU. Previous versions of OpCenter software used the same VMI for all instances (based on CPUs with x86 architecture).
  • NVIDIA GPU support allows users to submit jobs that take advantage of the NVIDIA drivers and hardware. The NVIDIA GPU support in the Half Moon Bay release does not include checkpoint and restore, so SpotSurfer and WaveRider are not available for these jobs.

New Features in Half Moon Bay 2.5 Release

Type Domain Description
Feature Platform

SurfZone is a cost management feature that solves two difficult challenges facing cloud users.

  • Proactively stopping a job before the budget is exceeded
  • Saving the state of a job so that the job can resume when more budget is available

With this feature, users are assigned to a SurfZone, which is a policy that defines the quota (also known as limit or budget), the threshold for issuing a warning, the action taken when the quota is exhausted, and the action taken when the budget is replenished. The quota is checked periodically; the interval is configurable (default is one hour and also the minimum).

Feature UX

Nextflow is a framework for assembling a collection of tasks into a coordinated pipeline or workflow. One Nextflow pipeline can generate hundreds of tasks that appear to the OpCenter as hundreds of independent jobs. When a workflow is started, Nextflow assigns it a name, for example, furious-mandelbrot. The OpCenter web interface now supports a Workflows screen that displays all the workflows by name. Clicking a workflow name brings up the Workflow Details screen that displays all the tasks started as part of that workflow as well as aggregate metrics of the entire workflow.

Similar information (without the aggregate metrics) is available using the cli, for example, float list -f "tags=nextflow-io-run-name:furious-mandelbrot".

The workflow view is available for any group of jobs (not just Nextflow pipelines) as long as the user provides a workflow name and a run name (analogous to the pipeline name in Nextflow) for each job in the group.

Feature Platform

Multiple Machine Images (also called Quiver) is a feature that enables the OpCenter to differentiate among the virtual machine images (VMIs) used to instantiate worker nodes. In earlier releases, the OpCenter uses a single VMI (based on an x86 CPU) for all worker nodes. With Quiver, the OpCenter can select a VMI that is customized for the intended task. For example, the OpCenter can select a VMI that supports an NVIDIA GPU or a VMI custom-built to support specific customer requirements.

Feature Platform

Although originally developed for graphics applications, the architecture of a GPU (graphics processing unit) is ideal for accelerating the type of mathematical calculations involved in machine learning and artificial intelligence. The Half Moon Bay release supports jobs that are submitted with the drivers to use NVIDIA GPUs. The Half Moon Bay release does not support GPU checkpoint and restore functions, so jobs can run on GPUs, but SpotSurfer and WaveRider are not available for these jobs.

Feature Security

The Dana Point 2.1.1 release introduced LDAP and local Linux password file as authentication methods (in addition to the built-in method) to use when logging in to the OpCenter. Half Moon Bay keeps the built-in method, removes the local Linux password file method (due to the lack of customer demand), and enhances the LDAP support. The LDAP enhancements are extensive; for example, security for LDAP bind credentials is improved as follows.

  • Removal of LDAP-related parameters from the OpCenter configuration list: The LDAP bind credentials are removed from the visible configuration list to prevent unauthorized access or exposure.
  • Dedicated command line and OpenAPI for LDAP configuration: A specialized command line tool and OpenAPI endpoints are introduced (for configuring and testing the LDAP connection).
  • Mutual TLS verification for LDAP connection: Users can upload their own certificates to enable anonymous access in a trusted environment.

Feature Platform

When OpCenter is used in a high-volume production environment, for example, a company delivering genomic analyses as a service, multiple thousands of jobs may be submitted per day. Each of these jobs generates extensive logs that must be retained. In previous releases, logs are stored in the OpCenter's root volume. To improve scalability, Half Moon Bay allows the use of an NFS server to provide storage volumes for the logs.

Feature UX

For a busy OpCenter supporting hundreds or even thousands of jobs, viewing a complete list of jobs is not practical. The Half Moon Bay release introduces extensive filtering capabilities (both in the CLI and the web interface) so that a user can narrow down the list of jobs in a display (or in a report) to only the jobs of interest.

Feature UX

For each job, the OpCenter compiles a number of logs. In addition, the OpCenter compiles logs related to the OpCenter itself. For troubleshooting, it is useful to download some or all of the logs. The Half Moon Bay release provides the capability (in the web interface and CLI) to download user-selected logs in a single zip file.

Feature Platform

Users can now submit jobs with the --output /path/to/dir option so that stdout and stderr are permanently saved (in a persistent EBS volume or on an NFS server).

Feature Platform

The status of the --errPolicy flag introduced as a preview feature in the Goa 2.4 release changes to general availability.

Feature Preview

Similar to the App Library, the Job Template library now includes a private repository to store job templates created by the user. In this release, the user creates a private template by saving a submitted job. This capability is only available for jobs that were submitted to an OpCenter running Half Moon Bay or a later release. After the template is created, the user can use the template to start additional jobs. Private templates are created using the web interface or the CLI.

Bug fix Platform

This release fixes an issue where charges for jobs run in AWS regions where the local currency is not US dollars were incorrectly reported.

For more information, visit our documentation center - New Features in MMCloud Half Moon Bay 2.5 Release

MMCloud Dev Team
KKJWSHCZYZATJWWMYSQFJGSZMY
Copyright © 2022-2024 MemVerge Inc. All Rights Reserved.