Google Compute Engine does regular maintenance of its infrastructure. This page describes the types and approximate frequencies of these maintenance events, and how you can set instance availability options to configure the behavior of VM instances when these maintenance events occur. This page also describes how to set an instance to live migrate when a maintenance event occurs.
Before you begin
- If you want to use the command-line examples in this guide:
- Install or update to the latest version of the gcloud command-line tool.
- Set a default region and zone.
- If you want to use the API examples in this guide, set up API access.
Maintenance events
Compute Engine maintenance events entail hardware and software updates. Some of these maintenance events require Google to move your VM away from the host that is undergoing maintenance and Compute Engine automatically manages the scheduling behavior of these instances. Compute Engine will live migrate your VM instances if you configured the instance's availability policy to use live migration. This prevents your applications from experiencing disruptions during these events. Alternatively, you can also choose to terminate your instances during these events rather than live migrating them.
The following table broadly categorizes Compute Engine maintenance events into two categories, illustrates each with examples, and signifies which maintenance event requires Live Migration of your VM to a different host.
| Maintenance event type | Examples | Approximate frequency * | Requires live migration to new host |
|---|---|---|---|
| Host maintenance | Host kernel upgrade, hardware repair or upgrade | Once per month | Yes |
| Lightweight | Hypervisor-level upgrade, networking stack upgrade | 1-2 times per week | No |
* Note that these frequencies are approximations, not guarantees. Compute Engine may occasionally perform maintenance more frequently than mentioned here.
Choosing availability policies
A VM instance's availability policy determines how it behaves when there is a maintenance event where Google must move your VM instance to another host machine. You can configure your VM instances to continue running while Compute Engine live migrates them to another host or you can choose to terminate your instances instead. You can update an instance's availability policy at any time to control how you want your VM instances to behave.
You can change an instance's availability policy by configuring the following two settings:
- The VM instance's maintenance behavior, which determines whether the instance is live migrated or terminated when there is a maintenance event.
- The instance's restart behavior, which determines whether the instance automatically restarts if it crashes or gets terminated.
The default maintenance behavior for instances is to live migrate, but you can change the behavior to terminate your instance during maintenance events instead.
Live migrate
By default, standard instances are set to live migrate, where Google Compute Engine automatically migrates your instance away from an infrastructure maintenance event, and your instance remains running during the migration. Your instance might experience a short period of decreased performance, although generally most instances should not notice any difference. This is ideal for instances that require constant uptime, and can tolerate a short period of decreased performance.
When Google Compute Engine migrates your instance, it reports a system event that is published to the list of zone operations. You can review this event by viewing the Compute Engine operations for a specific zone. Live migration events have the following operation type:
compute.instances.migrateOnHostMaintenance
Terminate and (optionally) restart
If you do not want your instance to live migrate, you can choose to terminate and optionally restart your instance. With this option, Google Compute Engine will signal your instance to shut down, wait for a short period of time for your instance to shut down cleanly, terminate the instance, and restart it away from the maintenance event. This option is ideal for instances that demand constant, maximum performance, and your overall application is built to handle instance failures or reboots.
When Google Compute Engine terminates and reboots your instances, it reports a system event that is published to the list of zone operations. You can review this event by viewing the Compute Engine operations for a specific zone. Termination events have the following operation type:
compute.instances.terminateOnHostMaintenance
When your instance reboots, it uses the same persistent boot disk and reattaches any secondary persistent disks that you configured. The data on those disks persists through instance migration and restart.
Local SSD data does not persist through instance termination. When the instance restarts, it creates a new Local SSD that you must format and mount.
Automatic restart
If your instance is set to terminate when there is a maintenance event, or if your
instance crashes because of an underlying hardware issue, you can set
up Google Compute Engine to automatically restart the instance by setting the
automaticRestart field to true. This setting does not apply if the
instance is taken offline through a user action, such as calling
sudo shutdown, or during a zone outage.
When Google Compute Engine automatically restarts your instance, it reports a system event that is published to the list of zone operations. You can review this event by viewing the Compute Engine operations for a specific zone. Automatic restart events have the following operation type:
compute.instances.automaticRestart
Viewing Compute Engine operations
You can view a list of completed operations through the
Google Cloud Platform Console, the
gcloud command-line tool, or the
Compute Engine API.
Console
To view a list of operations for your project, go to the Operations page.
- For more details on an operation, click on the operation summary. For
example, to view the migration details for the
my-instanceinstance, click on the Automatically migrate an instance operation.
gcloud
To view a list of operations for your project
using gcloud compute, use the operations list
sub-command.
To view the list of operations in a specified zone, add the --filter flag.
gcloud compute operations list --filter="zone:(ZONE)"
For example, to view the list of operations in us-cental1-c, run the
following command:
gcloud compute operations list --filter="zone:(us-central1-c)"
NAME TYPE TARGET HTTP_STATUS STATUS TIMESTAMP
systemevent-1543845145000... compute.instances.migrateOnHostMaintenance us-central1-c/instances/my-instance 200 DONE 2018-12-03T05:52:25.000-08:00
API
API requests for operations must be specified at either the global, region, or zone level. Live migration, instance termination, and automatic restarts are all zone level operations.
For zone operations, make a GET request to the zoneOperations.list method.
GET https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/operations
where:
[PROJECT_ID]is the project ID for this request.[ZONE]is the name of the zone for this request.Leave the request body empty.
The following is a sample output for a zone operation request. In this output, details for a host migration displays.
{
"kind": "compute#operation",
"id": "3216798767364213712",
"name": "systemevent-1543845145000-57c1e7574b840-a195b637-5ff74d9b",
"zone": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-c",
"operationType": "compute.instances.migrateOnHostMaintenance",
"targetLink": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-c/instances/my-instance",
"targetId": "3070988523247098025",
"status": "DONE",
"statusMessage": "Instance migrated during Compute Engine maintenance.",
"user": "system",
"progress": 100,
"insertTime": "2018-12-03T05:52:25.000-08:00",
"startTime": "2018-12-03T05:52:25.000-08:00",
"endTime": "2018-12-03T05:52:25.000-08:00",
"selfLink": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-c/operations/systemevent-1543845145000-57c1e7574b840-a195b637-5ff74d9b"
}
Setting availability policies
Configure an instance's maintenance behavior and automatic restart
setting using the onHostMaintenance and automaticRestart properties.
All instances are configured with default values unless you explicitly
specify otherwise.
onHostMaintenance: Determines the behavior when a maintenance event occurs that might cause your instance to reboot.- [Default]
migrate, which causes Compute Engine to live migrate an instance when there is a maintenance event. terminate, which terminates an instance instead of migrating it.
- [Default]
automaticRestart: Determines the behavior when an instance crashes or is terminated by the system.- [Default]
true, so Compute Engine restarts an instance if the instance crashes or is terminated. false, so Compute Engine does not restart an instance if the instance crashes or is terminated.
- [Default]
You can change the availability policies of an instance when you first
create an instance
or after the instance is created,
using the setScheduling
method.
Setting options during instance creation
Console
- In the GCP Console, go to the VM Instances page.
- Click Create instance.
- On the Create a new instance page, fill in the properties for your instance.
- Expand the Management, security, disks, networking, sole tenancy option.
- Under Availability policy, set the Automatic restart and On host maintenance options.
- Click Create to create the instance.
gcloud
To specify the availability policies of a new instance in gcloud compute, use
the --maintenance-policy flag to specify whether the instance is
migrated or terminated. By default, instances are automatically set
to restart unless you provide the --no-restart-on-failure flag.
gcloud compute instances create INSTANCE .. \
[--maintenance-policy MAINTENANCE_POLICY] \
[--no-restart-on-failure]
API
In the API, make a POST request to the following URL, replacing the
project and zone with your project ID and the zone of the instance:
https://www.googleapis.com/compute/v1/projects/myproject/zones/us-central1-f/instances
with the onHostMaintenance and automaticRestart parameters as part of the
request body:
{
"name": "example-instance",
"description": "Front-end for real-time ingest; don't migrate.",
...
// User options for influencing this Instance’s life cycle.
"scheduling": {
"onHostMaintenance": "migrate",
"automaticRestart": "true" # specifies that Google Compute Engine should automatically restart your instance
}
}
For more information, see the Instances reference documentation.
Updating options for an instance
Console
- Go to the VM Instances page in the Google Cloud Platform Console.
- Click the instance for which you want to change settings. The instance details page displays.
- From the instance details page, complete the following steps:
- Click the Edit button at the top of the page.
- Under Availability policies, update the policy as needed. From the Availability policies section, you can set the Automatic restart and On host maintenance options.
- Click Save.
gcloud
To update the availability policies of an instance, use the
instances set-scheduling
command with the same parameters and flags used in the instance
creation command above:
gcloud compute instances set-scheduling INSTANCE \
[--maintenance-policy BEHAVIOR] \
[--no-restart-on-failure | --restart-on-failure]
API
In the API, you can make a request to the following URL, replacing the project and zone with your own project ID and the zone of the instance:
https://www.googleapis.com/compute/v1/projects/example-project/zones/us-central1-f/instances/setScheduling
The body of your request must contain the new value for the availability policies:
{
"onHostMaintenance": "migrate"
"automaticRestart": "true" # specifies that Google Compute Engine should automatically restart your instance
}
For more information, see the
instances().setScheduling
reference documentation.
Testing your availability policies
After you set your availability policies, you can simulate maintenance events to test the effects of these availability policies on your applications. For example, you might simulate a maintenance event on your instances in one of the following situations:
- You have instances that are configured to live migrate during maintenance events and you need to test the effects of live migration on your applications.
- You have batch jobs running on preemptible VM instances and you need to test how your applications handle preemption and shutdown of one or more instances.
- Your instances are configured to terminate and restart during maintenance events rather than live migrate, and you need to test how your applications handle this shutdown and restart process.
Simulated maintenance events are subject to specific API Rate Limits.
You can simulate a maintenance event on an instance using either the
gcloud command-line tool or an API request.
gcloud
Run the
instances simulate-maintenance-event
command to force an instance to activate its configured maintenance policy
action:
gcloud compute instances simulate-maintenance-event [INSTANCE_NAME] \
--zone [ZONE]
where:
[INSTANCE_NAME]is the name of the instance where you want to simulate the maintenance event. You can specify multiple instance names to simulate maintenance events on more than one instance in the same zone.[ZONE]is the zone where the instance is located.
API
In the API, make a request to the
compute.instances.simulateMaintenanceEvent method in the
Compute Engine API:
POST https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/instances/[INSTANCE_NAME]/simulateMaintenanceEvent
where:
[INSTANCE_NAME]is the name of the instance where you want to simulate the maintenance event.[ZONE]is the zone where the instance is located.For more information about this method, see the
instances().simulateMaintenanceEventreference documentation.
Scheduled maintenance windows
Compute Engine offers the ability to choose a window of time each day when Google can perform maintenance events for your VM instances. By designating a window of time for maintenance events, you can protect performance-sensitive workloads from any interference caused by a maintenance event during critical times of the day. For example, you can:
Schedule maintenance windows outside of peak hours. If there is a consistent time frame where your VMs are heavily utilized by users, you can schedule maintenance windows outside of those times.
Schedule maintenance windows around special events. If there is an upcoming event where your VMs must remain running, you can schedule a maintenance window around that event.
This feature is invite-only. Sign up to be invited to use this feature.
What's next
- Learn more about live migration.
- Learn how to detect a live migration event.


