Under certain conditions, it is possible that a Google Compute Engine instance no longer accepts SSH connections. There are many reasons this could happen. Some common causes of SSH connection issues are as follows:
- The instance has a full disk. Check your disk space and clean it up as needed.
- The
sshddaemon is not configured properly. Review the user guide for your operating system to ensure that your ssh_d file is set up correctly. - OS Login is enabled on the instance. You cannot use both SSH Keys and OS Login to connect to an instance. If OS Login is enabled, then connecting with metadata-based SSH keys is disabled.
This topic describes a number of tips and approaches to help troubleshoot and resolve some of the most common SSH issues.
Requirements
You can run most of your troubleshooting steps from your local workstation. To use a local Linux or Windows workstation to troubleshoot a VM instance, you must first prepare the workstation.
Prepare your workstation with the following steps:
- Install or update to the latest version of the gcloud command-line tool.
- Install the nmap network discovery and security auditing tool for your operating system. You will use this tool to test the network connection to your VM instance.
- Set environment variables.
Set environment variables
You can set environment variables for any parameters that might be frequently used in this troubleshoot guide, such as the instance name and the name of the persistent boot disk for the affected instance.
Set environment variables on your local workstation.
Linux or macOS
On a Linux or macOS workstation, use the export command.
export PROB_INSTANCE='[INSTANCE_NAME]'
export BOOT_DISK='[BOOT_DISK_NAME]'
where:
[INSTANCE_NAME]is the name of the instance that you are troubleshooting.[BOOT_DISK_NAME]is the name of the persistent boot disk for the instance that you are troubleshooting.
For example, if your instance is named instance1 and your boot disk is named
disk1, run the following commands:
export PROB_INSTANCE='instance1'
export BOOT_DISK='disk1'
Windows
On Windows OS, use the set command.
set PROB_INSTANCE='[INSTANCE_NAME]'
set BOOT_DISK='[BOOT_DISK_NAME]'
where:
[INSTANCE_NAME]is the name of the instance that you are troubleshooting.[BOOT_DISK_NAME]is the name of the persistent boot disk for the instance that you are troubleshooting.
For example, if your instance is named instance1 and your boot disk is named
disk1, run the following commands:
set PROB_INSTANCE='instance1'
set BOOT_DISK='disk1'
Test connectivity
You might not be able to SSH to a VM instance because of connectivity issues linked to firewalls, network connection, or the user account. Follow the steps in this section to identify any connectivity issues.
Check your firewall rules
Google Compute Engine provisions each project with a default set of firewall
rules which permit SSH traffic. If the default firewall rule that permits SSH
connections is somehow removed, you'll be unable to access your instance. Check
your list of firewalls with the gcloud compute command-line tool and ensure
the default-allow-ssh rule is present.
On your local workstation, run the following command:
gcloud compute firewall-rules list
If the firewall rule is missing, add it back:
gcloud compute firewall-rules create default-allow-ssh --allow tcp:22
Test the network connection
You can use the nmap tool to connect to your
instance on port 22, and see if the network connection is working. If you connect
and see 22/tcp open ssh, your network
connection is working, and you can rule out firewall problems.
Use the
gcloudtool to obtain the externalnatIPfor your instance:gcloud compute instances describe $PROB_INSTANCE --format='get(networkInterfaces[0].accessConfigs[0].natIP)' 198.51.100.1Test the network connection to your instance.
Run the
nmapcommand to test the network connection to your instance:nmap [EXTERNAL_IP]where
[EXTERNAL_IP]is the external IP of the instance.For example, if the instance has the external IP
198.51.100.1, run the following command:user@local:~$ nmap 198.51.100.1 Starting Nmap 7.70 ( https://nmap.org ) at 2019-03-18 16:04 Greenwich Standard Time Nmap scan report for 229.30.196.35.bc.googleusercontent.com (198.51.100.1) Host is up (0.0061s latency). Not shown: 998 filtered ports PORT STATE SERVICE 22/tcp open ssh Nmap done: 1 IP address (1 host up) scanned in 6.22 seconds
Connect as a different user
The issue that prevents you from logging in might be limited to your user
account. For example, the permissions on the ~/.ssh/authorized_keys file on the
instance was not set correctly for the user.
Try logging in as a different user with the gcloud tool by specifying another
username with the SSH request. The gcloud tool will update the project's
metadata to add the new user and allow SSH access.
user@local:~$ gcloud compute ssh [USER]@$PROB_INSTANCE
where [USER] is a new username to log in with.
Debug the issue in the serial console
In some cases you might be able to find connection errors by reviewing the logs from the serial console. You can access the serial console from your local workstation using a browser.
Enable read-write access to an instance's serial console so you can log into the console and troubleshoot problems with the instance. This is particularly useful when you cannot log in with SSH or if the instance has no connection to the network. The serial console remains accessible in both of these situations.
To learn how to enable interactive access and connect to an instance's serial console, read Interacting with the Serial Console.
Inspect the VM instance without shutting it down
You might have an instance that you cannot connect to that continues to correctly serve production traffic. In this case, you might want to inspect the disk without interrupting the instance.
To inspect the instance you will need to take a snapshot of the boot disk, create a new disk from that snapshot, then create a temporary instance, and finally attach and mount the new persistent disk to your temporary instance to troubleshoot the disk.
Create a new VPC network to host your cloned instance:
gcloud compute networks create debug-networkAdd a firewall rule to allow SSH connections to the network:
gcloud compute firewall-rules create debug-network-allow-ssh --allow tcp:22Create a snapshot of the boot disk.
gcloud compute disks snapshot $BOOT_DISK --snapshot-name debug-disk-snapshotCreate a new disk with the snapshot you just created:
gcloud compute disks create example-disk-debugging --source-snapshot debug-disk-snapshotCreate a new debugging instance without an external IP address:
gcloud compute instances create debugger --network debug-network --no-addressAttach the debugging disk to the instance:
gcloud compute instances attach-disk debugger --disk example-disk-debuggingFollow the instructions to connect to an instance without an external IP address.
Once logged into the debugger instance, troubleshoot the instance. For example, you can look at the instance logs:
$ sudo su -$ mkdir /mnt/$PROB_INSTANCE$ mount /dev/disk/by-id/scsi-0Google_PersistentDisk_example-disk-debugging /mnt/$PROB_INSTANCE$ cd /mnt/$PROB_INSTANCE/var/log# Identify the issue preventing ssh from working $ ls
Use a startup script
If none of the above helped, you can create a startup script to collect information right after the instance starts. Follow the instructions for running a startup script.
Afterwards, you will also need to reset your instance before the metadata will
take effect using gcloud compute instances reset.
Alternatively, you can also recreate your instance with a diagnostic startup
script:
Run
gcloud compute instances deletewith the--keep-disksflag.gcloud compute instances delete $PROB_INSTANCE --keep-disks bootAdd a new instance with the same disk and specify your startup script.
gcloud compute instances create new-instance --disk name=$BOOT_DISK,boot=yes --startup-script-url URL
As a starting point, you can use the compute-ssh-diagnostic script to collect diagnostics information for most common issues.
Use your disk on a new instance
If the other steps in this document do not work for you and you need to recover data from your persistent boot disk, you can detach the boot disk and then attach that disk as a secondary disk on a new instance.
gcloud compute instances delete $PROB_INSTANCE --keep-disks=boot
gcloud compute instances create new-instance --disk name=$BOOT_DISK,boot=yes,auto-delete=no
gcloud compute ssh new-instance


