This page explains how to export a table from HBase or Cloud Bigtable as a series of Hadoop sequence files.
If you're migrating from HBase, you can export your table from HBase, then import the table into Cloud Bigtable.
If you're backing up or moving a Cloud Bigtable table, you can export your table from Cloud Bigtable, then import the table back into Cloud Bigtable.
You can also use a Cloud Dataflow template to export sequence files to Cloud Storage.
Exporting a table from HBase
Identifying the table's column families
When you export a table, you should record a list of column families that the table uses. You will need this information when you import the table into Cloud Bigtable.
To get a list of column families in your table:
- Log into your HBase server.
Start the HBase shell:
hbase shellUse the
describecommand to get information about the table you plan to export:describe '[TABLE_NAME]'The
describecommand prints detailed information about the table's column families.
Exporting sequence files
The HBase server provides a utility that exports a table as a series of Hadoop sequence files. See the HBase documentation for instructions on using this utility.
To reduce transfer time, you can export compressed sequence files from HBase. The Cloud Bigtable importer supports both compressed and uncompressed sequence files.
Copying sequence files to Cloud Storage
Use the gsutil tool to copy the exported sequence files to a Cloud Storage
bucket, replacing values in brackets with the appropriate values:
gsutil cp [SEQUENCE_FILES] gs://[BUCKET_PATH]
See the gsutil documentation for details about the gsutil cp
command.
Exporting a table from Cloud Bigtable
Cloud Bigtable provides a utility that uses a Cloud Dataflow job to export a table as a series of Hadoop sequence files. The Cloud Dataflow job runs on Google Cloud Platform.
Identifying the table's column families
When you export a table, you should record a list of column families that the table uses. You will need this information when you import the table.
To get a list of column families in your table:
Install the
cbttool:gcloud components update gcloud components install cbtUse the
lscommand to get a list of column families in the table you plan to export:cbt -instance [INSTANCE_ID] ls [TABLE_NAME]
Creating a Cloud Storage bucket
You can store your exported table in an existing Cloud Storage bucket
or in a new bucket. To create a new bucket, use the gsutil tool, replacing
[BUCKET_NAME] with the appropriate value:
gsutil mb gs://[BUCKET_NAME]
See the gsutil documentation for details about the gsutil mb
command.
Exporting sequence files
To export the table as a series of sequence files:
Download the import/export JAR file, which includes all of the required dependencies:
curl -f -O http://repo1.maven.org/maven2/com/google/cloud/bigtable/bigtable-beam-import/1.10.0/bigtable-beam-import-1.10.0-shaded.jarRun the following command to export the table, replacing values in brackets with the appropriate values. Make sure that
[EXPORT_PATH]and[TEMP_PATH]are paths that do not yet exist in your Cloud Storage bucket:java -jar bigtable-beam-import-1.10.0-shaded.jar export \ --runner=dataflow \ --project=[PROJECT_ID] \ --bigtableInstanceId=[INSTANCE_ID] \ --bigtableTableId=[TABLE_ID] \ --destinationPath=gs://[BUCKET_NAME]/[EXPORT_PATH] \ --tempLocation=gs://[BUCKET_NAME]/[TEMP_PATH] \ --maxNumWorkers=[10x_NUMBER_OF_NODES] \ --zone=[DATAFLOW_JOB_ZONE]For example, if the clusters in your Cloud Bigtable instance have 3 nodes:
java -jar bigtable-beam-import-1.10.0-shaded.jar export \ --runner=dataflow \ --project=my-project \ --bigtableInstanceId=my-instance \ --bigtableTableId=my-table \ --destinationPath=gs://my-export-bucket/my-table \ --tempLocation=gs://my-export-bucket/jar-temp \ --maxNumWorkers=30 \ --zone=us-east1-cThe export job saves your table to the Cloud Storage bucket as a set of Hadoop sequence files. You can use the Google Cloud Platform Console to monitor the export job while it runs.
When the job is complete, it prints the message
Job finished with status DONEto the console.
What's next
Learn how to import sequence files into Cloud Bigtable.


