Posted on

Generating The Export

As Google is ending the legacy Google Apps Free edition, it was time to move on for one of the organisations I manage. One part of this step is to export all data available at Google. This is done by:

  1. Opening the Google Admin Console.
  2. Go to Data export.
  3. Start export.
  4. Wait for the export to complete (you will receive an email).
  5. View the export (select View in the Archive column).
  6. This will take you to Google Cloud Storage.
  7. Select both the html file and the folder.
  8. Select Download.
  9. A popup will open, telling you have to use the gsutil utility to download the export.

As this is not a utility i use very often, and I don’t have python installed on my local host, I will therefore be setting up an environment in docker to download the export. Note that the popup includes the command to run when the environment has been setup.

Downloading data

Start an interactive python container in bash. I will be using the folder W:\Takeout on my host as my working folder.

docker run --rm -it -v W:\Takeout:/opt python bash

Firstly we need to install gsutil utility by installing google-cloud-sdk.

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg  add -
apt-get update -y
apt-get install google-cloud-sdk -y

We then need to authenticate, this is performed using the gcloud utility which is part of the google-cloud-sdk.

gcloud auth login

This will generate a link.

  1. Navigate to the link in your web browser (where you are already signed into the Google Admin console).
  2. Select you account and authorise Google Cloud SDK to access your account.
  3. This will generate an authentication code.
  4. Copy the authentication code.
  5. Paste in the docker session.
  6. You are now authenticated.
root@3ba12edc8937:/# gcloud auth login
Go to the following link in your browser:

    https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=xxxxxxxxxxx.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=openid+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth&state=wcEMJ90xxxxxxxxxxxxxxxxxxxKdj&prompt=consent&access_type=offline&code_challenge=qE9ArxxxxxxxxxxxxxxxxxxxxGFIOVn4N5S5_A7Ku9g&code_challenge_method=S256

Enter verification code: 4/1AX4XfxxxxxxxxxxxxxxxxxxxzaRLc2uIEVxd6b_hMibVsFhs2ZzKDpq5s4Q

You are now logged in as [admin@example.com].
Your current project is [None].  You can change this setting by running:
  $ gcloud config set project PROJECT_ID
root@3ba12edc8937:/#

Finally navigate to the working folder and run the command shown in the popup when trying to download from Google Data Export.

cd /opt
gsutil -m cp -r \
  "gs://takeout-export-0ac203dd-f8dd-4bef-96a5-xxxxxxxxxxxx/202201xxT170600Z/" \
  "gs://takeout-export-0ac203dd-f8dd-4bef-96a5-xxxxxxxxxxxx/Status Report.html" \
  .
comments powered by Disqus