Exporting Organizational Data From Google Apps
Generating The Export
As Google is ending the legacy Google Apps Free edition, it was time to move on for one of the organisations I manage. One part of this step is to export all data available at Google. This is done by:
- Opening the Google Admin Console.
- Go to Data export.
- Start export.
- Wait for the export to complete (you will receive an email).
- View the export (select
View
in theArchive
column). - This will take you to Google Cloud Storage.
- Select both the
html
file and the folder. - Select
Download
. - A popup will open, telling you have to use the
gsutil
utility to download the export.
As this is not a utility i use very often, and I don’t have python
installed on my local host, I will therefore be setting up an environment in docker to download the export. Note that the popup includes the command to run when the environment has been setup.
Downloading data
Start an interactive python container in bash. I will be using the folder W:\Takeout
on my host as my working folder.
docker run --rm -it -v W:\Takeout:/opt python bash
Firstly we need to install gsutil
utility by installing google-cloud-sdk
.
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
apt-get update -y
apt-get install google-cloud-sdk -y
We then need to authenticate, this is performed using the gcloud
utility which is part of the google-cloud-sdk
.
gcloud auth login
This will generate a link.
- Navigate to the link in your web browser (where you are already signed into the Google Admin console).
- Select you account and authorise Google Cloud SDK to access your account.
- This will generate an authentication code.
- Copy the authentication code.
- Paste in the docker session.
- You are now authenticated.
root@3ba12edc8937:/# gcloud auth login
Go to the following link in your browser:
https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=xxxxxxxxxxx.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=openid+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth&state=wcEMJ90xxxxxxxxxxxxxxxxxxxKdj&prompt=consent&access_type=offline&code_challenge=qE9ArxxxxxxxxxxxxxxxxxxxxGFIOVn4N5S5_A7Ku9g&code_challenge_method=S256
Enter verification code: 4/1AX4XfxxxxxxxxxxxxxxxxxxxzaRLc2uIEVxd6b_hMibVsFhs2ZzKDpq5s4Q
You are now logged in as [admin@example.com].
Your current project is [None]. You can change this setting by running:
$ gcloud config set project PROJECT_ID
root@3ba12edc8937:/#
Finally navigate to the working folder and run the command shown in the popup when trying to download from Google Data Export.
cd /opt
gsutil -m cp -r \
"gs://takeout-export-0ac203dd-f8dd-4bef-96a5-xxxxxxxxxxxx/202201xxT170600Z/" \
"gs://takeout-export-0ac203dd-f8dd-4bef-96a5-xxxxxxxxxxxx/Status Report.html" \
.