Getting Started

When you work with your sales or support contact to setup access to Data Transfer v2.0, you will be provided with a bucket name. You will need to provide your sales contact a Google Group which enables you to control access to your data files in Google Cloud Storage.

You can choose to access your data using a utility or you can write your own code.

Access data using gsutil

The gsutil tool is a command-line application, written in Python, that lets you access your data without having to do any coding. You could, for example, use gsutil as part of a script or batch file instead of creating custom applications.

To get started with gsutil read the gsutil documentation. The tool will prompt you for your credentials the first time you use it and then store them for use later on.

gsutil examples

You can list all of your files using gsutil as follows:

gsutil ls gs://[bucket_name]/[object name/file name]

gsutil uses much of the same syntax as UNIX, including the wildcard asterisk (*), so you can list all NetworkImpression files:

gsutil ls gs://[bucket_name]/dcm_account6837_impression_*

It's also easy to download a file:

gsutil cp gs://[bucket_name]/dcm_account6837_impression_2015120100.log.gz

You can copy your files from the dispersed DT Google buckets to your own Google API GCS Bucket using a Unix shell script, there are two options:

  • In gsutil, if you are using a Unix System, run the following for all your buckets daily:

    $ day=$(date --date="1 days ago" +"%m-%d-%Y")
    $ gsutil -m cp gs://{<dcmhashid_A>,<dcmhashid_B>,etc.}/*$day*.log.gz gs://<client_bucket>/
  • Alternatively, a solution that is a little trickier is to use a bash file:

    #!/bin/bash
    
    set -x
    
    buckets={dfa_-hasid_A dfa_-hashid_B,...} #include all hash ids
    day=$(date --date="1 days ago" +"%m-%d-%Y")
    for b in ${buckets[@]}; do /
        gsutil -m cp gs://$b/*$day*.log.gz gs:/// /
    done

Access data programmatically

Google Cloud Storage has APIs and samples for many programming languages that allow you to access your data in a programmatic way. Below are the steps specific to Data Transfer v2.0 that you must take to build a working integration.

Get a service account

To get started using Data Transfer v2.0, you need to first use the setup tool, which guides you through creating a project in the Google API Console, enabling the API, and creating credentials.

To set up a new service account, do the following:

  1. Click Create credentials > Service account key.
  2. Choose whether to download the service account's public/private key as a standard P12 file, or as a JSON file that can be loaded by a Google API client library.

Your new public/private key pair is generated and downloaded to your machine; it serves as the only copy of this key. You are responsible for storing it securely.

Be sure to keep this window open, you will need the service account email in the next step.

Add a service account to your group

  • Go to Google Group
  • Click on My Groups and select the group you use for managing access to your DT v2.0 Cloud Storage Bucket
  • Click Manage
  • Do not click Invite Members!
  • Click Direct add members
  • Copy the service account email from the previous step into the members box
  • Select No email
  • Click the Add button

I accidentally clicked Invite Members

Scope

Any scopes passed to Cloud Storage must be Read Only

For example, when using the Java client library the correct scope to use is:

StorageScopes.DEVSTORAGE_READ_ONLY