Wednesday, November 9, 2016

Replacing text & images with the Google Slides API with Python

NOTE: The code covered in this post are also available in a video walkthrough however the code here differs slightly, featuring some minor improvements to the code in the video.

Introduction

One of the critical things developers have not been able to do previously was access Google Slides presentations programmatically. To address this "shortfall," the Slides team pre-announced their first API a few months ago at Google I/O 2016—also see full announcement video (40+ mins). In early November, the G Suite product team officially launched the API, finally giving all developers access to build or edit Slides presentations from their applications.

In this post, I'll walk through a simple example featuring an existing Slides presentation template with a single slide. On this slide are placeholders for a presentation name and company logo, as illustrated below:

One of the obvious use cases that will come to mind is to take a presentation template replete with "variables" and placeholders, and auto-generate decks from the same source but created with different data for different customers. For example, here's what a "completed" slide would look like after the proxies have been replaced with "real data:"

Using the Google Slides API

We need to edit/write into a Google Slides presentation, meaning the read-write scope from all Slides API scopes below:
  • 'https://www.googleapis.com/auth/presentations' — Read-write access to Slides and Slides presentation properties
  • 'https://www.googleapis.com/auth/presentations.readonly' — View-only access to Slides presentations and properties
  • 'https://www.googleapis.com/auth/drive' — Full access to users' files on Google Drive
Why is the Google Drive API scope listed above? Well, think of it this way: APIs like the Google Sheets and Slides APIs were created to perform spreadsheet and presentation operations. However, importing/exporting, copying, and sharing are all file-based operations, thus where the Drive API fits in. If you need a review of its scopes, check out the Drive auth scopes page in the docs. Copying a file requires the full Drive API scope, hence why it's listed above. If you're not going to copy any files and only performing actions with the Slides API, you can of course leave it out.

Since we've fully covered the authorization boilerplate fully in earlier posts and videos, we're going to skip that here and jump right to the action.

Getting started

What are we doing in today's code sample? We start with a slide template file that has "variables" or placeholders for a title and an image. The application code will go then replace these proxies with the actual desired text and image, with the goal being that this scaffolding will allow you to automatically generate multiple slide decks but "tweaked" with "real" data that gets substituted into each slide deck.

The title slide template file is TMPFILE, and the image we're using as the company logo is the Google Slides product icon whose filename is stored as the IMG_FILE variable in my Google Drive. Be sure to use your own image and template files! These definitions plus the scopes to be used in this script are defined like this:
IMG_FILE = 'google-slides.png'     # use your own!
TMPLFILE = 'title slide template'  # use your own!
SCOPES = (
    'https://www.googleapis.com/auth/drive',
    'https://www.googleapis.com/auth/presentations',
)
Skipping past most of the OAuth2 boilerplate, let's move ahead to creating the API service endpoints. The Drive API name is (of course) 'drive', currently on 'v3', while the Slides API is 'slides' and 'v1' in the following call to create a signed HTTP client that's shared with a pair of calls to the apiclient.discovery.build() function to create the API service endpoints:
HTTP = creds.authorize(Http())
DRIVE =  discovery.build('drive',  'v3', http=HTTP)
SLIDES = discovery.build('slides', 'v1', http=HTTP)

Copy template file

The first step of the "real" app is to find and copy the template file TMPLFILE. To do this, we'll use DRIVE.files().list() to query for the file, then grab the first match found. Then we'll use DRIVE.files().copy() to copy the file and name it 'Google Slides API template DEMO':
rsp = DRIVE.files().list(q="name='%s'" % TMPLFILE).execute().get('files')[0]
DATA = {'name': 'Google Slides API template DEMO'}
print('** Copying template %r as %r' % (rsp['name'], DATA['name']))
DECK_ID = DRIVE.files().copy(body=DATA, fileId=rsp['id']).execute().get('id')

Find image placeholder

Next, we'll ask the Slides API to get the data on the first (and only) slide in the deck. Specifically, we want the dimensions of the image placeholder. Later on, we will use those properties when replacing it with the company logo, so that it will be automatically resized and centered into the same spot as the image placeholder.
The SLIDES.presentations().get() method is used to read the presentation metadata. Returned is a payload consisting of everything in the presentation, the masters, layouts, and of course, the slides themselves. We only care about the slides, so we get that from the payload. And since there's only one slide, we grab it at index 0. Once we have the slide, we're loop through all of the elements on that page and stop when we find the rectangle (image placeholder):
print('** Get slide objects, search for image placeholder')
slide = SLIDES.presentations().get(presentationId=DECK_ID
       ).execute().get('slides')[0]
obj = None
for obj in slide['pageElements']:
    if obj['shape']['shapeType'] == 'RECTANGLE':
        break

Find image file

At this point, the obj variable points to that rectangle. What are we going to replace it with? The company logo, which we now query for using the Drive API:
print('** Searching for icon file')
rsp = DRIVE.files().list(q="name='%s'" % IMG_FILE).execute().get('files')[0]
print(' - Found image %r' % rsp['name'])
img_url = '%s&access_token=%s' % (
        DRIVE.files().get_media(fileId=rsp['id']).uri, creds.access_token) 
The query code is similar to when we searched for the template file earlier. The trickiest thing about this snippet is that we need a full URL that points directly to the company logo. We use the DRIVE.files().get_media() method to create that request but don't execute it. Instead, we dig inside the request object itself and grab the file's URI and merge it with the current access token so what we're left with is a valid URL that the Slides API can use to read the image file and create it in the presentation.

Replace text and image

Back to the Slides API for the final steps: replace the title (text variable) with the desired text, add the company logo with the same size and transform as the image placeholder, and delete the image placeholder as it's no longer needed:
print('** Replacing placeholder text and icon')
reqs = [
    {'replaceAllText': {
        'containsText': {'text': '{{NAME}}'},
        'replaceText': 'Hello World!'
    }},
    {'createImage': {
        'url': img_url,
        'elementProperties': {
            'pageObjectId': slide['objectId'],
            'size': obj['size'],
            'transform': obj['transform'],
        }
    }},
    {'deleteObject': {'objectId': obj['objectId']}},
]
SLIDES.presentations().batchUpdate(body={'requests': reqs},
        presentationId=DECK_ID).execute()
print('DONE')
Once all the requests have been created, send them to the Slides API then let the user know everything is done.

Conclusion

That's the entire script, just under 60 lines of code. If you watched the video, you may notice a few minor differences in the code. One is use of the fields parameter in the Slides API calls. They represent the use of field masks, which is a separate topic on its own. As you're learning the API now, it may cause unnecessary confusion, so it's okay to disregard them for now. The other difference is an improvement in the replaceAllText request—the old way in the video is now deprecated, so go with what we've replaced it with in this post.

If your template slide deck and image is in your Google Drive, and you've modified the filenames and run the script, you should get output that looks something like this:
$ python3 slides_template.py
** Copying template 'title slide template' as 'Google Slides API template DEMO'
** Get slide objects, search for image placeholder
** Searching for icon file
 - Found image 'google-slides.png'
** Replacing placeholder text and icon
DONE
Below is the entire script for your convenience which runs on both Python 2 and Python 3 (unmodified!). If I were to divide the script into major sections, they would be:
  • Get creds & build API service endpoints
  • Copy template file
  • Get image placeholder size & transform (for replacement image later)
  • Get secure URL for company logo
  • Build and send Slides API requests to...
    • Replace slide title variable with "Hello World!"
    • Create image with secure URL using placeholder size & transform
    • Delete image placeholder
Here's the complete script—by using, copying, and/or modifying this code or any other piece of source from this blog, you implicitly agree to its Apache2 license:
from __future__ import print_function

from apiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

IMG_FILE = 'google-slides.png'      # use your own!
TMPLFILE = 'title slide template'   # use your own!
SCOPES = (
    'https://www.googleapis.com/auth/drive',
    'https://www.googleapis.com/auth/presentations',
)
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)
HTTP = creds.authorize(Http())
DRIVE  = discovery.build('drive',  'v3', http=HTTP)
SLIDES = discovery.build('slides', 'v1', http=HTTP)

rsp = DRIVE.files().list(q="name='%s'" % TMPLFILE).execute().get('files')[0]
DATA = {'name': 'Google Slides API template DEMO'}
print('** Copying template %r as %r' % (rsp['name'], DATA['name']))
DECK_ID = DRIVE.files().copy(body=DATA, fileId=rsp['id']).execute().get('id')

print('** Get slide objects, search for image placeholder')
slide = SLIDES.presentations().get(presentationId=DECK_ID,
        fields='slides').execute().get('slides')[0]
obj = None
for obj in slide['pageElements']:
    if obj['shape']['shapeType'] == 'RECTANGLE':
        break

print('** Searching for icon file')
rsp = DRIVE.files().list(q="name='%s'" % IMG_FILE).execute().get('files')[0]
print(' - Found image %r' % rsp['name'])
img_url = '%s&access_token=%s' % (
        DRIVE.files().get_media(fileId=rsp['id']).uri, creds.access_token)

print('** Replacing placeholder text and icon')
reqs = [
    {'replaceAllText': {
        'containsText': {'text': '{{NAME}}'},
        'replaceText': 'Hello World!'
    }},
    {'createImage': {
        'url': img_url,
        'elementProperties': {
            'pageObjectId': slide['objectId'],
            'size': obj['size'],
            'transform': obj['transform'],
        }
    }},
    {'deleteObject': {'objectId': obj['objectId']}},
]
SLIDES.presentations().batchUpdate(body={'requests': reqs},
        presentationId=DECK_ID).execute()
print('DONE')
As with our other code samples, you can now customize it to learn more about the API, integrate into other apps for your own needs, for a mobile frontend, sysadmin script, or a server-side backend!

Code challenge

Add more slides and/or text variables and modify the script replace them too. EXTRA CREDIT: Change the image-based image placeholder to a text-based image placeholder, say a textbox with the text, "{{COMPANY_LOGO}}" and use the replaceAllShapesWithImage request to perform the image replacement. By making this one change, your code should be simplified from the image-based image replacement solution we used in this post.

4 comments:

  1. Very interesting - I have been looking for this! Thanks.

    ReplyDelete
  2. Hi, thanks for writing. I'm not on the team any more but checked the internal bug, and it looks like the product team knows about the issue, however I don't know if there are any resources available to work on it. There seems to be some activity though, so it's definitely on the radar. The best you can do is to star that issue and add any additional examples of failures in as technical detail as possible. There is another similar issue that you should take a look at too: https://issuetracker.google.com/148814758 Both this and your bug are tied to the same internal bug.

    ReplyDelete
  3. Thank you a lot for your answer and the link to the other issue.
    Do you know if a potential quick fix would be to store the images in another cloud storage like dropbox ?
    Thanks again !

    ReplyDelete
  4. I'm not 100% sure, but I think all that's necessary is that your app can reach it via URL (without having to go through authentication, etc.) If the content is sensitive, make them available temporarily while you perform the import, then either shut it down from external access or delete the files since you've imported them into your presentation(s).

    ReplyDelete