Thursday, August 6, 2015

Accessing Gmail from Python (plus BONUS)

NOTE: The code covered in this blogpost is also available in a video walkthrough here.

Introduction

The last several posts have illustrated how to connect to public/simple and authorized Google APIs. Today, we're going to demonstrate accessing the Gmail (another authorized) API. Yes, you read that correctly... "API." In the old days, you access mail services with standard Internet protocols such as IMAP/POP and SMTP. However, while they are standards, they haven't kept up with modern day email usage and developers' needs that go along with it. In comes the Gmail API which provides CRUD access to email threads and drafts along with messages, search queries, management of labels (like folders), and domain administration features that are an extra concern for enterprise developers.

Earlier posts demonstrate the structure and "how-to" use Google APIs in general, so the most recent posts, including this one, focus on solutions and apps, and use of specific APIs. Once you review the earlier material, you're ready to start with Gmail scopes then see how to use the API itself.

    Gmail API Scopes

    Below are the Gmail API scopes of authorization. We're listing them in most-to-least restrictive order because that's the order you should consider using them in  use the most restrictive scope you possibly can yet still allowing your app to do its work. This makes your app more secure and may prevent inadvertently going over any quotas, or accessing, destroying, or corrupting data. Also, users are less hesitant to install your app if it asks only for more restricted access to their inboxes.
    • 'https://www.googleapis.com/auth/gmail.readonly' — Read-only access to all resources + metadata
    • 'https://www.googleapis.com/auth/gmail.send' — Send messages only (no inbox read nor modify)
    • 'https://www.googleapis.com/auth/gmail.labels' — Create, read, update, and delete labels only
    • 'https://www.googleapis.com/auth/gmail.insert' — Insert and import messages only
    • 'https://www.googleapis.com/auth/gmail.compose' — Create, read, update, delete, and send email drafts and messages
    • 'https://www.googleapis.com/auth/gmail.modify' — All read/write operations except for immediate & permanent deletion of threads & messages
    • 'https://mail.google.com/' — All read/write operations (use with caution)

    Using the Gmail API

    We're going to create a sample Python script that goes through your Gmail threads and looks for those which have more than 2 messages, for example, if you're seeking particularly chatty threads on mailing lists you're subscribed to. Since we're only peeking at inbox content, the only scope we'll request is 'gmail.readonly', the most restrictive scope. The API string is 'gmail' which is currently on version 1, so here's the call to apiclient.discovery.build() you'll use:

    GMAIL = build('gmail', 'v1', http=creds.authorize(Http()))

    Note that all lines of code above that is predominantly boilerplate (that was explained in earlier posts). Anyway, once you have an established service endpoint with build(), you can use the list() method of the threads service to request the file data. The one required parameter is the user's Gmail address. A special value of 'me' has been set aside for the currently authenticated user.
    threads = GMAIL.users().threads().list(userId='me').execute().get('threads', [])
    If all goes well, the (JSON) response payload will (not be empty or missing and) contain a sequence of threads that we can loop over. For each thread, we need to fetch more info, so we issue a second API call for that. Specifically, we care about the number of messages in a thread:
    for thread in threads:
        tdata = GMAIL.users().threads().get(userId='me', id=thread['id']).execute()
        nmsgs = len(tdata['messages'])
    
    We're seeking only all threads more than 2 (that means at least 3) messages, discarding the rest. If a thread meets that criteria, scan the first message and cycle through the email headers looking for the "Subject" line to display to users, skipping the remaining headers as soon as we find one:
        if nmsgs > 2:
            msg = tdata['messages'][0]['payload']
            subject = ''
            for header in msg['headers']:
                if header['name'] == 'Subject':
                    subject = header['value']
                    break
            if subject:
                print('%s (%d msgs)' % (subject, nmsgs))
    
    If you're on many mailing lists, this may give you more messages than desired, so feel free to up the threshold from 2 to 50, 100, or whatever makes sense for you. (In that case, you should use a variable.) Regardless, that's pretty much the entire script save for the OAuth2 code that we're so familiar with from previous posts. The script is posted below in its entirety, and if you run it, you'll see an interesting collection of threads... YMMV depending on what messages are in your inbox:
    $ python3 gmail_threads3.py
    [Tutor] About Python Module to Process Bytes (3 msgs)
    Core Python book review update (30 msgs)
    [Tutor] scratching my head (16 msgs)
    [Tutor] for loop for long numbers (10 msgs)
    [Tutor] How to show the listbox from sqlite and make it searchable? (4 msgs)
    [Tutor] find pickle and retrieve saved data (3 msgs)
    

    BONUS: Python 3!

    You may have noticed above that I named the script gmail_threads3.py... why the "3"? Well, as of Mar 2015 (formally in Apr 2015 when the docs were updated), support for Python 3 was added to Google APIs Client Library (3.3+)! This update was a long time coming (relevant GitHub thread), and allows Python 3 developers to write code that accesses Google APIs. If you're already running 3.x, you can use its pip command (pip3) to install the Client Library:

    $ pip3 install -U google-api-python-client

    Because of this, unlike previous blogposts, we're deliberately going to avoid use of the print statement and switch to the print() function instead. If you're still running Python 2, be sure to add the following import so that the code will also run in your 2.x interpreter:

    from __future__ import print_function

    Conclusion

    To find out more about the input parameters as well as all the fields that are in the response, take a look at the docs for threads().list(). For more information on what other operations you can execute with the Gmail API, take a look at the reference docs and check out the companion video for this code sample. That's it!

    Below is the entire script for your convenience which runs on both Python 2 and Python 3 (unmodified!):
    #!/usr/bin/env python
    
    from __future__ import print_function
    from apiclient.discovery import build
    from httplib2 import Http
    from oauth2client import file, client, tools
    
    CLIENT_SECRET = 'client_secret.json'
    SCOPES = 'https://www.googleapis.com/auth/gmail.readonly'
    
    store = file.Storage('storage.json')
    creds = store.get()
    if not creds or creds.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET, SCOPES)
        creds = tools.run(flow, store)
    GMAIL = build('gmail', 'v1', http=creds.authorize(Http()))
    
    threads = GMAIL.users().threads().list(userId='me').execute().get('threads', [])
    for thread in threads:
        tdata = GMAIL.users().threads().get(userId='me', id=thread['id']).execute()
        nmsgs = len(tdata['messages'])
    
        if nmsgs > 2:
            msg = tdata['messages'][0]['payload']
            subject = ''
            for header in msg['headers']:
                if header['name'] == 'Subject':
                    subject = header['value']
                    break
            if subject:
                print('%s (%d msgs)' % (subject, nmsgs))
    
    You can now customize this code for your own needs, for a mobile frontend, a server-side backend, or to access other Google APIs. If you want to see another example of using the Gmail API (displaying all your inbox labels), check out the Python Quickstart example in the official docs or its equivalent in Java (server-side, Android), iOS (Objective-C, Swift), C#/.NET, PHP, Ruby, JavaScript (client-side, Node.js), or Go. That's it... hope you find these code samples useful in helping you get started with the Gmail API!

    EXTRA CREDIT: To test your skills and challenge yourself, try writing code that allows users to perform a search across their email, or perhaps creating an email draft, adding attachments, then sending them! Note that to prevent spam, there are strict Program Policies that you must abide with... any abuse could rate limit your account or get it shut down. Check out those rules plus other Gmail terms of use here.

    Tuesday, April 7, 2015

    Migrating from tools.run() to tools.run_flow()

    This mini-tutorial slash migration guide slash PSA (public service announcement) is aimed at Python developers using the Google APIs Client Library (to access Google APIs from their Python applications) currently calling oauth2client.tools.run(), and likely getting deprecation warnings and/or considering a migration to oauth2client.tools.run_flow(), its replacement.

    Prelude

    We're going to continue our look at accessing Google APIs from Python. In addition to the previous pair of posts (http://goo.gl/57Gufk and http://goo.gl/cdm3kZ), as part of my day job, I've been working on corresponding video content which is part of a developer series called the Launchpad Online. The goal of the series is to help introduce or "launch" developers into using Google APIs, dev tools, or specific API features. (The Google Developers Startups team runs the Launchpad bootcamp events featuring this content delivered live to help entrepreneurs get their startup companies off the ground!) Specifically tied to these blogposts, check out episodes 2 (Creating new apps using Google APIs) and 3 (Accessing Google APIs: common code walkthrough).

    Here in this follow-up, we're going to specifically address the sidebar in the previous post, where we bookmarked an item for future discussion (IOW, the future is now): in the oauth2client package, tools.run() has been deprecated by tools.run_flow(). As explained, use of tools.run() is "easier," meaning less code on-screen (or in a blogpost) hence why I've been using it for code samples. But truth be told that it is outdated. Another problem is that tools.run() requires users to install another package (python-gflags), typically with a command like: "pip install -U python-gflags".

    Now it's time to look at tools.run_flow(), so that you can see the better alternative and can code accordingly, even if the code samples in the videos or blogposts use tools.run(). Yes, tools.run_flow() does requires a recent version of Python.

    Command-line argument processing, or "Why argparse?"

    Python has had several modules in the Standard Library that allow developers to process command-line arguments. The original one was getopt which mirrored the getopt() function from C. In Python 2.3, optparse was introduced, featuring more powerful processing capabilities. However, it was deprecated in 2.7 in favor of a similar module, argparse. (To find out more about their similarities, differences and rationale behind developing argparse , see PEP 389 and this argparse docs page.)

    For the purposes of using Google APIs, you're all set if using Python 2.7 as it's included in the Standard Library. Otherwise Python 2.3-2.6 users can install it with: "pip install -U argparse".  NOTE: while argparse is available in 3.x starting with 3.2, the Google APIs Client Library hasn't been ported to 3.x yet at the time of this writing.


    Replacing tools.run() with tools.run_flow()

    Now let's methodically convert the authorized access to Google APIs code from the previous blogpost from using oauth2client.tools.run() to oauth2client.tools.run_flow(). As a courtesy, this is the code I'm talking about that needs the upgrade:

    from apiclient.discovery import build
    from httplib2 import Http
    from oauth2client import file, client, tools

    SCOPES = # 1 or more scopes, i.e., 'https://www.googleapis.com/auth/youtube'
    CLIENT_SECRET_FILE = 'client_secret.json' # downloaded JSON file

    store = file.Storage('storage.json')
    creds = store.get()
    if not creds or creds.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        creds = tools.run(flow, store)

    # API information, i.e., (API='youtube', VERSION='v3')
    SERVICE = build(API, VERSION, http=creds.authorize(Http()))

    If we wanted to make it easy, we'd simply direct users to add the requisite import argparse line (after 2.3-2.6 users install it). However, one practice I'm apt to follow, if options avail themselves, is to hedge my bets. Why force users one direction or the other? Why can't we use tools.run_flow() if argparse is available, and fallback to tools.run() otherwise? Rather than a required import, I'll check and set a sentinel like this:

    try:
        import argparse
        flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
    except ImportError:
        flags = None

    If 'flags' comes back as None, we use tools.run() otherwise use tools.run_flow(). The line this affects is the assignment of credentials in the if-block where we have to run through the flow. If argparse is indeed available, 'flags' will hold the ArgumentParser instance registered with the tools.argparser object as a parent. We now only need to use the sentinel to choose which "run" function to run, so to do that, replace this original line of code:

        creds = tools.run(flow, store)

    With:

        if flags:
            creds = tools.run_flow(flow, store, flags)
        else:
            creds = tools.run(flow, store)

    Those obsessed with Python's ternary (? :) operation can do this instead, although it's not as easy to read/understand as the above (and thus, less "Pythonic"):

        creds = tools.run_flow(flow, store, flags) if flags else tools.run(flow, store)

    Conclusion

    That's it, and all the other lines stay the same. The complete updated source is here:

    from apiclient.discovery import build
    from httplib2 import Http
    from oauth2client import file, client, tools

    try:
        import argparse
        flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
    except ImportError:
        flags = None

    SCOPES = # 1 or more scopes, i.e., 'https://www.googleapis.com/auth/youtube'
    CLIENT_SECRET_FILE = 'client_secret.json' # downloaded JSON file

    store = file.Storage('storage.json')
    creds = store.get()
    if not creds or creds.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        if flags:
            creds = tools.run_flow(flow, store, flags)
        else:
            creds = tools.run(flow, store)

    # API information, i.e., (API='youtube', VERSION='v3')
    SERVICE = build(API, VERSION, http=creds.authorize(Http()))

    Look for future video episodes where we explore this change as well as use it to access a variety of Google APIs. For now, use this as a guide to modernizing any older code you see in one of my blogposts, videos, or in your own code base. If you want to see some sample usage of the "new way" of doing things, check out the Python Guide for Google Compute Engine.

    Thursday, November 6, 2014

    Authorized Google API access from Python

    NOTE: You can also watch a video walkthrough of the common code covered in this blogpost here.

    Introduction

    In this final installment of a (currently) two-part series introducing Python developers to building on Google APIs, we'll extend from the simple API example from the first post (part 1) just over a month ago. Those first snippets showed some skeleton code and a short real working sample that demonstrate accessing a public (Google) API with an API key (that queried public Google+ posts). An API key however, does not grant applications access to authorized data.

    Authorized data, including user information such as personal files on Google Drive and YouTube playlists, require additional security steps before access is granted. Sharing of and hardcoding credentials such as usernames and passwords is not only insecure, it's also a thing of the past. A more modern approach leverages token exchange, authenticated API calls, and standards such as OAuth2.

    In this post, we'll demonstrate how to use Python to access authorized Google APIs using OAuth2, specifically listing the files (and folders) in your Google Drive. In order to better understand the example, we strongly recommend you check out the OAuth2 guides (general OAuth2 info, OAuth2 as it relates to Python and its client library) in the documentation to get started.

    The docs describe the OAuth2 flow: making a request for authorized access, having the user grant access to your app, and obtaining a(n access) token with which to sign and make authorized API calls with. The steps you need to take to get started begin nearly the same way as for simple API access. The process diverges when you arrive on the Credentials page when following the steps below.

    Google API access

    In order to Google API authorized access, follow these instructions (the first three of which are roughly the same for simple API access):
    • Go to the Google Developers Console and login.
      • Use your Gmail or Google credentials; create an account if needed
    • Click "Create Project" button
      • Enter a Project Name (mutable, human-friendly string only used in the console)
      • Enter a Project ID (immutable, must be unique and not already taken)
    • Once project has been created, click "Enable an API" button
    • Select "Credentials" in left-nav under "APIs & auth"
      • In the top half labeled "OAuth2", click "Create new Client ID"
      • In the new dialog, select your application type — we're building a command-line script which is an "Installed application"
      • In the bottom part of that same dialog, specify the type of installed application; choose "Other" (cmd-line scripts are not web nor mobile)
      • Click "Create Client ID" to generate your credentials
    • Finally, click "Download JSON" to save the new credentials to your computer... perhaps choose a shorter name like "client_secret.json"
    NOTEs: Instructions from the previous blogpost were to get an API key. This time, in the steps above, we're creating and downloading OAuth2 credentials. You can also watch a video walkthrough of this app setup process of getting simple or authorized access credentials in the "DevConsole" here.

      Accessing Google APIs from Python

      In order to access authorized Google APIs from Python, you still need the Google APIs Client Library for Python, so in this case, do follow those installation instructions from part 1.

      We will again use the apiclient.discovery.build() function, which is what we need to create a service endpoint for interacting with an API, authorized or otherwise. However, for authorized data access, we need additional resources, namely the httplib2 and oauth2client packages. Here are the first five lines of the new boilerplate code for authorized access:

      from apiclient.discovery import build
      from httplib2 import Http
      from oauth2client import file, client, tools
      
      CLIENT_SECRET = 'client_secret.json' # downloaded JSON file
      SCOPES = # one or more scopes (strings)
      
      After the imports are some global variables, starting with CLIENT_SECRET. This is the credentials file you saved when you clicked "Download JSON" in the instructions above. SCOPES is a critical variable: it represents the set of scopes of authorization an app wants to obtain (then access) on behalf of user(s). What's does a scope look like?

      Each scope is a single character string, specifically a URL. Here are some examples:
      • 'https://www.googleapis.com/auth/plus.me' — access your personal Google+ settings
      • 'https://www.googleapis.com/auth/drive.metadata.readonly' — read-only access your Google Drive file or folder metadata
      • 'https://www.googleapis.com/auth/youtube' — access your YouTube playlists and other personal information
      You can request one or more scopes, given as a single space-delimited string of scopes or an iterable (list, generator expression, etc.) of strings.  If you were writing an app that accesses both your YouTube playlists as well as your Google+ profile information, your SCOPES variable could be either of the following:
      SCOPES = 'https://www.googleapis.com/auth/plus.me https://www.googleapis.com/auth/youtube'

      That is space-delimited and made tiny by me so it doesn't wrap in a regular-sized browser window; or it could be an easier-to-read, non-tiny, and non-wrapped tuple:

      SCOPES = (
          'https://www.googleapis.com/auth/plus.me',
          'https://www.googleapis.com/auth/youtube',
      )

      Our example command-line script will just list the files on your Google Drive, so we only need the read-only Drive metadata scope, meaning our SCOPES variable will be just this:
      SCOPES = 'https://www.googleapis.com/auth/drive.metadata.readonly'
      The next section of boilerplate represents the security code:
      store = file.Storage('storage.json')
      creds = store.get()
      if not creds or creds.invalid:
          flow = client.flow_from_clientsecrets(CLIENT_SECRET, SCOPES)
          creds = tools.run(flow, store)
      
      Once the user has authorized access to their personal data by your app, a special "access token" is given to your app. This precious resource must be stored somewhere local for the app to use. In our case, we'll store it in a file called "storage.json". The lines setting the store and creds variables are attempting to get a valid access token with which to make an authorized API call.

      If the credentials are missing or invalid, such as being expired, the authorization flow (using the client secret you downloaded along with a set of requested scopes) must be created (by client.flow_from_clientsecrets()) and executed (by tools.run()) to ensure possession of valid credentials. If you don't have credentials at all, the user much explicitly grant permission — I'm sure you've all seen the OAuth2 dialog describing the type of access an app is requesting (remember those scopes?). Once the user clicks "Accept" to grant permission, a valid access token is returned and saved into the storage file (because you passed a handle to it when you called tools.run()).

      Note: tools.run() deprecated by tools.run_flow()
      At the time of this writing, the tools.run() function has been deprecated by tools.run_flow(). We'll explain this in more detail in a future blogpost, but for now, you can use either. The caveats for both: use of tools.run() is "easier" but is outdated and requires another package to download while tools.run_flow() requires more code and a recent version of Python.
      Why is using tools.run() "easier?" Well, it does mean less code, but it also requires the 'gflags' library, so if you need that, install it with "pip install -U python-gflags". The good news with tools.run_flow() is that it does not need this library; the bad news is that you do need to create an argparse.ArgumentParser object (which proxies for the missing 'gflags'), meaning you need Python 2.7. If you wish to do be modern and use tools.run_flow(), read more here in the docs.

      Once the user grants access and valid credentials are saved, you can create one or more endpoints to the secure service(s) desired with apiclient.discovery.build(), just like with simple API access. Its call will look slightly different, mainly that you need to sign your HTTP requests with your credentials rather than passing an API key:

      DRIVE = build(API, VERSION, http=creds.authorize(Http()))

      In our example, we're going to list your files and folders in your Google Drive, so for API, use the string 'drive'. The API version is currently on version 2 so use 'v2' for VERSION:

      DRIVE = build('drive', 'v2', http=creds.authorize(Http()))

      If you want to get comfortable with OAuth2, what it's flow is and how it works, we recommend that you experiment at the OAuth Playground. There you can choose from any number of APIs to access and experience first-hand how your app must be authorized to access personal data.

      Going back to our working example, once you have an established service endpoint, you can use the list() method of the files service to request the file data:

      files = DRIVE.files().list().execute().get('items', [])

      If all goes well, the (JSON) response payload will (not be empty or missing and) contain a sequence of files that we can loop over, displaying file names and types:

      for f in files:
          print f['title'], f['mimeType']

      Just like in the previous blogpost, we're using the print statement here in Python 2, but a pro tip to start getting ready for Python 3 is to add this import to the top of your script (which has no effect in 3.x) so you can use the print() function instead:

      from __future__ import print_function

      Conclusion

      To find out more about the input parameters as well as all the fields that are in the response, take a look at the docs for files().list(). For more information on what other operations you can execute with the Google Drive API, take a look at the reference docs and check out the companion video for this code sample. That's it!

      Below is the entire script for your convenience:
      #!/usr/bin/env python
      
      from apiclient.discovery import build
      from httplib2 import Http
      from oauth2client import file, client, tools
      
      CLIENT_SECRET = 'client_secret.json'
      SCOPES = 'https://www.googleapis.com/auth/drive.readonly.metadata'
      
      store = file.Storage('storage.json')
      creds = store.get()
      if not creds or creds.invalid:
          flow = client.flow_from_clientsecrets(CLIENT_SECRET, SCOPES)
          creds = tools.run(flow, store)
      DRIVE = build('drive', 'v2', http=creds.authorize(Http()))
      
      files = DRIVE.files().list().execute().get('items', [])
      for f in files:
          print f['title'], f['mimeType']
      
      When you run it, you should see pretty much what you'd expect, a list of file or folder names followed by their MIMEtypes — I named my script drive_list.py:
      $ python drive_list.py
      Google Maps demo application/vnd.google-apps.spreadsheet
      Overview of Google APIs - Sep 2014 application/vnd.google-apps.presentation
      tiresResearch.xls application/vnd.google-apps.spreadsheet
      6451_Core_Python_Schedule.doc application/vnd.google-apps.document
      out1.txt application/vnd.google-apps.document
      tiresResearch.xls application/vnd.ms-excel
      6451_Core_Python_Schedule.doc application/msword
      out1.txt text/plain
      Maps and Sheets demo application/vnd.google-apps.spreadsheet
      ProtoRPC Getting Started Guide application/vnd.google-apps.document
      gtaskqueue-1.0.2_public.tar.gz application/x-gzip
      Pull Queues application/vnd.google-apps.folder
      gtaskqueue-1.0.1_public.tar.gz application/x-gzip
      appengine-java-sdk.zip application/zip
      taskqueue.py text/x-python-script
      Google Apps Security Whitepaper 06/10/2010.pdf application/pdf
      
      Obviously your output will be different, depending on what files are in your Google Drive. But that's it... hope this is useful. You can now customize this code for your own needs and/or to access other Google APIs. Thanks for reading!

      EXTRA CREDIT: To test your skills, add functionality to this code that also displays the last modified timestamp, the file (byte)size, and perhaps shave the MIMEtype a bit as it's slightly harder to read in its entirety... perhaps take just the final path element? One last challenge: in the output above, we have both Microsoft Office documents as well as their auto-converted versions for Google Apps... perhaps only show the filename once and have a double-entry for the filetypes!

      Saturday, September 20, 2014

      Simple Google API access from Python

      NOTE: You can also watch a video walkthrough of the common code covered in this blogpost here.

      Introduction

      Back in 2012 when I published Core Python Applications Programming, 3rd ed., I
      posted about how I integrated Google technologies into the book. The only problem is that I presented very specific code for Google App Engine and Google+ only. I didn't show a generic way how, using pretty much the same boilerplate Python snippet, you can access any number of Google APIs; so here we are.

      In this multi-part series, I'll break down the code that allows you to leverage Google APIs to the most basic level (even for Python), so you can customize as necessary for your app, whether it's running as a command-line tool or something server-side in the cloud backending Web or mobile clients. If you've got the book and played around with our Google+ API example, you'll find this code familiar, if not identical — I'll go into more detail here, highlighting the common code for generic API access and then bring in the G+-relevant code later.

      We'll start in this first post by demonstrating how to access public or unauthorized data from Google APIs. (The next post will illustrate how to access authorized data from Google APIs.) Regardless of which you use, the corresponding boilerplate code stands alone. In fact, it's probably best if you saved these generic snippets in a library module so you can (re)use the same bits for any number of apps which access any number of modern Google APIs.

      Google API access

      In order to access Google APIs, follow these instructions:
      • Go to the Google Developers Console and login.
        • Use your Gmail or Google credentials; create an account if needed
      • Click "Create Project" button
        • Enter a Project Name (mutable, human-friendly string only used in the console)
        • Enter a Project ID (immutable, must be unique and not already taken)
      • Once project has been created, click "Enable an API" button
        • You can toggle on any API(s) that support(s) simple API access (not authorized).
        • For the code example below, we use the Google+ API.
        • Other ideas: YouTube Data API, Google Maps API, etc.
        • Find more APIs (and version#s which you need) at the OAuth Playground.
      • Select "Credentials" in left-nav under "APIs & auth"
        • Go to bottom half and click "Create new Key" button
        • Grab long "API KEY" cryptic string and save to Python script
        NOTE: You can also watch a video walkthrough of this app setup process in the "DevConsole" here.

        Accessing Google APIs from Python

        Now that you're set up, everything else is done on the Python side. To talk to a Google API, you need the Google APIs Client Library for Python, specifically the apiclient.discovery.build() function. Download and install the library in your usual way, for example:

        $ pip install -U google-api-python-client
        NOTE: If you're building a Python App Engine app, you'll need something else, the Google APIs Client Library for Python on Google App Engine. It's similar but has extra goodies (specifically decorators — brief generic intro to those in my previous post) just for cloud developers that must be installed elsewhere. As App Engine developers know, libraries must be in the same location on the filesystem as your source code.
        Once everything is installed, make sure that you can import apiclient.discovery:

        $ python
        Python 2.7.6 (default, Apr  9 2014, 11:48:52)
        [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)] on darwin
        Type "help", "copyright", "credits" or "license" for more information.
        >>> import apiclient.discovery
        >>>

        In discovery.py is the build() function, which is what we need to create a service endpoint for interacting with an API. Now craft the following lines of code in your command-line tool:

        from apiclient.discovery import build

        API_KEY = # copied from project credentials page
        SERVICE = build(API, VERSION, developerKey=API_KEY)

        Take the API key you copied from the credentials page and assign to the API_KEY variable as a string. Obviously, embedding an API key in source code isn't something you'd so in practice as it's not secure whatsoever — stick it in a database, key broker, encrypt, or at least have it in a separate byte code (.pyc/.pyo) file that you import — but we'll allow it now solely for illustrative purposes of a simple command-line script.

        In our short example we're going to do a simple search for "python" in public Google+ posts, so for the API variable, use the string 'plus'. The API version is currently on version 1 (at the time of this writing), so use 'v1' for VERSION. (Each API will use a different name and version string... again, you can find those in the OAuth Playground or in the docs for the specific API you want to use.) Here's the call once we've filled in those variables:

        GPLUS = build('plus', 'v1', developerKey=API_KEY)

        We need a template for the results that come back. There are many fields in a Google+ post, so we're only going to pick three to display... the user name, post timestamp, and a snippet of the post itself:

        TMPL = '''
            User: %s
            Date: %s
            Post: %s
        '''

        Now for the code. Google+ posts are activities (known as "notes;" there are other activities as well). One of the methods you have access to is search(), which lets you query public activities; so that's what we're going to use. Add the following call using the GPLUS service endpoint you already created using the verbs we just described and execute it:

        items = GPLUS.activities().search(query='python').execute().get('items', [])

        If all goes well, the (JSON) response payload will contain a set of 'items' (else we assign an empty list for the for loop). From there, we'll loop through each matching post, do some minor string manipulation to replace all whitespace characters (including NEWLINEs [ \n ]) with spaces, and display if not blank:

        for data in items:
            post = ' '.join(data['title'].strip().split())
            if post:
                print TMPL % (data['actor']['displayName'],
                              data['published'], post)

        We're using the print statement here in Python 2, but a pro tip to start getting ready for Python 3 is to add this import to the top of your script (which has no effect in 3.x) so you can use the print() function instead:

        from __future__ import print_function

        Conclusion

        To find out more about the input parameters as well as all the fields that are in the response, take a look at the docs. Below is the entire script missing only the API_KEY which you'll have to fill in yourself.

        #!/usr/bin/env python

        from apiclient.discovery import build

        TMPL = '''
            User: %s
            Date: %s
            Post: %s
        '''

        API_KEY = # copied from project credentials page
        GPLUS = build('plus', 'v1', developerKey=API_KEY)
        items = GPLUS.activities().search(query='python').execute().get('items', [])
        for data in items:
            post = ' '.join(data['title'].strip().split())
            if post:
                print TMPL % (data['actor']['displayName'],
                              data['published'], post)

        When you run it, you should see pretty much what you'd expect, a few posts on Python, some on Monty Python, and of course, some on the snake — I called my script plus_search.py:

        $ python plus_search.py 

            User: Jeff Ward
            Date: 2014-09-20T18:08:23.058Z
            Post: How to make python accessible in the command window.


            User: Fayland Lam
            Date: 2014-09-20T16:40:11.512Z
            Post: Data Engineer http://findmjob.com/job/AB7ZKitA5BGYyW1oAlQ0Fw/Data-Engineer.html #python #hadoop #jobs...


            User: Willy's Emporium LTD
            Date: 2014-09-20T16:19:33.851Z
            Post: MONTY PYTHON QUOTES MUG Take a swig to wash down all that albatross and crunchy frog. Featuring 20 ...


            User: Doddy Pal
            Date: 2014-09-20T15:49:54.405Z
            Post: Classic Monty Python!!!


            User: Sebastian Huskins
            Date: 2014-09-20T15:33:00.707Z
            Post: Made a small python script to get shellcode out of an executable. I found a nice commandlinefu.com oneline...

        EXTRA CREDIT: To test your skills, check the docs and add a fourth line to each output which is the URL/link to that specific post, so that you (and your users) can open a browser to it if of interest.

        If you want to build on from here, check out the larger app using the Google+ API featured in Chapter 15 of the book — it adds some brains to this basic code where the Google+ posts are sorted by popularity using a "chatter" score. That just about wraps it up this post. Once you're good to go, then you're ready to learn how to perform authorized Google API access in part 2 of this two-part series!

        Saturday, July 26, 2014

        Introduction to Python decorators

        In this post, we're going to give you a user-friendly introduction to Python decorators. (The code works on both Python 2 [2.6 or 2.7 only] and 3 so don't be concerned with your version.) Before jumping into the topic du jour, consider the usefulness of the map() function. You've got a list with some data and want to apply some function [like times2() below] to all its elements and get a new list with the modified data:

        def times2(x):
            return x * 2

        >>> list(map(times2, [0, 1, 2, 3, 4]))
        [0, 2, 4, 6, 8]

        Yeah yeah, I know that you can do the same thing with a list comprehension or generator expression, but my point was about an independent piece of logic [like times2()] and mapping that function across a data set ([0, 1, 2, 3, 4]) to generate a new data set ([0, 2, 4, 6, 8]). However, since mapping functions like times2()aren't tied to any particular chunk of data, you can reuse them elsewhere with other unrelated (or related) data.

        Along similar lines, consider function calls. You have independent functions and methods in classes. Now, think about "mapped" execution across functions. What are things that you can do with functions that don't have much to do with the behavior of the functions themselves? How about logging function calls, timing them, or some other introspective, cross-cutting behavior. Sure you can implement that behavior in each of the functions that you care about such information, however since they're so generic, it would be nice to only write that logging code just once.

        Introduced in 2.4, decorators modularize cross-cutting behavior so that developers don't have to implement near duplicates of the same piece of code for each function. Rather, Python gives them the ability to put that logic in one place and use decorators with its at-sign ("@") syntax to "map" that behavior to any function (or method). This compartmentalization of cross-cutting functionality gives Python an aspect-oriented programming flavor.

        How do you do this in Python? Let's take a look at a simple example, the logging of function calls. Create a decorator function that takes a function object as its sole argument, and implement the cross-cutting functionality. In logged() below, we're just going to log function calls by making a call to the print() function each time a logged function is called.

        def logged(_func):
            def _wrapped():
                print('Function %r called at: %s' % (
                    _func.__name__, ctime()))
                return _func()
            return _wrapped

        In logged(), we use the function's name (given by func.__name__) plus a timestamp from time.ctime() to build our output string. Make sure you get the right imports, time.ctime() for sure, and if using Python 2, the print() function:

        from __future__ import print_function # 2.6 or 2.7 only
        from time import ctime

        Now that we have our logged() decorator, how do we use it? On the line above the function which you want to apply the decorator to, place an at-sign in front of the decorator name. That's followed immediately on the next line with the normal function declaration. Here's what it looks like, applied to a boring generic foo() function which just print()s it's been called.

        @logged
        def foo():
            print('foo() called')

        When you call foo(), you can see that the decorator logged() is called first, which then calls foo() on your behalf:

        $ log_func.py
        Function 'foo' called at: Sun Jul 27 04:09:37 2014
        foo() called

        If you take a closer look at logged() above, the way the decorator works is that the decorated function is "wrapped" so that it is passed as func to the decorator then the newly-wrapped function _wrapped()is (re)assigned as foo(). That's why it now behaves the way it does when you call it.

        The entire script:

        #!/usr/bin/env python
        'log_func.py -- demo of decorators'

        from __future__ import print_function
         # 2.6 or 2.7 only
        from time import ctime

        def logged(_func):
            def _wrapped():
                print('Function %r called at: %s' % (
                      _func.__name__, ctime()))
                return _func()
            return _wrapped

        @logged
        def foo():
            print('foo() called')

        foo()


        That was just a simple example to give you an idea of what decorators are. If you dig a little deeper, you'll discover one caveat is that the wrapping isn't perfect. For example, the attributes of foo() are lost, i.e., its name and docstring. If you ask for either, you'll get _wrapped()'s info instead:

        >>> print("My name:", foo.__name__) # should be 'foo'!
        My name: _wrapped
        >>> print("Docstring:", foo.__doc__) # _wrapped's docstring!
        Docstring: None

        In reality, the "@" syntax is just a shortcut. Here's what you really did, which should explain this behavior:

        def foo():
            print('foo() called')

        foo = logged(foo) # returns _wrapped (and its attributes)

        So as you can tell, it's not a complete wrap. A convenience function that ties up these loose ends is functools.wraps(). If you use it and run the same code, you will get foo()'s info. However, if you're not going to use a function's attributes while it's wrapped, it's less important to do this.

        There's also support for additional features, such calling decorated functions with parameters, applying more complex decorators, applying multiple levels of decorators, and also class decorators. You can find out more about (function and method) decorators in Chapter 11 of Core Python Programming or live in my upcoming course which starts in just a few days near the San Francisco airport... there are still a few seats left!

        Wednesday, September 4, 2013

        Learning Programming

        Two years ago, I wrote a post on "learning Python" to launch this blog dedicated to Python. While useful, it doesn't address beginners' needs as much, so it's time for a revisit. Because Python is such a user-friendly language for beginners, I'm often asked whether Python is the "best first language" for those new to programming. While tempted to respond in the affirmative, my answer really is, "it depends." It depends on your experience, age, level of exposure, etc.

        Yes, there are indeed plenty of resources out there, such as courses from online learning brands such as Khan Academy, Udacity, Coursera, Codecademy, CodeSchool, and edX, but most certainly don't come with an instructor, instead relying on live or recorded videos and possibly supplemental study groups, or "cohort learning," as a colleague of mine has branded it. Whatever the mechanism, it's surely better than pure online tutorials or slaving away over a book, neither of which come with instructors either.

        Stepping back a bit, before jumping into hardcore C/C++, Java, PHP, Ruby, or Javascript lessons, for learning tools that are used in industry today, there are better stepping stones to get you there. You may be a kid or a professional who either doesn't code much or had done so long ago. You're say that type of user who is "insulted" by the move "left" or "right" commands for controlling a turtle, say, and desire something more complex. The good news is that there are tools out there, more which allow you to venture further without an instructor.

        One of them is Scratch, a "jigsaw puzzle"-like programming language created at MIT (or Tynker or Blockly, Scratch-like derivatives). Yes, you will do left, right, up, down, etc., but you'll also get to play audio, video, repeat commands, draw graphics, and make sounds. This tool is great for teaching the young learner, who don't need any of the advanced features but which are available for when they're ready to take the next step. It can be used to teach children the concepts of programming without all the syntax that text-based programming languages feature which may make learning those concepts a burden.

        If you wish to proceed, go to the website to get started. They've got videos there as well as projects you can copy. As you can see, you snap together puzzle pieces that teach you coding. Better yet, to get started even more quickly, clone one (or more) of the projects, and "tweak" the code a bit to "do your own thing." In time, you may even develop your own fun applications or real games. Another similar graphical learning tool to consider is Alice from the University of Virginia and now Carnegie-Mellon University.

        Once you're comfortable with that type of working environment, there's a similar tool from MIT called App Inventor. Leverage your Scratch skills and start building applications that run on Android devices! There's an emulator, so you don't really need an Android device, but it's certainly more rewarding when you can use an app that you built running on a tablet or phone! (Try a family friend who may have an old device they don't use any more.)

        Once you're to move beyond block-like languages, there are 2 good choices (or better yet, do both!). One of which the de facto language of the web: Javascript. Unfortunately, there are so many online tutorials out there, I wouldn't know which to suggest, so looking forward to your comments below. The ones which are the most effective however, have you learning then coding directly into the browser and seeing results immediately, requiring you to write successful Javascript before allowing you to move on.

        The thing about Javascript is that code typically only runs within the browser, to control web pages (i.e., "DOM manipulation") and actions you can take on a single page -- it can also be AJAX code that makes an external call to update a page without requiring a page load. Nevertheless, browser-only execution can be somewhat limiting, so there are now 2 additional ways you can use it.

        One is to write "server-side" applications via Node.js. This type of Javascript allows you to write code that executes on the remote machine serving your web pages (generally) after you've entered information in a form and clicked submit. For every web page that users see and interact with, there's also got to be code on the server side that does all the work! This code will also end up returning the final HTML that users see in their browsers once the form has been submitted and results returned.

        Another place you can use Javascript is in Google's cloud. The tool there is called Google Apps Script. Using Apps Script, you can create applications that interact with various Google Apps, automate repetitive tasks, or write glue code that lets you connect and share data between different Google services. Try some of their tutorials to get started!

        The other option besides Javascript is Python. No doubt you already know what it is since you're here. Python's syntax is extremely approachable for beginners and is widely considered "executable pseudocode." That's right, a programming language that doesn't require you have a Computer Science degree to make good use of it! It's also one of those rare languages that can be used by adults in the professional world as well as by kids learning how to code. Sure there are many online learning systems out there, a sampling of which are here:
        See if you like any of them or have your new coder friends try them out. However, I think kids (and even adults) learn programming best when they get to write cool games (leveraging the amazing PyGame library). There are several books written just for kids, including "Hello World" which was actually written by an engineer and his son! Along with that book there are two more you should consider:
        Two of the three books above are in the beginners list I created over a year ago along with two other Python reading lists in this post. (The third book should be added to the list as well.) Those of you who are already programmers probably know which one I would recommend. :-) Seriously though, those reading lists show that I can toot other horns too. :P

        Here are other online projects and learning resources, including book websites, that you can also try (many are for kids):
        In conjunction with a good learning system, book, or project-based learning above, you should also try out one of many free online courses to validate things you've picked up but to also build other knowledge you haven't learned yet. There are a pair from Coursera and one from Udacity:
        For existing programmers who are still questioning why Python, check out Udacity's motivational blogpost.

        That's it! Hopefully I've given you enough resources you can pass along to friends and family members who are intrigued by your passion for computer programming and wish to see what all the excitement is all about. A young man I met on vacation this summer motivated this post... good luck Mitchell! I hope to see the rest of you on the road as well, perhaps at a developers' conference or sitting in one of my upcoming Python courses!

        Tuesday, May 29, 2012

        Tuples aren't what you think they're for

        While I'm happy that the number of Python users continues to grow at a rapid pace and that there are many tutorials added each day to support all the newbies, there are a few things that make me cringe when I see them.

        One example of this is seeing a Python college textbook (you can tell by its retail price) produced by a big-name publisher (one of the largest in the world which shall remain unnamed) that instructs users (of Python 2), to get user command-line input using the input() function! Clearly, this is a major faux pas, as most Python users know that it's a security risk and that raw_input() should always be used instead (and the main reason why raw_input() replaces and is renamed as input() in Python 3).

        Another example is this recent article on lists and tuples. While I find the content useful in teaching new Python developers various useful ways of using slicing, I disagree with the premise that tuples...
        1. along with lists are two of Python's most popular data structures
        2. are mostly immutable but there are workarounds, and
        3. should be used for application data manipulation

        I would says lists and dictionaries are the two most popular Python data structures; tuples shouldn't even be in that group. In fact, I would even argue that tuples shouldn't be used to manipulate application data at all, as that wasn't what they were generally created for. (If this was the case, then why not have lists with a read-only flag?)

        The main reason why tuples exist is to get data to and from function calls. [UPDATE: two other strong use cases: 1) "constructed" dictionary keys (i would've turned such N-tuples into a delimited string) and from that use comes 2) a data structure with positional semantics, aka indices with implied meaning... both of these view such tuples as an individual entity (made up of multiple components), again, not a data structure for manipulating objects. Named tuples is an related alternative. See the debate in the commentary below.]

        Calling a foreign API or 3rd-party function and want to pass in a data structure you know can't be altered? Check. Calling any function where you want to pass in only one data structure (instead of separate variables)? Use "*" and you're good to go. Previously worked with a programming language that only allowed you to return a single value? Tuples are that one object (think of it as a single shopping bag for all your groceries).

        All of the manipulations in the post on getting around the immutability are superfluous and not adhering to the best practice of not using tuples as a data structure. I mean, this is not a strict rule. If you're needing a data structure where you're not going to make any modifications and desire slightly better performance, sure a tuple can be used in such cases. This is why in Python 2.6, for the first time "evar," tuples were given methods!

        There was never any need for tuples to have methods because they were immutable. "Just use lists," is what we would all say. However, lists had a pair of read-only methods (count() and index()) that led to inefficiencies (and poor practices) where developers used tuples for the reason we just outlined but needed to either get a count on how many times an object appeared in that sequence or wanted to find the index of the first appearance of an object. They would have to convert that tuple to a list, just to call those methods. Starting in 2.6, tuples now have those (and only those) methods to avoid this extra nonsense.

        So yes, you can use tuples as user-land data structures in such cases, but that's really it. For manipulation, use lists instead. As stated at the top, I'm generally all for more intro posts and tutorials out there. However, there may be some that don't always impart the best practices out there. Readers should always be alert and question whether there are more "Pythonic" ways of doing things. In this case, tuples should not be one of the "[two] of the most commonly used built-in data types in Python...."