Interact with other programs using Python

Asked 11/1, 2013 at 23:13 Answered 15/1, 2013 at 12:42

Solved python automation interop concept

I'm having the idea of writing a program using Python which shall find a lyric of a song whose name I provided. I think the whole process should boil down to couple of things below. These are what I want the program to do when I run it:

prompt me to enter a name of a song
copy that name
open a web browser (google chrome for example)
paste that name in the address bar and find information about the song
open a page that contains the lyrics
copy that lyrics
run a text editor (like Microsoft Word for instance)
paste the lyrics
save the new text file with the name of the song

I am not asking for code, of course. I just want to know the concepts or ideas about how to use python to interact with other programs

To be more specific, I think I want to know, fox example, just how we point out where is the address bar in Google Chrome and tell python to paste the name there. Or how we tell python how to copy the lyrics as well as paste it into the Microsof Word's sheet then save it.

I've been reading (I'm still reading) several books on Python: Byte of python, Learn python the hard way, Python for dummies, Beginning Game Development with Python and Pygame. However, I found out that it seems like I only (or almost only) learn to creat programs that work on itself (I can't tell my program to do things I want with other programs that are already installed on my computer)

I know that my question somehow sounds rather silly, but I really want to know how it works, the way we tell Python to regconize that this part of the Google chrome browser is the address bar and that it should paste the name of the song in it. The whole idea of making python interact with another program is really really vague to me and I just extremely want to grasp that.

Thank you everyone, whoever spend their time reading my so-long question.

ttriet204

Plangent answered 11/1, 2013 at 23:13 Comment(9)

Look at code.google.com/p/pywinauto. I've accomplished some decent automation with it. – Turbulent 11/1, 2013 at 23:15

You are approaching the problem from a "user perspective" when you should approach it from a "programmer perspective"; you don't need to open a browser, copy the text, open Word or whatever, you need to perform the appropriate HTTP requests, parse the relevant HTML, extract the text and write it to a file from inside your Python script. All the tools to do this are available in Python (in particular you'll need urllib2 and BeautifulSoup). – Dacy 11/1, 2013 at 23:15

@Matteo, true, but product testers need to think of things from a "user perspective" – Rubetta 11/1, 2013 at 23:17

And, in full agreement with Matteo if you don't go the automation route, look at wwwsearch.sourceforge.net/mechanize. – Turbulent 11/1, 2013 at 23:17

@CameronSparr: of course, but that's a very specialized need - they need to simulate user interaction; instead, here what the user needs is to get some job done, and he naively thinks to replicate what a human would do, while there are much more straightforward ways from directly from code. – Dacy 11/1, 2013 at 23:18

Could be even simpler using a service that provides an API for .. would leave out the parsing and extracting – Purgatory 11/1, 2013 at 23:23

Is the point of this question "I need to write a program that does this", or "I want to use this as an excuse to learn how to interact with other apps"? – Heritage 11/1, 2013 at 23:33

you could use one of the sites that provide API to search for lyrics e.g., LyricWiki API (I haven't tried it). – Chirography 12/1, 2013 at 1:12

@MatteoItalia: I've posted implementation in Python of your comment – Chirography 15/1, 2013 at 12:43

If what you're really looking into is a good excuse to teach yourself how to interact with other apps, this may not be the best one. Web browsers are messy, the timing is going to be unpredictable, etc. So, you've taken on a very hard task—and one that would be very easy if you did it the usual way (talk to the server directly, create the text file directly, etc., all without touching any other programs).

But if you do want to interact with other apps, there are a variety of different approaches, and which is appropriate depends on the kinds of apps you need to deal with.

Some apps are designed to be automated from the outside. On Windows, this nearly always means they a COM interface, usually with an IDispatch interface, for which you can use pywin32's COM wrappers; on Mac, it means an AppleEvent interface, for which you use ScriptingBridge or appscript; on other platforms there is no universal standard. IE (but probably not Chrome) and Word both have such interfaces.
Some apps have a non-GUI interface—whether that's a command line you can drive with popen, or a DLL/SO/DYLIB you can load up through ctypes. Or, ideally, someone else has already written Python bindings for you.
Some apps have nothing but the GUI, and there's no way around doing GUI automation. You can do this at a low level, by crafting WM_ messages to send via pywin32 on Windows, using the accessibility APIs on Mac, etc., or at a somewhat higher level with libraries like pywinauto, or possibly at the very high level of selenium or similar tools built to automate specific apps.

So, you could do this with anything from selenium for Chrome and COM automation for Word, to crafting all the WM_ messages yourself. If this is meant to be a learning exercise, the question is which of those things you want to learn today.

Let's start with COM automation. Using pywin32, you directly access the application's own scripting interfaces, without having to take control of the GUI from the user, figure out how to navigate menus and dialog boxes, etc. This is the modern version of writing "Word macros"—the macros can be external scripts instead of inside Word, and they don't have to be written in VB, but they look pretty similar. The last part of your script would look something like this:

word = win32com.client.dispatch('Word.Application')
word.Visible = True
doc = word.Documents.Add()
doc.Selection.TypeText(my_string)
doc.SaveAs(r'C:\TestFiles\TestDoc.doc')

If you look at Microsoft Word Scripts, you can see a bunch of examples. However, you may notice they're written in VBScript. And if you look around for tutorials, they're all written for VBScript (or older VB). And the documentation for most apps is written for VBScript (or VB, .NET, or even low-level COM). And all of the tutorials I know of for using COM automation from Python, like Quick Start to Client Side COM and Python, are written for people who already know about COM automation, and just want to know how to do it from Python. The fact that Microsoft keeps changing the name of everything makes it even harder to search for—how would you guess that googling for OLE automation, ActiveX scripting, Windows Scripting House, etc. would have anything to do with learning about COM automation? So, I'm not sure what to recommend for getting started. I can promise that it's all as simple as it looks from that example above, once you do learn all the nonsense, but I don't know how to get past that initial hurdle.

Anyway, not every application is automatable. And sometimes, even if it is, describing the GUI actions (what a user would click on the screen) is simpler than thinking in terms of the app's object model. "Select the third paragraph" is hard to describe in GUI terms, but "select the whole document" is easy—just hit control-A, or go to the Edit menu and Select All. GUI automation is much harder than COM automation, because you either have to send the app the same messages that Windows itself sends to represent your user actions (e.g., see "Menu Notifications") or, worse, craft mouse messages like "go (32, 4) pixels from the top-left corner, click, mouse down 16 pixels, click again" to say "open the File menu, then click New".

Fortunately, there are tools like pywinauto that wrap up both kinds of GUI automation stuff up to make it a lot simpler. And there are tools like swapy that can help you figure out what commands you want to send. If you're not wedded to Python, there are also tools like AutoIt and Actions that are even easier than using swapy and pywinauto, at least when you're getting started. Going this way, the last part of your script might look like:

word.Activate()
word.MenuSelect('File->New')
word.KeyStrokes(my_string)
word.MenuSelect('File->Save As')
word.Dialogs[-1].FindTextField('Filename').Select()
word.KeyStrokes(r'C:\TestFiles\TestDoc.doc')
word.Dialogs[-1].FindButton('OK').Click()

Finally, even with all of these tools, web browsers are very hard to automate, because each web page has its own menus, buttons, etc. that aren't Windows controls, but HTML. Unless you want to go all the way down to the level of "move the mouse 12 pixels", it's very hard to deal with these. That's where selenium comes in—it scripts web GUIs the same way that pywinauto scripts Windows GUIs.

Heritage answered 11/1, 2013 at 23:32 Comment(3)

Hi Abarnert! Thank you so much. Though I do not understand much of what you explained to me due to my lack of knowledge, I now think I have grasp the basic idea on how to make python work with other programs. I'm currently digging more on some of your aforementioned terminology like "GUI" or "DLL/SO/DYLIB" "Python bindings" "COM". I have tried them on google and wikipedia, however, the explanations provided seemed to be rather "academic" to me. If you can, please provide me with some reading materials on these items. Thank you – Plangent 12/1, 2013 at 11:47

I'll edit the answer to add a bit more info. Also, now that I know you're on Windows, I can be more specific. – Heritage 12/1, 2013 at 20:25

Thanks again, Abarnert! You've been bearing with me all through the time! I'm really appreciate what you've done and have tried to do to break this very vague idea down to simple "chunks" for me! I learnt alot from your answers. You're making this website a really useful and friendly place for such newbie like me. – Plangent 14/1, 2013 at 16:14

The following script uses Automa to do exactly what you want (tested on Word 2010):

def find_lyrics():
    print 'Please minimize all other open windows, then enter the song:'
    song = raw_input()
    start("Google Chrome")
    # Disable Google's autocompletion and set the language to English:
    google_address = 'google.com/webhp?complete=0&hl=en'
    write(google_address, into="Address")
    press(ENTER)
    write(song + ' lyrics filetype:txt')
    click("I'm Feeling Lucky")
    press(CTRL + 'a', CTRL + 'c')
    press(ALT + F4)
    start("Microsoft Word")
    press(CTRL + 'v')
    press(CTRL + 's')
    click("Desktop")
    write(song + ' lyrics', into="File name")
    click("Save")
    press(ALT + F4)
    print("\nThe lyrics have been saved in file '%s lyrics' "
          "on your desktop." % song)

To try it out for yourself, download Automa.zip from its Download page and unzip into, say, c:\Program Files. You'll get a folder called Automa 1.1.2. Run Automa.exe in that folder. Copy the code above and paste it into Automa by right-clicking into the console window. Press Enter twice to get rid of the last ... in the window and arrive back at the prompt >>>. Close all other open windows and type

>>> find_lyrics()

This performs the required steps.

Automa is a Python library: To use it as such, you have to add the line

from automa.api import *

to the top of your scripts and the file library.zip from Automa's installation directory to your environment variable PYTHONPATH.

If you have any other questions, just let me know :-)

Paolo answered 13/1, 2013 at 14:35 Comment(11)

Hi Michael Herrmann. Thanks alot. I had never thought that anyone would spend time writing the whole code for me (I'll ues your code to study as a refference since I really really want to put what I've learnt to some practical use :D) I'm sure your code shall make excellent example for me as well as a guild. I'm now downloading automa and I just can't wait to try out your work! Thank you so much. I'm sure I'll have more questions to ask later. – Plangent 14/1, 2013 at 16:9

First, this is going to copy and paste all of the text of the lyrics results into a Word file, not just the lyrics. Which is almost certainly not what you want. Many lyrics sites are designed to not let you copy-paste the lyrics; even if you're lucky, you'll get a whole mess of navigation, links, etc. surrounding the lyrics. (Also, this will quit both apps when it's done, which is probably not what you want either, but that's easier to fix.) – Heritage 14/1, 2013 at 20:22

Second, the blog post http://www.getautoma.com/blog/find-song-lyrics-with-automa about this SO question is highly misleading. "… the other answers consist of… the general tone 'it's too difficult'"? Really, showing the 5 lines of win32com code to create a new file in Word is too difficult, but showing the 7 lines of Automa code to do the same thing is not? The hard part is extracting the lyrics text—which this answer just punts on by not actually solving it. – Heritage 14/1, 2013 at 20:23

Finally, you should probably mention that Automa only comes as a trial download, and neither the trial limits nor the purchase price are available anywhere on the site. – Heritage 14/1, 2013 at 20:25

Glad you like it ttriet :-) If you think it's the best answer it'd be great if you could accept it as such here! – Paolo 14/1, 2013 at 21:53

Abarnert: My solution follows the steps asked for in the question to the dot. With those steps - using Google to search for and open any lyrics site that may appear as a search result - it is impossible to copy just the lyrics in the general case. My solution does sometimes copy unwanted text, however the addition of the filetype:txt parameter reduces this to a small extent. For instance for the input The times they are a changin, it returns precisely the required result. ... – Paolo 14/1, 2013 at 22:17

P.S.: About the "trial" @ttriet204: I talked to my colleagues - We think it's cool how you want to learn something new and we'd be happy to give you a free license, if you want :-) – Paolo 14/1, 2013 at 23:21

Hi Michael Hermann. I tried downloading Automa but I just can't get it. I used IDM to download it and it always stopped at 80%. Maybe my pc has problems downloading files... It's very kind of you (and your colleagues as well) since you all provdided me with a free license. That is, without any doubts, a great deed. However, I can't take your offer (though I'm really want to :D) since I don't want to take advantage of you and your colleagues. Thank you. – Plangent 15/1, 2013 at 11:34

Oh, and actually, Abarnert's answer to me was helpful. I wrote "though I do not understand much of what you said", it was due to my very limited programming and general computer understanding, not because of his answer. One more thing, he did answer into my question, which was the "concept" of how Python interacted with other programs. – Plangent 15/1, 2013 at 11:37

I feel pitty since the website only gives me one choice to accept one's answer as a so called "official" answer. However, please, do believe that all of the answers in this page are all helpful to me and I want to thank to you all. Since Abarnert has been bearing with me from the beginning and his answers had helped me alot, I will choose his one to be the "accepted answer" – Plangent 15/1, 2013 at 11:42

Automa appears to no longer be available or maintained. Links for automa are no longer valid. – Jeddy 28/3, 2022 at 14:23

Here's an implementation in Python of @Matteo Italia's comment:

You are approaching the problem from a "user perspective" when you should approach it from a "programmer perspective"; you don't need to open a browser, copy the text, open Word or whatever, you need to perform the appropriate HTTP requests, parse the relevant HTML, extract the text and write it to a file from inside your Python script. All the tools to do this are available in Python (in particular you'll need urllib2 and BeautifulSoup).

#!/usr/bin/env python
import codecs
import json
import sys
import urllib
import urllib2

import bs4  # pip install beautifulsoup4

def extract_lyrics(page):
    """Extract lyrics text from given lyrics.wikia.com html page."""
    soup = bs4.BeautifulSoup(page)
    result = []
    for tag in soup.find('div', 'lyricbox'):
        if isinstance(tag, bs4.NavigableString):
            if not isinstance(tag, bs4.element.Comment):
                result.append(tag)
        elif tag.name == 'br':
            result.append('\n')
    return "".join(result)

# get artist, song to search
artist = raw_input("Enter artist:")
song = raw_input("Enter song:")

# make request
query = urllib.urlencode(dict(artist=artist, song=song, fmt="realjson"))
response = urllib2.urlopen("http://lyrics.wikia.com/api.php?" + query)
data = json.load(response)

if data['lyrics'] != 'Not found':
    # print short lyrics
    print(data['lyrics'])
    # get full lyrics
    lyrics = extract_lyrics(urllib2.urlopen(data['url']))
    # save to file
    filename = "[%s] [%s] lyrics.txt" % (data['artist'], data['song'])
    with codecs.open(filename, 'w', encoding='utf-8') as output_file:
        output_file.write(lyrics)
    print("written '%s'" % filename)
else:
    sys.exit('not found')

Example

$ printf "Queen\nWe are the Champions" | python get-lyrics.py

Output

I've paid my dues
Time after time
I've done my sentence
But committed no crime

And bad mistakes
I've made a few
I've had my share of sand kicked [...]
written '[Queen] [We are the Champions] lyrics.txt'

Chirography answered 15/1, 2013 at 12:42 Comment(3)

@J.F. Sebastian I know this post/answer is old, but I tested your code and it failed. It seems apI format at lyrics.wikia.com/api.php is changed. Could you point me right direction or link for getting this work again? I looked and searched that api site and there are content/wiki api but didn't find lyrics api anywhere. – Tat 21/3, 2016 at 13:37

@MichaelSM: the specific API is not the point of the answer. The point is expressed in the Matteo Italia's comment and the code is used as a mere illustration (to demystify the effort required). Though the code did work at the time of the posting. If lyrics.wikia.com's API is discontinued; you could try some other similar service. – Chirography 21/3, 2016 at 15:12

@J.F. Sebastian: Thanks for your reply. I am particularly interested in the lyrics API. could you recommend a good lyrics API? – Tat 23/3, 2016 at 13:19

If you really want to open a browser, etc, look at selenium. But that's overkill for your purposes. Selenium is used to simulate button clicks, etc for testing the appearance of websites on various browsers, etc. Mechanize is less of an overkill for this

What you really want to do is understand how a browser (or any other program) works under the hood i.e. when you click on the mouse or type on the keyboard or hit Save, what does the program do behind the scenes? It is this behind-the-scenes work that you want your python code to do.

So, use urllib, urllib2 or requests (or heck, even scrapy) to request a web page (learn how to put together the url to a google search or the php GET request of a lyrics website). Google also has a search API that you can take advantage of, to perform a google search.

Once you have your results from your page request, parse it with xml, beautifulsoup, lxlml, etc and find the section of the request result that has the information you're after.

Now that you have your lyrics, the simplest thing to do is open a text file and dump the lyrics in there and write to disk. But if you really want to do it with MS Word, then open a doc file in notepad or notepad++ and look at its structure. Now, use python to build a document with similar structure, wherein the content will be the downloaded lyrics.
If this method fails, you could look into pywinauto or such to automate the pasting of text into an MS Word doc and clicking on Save

Citation: Matteo Italia, g.d.d.c from the comments on the OP

Boxboard answered 11/1, 2013 at 23:28 Comment(4)

doc is a binary format, it makes no sense to open it in notepad – Purgatory 12/1, 2013 at 0:37

@Esailija: I meant that output could be stored as a .txt file instead of .doc file – Boxboard 12/1, 2013 at 7:15

Hi InspectorG4dget! Thanks alot for your answer. It really helpped! I have tried downloading pywinauto and having fun with it. I originally picked your first suggestion which is Selenium. However, I cannot download the file via the link you provided, maybe there's something wrong with my computer or my internet connection. I kept trying downloading the file but couldn't manage to get it. I then looked for it on the internet but I just can't get the link (or one that works!). Please, give me another link of Selenium if you can. I really appreciate your help – Plangent 12/1, 2013 at 11:56

seleniumhq.org is the homepage for Selenium. The instructions there should explain how to get the various bits of Selenium WebDriver that you need (and which bits those are). – Heritage 12/1, 2013 at 21:20

You should look into a package called selenium for interacting with web browsers

Rubetta answered 11/1, 2013 at 23:16 Comment(1)

Thank you! I'm still looking for Selenium to download. Can you give me the link? – Plangent 12/1, 2013 at 11:57

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Example

Output

Recommended topics

Hot tags