Pymerase Docs - LinkDB Tutorial

Brandon King
Copyright © 2003 California Institute of Technology

Version 0.1.7
Apr 30, 2003


The Goal
Tutorial Requirements
    2.1  Software Requirements
    2.2  Previous Knowledge Requirements
    2.3  Optional Example Files
Getting Help
    3.1  Pymerase Web Site
    3.2  Pymerase-devel Mailing List
    3.3  Trackers: Bugs, Feature Request, etc.
Tutorial: LinkDB Schema
    4.1  Description
    4.2  ArgoUML
Tutorial: Prepare Driver Scripts
    5.1  Extract XMI File
    5.2  Running Pymerase
    5.3  Driver Program Template
    5.4  Todo for All Templates
    5.5  Create SQL for Database
    5.6  Create Python Database API (DBAPI)
    5.7  Create Python Tkinter Widgets
    5.8  Create Python Tkinter Database Widgets
    5.9  Other Output Modules
Tutorial: Generating Files with Pymerase
    6.1  Description
    6.2  CreateSQL
    6.3  CreateDBAPI
    6.4  CreatePyTkWidgets
    6.5  CreatePyTkDBWidgets
Tutorial: Setting Up the Database
    7.1  PostgreSQL Setup
    7.2  Create the linkdb Database
    7.3  Input Data Using DB Widgets
Tutorial: Command Line Program
    8.1  Creating the Program
    8.2 Useage
Tutorial: Testing The Program
    9.1  Prepare the HTML File
    9.2  Process the HTML File
10  Making Pymerase Better
    10.1  How You Can Help
    11.1  Command Line Program

1  The Goal

The goal of this tutorial is to use Pymerase to help us create a simple application to help us with the task of updating keywords on a webpage with links. Why might we want to create such a program? For example, you might be writting a webpage about all your favorate Python modules and what you've done with them. If you decide you want to create a link to every time the word 'Python' appears, it would consome much of your time adding links at every occurance. Add ten more keywords like 'Python' that you would like to link to, and then updating your webpage with links could become a nightmare.
It is true this python program would be fairly easy to create by creating your own file format and parser. Although, it's a good example of how Pymerase works and it's potential benifits.

2  Tutorial Requirements

2.1  Software Requirements

2.2  Previous Knowledge Requirements

2.3  Optional Example Files

All the files created in this tutorial can be found in the linkDB example provided with pymerase.

3  Getting Help

3.1  Pymerase Web Site

One of the first places you should visit if you need help is the Pymerase web site10. Most of the documentation you will need, if it's available, will be in the Docs section of the web site.

3.2  Pymerase-devel Mailing List

Pymerase developers and users are using pymerase-devel at as a means of communication. If you need help or would like to track development status of Pymerase, subscribing to the mailing list is a good idea. You can do this by going to and clicking on the 'Mail Lists' button. Note that everyone is welcome to make posts related to Pymerase, not just developers.

3.3  Trackers: Bugs, Feature Request, etc.

Currently, there are four trackers in use by the pymerase project on the development site11 at Please submit Bugs and Feature Requests using these trackers. If you have a patch for pymerase, but do not want to become a developer, please use the Patch tracker. Support Requests can be submitted to the tracker or the Pymerase-devel mailing list.

4  Tutorial: LinkDB Schema

4.1  Description

Now that we know what we want to do we have to design the schema. If you haven't read section 1, you may want to do so now. There are two ways that currently exist in Pymerase. An XML format defined by table.dtd. If you decide to use this format, you need to use the parseGenexSchemaXML input module for Pymerase. The other format, which we are going to use, is UML. In this case, ArgoUML. We will be using the parseXMI input module for Pymerase.
Before we actually make the schema in ArgoUML, let's describe it first. We want a simple object called "NameLinkPair" which will store a Keyword and the URL we want to associate with this keyword in our html document. This would be good enough for our program, but if we want one set of keyword/URL pairs for one html file, and another set for a second html file, then this won't work very well. So, let's add a group object with a name attribute. This will be used in our command line program do define which group of keywords we want to use on a given file. We will need to define an association between or 'Group' object and our 'NameLinkPair' object. For the sake of simplicity, we will limit 'NameLinkPair' objects to having 1 - 1 'Group' objects, but the 'Group' objects can have 0 - N 'NameLinkPair' objects. Many-to-Many associations are possible, but are a little more complicated, so I will leave them out of this tutorial.

4.2  ArgoUML

We should now create an ArgoUML file based on the description in 4.1. I assume you either already know how to use ArgoUML, or you have read 'Pymerase Docs - ArgoUML'12. If you haven't, you should do so now.
When you finish creating the ArgoUML file it should look like Figure 1.
Figure 1: LinkDB Schema in ArgoUML
Make sure you change the namespace property to 'LinkAPI' by clicking on the 'Group' class and then on the name space textbox. Then should be able to change the name. This is information that CreateDBAPI will use as the package name of your generated DBAPI.
Note that if you change the names of the classes or attributes in this tutorial, you will have to do so throughout the whole tutorial.
Create a directory in which you will work on this tutorial and save your ArgoUML file as linkDB.zargo.

5  Tutorial: Prepare Driver Scripts

5.1  Extract XMI File

Pymerase cannot read the .zargo files created by ArgoUML, but it can read the .xmi files stored within the .zargo files. To extract the files, run the command below.
unzip linkDB.zargo linkDB_.xmi

If that command fails, try this command:
unzip linkDB.zargo

You should find a .xmi file in your directory. This is the file which you are going to point pymerase to in a variable called 'schema' in the driver program in section 5.3.
This is the reason why the input module for UML is called parseXMI and not parseUML.

5.2  Running Pymerase

If you look at the documentation "Pymerase Docs - Running Pymerase", you will notice that there are four methods of running pymerase. At the time this tutorial was written, the only method for using UML/XMI with pymerase was to use the Driver Program Template and Jython.13 For this reason, we use the Driver Program Template for this tutorial. If you would like to use one of the other methods for running pymerase, feel free. Also note that you can also use the table.dtd XML14 format (parseGenexSchemaXML input module) to define your schema.

5.3  Driver Program Template

Below you will find the Driver Program Template from "Pymerase Docs - Running Pymerase"
------------Driver Program Template-----------

#!/usr/bin/env python

import sys
import os
import pymerase

if __name__ == '__main__':
  #Path to schema
  schema = os.path.abspath('./path2schema/schema') 
  #Output Path
  outputPath = os.path.abspath('./outputPath')
  #Run pymerase,

---End Driver Program Template---
For this tutorial, all driver programs will be using '.../path2schema/linkdb_.xmi' for the schema variable and 'parseXMI' for the input module.

5.4  Todo for All Templates

For each of the driver programs you will create a .py file and paste in the template from section 5.3. Changed the schema path to the path were your .xmi file we extracted in section 5.1 and that you replace 'nameOfInputModule' with 'parseXMI'.

5.5  Create SQL for Database

The CreateSQL output module is used to generate the SQL statements you need to create your database in PostgreSQL.
Create a file in your tutorial directory called '' and paste in the template from section 5.3. Make sure you've changed the schema path to the .xmi file we extracted in section 5.1 and that you replace 'nameOfInputModule' to 'parseXMI'.
In addition to the changes above you have to make for all Driver Programs we will create, you need to change 'nameOfOutputModule' to 'CreateSQL' and the './outputPath' to './linkDB.sql'.
Once you generate your linkDB.sql file later in the tutorial, it should look like the following:
CREATE SEQUENCE "group_pk_seq" start 1 increment 1
  maxvalue 2147483647 minvalue 1 cache 1;

CREATE TABLE "group" (
  "group_pk" integer DEFAULT nextval('"group_pk_seq"'::text)
  "name" varchar(128)
) ;

CREATE SEQUENCE "name_link_pair_pk_seq" start 1 increment 1
  maxvalue 2147483647 minvalue 1 cache 1;

CREATE TABLE "name_link_pair" (
  "name_link_pair_pk" integer DEFAULT
    nextval('"name_link_pair_pk_seq"'::text) PRIMARY KEY,
  "name" varchar(128),
  "url" varchar(128),
  "group_fk" integer
) ;

5.6  Create Python Database API (DBAPI)

The CreateDBAPI output module is used to generate a Python Database API to the database you will generate from the SQL in section 5.5.
Create a new file in your tutorial directory named '' and paste in the driver program template from section 5.3. Change the output path from './outputPath' to './LinkAPI' and change 'nameOfOutputModule' to 'CreateDBAPI'.

5.7  Create Python Tkinter Widgets

The CreatePyTkWidgets output module is used to generate a library of Python Tkinter Widgets based on the schema you used to generate the widgets. You would use this if you would quickly like to create a GUI for your application.
Create a new file in your tutorial directory named '' and paste in the driver program template from section 5.3. Change the output path from './outputPath' to './widgets' and change 'nameOfOutputModule' to 'CreatePyTkWidgets'.

5.8  Create Python Tkinter Database Widgets

The CreatePyTkDBWidgets output module is used to create a few more widgets to the library of Python Tkinter Widgets. These widgets have been linked up to the DBAPI for your database, and will allow you view, edit, and create new records in your database.
Create a new file in your tutorial directory named '' and paste in the driver program template from section 5.3. Change the output path from './outputPath' to './widgets' and change 'nameOfOutputModule' to 'CreatePyTkDBWidgets'.

5.9  Other Output Modules

You may wish to generate other output modules in addition to the four used in this tutorial. For a complete list of Output modules, visit the Pymerase Docs - Output Modules web page15.

6  Tutorial: Generating Files with Pymerase

6.1  Description

Now you should have at least the following files in your tutorial directory.
In this section we will generate all the files we will need to make our command line program that will accomplish our goal from section 1.

6.2  CreateSQL

Time to create the sql for the database. Executed the following command.

If everything goes well, you should find a file called linkDB.sql in your tutorial directory. If something goes wrong, you probably don't pymerase setup correctly. Read the Pymerase Installation Documentation16 or e-mail the pymerase-devel mailing list mentioned in section 3.2.

6.3  CreateDBAPI

Now execute the following command to create the DBAPI.

If everything went well, you should find a python package called 'LinkAPI'. We will use this later to access the data from our database.
Here is a quick example of how to use the DBAPI17.
#!/usr/bin/env python

from LinkAPI import DBSession

if __name__ == '__main__':
  dbs = DBSession(dsn='localhost',

  #get all name link pairs
  nameLinkPairList = dbs.getAllObjects(dbs.NameLinkPair)

  #get all groups
  groupList = dbs.getAllObjects(dbs.Group)

  #get group with primary key of 1
  groupId1 = dbs.getObject(dbs.Group, '1')

  #get name link pairs with primary keys 1, 3, 4
  nlpKeys134 = dbs.getObject(dbs.NameLinkPair,
                             ['1', '3', '4'])

  #get group by database field 'name'
  nameGroup = dbs.getObjectWhere(dbs.Group, 'name = \'myGroup\')

  #get NameLinkPairs associated with 'myGroup'
  myGroup = nameGroup[0]

  nlp4myGroupList = myGroup.getNameLinkPair()

6.4  CreatePyTkWidgets

Excute the following command to generate the Python Tkinter Widget library for your schema.

You will be prompted for the name of the DBAPI package you want to use with these widgets. In this case, you will enter 'LinkAPI'.
If everything went well, you should have a directory named 'widgets' in your tutorial directory. In that directory you should find the following widgets. Adds funtionality to OptionMenu widget. Entry Widgets for user entry of data. Controls entry widget navigation. Tells a given Entry Widget to save itself. Adds validation to Tkinter Entry widgets. Session Object for passing information around.
Also contains a generic DB connection widget. Static variables for choosing widget mode.
All files can be executed to see if they were contstructed properly. They don't do much in this state, but the have functions for getting and setting the ValiditingEntrys. Each of widgets can be subclassed and given save() and load() functions which will then allow them to be hooked up to the SaveWidget and NavBar widgets. In the next section we will generate Python Tkinter Database Widgets which use these features.
For more information on the entry widgets, please visit the Pymerase Docs web site or send e-mail to the mailing list mentioned in section 3.2.
The widgets will need a copy of the LinkAPI package you generated in section 6.3. You can copy, move, or create a symbolic link for this purpose. Excute the following command to copy the pakage from your tutorial directory into the widget directory.
cp -r LinkAPI/ widgets/

6.5  CreatePyTkDBWidgets

Excute the following command to generate the Python Tkinter Database Widget library for your schema.

You will be prompted for the name of the DBAPI package you want to use with these widgets. In this case, you will enter 'LinkAPI'.
If all went well, you should find files in your widget directory. Each of these programs can be executed upon creation. Each one if hooked up to the generic DB connection widget and will prompt for infomation neccisary to connect to the database. It can become annoying sometimes when you have to enter that information every time you want to connect to the same database. So, there is a 'db.cfg' file you can drop into the widget directory. Here is an example.
dsn: localhost
database: linkdb
user: userName

If the connection to the database fails using the config file, it will load the generic connection widget and load the config file values into it.
We will use these DbWidgets to input data into our database later in the tutorial.

7  Tutorial: Setting Up the Database

7.1  PostgreSQL Setup

If you have not installed PostgreSQL 7.2 or higher, you should do so now. If you have installed PostgreSQL and have configured it to your liking, move on to the next section. If you get stuck, the mailing list in section 3.2 is there for you.

7.2  Create the linkdb Database

The first thing we need to do is create the linkdb database. If you are running linux, you can use the following command.
createdb -h localhost -U userName linkdb

If you get the message 'CREATE DATABASE', the command succeded. Next you need to feed the SQL statements into the database to generate the proper tables. User the following command.
psql -h localhost -U userName linkdb < linkDB.sql

If you get the following message, everything went well.
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit
   index 'group_pkey' for table 'group'
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit
   index 'name_link_pair_pkey' for table 'name_link_pair'

7.3  Input Data Using DB Widgets

Now that we have our database, lets put some data in it for us to use. Goto the widgets directory in your tutorial directory. Execute the following command.
python ./

Unless you configured the db.cfg file mentioned in section 6.5, you should see the generic database connection widget shown in Figure 2.
Figure 2: Generic Database Connection Widget
If you are using the widget on the same computer as the database, use 'localhost' as the host. Otherwise you can enter a domain name or ip address and it should work as long as the database has been configured to except external connections. Enter the database name that we used when creating the database 'linkdb' and then enter your username and password if needed.
Once you've connected to the database, you should be able to add groups to your database. These groups will define groups of links you will use for a specific web site. Enter 'tutorial' in for the name of the first group as shown in Figure 3 and then press the 'Save' button.
    |<<   First Record          >>|   Last Record
    |<    Previous Record       >|    Next Record
    >*    New Record
Figure 3: Group Database Widget
Close the GroupDbWidget and then execute the following command to load up the NameLinkPair Entry Widget.
python ./

Connect to the database using the generic connection widget and then we are ready to add Name Link Pairs and associate them with a group. Later on we will create a simple web page with the following keywords on it.
Keyword URL
For each of the Name Link Pairs, enter them in to the database as shown in Figure 4. After typing each entry, press the save button and then the 'New Record' button shown in Figure 3.
Figure 4: Generic Database Connection Widget

8  Tutorial: Command Line Program

8.1  Creating the Program

Once the data has been entered into the database, it's time to make the command line program which you will use. You can use the file included in the linkDB example included with Pymerase, or you can reference the same program in the appendix on page pageref. Or if your really daring, you can right it yourself from scratch.
If you decide to use the appendix or example code provided, the python code is commented and should be fairly self explanitory. If you don't agree or you run into problems, post to the mailing list mentioned in section 3.2 are always welcome.

8.2 Useage

The following is a print out of the instructions you get when you run:
python ./ --help
    Looks for names in an html file that matches the database
    and replaces them with a link.
  Useage: [options] -g group -f file [options] --group=group --file=file

    -h, --host=foo       Name of host. Default: localhost
    -d, --database=foo   Name of database. Default: linkdb
    -u, --user=foo       User login for DB. Default: ENV USERNAME
    -p, --password=foo   User password for DB. Default: None

    -g, --group=foo      Name of group to use for processing
    -f, --file=foo       File to be processed

    -h, --help           Displays this help page

9  Tutorial: Testing The Program

9.1  Prepare the HTML File

Before we can test the program we need an html file to use. Create a new text file called 'tutorial.html' and paste in the html code below.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

    <title>linkDB Tutorial</title>

9.2  Process the HTML File

The big question... is it going to work? Remember, is case sensitive, so 'Pymerase' is different than 'pymerase'. If you'd like to expand on this example, making an option for ignoring case would be good thing to do.
You may want to reference section 8.2 when executing the following command. You may wish to change the options based on your database setup.
python ./ -h localhost -d tutorial -u king -g tutorial -f tutorial.html

In case something went wrong, the program makes a backup of your 'tutorial.html' file in a file called 'tutorial.html.bak'.
The end result if everything went well should look like the following.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

    <title>linkDB Tutorial</title>
      <a href="">Pymerase</a><br>
      <a href="">Postgresql</a><br>
      <a href="">Python</a><br>

10  Making Pymerase Better

10.1  How You Can Help

If you have an questions, comments, ideas, spelling corrections etc., please voice your comments. Open Source projects strive on feedback from the community. If you would like to contibute to any part of the project, please send a message to the mailing list mentioned in section 3.2. We could use all the help we can get. =o)


11.1  Command Line Program

#!/usr/bin/env python
#                                                                         #
# C O P Y R I G H T   N O T I C E                                         #
#  Copyright (c) 2003 by:                                                 #
#    * California Institute of Technology                                 #
#                                                                         #
#    All Rights Reserved.                                                 #
#                                                                         #
# Permission is hereby granted, free of charge, to any person             #
# obtaining a copy of this software and associated documentation files    #
# (the "Software"), to deal in the Software without restriction,          #
# including without limitation the rights to use, copy, modify, merge,    #
# publish, distribute, sublicense, and/or sell copies of the Software,    #
# and to permit persons to whom the Software is furnished to do so,       #
# subject to the following conditions:                                    #
#                                                                         #
# The above copyright notice and this permission notice shall be          #
# included in all copies or substantial portions of the Software.         #
#                                                                         #
# SOFTWARE.                                                               #
#       Authors: Brandon King
# Last Modified: $Date: 2003/04/30 23:51:51 $

import os
import sys
import getopt
import re

from LinkAPI import DBSession

def parseCommandLine():
  Processes Command Line Arguments

  return (groupName, fileName, dsn, database, user, password)

  #Set default values for command line
  groupName = None
  fileName = None
  dsn = "localhost"
  database = "linkdb"
    user = os.environ["USERNAME"]
    user = None
  password = None
    #define command line args
    opts, args = getopt.getopt(sys.argv[1:], "h:d:u:p:g:f:",
  except getopt.GetoptError:
    print "-------------------------------"
    print "- Invalid Command Line Option -"
    print "-------------------------------"
  if len(sys.argv) <= 1:

  for arg, val in opts:

    #Display Help
    if arg in ('--help', '--help'):

    #Set the group name to use when accessing the database
    if arg in ('-g', '--group'):
      groupName = val

    #Set files name to use
    if arg in ('-f', '--file'):
      fileName = val

    #host to connect to. Default = localhost
    if arg in ('-h', '--host'):
      dsn = val

    #name of database. Default = linkdb
    if arg in ('-d', '--database'):
      database = val

    #user name. Default = $USERNAME environment var
    if arg in ('-u', '--user'):
      user = val

    #password. Default = None
    if arg in ('-p', '--password'):
      password = val

  msg = ""
  if groupName is None:
    msg += "Error: No group provided"
    msg += os.linesep
  if fileName is None:
    msg += "Error: No file provided"
    msg += os.linesep
  if msg != "":
    msg += "Suggestion: see \' --help\'"
    msg += os.linesep
    print msg

  return (groupName, fileName, dsn, database, user, password)

def printUseage():
  Prints the useage information
  useage = """
    Looks for names in an html file that matches the database
    and replaces them with a link.
  Useage: [options] -g group -f file [options] --group=group --file=file

    -h, --host=foo       Name of host. Default: localhost
    -d, --database=foo   Name of database. Default: linkdb
    -u, --user=foo       User login for DB. Default: $USERNAME
    -p, --password=foo   User password for DB. Default: None
    -g, --group=foo      Name of group to use for processing
    -f, --file=foo       File to be processed

    -h, --help           Displays this help page

  print useage

def getFileData(fileName):
  opens file if exists
  returns data
  if os.path.isfile(fileName):
    f = open(fileName, 'r')
    data =
    return data
    "Invalid file name \"%s\"" % (fileName)
    return None

if __name__ == '__main__':
  #retrive command line arguments if valid
  groupName, fileName, dsn, database, user, password = parseCommandLine()

  #set fileName to absolute path
  fileName = os.path.abspath(fileName)

  #Connects, or dies... Good luck!
  dbs = DBSession(dsn=dsn, database=database, user=user, password=password)

  #get the group object with a given name
  # this allows you to sort your links in groups
  groupList = dbs.getObjectsWhere(dbs.Group, 'name = \'%s\'' % (groupName))

  #If you entered a name that doesn't exist (CASE sensitive),
  # the program quits
  if len(groupList) == 0:
    print "Group \"%s\" not found" % (groupName)

  #If one group exists 
  if len(groupList) == 1:
    #get the group
    group = groupList.pop()
    #get all the links in the group
    linkerList = group.getNameLinkPair()

    #get the html file in string form for processing
    fileData = getFileData(fileName)

    #save a backup copy of the file as fileName.bak
    f = open(fileName + '.bak', 'w')

    print "Processing Links:"
    for nameLink in linkerList:
      name = nameLink.getName()
      url = nameLink.getUrl()

      print "%s\t%s" % (name, url)

      #replace keyword with link using regexp
      fileData = re.sub(name+"(?!</a>)", "<a href=\"%s\">%s</a>" % \

    #save html file
    f = open(fileName, 'w')
    print "%s groups with name %s!" % (len(groupList), groupName)


5 iporres/html/smw.html
8See for ArgoUML information
9Python Tutorial:
12Current version of this is in the tutorial dictory in Pymerase CVS module Docs
13As Jython is no longer needed, support has been dropped
14If no documentation exist on the table.dtd XML format, please e-mail the mailing list mentioned in section 3.2.
17Check the pymerase docs or e-mail the mailing list mentioned in section 3.2 for more help.

File translated from TEX by TTH, version 3.33.
On 30 Apr 2003, 16:47.