Exporting

The export process connects to the database defined by the properties files of the portal project you are exporting from and generates a number of xml files in your export directory.  The export procedure for uPortal versions <3.0 is different than those >=3.0.  Read the appropriate section for the portal you are exporting from.

 uPortal >= 3.0

 To export simply run:

ant crn-export -Dtype=<type> -Ddir=<export_directory>

This will generate a number of xml files in the specified directory. The available types are printed if you call crn-export without a type, or can be examined in the export_internal.crn file.  Some types take a sysid parameter, for example, to export a single layout, run:

ant crn-export -Dtype=layout -Ddir=<export_directory> -Dsysid=<username>

uPortal < 3.0

Exporting from an earlier version of uPortal is a bit of an adventure.  Some of the branches have had the export system backported, but they are old versions of the scripts.  Eric Dalquist made a major commit to uPortal/trunk with r44591 on 1/13/2009.  As of January 2009 improvements that came with this commit have yet to be backported.  For example, 2.5-patches had the export system committed with r42716 on 12/12/2007 and is way out of date.  The backports have an import-export.xml file that you call ant on.  For example:

ant -f import-export.xml export -Ddir=<export_dir> -Dtype=all

You could use this file, or else just add a crn-export task to your build.xml file.

When CalPoly performed it's migration the cernunnos scripts were under heavy development, I first applied the 2.5-patches branch export commit to our portal, then merged in Eric Dalquist's export.zip contents and some of the newer scripts from uPortal/trunk.  Presently, it may be cleaner to simply work off uPortal/trunk.

 Recommended Procedure

Here is what I would recommend doing:

  1. Branch your portal project in your repository.  You may have an undeployable portal after all of this due to upgraded jars and changes to build.properties.  Better to keep this work cleanly separated.
  2. Copy the files from https://www.ja-sig.org/svn/uPortal/trunk_ under _uportal-impl/src/main/resources/org/jasig/portal/io into your portal project.
  3. Hook into the scripts by copying import-export.xml from a backported branch or by just adding a crn-export task to your build.xml similar to the uPortal/trunk build.xml file.
  4. Get the latest cernunnons jar from uPortal/trunk and put it in your classpath
  5. You may need to upgrade other jar files, I copied used all of the new jars from Eric's export.zip patch found in this JIRA issue. I had to upgrade to Spring as well. In the end, instead of setting every jar in build.properties, I just added lib/*.jar to the ant classpath.
  6. Develop an export procedure as you troubleshoot the problems you encounter.

Defining your data migration procedure

Being very meticulous about the data migration will save you time and a lot of headaches.  Basically there are three stages:  pre-export, export, and post-export.  You should document in detail what should be done during each stage.

 Pre-Export

These are steps you need to perform while the database is disconnected before running crn-export.  CalPoly had these pre-export steps:

  1. Delete old groups in UP_GROUP
  2. Rename "DLM Nested tables" to "DLM XHTML" in UP_SS_THEME
  3. Rename "ASI" Channel category to "ASI Channels" in UP_GROUP entity type 4

I'm not sure if step 2 is strictly necessary because the import script should force the new theme name, see import.properties.  However, we included this step to be safe and ensure that everything in the export files referenced that correct theme name.  At the time, the export scripts lookup groups by their name, not id, so no two groups can have the same name.  In this case, we had a channel group with the same name as a person group and renamed one of them.

 Export

This is where you run the actual export command.  This process takes around 30 to 60 minutes.

  1. Checkout and/or cd to the export branch of your portal 
  2. The portal needs to compile before exporting, to do this we had to point our JAVA_HOME at 1.5 instead of 1.6
  3. ant compile
  4. Run the export command appropriate for your project:
    1. ant -f import-export.xml export -Ddir=<export_dir> -Dtype=all
    2. ant crn-export -Ddir=<export_dir> -Dtype=all

Post-Export

In this step you will clean-up your export files.  This involves removing users/layouts you don't want imported.  Fixing export errors, etc.

At the time, we had an issue with UP-2246and one post-export step was to edit a group file that had this issue.

In our case, we hand-edited our channels and fragment-layouts.  So we deleted these files in the export directory and copied the hand-edited ones on top.

The export process will export your template user, channel and entitiy types and some other import database entities.  However, we elected to use the ones from uPortal3's entities directory.  So we first removed system, defaultTemplateUser, fragmentTemplate user and layout files, and removed the entity-type, channel-type, theme, and structure directories and copied them from entities.

If your process becomes complex you may want to create a script that makes the process more repeatable, which is nice if you test the process on dev and test environments a few times.  I've attached the script CalPoly used.

You'll notice this script calls movePlainUsers.py.  This is a python script I created that removes all .user files that don't have a corresponding layout file or entry in a group-membership.  We did this because we had over 50,000 users in UP_USER, many of which were no longer active.  After running this script, we brought the number down to 11,000.  New users will get a default layout just like we want.

 Export Tweaks

You may find that the export process logs to your portal.log file. In my case, it grew to > 40gb!  If this is the case, you can reduce the number of log messages by adding the following line to the log4j.properties file:

log4j.logger.org.danann.cernunnos=WARN

On my machine I encountered issues with threading.  Certain threads seemed to die, resulting in large numbers of missing layout files and users.  Using just 1 thread resolved this.  To set the number of threads, edit export.crn and modify the line that looks like:

<attribute key="THREAD_COUNT">1</attribute>

 Analyzing Export Results

You'll want to follow-up on the errors from the export process during your testing.  As the export process runs, it outputs DLM node errors when it fails too lookup a node.  I added a debug line to print out every DLM node lookup cernunnos did so I could get the number of nodes for any layout.  I recorded the output from the export process and wrote a script to analyze it.  The output looks like this:

  5870/52597 (11%) users with DLM Warnings:
(sorted by number of warnings and nodes)

Warning Distribution
 1 warning:  4229 users
 2 warning:  1063 users
 3 warning:  330 users
 4 warning:  136 users
 5 warning:  53 users
 6 warning:  31 users
 7 warning:  10 users (pfoster, psouth, ascott, rglenn, mnojunas, wchalmer, sneill, kcashier, tholmgre, kbrauer)
 8 warning:  7 users (clavin, mrdumont, koconnel, atlee, hchavarr, sgburke, tmasek)
 9 warning:  1 users (lguggemo)
10 warning:  6 users (ylabiaga, sanguyen, adixit, skohan, tdorshor, dcoronaz)
11 warning:  0 users
12 warning:  3 users (jgalvan, tkempton, mgeis)
13 warning:  1 users (csermers)

The report script then prints the users in order of most errors. We used this during testing to pick different users to log in with and compare their production and uPortal 3.1 layouts. We tested a a few of the common users with just one warning, and the users with lots of warnings.

Verify all users with customized layouts were exported

If the export script chokes on exporting a layout it skips it.  I'm not sure if it prints a message or not.  You'll want to try to verify that every user that should have a layout file has one.  I did this by running:

SELECT user_name FROM up_user WHERE user_id IN (SELECT DISTINCT USER_ID FROM up_layout_struct WHERE type LIKE 'dlm:%') ORDER BY USER_NAME;

I exported these results to a text file and wrote some scripts to compare the layout directory users with the SQL results.  Note that this SQL query doesn't actually get all users with customized layouts. Some don't have "dlm:" in UP_LAYOUT_STRUCT even though they have customized layouts.  However, it helped me find a certain set of users that were missed, see: UP-2247