SharePoint Document Converter to PDF Using Oracle Outside In (How-to)

I’ve seen several examples of creating document converters in SharePoint that convert documents to PDF format, using tools like Aspose and other print-based solutions.

I recently did a proof of concept for a customer that used a different product, Outside In PDF Export from Oracle. This is an interpretive converter that can take a slew of file formats (including MS Office) and automatically convert them into a PDF file. If you aren’t familiar with PDF conversion, there are two basic kinds:

  • Print based conversion – This is the Acrobat Distiller approach. Uses a printer driver that accepts the PostScript output from an application (such as Microsoft Word), and converts it to PDF. This approach results in the most accurate conversions, but is hardest to automate because it involves opening the native application and automating its “Print” functionality.
  • Interpretive conversion – This is an approach that reads the contents of a native file format and translates the contents into PDF format. This results in potentially less accurate conversions, but is much easier to automate.

Outside In, and in particular the PDF Export SDK, is an EXE and set of DLLs that contains logic to covert over 400 different file types to PDF. In combination with Transformation Server (a web services wrapper to PDF Export), you can create SharePoint Document Converters that will convert your documents to PDF format.

How I Did It

For the proof of concept, I followed these steps to get it working

  • Download PDF Export, Transformation Server, and SrvAny.exe.
  • Install and Configure the Transformation Server and Web Service
  • Develop and Install the Document Converter

Download of PDF Export 8.3.0 and Transformation Server 8.2.0

You can download both products here:

http://www.oracle.com/technology/products/content-management/oit/oit_dl_otn.html

You’ll have to register with an Oracle account to get it, but it is a freely available trial version, and does not appear to have any trial expiration limitations.

You will also need SrvAny.exe to run Transformation Server as a Windows Service. I found this site that had a download:

http://www.tacktech.com/display.cfm?ttid=197

Install and Configure the Transformation Server and Web Service

Extract the contents of the Transformation Server zip you downloaded to a folder on your server that will do the document conversions (e.g. “C:PDFConverter”). There will be a bunch of dlls, some wsdl files (we’ll use those later), some EXE’s (which are run as a service), and some XML config files.

Transformation Server doesn’t really know anything about PDF generation, it is just a web service wrapper, so we’ll need to place the PDF Export library into our folder so it can use it. Extract the contents of PDF Export 8.3.0 to a separate folder on your machine. Grab every DLL in the root folder, as well as exporter.exe, px.cfg, and cmmap000.bin and copy those files into “C:PDFConverter”.

Open the file called server_startup.xml in “C:PDFConverter”. This configures the service to listen on a specific port and hostname/ip address. I went ahead and changed the port, but you can leave it as is if you wish, just make note of the port for later:

Go ahead and double-click tsmanager.exe to make sure it is working. It should open a console window and show you what host/port it is listening on.

tsmanager console

Press Ctrl + C to quit.

Now we need to run this program as a service, so that it is always available, even when nobody is logged on to the server. Place instsrv.exe and srvany.exe in your C:PDFConverter folder. Run the following command prompt:

Open regedit, and browse to HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesTransformation Server.

Create a new key called Parameters, and add a new string value inside this key called Application, with a value of C:PDFConvertertsmanager.exe.

ts registry entries

Go to the services console and start the Transformation Server service. Optionally, you can also go to the Properties for this service account, and change the Identity that this service runs under.

Develop the Document Converter

Create a new .NET C# console application project and solution from within Visual Studio 2008, called PDFConverter.

Right-click the project in Solution Explorer and choose Add Service Reference…

Since the wsdl files for Transformation Server are not compliant with WCF services, choose the Advanced… button on the Add Service Reference dialog, and then choose the Add Web Reference… button to add a .NET 2.0 style web service reference. In the Add Web Reference dialog, enter the path to the wsdl file: C:PDFConvertertransform_net_2005.wsdl. Hit Go.

Once your web reference has resolved, enter TransformationServer in the Web reference name box and click the Add Reference button.

Open your Properties/Settings.settings file. There will be one setting in the file for the web service URL. Change this to the right hostname/port in your server_settings.xml for Transformation Server. Make sure to include a trailing “/transform” on the end, e.g.:

http://www.lifeonplanetgroove.com/transform

A document converter is just a console EXE that is called by SharePoint with command line arguments, so we’ll setup our console app to handle the arguments. In Program.cs, before static void Main, add the following code (taken from the WSS SDK here):

Add the following code inside static void Main:

Now that the command line arguments can be parsed, it is time to use the web service and do the document conversion. The Transformation Server download contained some code samples for C#, and I was able to basically use as-is, only modifying the input and output file paths. After the argument handling code in static void Main, add the following code:

Most of this code is cut-and-paste from the sample, and we are just swapping the input file path (source.spec.str) and output file path (sink.spec.str) with the arguments from the command line.

Installing the Converter

To install the converter, you need to create a feature file, install the feature, and place the converter EXE in the proper directory.

Create a feature.xml file and add the following:

Create an Elements.xml file, and add the following code. You can create as many document converter nodes as you need for each file type you want to convert (just make sure each has a unique guid).

Place both these files in a folder called PDFConverter in your …/12/TEMPLATE/Features folder, and use stsadm to activate the feature on a particular web application.

After activating the feature, go to Central Administration, Applications tab, and click the Document Conversions link. Ensure that your web application has document conversions turned on. You should see your document converters in the list at the bottom of the screen:

Document Conversion Settings

Since you are using a Web Service, a proxy class needs to be generated dynamically, and so you’ll need to give your SharePoint Document Conversion User Account rights to create this in C:WindowsTemp (or whatever your system temp directory is). Grant List Folder Contents and Read permissions to the Document Conversion account (usually a local machine account that starts with “HVU_”) on your Temp folder.

The Document Conversion service stores temporary documents in the following folder: C:Program FilesMicrosoft Office Servers12.0BINHtmlTrLauncher. By default, this folder only grants the Document Conversion User account access. Since the Transformation Server windows service is running under its own account, you need to give this account access to the folder. Grant the account that your Transformation Server windows service is running under Read and Write access to this folder.

The last step involves copying over your compiled console EXE and exe.config file into the following SharePoint folder (the actual path may vary depending on your setup, but you should be able to find the TransformApps folder):

C:Program FilesMicrosoft Office Servers12.0TransformApps

Results

Navigate to your SharePoint site. Upload a document into your document library (make sure it is a file extension that you’ve configured in your Elements.xml file). Open the drop down menu and choose Convert Document. Your document converter should appear in the list. Run the conversion. After around a minute if you refresh your document library and all went well, you should see your PDF file in the library.

Generated PDF in document library

Here is a picture of a generated Visio document in PDF format, not too bad:

exported visio file

6 comments on “SharePoint Document Converter to PDF Using Oracle Outside In (How-to)
  1. Fantastic article/how-to/tutorial! Brilliant! But… 🙂

    I am having a few problems with permissions although I followed your instructions to the letter 🙂

    Here is the event log:

    Event Type: Warning
    Event Source: Userenv
    Event Category: None
    Event ID: 1509
    Date: 14-10-2009
    Time: 11:06:01
    User: I00841HVU_I00841
    Computer: I00841
    Description:
    Windows cannot copy file C:Documents and SettingsDefault UserApplication DataMicrosoftCLR Security Configv2.0.50727.42security.config.cch.4940.305000 to location C:Documents and SettingsHVU_I00841Application DataMicrosoftCLR Security Configv2.0.50727.42security.config.cch.4940.305000. Possible causes of this error include network problems or insufficient security rights. If this problem persists, contact your network administrator.

    DETAIL – Access is denied.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

    And … the error:

    Event Type: Error
    Event Source: Windows SharePoint Services 3
    Event Category: Timer Job
    Event ID: 5448
    Date: 14-10-2009
    Time: 11:06:46
    User: N/A
    Computer: I00841
    Description:
    The document converter was not able to convert the file.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

    Any insight would be greatly appreciated. Thanks.

    /Kevin

  2. Adam, GREAT ARTICLE. I have a question
    1. Can it take any office file (doc, docx, xls, xlsx, ppt, visio, pdf, jpg, jpeg, gif, tiff) and output to PDF/A with this method? If yes, please confirm or test as I dont have anything setup according to your instruction. It would REALLY REALLY help me.

    Thanks
    JB

  3. Loved it… until i tried using dynamic content in my document (xml binding) and fell out of love.. all in under a day.. I need to get my company to accept a sharepoint 2010 upgrade so i can make use of work automation services 🙁

Comments are closed.