How to Convert Office File to PDF File Format in C#

How to Convert Office file to pdf in C#.

These codes are used for Microsoft Office products with the Save As PDF add-in installed.

Note that You will need to add a reference to Microsoft.Office.Interop.(word,excel, or powerpoint)

<Word To PDF>

public string ConvertWordToPdf(string inputFile)
{
string outputFileName = “Desired Output File Path”;
Microsoft.Office.Interop.Word.ApplicationClasswordApp =
new rosoft.Office.Interop.Word.ApplicationClass();
Microsoft.Office.Interop.Word.Document wordDoc = null;
object inputFileTemp = inputFile;

try
{
wordDoc = wordApp.Documents.Open(refinputFileTemp);
wordDoc.ExportAsFixedFormat(outputFileName, WdExportFormat.wdExportFormatPDF);
}
finally
{
if (wordDoc != null)
{
wordDoc.Close(WdSaveOptions.wdDoNotSaveChanges);
}
if (wordApp != null)
{
wordApp.Quit(WdSaveOptions.wdDoNotSaveChanges);
wordApp = null;
}
}

return outputFileName;
}

<Excel To PDF>

public static string ConvertExcelToPdf(string inputFile)
{
string outputFileName = “DesireOutput File Path”;
Microsoft.Office.Interop.Excel.Application excelApp =
new Microsoft.Office.Interop.Excel.Application();
excelApp.Visible = false;
Workbook workbook = null;
Workbooks workbooks = null;
try
{
workbooks = excelApp.Workbooks;
workbook = workbooks.Open(inputFile);
workbook.ExportAsFixedFormatXlFixedFormatType.xlTypePDF,outputFileName,
XlFixedFormatQuality.xlQualityStandard, true, true, Type.Missing,Type.Missing, false,Type.Missing);
}
finally
{
if (workbook != null)
{
workbook.Close(XlSaveAction.xlDoNotSaveChanges);
while(Marshal.FinalReleaseComObject(workbook) != 0) { };
workbook = null;
}
if (workbooks != null)
{
workbooks.Close();
while(Marshal.FinalReleaseComObject(workbooks) != 0) { };
workbooks = null;
}
if(excelApp != null)
{
excelApp.Quit();
excelApp.Application.Quit();
while(Marshal.FinalReleaseComObject(excelApp) != 0) { };
excelApp = null;
}
}

return outputFileName;
}

<PowerPoint To PDF>

public static string ConvertPowerPointToPdf(string inputFile)
{
string outputFileName = “DesireOutput File Path”;
Microsoft.Office.Interop.PowerPoint.ApplicationClass powerPointApp =
new Microsoft.Office.Interop.PowerPoint.ApplicationClass();
Presentation presentation = null;
Presentations presentations = null;
try
{
presentations = powerPointApp.Presentations;
presentation = presentations.Open(inputFile, MsoTriState.msoFalse,MsoTriState.msoFalse,
MsoTriState.msoFalse);

presentation.ExportAsFixedFormat(outputFileName, PpFixedFormatType.ppFixedFormatTypePDF,
PpFixedFormatIntent.ppFixedFormatIntentScreen, MsoTriState.msoFalse,
PpPrintHandoutOrder.ppPrintHandoutVerticalFirst,PpPrintOutputType.ppPrintOutputSlides,
MsoTriState.msoFalse,null,PpPrintRangeType.ppPrintAll, string.Empty, false, true, true, true, false,
Type.Missing);
}
finally
{
if (presentation != null)
{
presentation.Close();
Marshal.ReleaseComObject(presentation);
presentation = null;
}
if (powerPointApp != null)
{
powerPointApp.Quit();
Marshal.ReleaseComObject(powerPointApp);
powerPointApp = null;
}
}
return outputFileName;

}

Kyoungsu Do
Software Quality Engineer
ImageSource, Inc.

Deadly sins of software development

InfoWorld published a really good article on the 7 sins of software development.  While this is a good list, I would add the following sins as well:

8) Not picking the right technology

This one mainly due to comfort zone and failing to keep up with better tooling or proven techniques out on the industry.  It’s the old saying of trying to force a square peg into a round hole or if all you have is a hammer than everything else looks like a nail.

9) Not thinking about performance

If performance is critical for a project, test it early and test it often.  Don’t wait until near shipping to start making performance changes because it would be too late and too little then.

10) Not considering about security

Isn’t this item obvious? :)

11) Not talking to users

Unless you are writing the software for yourself, which I do every once in a while ;) , get up and go talk to the users.  They would love you and appreciate your software even more.

If they tell you your software is terrible, thank them, think about what they said and improve on it.  If you get too attached to your software, you will never be able to make it better.

Building Out Distributed Apps (Big Data)

Yesterday, I attended a webinar by O’Reilly on how to reduce the pain of building out distributed applications. The focus was on scalability, which makes sense, since this is why you would want to distribute your applications.

Apart from the host’s unfortunate resemblance to Little Lord Fauntleroy, there was some interesting observations to be made. To wit:

Engineers versus Ops

When there’s an issue affecting your customer in large systems, it is most likely an engineering issue, especially in emerging products. You need to staff up on Engineering talent for your projects at a much greater rate than Ops.

Data is not always relational

Data these days is more than OLAP stuff. Things being captured and crunched include data graphs, key-value pairs, etc. So, something non-SQL based might be called for as a datastore. Only a handful of SQL features are used in most large data projects. As the data sets get larger, SQL gets less useful.

Real-time versus Batch Processing

Something to consider. How is your data being created, in one-sy/two-sy fashion online, or in large grabs of data. This will affect your basic understructure.

Cost of Research

It is very easy to under-estimate the cost of research when moving into a new area. Executive management wants hard numbers to be able to plan and manage costs, but anybody who’s developed new systems knows that costs tend to be unpredictable because you just don’t know what you don’t know yet.

What is your experience involving Big Data and Distributed Applications?

Using the ‘using’ keywork in C#

C# Super HeroIn the System.Data.SqlClient namespace, SqlConnection and SqlCommand are two examples of managed types that use unmanaged resources down in the COM layer of the run-time. Microsoft says that all of these types must implement the IDisposable interface.

Also from Microsoft:
As a rule, when you use an IDisposable object, you should declare and instantiate it in a using statement. The using statement calls the Dispose method on the object in the correct way, and …it also causes the object itself to go out of scope as soon as Dispose is called. Within the using block, the object is read-only and cannot be modified or reassigned.
using (SqlConnection conn)
{
//do work
}

The using statement ensures that Dispose is called even if an exception occurs. This is the same as wrapping it in a try/finally block:

SqlConnection conn;
try
{
    conn = GetConnection();
    // do work
}
finally
{
    conn.Dispose();
}

What I like to do is to nest my SqlClient objects in using statements:

string sqlString = "select * from myTable";
using (SqlConnection conn = GetConnection())
{
    using (SqlCommand cmd = new SqlCommand(sqlString, conn))
    {
        // do work, like add params, execute the statement
        // read the results, etc

        // even if you return from inside this nested using statement
        // both the cmd and conn objects are disposed properly

    } // cleans up cmd
} // cleans up conn

Very nice cheatsheet for C#

I came across this the other day.

http://www.fincher.org/tips/Languages/csharp.shtml

Don’t forget to come see us at Nexus 2010

  

Javascript escape() and C#

Recently, I needed to transmute a Web form into Windows form for a client. There was a subtle issue involving parameters to a SQL stored proc; the stored proc returned matches and near matches to the input values. The return values were missing special characters like ampersand (&).
I kept looking at the output routine trying to figure out where the data was getting cooked. I walked back up the execution chain, checking the inputs until I found a call to escape() in the JavaScript on the original web form.
The solution was to import the Microsoft.JScript assembly and call the Microsoft.JScript.GlobalObject.escape() method.
Another good day at ImageSource, Inc.

  

Adventures with Oracle IPM 11g (part 1)

Part 1.

Oracle starting shipping its latest IPM product, 11g earlier this quarter. As a company, ImageSource has been working as a partner with Oracle on this next step in their ECM evolution. I, personally, am a little late to the 11g game, so I will be sharing with you my impressions as a relative newcomer to the 11g world.

First off, 11g is a complete rewrite from IPM 7.7.x and the 11g install is big. The download is well over a gigabyte and I ended up downloading two other components in excess of 700M apiece. Installation is no picnic either. It’s been my experience over the years that Oracle produces products which are hard to install and configure properly, but once properly setup, they can outperform anything else in their class.

I will admit that I was unable to get my 11g system installed properly (I still say it’s Oracle’s fault, but that’s a long story; it also had to do with the undersized server I was attempting to use), but the guys in the Tech Support and Services department are letting me use one of their installs to get my dev system booted.

11g  is written in Java and lives on top of WebLogic and the rest of Oracle’s Fusion Middleware products, so it’s time to dust off the old Java coding skills. The Fusion Middleware stuff is good news for customers and developers who want to integrate with any number of Oracle back-end systems (PeopleSoft, JD Edwards, etc). It will be much more tightly integrated.

My first impression of the new web interface for the Imaging piece is that it’s nice, but not “wow!” but better than before. Some of the icons look a little clunky, but I suppose we will get used to them. The controls I’ve seen have both Tooltips and annoying Ajax popup menus. I’ve not had a chance to look very deeply into Process Management piece, which is built on BPEL, but I’ll be blogging more on it as I get deeper into in.

The API into the system is kind of the reverse from the IPM 7.7 way of doing things. There, the WebService API called down into the COM layer. Here, the Java client-side calls all go into the WebServices on the server. There’s no actual “client” codebase or DLL to call into. You log into one of several webservices (ApplicationService, ConnectionService, DocumentContentService, DocumentService, ImportExportService,LoginService, PreferenceService, SearchServiceService, SecurityService, TicketService) and proceed from there.

  

Distributed Capture & Document Capture

Distributed Capture & Document Capture

Capture is only a part of the ECM universe, but a crucial part nonetheless. Once a document is captured into an Enterprise Content Management system, it must be stored, perhaps put into a workflow process, archived, and made available for retrieval. Retrieval is in many ways the main thrust of an ECM system (no point putting it in there if you can’t ever see it again); retrieval is dependent on the index values associated it with it, which brings us back to capture.

Capture is the process of getting documents (and their data) into the system. Distributed Capture is the mechanism by which documents from a variety of locations (near and far) enter the system. The easiest way to do this is to utilize the file system. When different offices (or locations — work from home, anyone?) of a company are on the same network, specific locations on the shared file system can be designated for various purposes. Different directories can be used to input different kinds of documents.

I thought we were going to be paperless by now

This type of taxonomy works okay for existing electronic documents (Word files, spreadsheets, PDFs, etc); but what about hard-copy? The seemingly ubiquitous paper which exists in our so-called paperless office? Well, it needs to be scanned in. You want documents classified in a consistent manner, and the metadata (index values and other interesting info about the document) as accurate and as consistent as possible.

Consistency is key. When setting up a company-wide ECM system, it is a a key success indicator that everybody to follow the same set of procedures and guidelines involved in getting documents into the system. This can be accomplished by having a distributed capture system available.

The company I work for makes and sells a distributed capture system today. As we go through our roadmap discussions for where we want to take the product to solve customers’ future problems, we developers have have to grapple with some fundamental issues, mainly, what is the best technology to use as a platform.

It’s easy to imagine using the web to provide distributed document capture throughout your enterprise. You have centrally managed web servers. Everyone has a web browser on their computer (and cell phone, for that matter). In fact, anyone who’s ever attached a document using an html-based email program has already exercised the base technology necessary for a distributed capture system. One key advantage of Distributed Capture is that you get rid of paper at the source; take a moment to think about the implications of that. It’s okay, I’ll wait.

What else is needed…
There are two main improvements to simply uploading a document by way of a web page. One is the acquisition of the paper document, the other is the user-experience and business process to build into the hosting program. I’ll go into the physical acquisition in a later post, but the user-experience of a distributed capture system has to provide two things to be successful. It must be Dead Simple to Use and it must provide the functionality necessary to get good data into the system.

Our checking with users shows again and again that a single button is an attractive interface, with more functionality exposed as needed. One key question developers raise is what technology to build the interface in?

Technology Pros Cons
HTML Standards compliant, supported by all browsers. Primarily a static user interface. AJAX can add some Zing to the interface, but is problematical in certain situations (back-button, anybody?)
Flash Ubiquitous; Flash player in something like 90% of all browsers. Began life as an animation scripting language, although ActionScript 3.0 is more sophisticated. IDE support is poor. Hard to get my head wrapped around the timeline model.
Silverlight Microsoft integration and toolset. Microsoft has an army of developers working on tools and technologies; big changes in how Microsoft handles internet computing are emerging. Current market adoption is a little slow. Microsoft talks the big talk about cross-platform now, but has a history of embracing, extending, then co-opting technology (in my opinion)
JavaFX Ubiquitous. Many very good VM’s out there. Java itself is well suited to backend, server-side development. UI is not Java’s strong-suit; AWT ring a bell?
Platform Specific Code Leverage native functionality, look and feel. Lots of code bases to implement and maintain. Cross-platform toolkits and libraries tend to dumb-down the functionality to the lowest-common denominator.

I’m sure anybody reading this has ideas of their own about the pros and cons of the platforms listed out, and perhaps other ideas to add to the list. I welcome your comments.

Share on Twitter

Follow

Get every new post delivered to your Inbox.