Anda di halaman 1dari 10

27/03/2019 Add Images and Textboxes to PDF - CodeProject

Add Images and Textboxes to PDF


pmpdesign, 2 May 2007

A lightweight C# library to add images and 'round rectangles' to a PDF on the fly and then securely embed the PDF in a web page

Download source and examples - 44.0 KB

Introduction
I needed a method to create PDF documents on the fly,
specifically invoices and similar financial documents.

This software also had to integrate seamlessly into an application


I was developing.

Obviously there are a lot of pieces of software ranging from open


source to fairly expensive commercial applications available,
however, I wanted something that I could incorporate into a
commercial application with minimum or no license fees and no
issues with support if I did not have access to source code.

Another criteria was to have minimum code. I saw no reason to


have 3MB of code when I probably only needed 5% of it.

After searching the web for a long time, I finally came across an
article on CodeProject (PDF Library for creating PDF with tables
and text, in C#). This excellent article by Zainu introduced me to
the concepts behind creating a PDF.

Zainus' article introduces the basic concepts required to create a


PDF structure and add text either as a one-line/sentence or
formatted into a tabular fashion.

With Zainu's permission, I have extended his codebase to include


the addition of JPG images, to add textboxes in the form of
'rounded rectangles' as seen in the image to the right and finally,
to display the finished PDF inside a web page rather than linking
to it with the usual <a href="abc.pdf">.

Background
In order to fully understand the code behind this library, you should read the article by Zainu as I will not cover the same topics
again here.

Some of the original code has been modified, these changes are all commented in the code itself.

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 1/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

The workings of a PDF


Although I am not going to revisit the original article concepts, I will revisit the concept behind the PDF to show what is required to
add images and 'round rectangles'.

This is an example of the PDF markup that will be generated if you download and run the attached code.

Don't forget that even though you can read the markup in a text editor, that a PDF is in fact a binary file and (excluding the simplest
case) must be treated as such.

You can find a much more detailed explanation of the PDF by downloading the Adobe PDF Manual. It is only 1300 pages or so....

PDF Markup What it means

%PDF-1.5 This is a PDF version 1.5 - the double question mark is simply so that FTP and similar packages
%?? know that this is a binary file when transferring.

8 0 obj The 'X 0 obj' means that this is an object in the PDF - the X is its unique number.
<<
/Type /Page/Parent 2 0 R This object (8 0) describes one page, setting the page size and then defining the resources that
/Rotate 0 will be needed, in this case the fonts called T1, T2, T3 and T4 which are described in the objects 3
/MediaBox [0 0 595 0, 4 0, 5 0 and 6 0 respectively.
842]/CropBox [0 0 595 842]
/Resources<</ProcSet[/PDF/T The XObject refers in this case to an image called I1, the data that describes it can be found in
ext] the object 10 0.
/Font<</T1 3 0 R/T2 4 0 R/T3
5 0 R/T4 6 0 R>> Finally, the 'Contents' (essentially the markup which tells the PDF what to display) can be found
/XObject <</I1 10 0 R >>>> in the object numbered 9 0.
/Contents 9 0 R
>> The 'Parent' reference is to object 2 0 which records the number of pages in the entire document
endobj (in this case only 1) and shows which object describes the contents of each page.

9 0 obj<</Length 989
>>stream
q
144 0 0 100 300 700 cm This object contains markup to describe the actual page and means such things as
1 0 0 1 0 0 cm
/I1 Do place the text "XYZ" at location x,y in font Z
Q or
BT/T3 12 Tf draw an image
105 699 Td located at x,y of size
(Round Rectangle Header) Tj w,h etc In the code itself, is SetStream.
ET
endstream
endobj

1 0 obj<</Type
/Catalog/Lang(EN-US)/Pages
Root of the file, says that the index (or pagetree) to the document can be found in object 2 0
2 0 R>>
endobj

2 0 obj<</Count 1/Kids [ 8 0
The document index - this document has one page (Kids) and the information can be found in
R ]>>
object 8 0
endobj

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 2/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

PDF Markup What it means

3 0 obj<</Type/Font/Name
/T1/BaseFont/Times-Roman
/Subtype/Type1/Encoding
/WinAnsiEncoding>>
endobj
4 0 obj<</Type/Font/Name
/T2/BaseFont/Times-Italic
/Subtype/Type1/Encoding
/WinAnsiEncoding>>
endobj
Describes the fonts used in the document.
5 0 obj<</Type/Font/Name
/T3/BaseFont/Times-Bold
/Subtype/Type1/Encoding
/WinAnsiEncoding>>
endobj
6 0 obj<</Type/Font/Name
/T4/BaseFont/Courier
/Subtype/Type1/Encoding
/WinAnsiEncoding>>
endobj

10 0 obj
<</Name /I1
/Type /XObject
/Subtype /Image
/Width 144
/Height 100
/Length 29779
/Filter /DCTDecode Describes an image.
/ColorSpace /DeviceRGB
/BitsPerComponent 8 Note that the actual byte data is missing - you will find out how to add that later.
>> stream

[ byte data to represent the


jpg image ]

endstream
endobj

70
obj<</ModDate(D:200705010
24237+10'00')
/CreationDate(D:20070501024
237+10'00')
/Title(Title)/Creator(Your App
Properties of the document e.g. who created it and when etc
Name)
/Author(System Generated
/Producer(www.My New
App.com.au)/Company(My
Company Name)>>
endobj

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 3/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

PDF Markup What it means

xref
0 11
0000000000 65535 f
0000001275 00000 n
0000001332 00000 n
0000001374 00000 n
0000001473 00000 n The byte offsets of each object in the document. This is explained in the original article by Zainu.
0000001573 00000 n
0000001671 00000 n
0000031745 00000 n
0000000014 00000 n
0000000234 00000 n
0000001766 00000 n

trailer
<</Size 11
/Root 1 0 R
/Info 7 0 R
'Root' refers to the starting point known as the pagetree (object 1 0 in this case) of the
/ID[<5181383ede94727bcb32
document.
ac27ded71c68>
<5181383ede94727bcb32ac2
7ded71c68>]
>>

startxref
31959 End of the file.
%%EOF

Using the code


Download the zip file above and extract to a suitable location (or create a new web application in Visual Studio). The zip file contains
six files

PDFLibrary.cs
Default.aspx
Default.aspx.cs
streampdf.aspx
streampdf.aspx.cs
myimage.jpg

The file PDFLibrary.cs should be placed in the App_Code folder, the rest in the root of the application.

Point your browser at the default.aspx file and you should get a button displayed. Clicking this button should create the PDF and
display it within the web page.

How it all works

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 4/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

Let's take a look firstly at the 'round rectangles'.

These are based on 'Cubic Bezier Curves'. If you have ever used PhotoShop or similar graphical software, you may have used this
method without even knowing it.

Essentially, all we do to create the 'round rectangle' is to use eight paths to form an area. Four of these paths are the radii based on
the Bezier Curve plus four straight lines which connect them. This is then stroked to form the border, and the bounded area is then
coloured in to form the background. A simple rectangle is then drawn on top of the bezier area to form the text box.

If you are interested in the full details, have a look in the Adobe PDF Manual which describes the mathematics behind it. For the rest
of us, all we need to know is that it works!

Let's have a look at the actual code now. First we create a new object to represent the rectangle in code

RoundRectangle rr = new RoundRectangle();


and then specify the colours for the border, main background and background colour for the textbox.

ColorSpec rrBorder = new ColorSpec(0, 0, 0); //main border colour


ColorSpec rrMainBG = new ColorSpec(204, 204, 204); //background colour of the
//round rectangle
ColorSpec rrTBBG = new ColorSpec(255, 255, 255); //background colour of the
//rectangle on top of the
//round rectangle

Finally, as this is only markup as far as the PDF is concerned, we add the markup to the PDF content stream.

content.SetStream("q\r\n"); //initialise the PDF graphics cursor


content.SetStream(rr.DrawRoundRectangle(45, 582, 240, 130, 20, 0.55, 20, 90,
1, rrBorder, rrMainBG, rrTBBG)); //Draw the rectangle
content.SetStream("Q\r\n"); //close the graphics cursor in PDF

There are twelve parameters for the method DrawRoundRectangle

LLX
LLY
rrWidth
rrHeight
CornerRadius
Circularity
HeaderHeight
TextBoxHeight
Border
BorderColor
MainBG
TextBoxBG

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 5/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

LLX and LLY are the horizontal and vertical coordinates of the lower left of the box, rrWidth and rrHeight are the width and height of
the box (remember all coordinates are in 1/72" rather than pixels).

The CornerRadius parameter is as shown in Figure 2. The HeaderHeight parameter is the vertical height of the area at the top where
you can later place text. It cannot be less than the radius otherwise the text area rectangle placed over the top will overlap.

TextBoxHeight is the height of the text box and will be centred vertically. The last three colour parameters are the three
ColorSpec values we created earlier.
Finally the Circularity parameter. This is used to change the actual shape of the corners of the box.

As I wanted to make each corner mirror reflections of each other, I decided to calculate the (x2,y2) and (x3,y3) values shown in
Figure 1 (which are the values in PDF markup to describe the curve) based on the radius of the corner and a constant which I called
Circularity. The value for (x1,y1) is the current graphics cursor position in the PDF and the (x4,y4) value is the end point of the curve
and also the new graphics cursor position.

At a value of 0, you get a straight line (in effect, an octagonal shape). If you increase the value to 0.55, you get a perfect radius. As
the value increases towards 1, the corner gets tighter / smaller. Once the value starts to go above 1, some other interesting corner
shapes start to form.

So that's all there is to it. This code assumes that the final document is one page and has a fixed number of lines of text in a text
box, however, it would not be too hard to combine the textAndtable.AddRow method with the DrawRoundRectangle
method to dynamically create the vertical dimensions of the textbox and wrap it across multiple pages if you needed to.

Drawing the straight lines


Drawing the lines inside a box (perhaps to designate columns) can be done using the textAndtable class if the text is tabular,
or you can use the line.DrawLine method. This simply accepts the start of line (xs,ys), end of line (xe,ye) coordinates plus the
line width and colour and adds the markup to draw a line to the PDF content stream. This is useful for separating individual text
elements.

Adding an image to the PDF document


Displaying an image on a PDF page is a much more involved process than creating a 'round rectangle'. As an image cannot be
mathematically specified, we need to provide the PDF with more information before it can be rendered.

There are three parts to adding an image to a PDF. These are shown in the table describing a simple PDF markup at the start of this
article.
https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 6/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

Create the index


Create the parameters and byte data that describes the image
Draw the image to the document

Let's look in more detail at what this involves.

What it means as far as adding an image is


PDF Markup
concerned

8 0 obj
<<
/Type /Page/Parent 2 0 R
Firstly we need to tell the PDF where to find the
/Rotate 0
data that describes the image.
/MediaBox [0 0 595 842]/CropBox [0 0 595
This is the CreateImageDict method. In this
842] /Resources<</ProcSet
case we are telling the PDF that that the data that
[/PDF/Text] /Font<</T1 3 0 R/T2 4 0
describes the image called 'I1' can be found in the
R/T3 5 0 R/T4 6 0 R>>
object numbered 10 0.
/XObject <</I1 10 0 R >>>>
/Contents 9 0 R >>
endobj

Now we need to give the PDF some information as


to where on the page to place the image. The q &
Q mean we are working with a graphics cursor (see
PDF manual for full details of syntax).

9 0 obj<</Length 989 The next three lines describe where to place the
>>stream<br />q<br/>144 0 0 100 300 700 cm<br />1 image relative to the page, its page width and
0 0 1 0 0 cm<br />/I1 Do<br />Q height.
BT/T3 12 Tf
105 699 Td If you look in the PDF manual, there is also a whole
(Round Rectangle Header) Tj host of transformations that you can apply to an
ET image, for example rotation, scaling, skewing and
endstream many other more advanced features. These would
endobj need to be added to the code if you wished to use
them.

This markup is added to the content stream using


the AddImageResource method as a part of
the GetPageDict method.

Finally we need to describe the actual image.


10 0 obj<br /><</Name /I1<br />/Type /XObject<br
/>/Subtype /Image<br />/Width 144<br />/Height The PDF needs details such as the name, pixel
100<br />/Length 29779<br />/Filter /DCTDecode<br dimensions, data compression type (jpg, gif, png, tif
/>/ColorSpace /DeviceRGB<br />/BitsPerComponent etc all have different compression methods), colour
8<br />>> stream<br />[ byte data to represent space eg RGB or CMYK etc plus the number of bits
<br />the jpg image ]<br />endstream<br />endobj required to describe each pixel colour component
and finally the byte data that makes up the image.

There are only a few lines of code to generate the PDF.

String ImagePath = Server.MapPath("myimage.jpg"); //file path to image source


ImageDict I1 = new ImageDict(); //new image dictionary object
I1.CreateImageDict("I1", ImagePath); //create the object which describes
//the image
page.AddImageResource(I1.PDFImageName, I1, content.objectNum); //which object
//within the PDF contains the image data
PageImages pi = new PageImages();
content.SetStream(pi.ShowImage("I1", 300, 700, 144, 100)); //draw an image
//called 'I1', where and what size

Once we have the created the data, we need to write it to the physical PDF file

file.Write(I1.GetImageDict(file.Length, out size), 0, size);

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 7/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

The code behind adding an image


First we need to define where the image is on the file system.

Secondly we need to add this data to the PDF in the form of an object (9 0) in the table above. This is perhaps the most complex
part of the process. Luckily for me, Zainu had already done most of the hard work as far as creating a framework which keeps track
of object numbers and the other main parts of a PDF. I have simply added in some more methods specifically to handle images.

In order to create the object which contains the data for the image, we must remember that a PDF file is in fact binary by nature.
The markup is created as unicode (16 bit), whereas the actual data representing the image is only 8 bit in the case of my example.
This means that we have to handle the byte output slightly differently to create the object bytes.

In essence we do this in three parts. Part one is send the first part of the object (obj X 0 .... stream) converted to byte data
imageDictStart followed by the actual byte data of the image imagebytes followed by the last part of the object
(endstream endobj) imageDictEnd to the PDF stream.

This is coded in the methods, GetImageDict and GetImageBytes

CreateImageDict opens the jpg as a bitmap to get the pixel dimensions and then puts the byte data into an array. Finally it
adds the parameters such as the image name, pixel dimensions etc into the string imageDictStart ready for writing to the
page later.

The next thing we need to do is to add the reference to this object into the page index. This is the markup /XObject <</I1
10 0 R >> in the example above.
This is written in AddImageResource to the string imageRef which is later used by GetPageDict to create the PDF page
index.

Now all the hard work is done, we just need to let the PDF know that we would like to display the image on the page. This is done
using markup such as

q
144 0 0 100 300 700 cm
1 0 0 1 0 0 cm
/I1 Do
Q

and is added to the PDF using PageImages.ShowImage

content.SetStream(pi.ShowImage("I1", 300, 700, 144, 100));

This could be written direct to the content stream, however, I implemented it as a separate class in case I wanted to add in some
transformations to the image later.

The first parameter is the image name, the second pair are the (x,y) coordinates where the lower left of the image should be placed
and the last pair are the width and height of the image on the page.

You should now have an image added to your PDF.

Other image types


I have only tested this on RGB JPGs. Other image types have different compression methods and hence different decompression
methods, different colour spaces and different bit component levels. If you want to try the method with different image types, you
could start by changing the decode method in CreateImageDict. I think you can find the required information in the PDF
manual. Feel free to post your findings in the discussion below.

Displaying the image as part of a Web Application


Now that we have a PDF, we probably want to show it somewhere.

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 8/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

In many cases, we can simply do this by using a hyperlink to the file itself. This will be fine for many applications, but what if we
want to restrict the file to certain users of a system?

You could set up file permissions on a network, but this is not an option on a public website (for instance).

I decided to use a very simple technique using an <iframe>. This can easily be used to display an object inside a web page simply
by providing the file name and the details of the application that will open the file (in this case application/PDF).

However, there is another way to use this. Instead of specifying the actual pdf file, we can specify an .aspx file which will serve the
byte data of the file. In this case, we can now determine who the requestor is and determine if they have permission to view the file
before serving it.

If you look towards the bottom of default.aspx, you will see that the <iframe> calls streampdf.aspx for the file data
source.

Have a look at the structure of the html that streampdf.asp generates. It does not send any headers etc, only the application
type followed by the bytes of the file specified.

This version simply sends the bytes from the hardcoded file name, however, you could specify a reference to perhaps a primary key
in a database which contains the actual file name/path to serve. You could even store the binary data of the PDF in the database
itself if you wished. In this way you can control who sees the file.

Note that .NET 2.0 web apps have a special folder called App_Data. This is specifically designed for storage of such files as anyone
browsing to a file in this folder will be returned a message that The system cannot find the file specified.

Further development?
I have tried to use flate compression to reduce the size of the page dictionary, so far unsuccessfully. I gather that the MS
implementation of the flate compression algorithm is not the same as the Adobe version. If anyone manages to work out how to
use it, please post it!

The ability to use other image formats (gif, png, tif etc) would also be a bonus. If anyone manages to add that in successfully, please
post it here.

If you would like to use this code...


In the spirit of CodeProject - go ahead and use the code, there are no licensing conditions. Just remember the code is provided 'as
is' and if it breaks your application I take no responsibility. If you successfully use the code in an application, I would appreciate a
mention or just send me an email and let me know it works!

History
May 2007 - Article first published.

License
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 9/10
27/03/2019 Add Images and Textboxes to PDF - CodeProject

pmpdesign
Web Developer
Australia

PMP Design is based in Newcastle, Australia and specialises in designing and implementing custom business management
systems and websites.

Owner Geoff is currently working on TimeSuite, a business management system for project based organisations.

Comments and Discussions


77 messages have been posted for this article Visit https://www.codeproject.com/Articles/18623/Add-Images-and-
Textboxes-to-PDF to post and view comments on this article, or click here to get a print view with messages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile Article Copyright 2007 by pmpdesign
Web03 | 2.8.190306.1 | Last Updated 2 May 2007 Everything else Copyright © CodeProject, 1999-2019

https://www.codeproject.com/Articles/18623/Add-Images-and-Textboxes-to-PDF?display=Print 10/10

Anda mungkin juga menyukai