You are on page 1of 21

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

Featured * Storing Data in the Cloud Learn step-by-step how to setup data in the cloud and use it within an application. * Accessing Cloud Data using an OData Web Service Learn to build a web service and an application that consumes your data that is stored in the cloud. * NEW: Windows Mobile Development Center CodeGuru has launched a new Windows Mobile Development Center with articles for building Windows hone !pps" Windows #tore !pps" and more. * !M"# Development Center $ur portal for articles" videos" and news on %&ML'" C##(" and )ava#cript
!ser Name Pass#ord $og in Help Register

Remember Me? Forum What's New? Quick Links Advanced Search

New Posts FAQ Calendar Forum Actions

VBForums

VBForums CodeBank

CodeBank - Visual Basic .NET

[VB.NET] Extract Pages and Split Pd Files !sing iTextS"arp

%esults & to '( o &)*

Page & o ' 1 ) + '

$ast

Thread: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
Share 0 Tweet 0 Share T"read Tools Sep ).t"/ )((0/ &&1&& 23 ,ispla4&

stana#
Thread Starter Po#erPoster

[VB.NET] Pdf Manip lati!n "lass Using iTextSharp


&his thread was originally about e*tracting and merging pdf files using i&e*t#harp. %owever" as time goes by" + have added a lot more code to do other stuff and put them all together into a handy class called dfManipulation. &here are , classes as below -choose the one that matches the i&e*t#harp version you.re using/0 1. &he original dfManipulation.vb class is coded based on ite*tsharp version 2. &his class is obsolete and no longer maintained. ,. &he updated dfManipulation,.vb class is for the newer ite*tsharp version '. &his class also contains alot more methods than the original one and + highly recommend it over the old one. + will update this class from time to time to fi* bugs and3or add more functionality. Consider it.s a

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

1 of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

wor@ in progress

$$$$ "ast updated on %&'&()*( ++++

Please verify the version of iTextSharp you're using and download the correct class. &he current version of dfManipulation, class supports !8#A,'? encryption provided that your ite*tsharp.dll version is '.1.* or higher. 6elow is the list of public methods in the new dfManipulation, class

vb.net Code0 1. 'Remove all restrictions from a pdf file 2. Public Shared Function RemoveRestrictions(ByVal restrictedPdf 3. 4. 'Parse text from a specified range of pdf pages 5. Public Shared Function ParsePdfText(ByVal sourcePDF 6. Optional ByVal fromPageNum 7. Optional ByVal toPageNum 8. 9. 'Parse all text from a pdf 10. Public Shared Function ParseAllPdfText(ByVal sourcePDF 11. 12. 'Page to page comparision of 2 pdf files and write the differe 13. Public Shared Sub ComparePdfs(ByVal pdf1 As String 14. ByVal resultFile As 15. Optional ByVal fromPageNum 16. Optional ByVal toPageNum 17. 18. 'Extract specified pages from a pdf to create a new pdf 19. Public Shared Sub ExtractPdfPages(ByVal sourcePdf 20. 21. 'Split a pdf into specified number of pdfs 22. Public Shared Sub SplitPdfByParts(ByVal sourcePdf 23. 24. 'Split a pdf into multiple pdfs each containing a specified nu 25. Public Shared Sub SplitPdfByPages(ByVal sourcePdf 26. 27. 'Extract pages from multiple source pdfs and merge into a fina 28. Public Shared Sub ExtractAndMergePdfPages(ByVal sourceTable 29. 30. 'Set security password on an existing pdf file 31. Public Shared Sub SetSecurityPasswords(ByVal sourcePdf 32. 33. 'Add watermark to pdf pages using an image 34. Public Shared Sub AddWatermarkImage(ByVal sourceFile 35. 36. 'Add water mark to all pdf pages using text 37. Public Shared Sub AddWatermarkText(ByVal sourceFile 38. Optional ByVal 39. Optional ByVal 40. Optional ByVal 41. Optional ByVal 42. Optional ByVal 43. 44. 'Merge multiple pdfs into a single one. 45. Public Shared Function MergePdfFiles(ByVal pdfFiles 46. Optional ByVal 47. Optional ByVal 48. Optional ByVal 49. Optional ByVal 50. Optional ByVal 51. 52. 'Merge multiple pdf's into one with all bookmarks preserved 53. Public Shared Function MergePdfFilesWithBookmarks 54. 55. 'Add document outline (bookmarks) to a pdf 56. Public Shared Sub AddDocumentOutline(ByVal sourcePdf 57. 58. 'Extract urls from a pdf 59. Public Shared Function ExtractURLs(ByVal sourcePdf 60. 61. 'Extract images from a pdf 62. Public Shared Function ExtractImages(ByVal sourcePdf 63.

, of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90.

'Fill a form Public Shared Sub FillAcroForm(ByVal sourcePdf As Public Shared Sub FillMyForm(ByVal sourcePdf As String 'Add annotatation Public Shared Sub AddTextAnnotation(ByVal sourcePdf Public Shared Function GetAcroFieldData(ByVal sourcePdf Public Shared Function GetPdfSummary(ByVal sourcePdf Public Shared Function ReplacePagesWithBlank(ByVal ByVal ByVal Optional Public Shared Function InsertPages(ByVal sourcePdf ByVal pagesToInsert ByVal outPdf As Public Shared Function RemovePages(ByVal sourcePdf 'A demo on how to draw various shapes in itextsharp Public Shared Sub DrawShapesDemo(ByVal sourcePdf Public Shared Sub AddImageToPage(ByVal sourcePdf

!ny comments are welcomed. %appy coding #tanav.

2ttac"ed Files dfManipulation.vb -1=.2 B6" ''(1 views/ dfManipulation,.vb -=1.C B6" 1?CC views/

$ast edited 9- stana7: 2pr *t"/ )(&) at (+1+6 P3. $eas!n: Ne# 7ersion o Pd 3anipulation) class no# supports 2ES-);6 encr-ption
$epl% &ith ' !te ,ec &'t"/ )((0/ &&1+) 23 4)

n(rege
Fren<ied 3em9er 5oin ,ate1 $ocation1 Posts1 5ul )((6 38 &/;0*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
#tanav ... than@s for posting these code samples. &hey helped me on a proDect that + am currently wor@ing on. + would li@e to reEuest that you post another sample0 + need to be able to e*tract specified pages from multiple documents F save them to one combined D:. ie. ta@e pages ( F C from Doc1.pdf" 2-? from Doc,.pdf F 1" ' F 1, from Doc(.pdf and save them in Doc2.pdf +s this Gdo-ableG<

$ast edited 9- n9rege: ,ec &'t"/ )((0 at &&1+6 23.

( of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

$epl% &ith ' !te ,ec &0t"/ )((0/ (*1(* 23 4+

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
Hes" it.s doable. %owever" +.m on vaction right now and + do not have access to my wor@ computer which has all the needed tools to write code. What you can do right now is to create a function that returns a hashtable or a dictionary with the file names -string/ being the @eys and the pages to e*tract -integer array/ being the values. $nce you have this hashtable3dictionary" you can modify the 8*tract df age sub such that it will create a single new pdf file and then loop trhu the hashtable3dictionary to e*tract the pages and add them o the output pdf. +t.s Dust a matter of setting up the loop right such that in each loop" you read an entry and e*tract pages from that file. +f you can wait until later this wee@ when + return to wor@" + can try to come up with something for you in code. 6est regards" #tanav.

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

$epl% &ith ' !te ,ec &0t"/ )((0/ (*1&* 23 4'

n(rege
Fren<ied 3em9er 5oin ,ate1 $ocation1 Posts1 5ul )((6 38 &/;0*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
+f you could post a Euic@ code e*ample when you get bac@ that would help me immensely and may be of help to others trying to do the same thing. 8nDoy the rest of your vacation...

$epl% &ith ' !te ,ec )(t"/ )((0/ (*1)6 23 4;

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- n(rege

8 -ou could post a >uick code example #"en -ou get 9ack t"at #ould "elp me immensel- and ma- 9e o "elp to ot"ers tr-ing to do t"e same t"ing. En?o- t"e rest o -our 7acation...
5oin ,ate1 $ocation1 !S2 Posts1 5ul )((6 Pro7idence/ %8 */&*.

+.ve added a method to do what you need. #ince the total te*t is more than 1>>> characters" + had to put all the code in to a class - dfManipulation.vb/ and post it as an attachment. %ope it helps.

$epl% &ith ' !te 5ul +&st/ )((./ &&1)6 23 46

gaig!i11)
Ne# 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i #tanav"

2 of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

5oin ,ate1 )((. Posts1 )

5ul

Do you have any code sample that will convert pdf to multipage tiff< than@s

$epl% &ith ' !te 5ul +&st/ )((./ &)1+6 P3 40

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- gaig!i11)

@i Stana7/ ,o -ou "a7e an- code sample t"at #ill con7ert pd to multipage ti A t"anks
5oin ,ate1 $ocation1 !S2 Posts1 5ul )((6 Pro7idence/ %8 */&*.

+t.s impossible to use i&e*t#harp to convert pdf to multipage tiff. %owever" you can use D:6o* to convert each pdf page to an image file -it only outputs to Dpg.s or png.s/" then merge these images into a multipage tiff. &o download D:6o*" go here0 http033www.pdfbo*.org3inde*.html &o merge multiple images into 1 multipage tiff" chec@ out this codeproDect article0 http033www.codeproDect.com3B63GD+-pl...ipage&iff.asp* !nd good luc@

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 5ul .t"/ )((*/ (61&+ 23 4.

Master$ipper
Ne# 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i all.

5oin ,ate1 Posts1

5ul )((* )

+ @now this thread is old" but + am using the i&e*t#harp library in this e*act way. + have a D: with 2 pages and use this code to e*tract page ( in a Euic@ e*ample prog + made. %owever" the original D: has te*t fields + can edit - acrofields / and after e*traction the (rd page" loses these fields. !ny idea-s/ what + can change 3 do to @eep these editable fields in the resulting page (.

' of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

&han@s.

$epl% &ith ' !te 3ar ;t"/ )(&(/ (&1;* P3 4*

cthai
3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i"
3ar )((0 +'

5oin ,ate1 Posts1

+.m trying to e*tract a single page from a multi page pdf and +.m using the code belowI however" +.m getting an error that it.s not recogniJing Kparam nameL. !ny help would be great. &han@s. Code0

''' <summary> ''' Extract a single page from source pdf to a new pdf ''' </summary> <param name="sourcePdf">"C:\Documents and Settings\rch\Desktop\psm2010\v <param name="pageNumberToExtract">"P1T1"</param> <param name="outPdf">"C:\Documents and Settings\rch\Desktop\psm2010\vent ''' <remarks></remarks> Public Shared Sub ExtractPdfPage(ByVal sourcePdf As String, ByVal pageNu Dim reader As iTextSharp.text.pdf.PdfReader = Nothing Dim doc As iTextSharp.text.Document = Nothing Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing Try reader = New iTextSharp.text.pdf.PdfReader(sourcePdf) doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotatio pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream( doc.Open() page = pdfCpy.GetImportedPage(reader, pageNumberToExtract) pdfCpy.AddPage(page) doc.Close() reader.Close() Catch ex As Exception Throw ex End Try End Sub

$epl% &ith ' !te 3ar ;t"/ )(&(/ (;1&* P3 4&(

stana#
Thread Starter Po#erPoster

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

? of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
Why are you putting your arguments in the code comments< &hat.s not how you do it. Hou need to call the sub and pass in your arguments" something li@e this0

vb.net Code0 1. 'Specified the path to the source pdf file 2. Dim sourcePdf as sgtring = "C:\Documents and Settings\rch\Desktop\ 3. 4. 'Extract page # 2 off this above pdf file 5. Dim pageNumberToExtract As Integer = 2 6. 7. 'And then save it to a new pdf named 'table40_page2.pdf' 8. Dim outputPdf As String = "C:\Documents and Settings\rch\Desktop\p 9. 10. 'Call the sub somewhere in your program passing in the above argum 11. PdfManipulation.ExtractPdfPage("C:\Documents and Settings\rch\Desk

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 3ar )'t"/ )(&(/ (;1(* P3 4&&

sl!*+stead%
Ne# 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
#tanav 0

5oin ,ate1 Posts1

=ct )((* ;

i have tried ite*tsharp for putting watermar@ on pdfs.+t wor@ed fine. 7ow i am trying to edit %eader on e*isting pdf files to desired header. +s it possible. if its possible then i have to try to use it on the bunch of pdf files in one single folder &han@s for the help #ri

$epl% &ith ' !te 3ar )'t"/ )(&(/ (.1'' P3 4&)

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- sl!*+stead%

Stana7 1 i "a7e tried itexts"arp or putting #atermark on pd s.8t #orked ine.

C of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

5oin ,ate1 )((6 $ocation1 !S2 Posts1

5ul

No# i am tr-ing to edit @eader on existing pd iles to desired "eader. 8s it possi9le. i its possi9le t"en i "a7e to tr- to use it on t"e 9unc" o pd iles in one single older T"anks or t"e "elp Sri

Pro7idence/ %8 */&*.

Hes" it.s possible to add3change the header3footer of an e*isting pdf file and save the result to a new file. lease post your Euestion in 56.7et forum because it.s a different subDect and doeasn.t belong to this code ban@ thread.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 2pr 6t"/ )(&(/ (+1&+ 23 4&+

#i,%
Fanatic 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i #tanav" its possible to e*tract the D: pages with boo@mar@s<

5oin ,ate1 $ocation1 Posts1

3a- )((0 8ndia ;+*

Visual Studio.net )(&( 8 t"is post is use ul/ rate it

$epl% &ith ' !te 2pr &6t"/ )(&(/ &&1&& 23 4&'

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- #i,%

@i Stana7/ its possi9le to extract t"e P,F pages #it" 9ookmarksA

5oin ,ate1 $ocation1 !S2

5ul )((6 Pro7idence/ %8 -

Hes" + &%+7B it is Euite possible" but it would involve much more wor@ -obviously/. + gave it a shot as seen in the code below but fran@ly" the method + was using only wor@s to some e*tends. +t only preserves the 1st

M of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

Posts1

*/&*.

level boo@mar@s . My approach was to e*port the boo@mar@s in the original pdf to a collection" and" select the pages to be e*tract from the reader" use pdfstamper to copy the original pdf -with now only the selected pages/ to a new pdf. #ince pdfstamper automatically preserves !LL the boo@mar@s from the original" + had to edit the boo@mar@ collection to remove the unused ones. &his approach should wor@ but + don.t @now why it only preserves 1st level boo@mar@s. #ome more wor@ is needed to wor@ that bug out" but + don.t have the time right now. + will post Dust what + have so far.

vb.net Code0 1. ''' <summary> 2. ''' Extract pages from an existing pdf file to create a new pd 3. ''' </summary> 4. ''' <param name="sourcePdf">full path to sthe source pdf</para 5. ''' <param name="pageNumbersToExtract">an integer array contai 6. ''' <param name="outPdf">the full path to the output pdf</para 7. ''' <remarks></remarks> 8. Public Shared Sub ExtractPdfPages(ByVal sourcePdf 9. 10. Dim raf As iTextSharp.text.pdf.RandomAccessFileOrArray 11. Dim reader As iTextSharp.text.pdf.PdfReader = 12. Dim outlines As System.Collections.ArrayList = 13. Dim page As iTextSharp.text.pdf.PdfImportedPage 14. Dim stamper As iTextSharp.text.pdf.PdfStamper 15. Dim hshTable As System.Collections.Hashtable = 16. Try 17. raf = New iTextSharp.text.pdf.RandomAccessFileOrArray 18. reader = New iTextSharp.text.pdf.PdfReader 19. outlines = iTextSharp.text.pdf.SimpleBookmark 20. reader.SelectPages(New System.Collections 21. stamper = New iTextSharp.text.pdf.PdfStamper 22. RemoveUnusedBookmarks(outlines, pageNumbersToExtract 23. stamper.Outlines = outlines 24. stamper.Close() 25. reader.Close() 26. Catch ex As Exception 27. MessageBox.Show(ex.Message) 28. End Try 29. End Sub

!nother approach + thought of was to e*port the original boo@mar@s to an NML file and edit that file. $nce done" import it bac@ to the new pdf file -which contains only the e*tracted pages/. 6ut li@e + said" +.m currently donot have a lot of free time to play with it. #o + leave it to you to try Good luc@.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 2pr )&st/ )(&(/ (;1;( 23 4&;

#i,%
Fanatic 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
&han@s stanav... yep i tried and i get... #plitting Code0

= of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

5oin ,ate1 $ocation1 Posts1

3a- )((0 8ndia ;+*

1. Public Function SplitPdfFiles(ByVal iStartPage As String, ByVal iE 2. Try 3. 'Variables to hold the split file informations 4. 5. Dim reader As PdfReader = New PdfReader(sPDFPath) 6. reader.RemoveUnusedObjects() 7. reader.ConsolidateNamedDestinations() 8. 9. Dim importedPage As PdfImportedPage = Nothing 10. Dim currentDocument As New Document 11. Dim pdfWriter As PdfSmartCopy = Nothing 12. 13. 14. Dim bIsFirst As Boolean = True 15. For j As Integer = iStartPage To iEndPage 16. If bIsFirst Then 17. bIsFirst = False 18. currentDocument = New Document(reader.GetPageS 19. pdfWriter = New PdfSmartCopy(currentDocument, 20. pdfWriter.SetFullCompression() 21. ' pdfWriter.CompressionLevel = PdfStream.BEST_ 22. pdfWriter.PdfVersion = reader.PdfVersion 23. currentDocument.Open() 24. End If 25. 26. importedPage = pdfWriter.GetImportedPage(reader, j 27. pdfWriter.AddPage(importedPage) 28. Next 29.

this one wor@ing fine.. and the pdf e*tracting with actual boo@mar@s..

T"is approac" s"ould #ork 9ut 8 donBt kno# #"- it onl- preser7es &st le7el 9ookmarks
the problem is its preserving first level boo@mar@s.. #tanav" its possible to get atleast the child boo@mar@s collection..<<

Visual Studio.net )(&( 8 t"is post is use ul/ rate it

$epl% &ith ' !te 3a- &*t"/ )(&(/ (&1&' P3 4&6

selnah*%
Ne# 3em9er 5oin ,ate1 Posts1 3a- )(&( &

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%as anyone found a code e*ample on how to convert D: to image using i&e*t#harp or D:6o*<

$epl% &ith ' !te 5ul )0t"/ )(&(/ (01&. 23 4&0

-pires
Ne# 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp

1> of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

%i #tanav"
5oin ,ate1 Posts1 Fe9 )((* ;

:irst nice wor@" you help me allot" wit you e*ample but i have a Euestion" +.m using the G#plit df6y agesG and is wor@ing o@" but is there any reason for the e*traction pdf.s end with a larger siJe that the original that as '.pag<

8*.0 $riginal pdf with '.pag - C,B6 / + e*tract the '.pag with your e*ample code" and etch pag ends with M'B6 +s there any way to compress the e*traction pages< or some reason for this<

Oegards"

$epl% &ith ' !te 2ug ;t"/ )(&(/ (*1;0 23 4&.

pra(a.aran.
5unior 3em9er 5oin ,ate1 Posts1 2ug )(&( &*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i" + have used G#plit df6y agesG method. 6ut i pass ;OLl-http033localhost01MC>3 D:WC:#ervice31.pdf/ for splitting...+t returns following error G;ri format is not supportedG. lease give the solutions for the above problem. lease do the needful.

$epl% &ith ' !te 2ug ;t"/ )(&(/ &)1'+ P3 4&*

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- pra(a.aran.

@i/ 8 "a7e used CSplitPd B-PagesC met"od. But i pass !%$lD"ttp1EElocal"ost1&.0(EP,FFCFSer7iceE&.pd G or splitting...8t returns ollo#ing error C!ri ormat is not supportedC.
5oin ,ate1 $ocation1 !S2 Posts1 5ul )((6 Pro7idence/ %8 */&*.

Please gi7e t"e solutions or t"e a9o7e pro9lem. Please do t"e need ul.

Hou download the file and save it to a temp location 1st. !fter that" you can split it as usual. +f you don.t need the original pdf after done splitting" you can delete it. &o download a file from an url" you can use a WebClient or simply use My.Computer.7etwor@.Download:ile-url" saveLocation/.

11 of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 2ug 6t"/ )(&(/ (+1+0 23 4)(

pra(a.aran.
5unior 3em9er 5oin ,ate1 Posts1 2ug )(&( &*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i " + need to pass the parameter li@e this -Ghttp033localhost01MC> 3 D:WC:#ervice31.pdfG"1"Ghttp033localhost01MC>3 D:WC:#ervice3,.pdfG/ in the #plit df6y ages method.. &he output file in the format of ;OL. +t returns following error G;ri format is not supportedG. lease give the solutions for the above problem. lease do the needful.

$epl% &ith ' !te 2ug 6t"/ )(&(/ (.1'' 23 4)&

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
Hou need to supply the physical file paths... &here.s no way around it because we rely on i&e*t#harp to do the wor@" and if i&e*t#harp doesn.t support it" there.s not much we can do to. %owever" that is not a problem. &he problem is with your methodology of doing things. While you can access -download/ a file from an url" you cannot upload the file using an url. +f you are to run the splitting tas@ any C" you will need to download the file to the local C" split it and then upload it bac@. +f you.re to run that splitting tas@ on the server that host your web site" you have to give it the direct physical paths and not the url.s. Hou cannot treat an url the same as a conventional file path.

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 2ug 6t"/ )(&(/ &&1&+ 23 4))

pra(a.aran.
5unior 3em9er 5oin ,ate1 Posts1 2ug )(&( &*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i" i got the below error ;nable to cast obDect of type .i&e*t#harp.te*t.pdf. df!rray. to type .i&e*t#harp.te*t.pdf. O+ndirectOeference.. Whats the reason i got that error. %ow we avoid this type error. +s there any solution for this problem.

1, of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

$epl% &ith ' !te 2ug 6t"/ )(&(/ &&1&0 23 4)+

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
#how the code where the error occured...

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 2ug 6t"/ )(&(/ &&1)& 23 4)'

pra(a.aran.
5unior 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
6elow is the code. + converted from 5b.net to CP.

5oin ,ate1 Posts1

2ug )(&( &*

i&e*t#harp.te*t.pdf. dfOeader reader Q nullI i&e*t#harp.te*t.Document doc Q nullI i&e*t#harp.te*t.pdf. dfCopy pdfCpy Q nullI i&e*t#harp.te*t.pdf. df+mported age page Q nullI int pageCount Q >I try R reader Q new i&e*t#harp.te*t.pdf. dfOeader-source df/I pageCount Q reader.7umber$f agesI if -pageCount K num$f ages/ R return -1I throw new !rgument8*ception-G7ot enough pages in source pdf to splitG/I S else R string e*t Q #ystem.+$. ath.Get8*tension-base7ame$ut df/I string outfile Q string.8mptyI int n Q Convert.&o+nt(,-Math.Ceiling-Convert.&oDouble-pageCount/ 3 Convert.&oDouble-num$f ages///I int current age Q 1I for -int i Q 1I i KQ nI iTT/ R outfile Q base7ame$ut df.Oeplace-e*t" GAG T i T e*t/I doc Q new i&e*t#harp.te*t.Document-reader.Get age#iJeWithOotation-current age// I 33pdfCpy Q new i&e*t#harp.te*t.pdf. dfCopy-doc" new #ystem.+$.:ile#tream-outfile" #ystem.+$.:ileMode.Create//I

1( of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

pdfCpy Q new i&e*t#harp.te*t.pdf. dfCopy-doc" new #ystem.+$.:ile#tream-outfile" #ystem.+$.:ileMode.Create//I 33pdfCpy Q new i&e*t#harp.te*t.pdf. dfCopy-doc" #ystem.7et.%ttpWebOeEuest.Create-outfile/.GetOesponse-/.GetOesponse #tream-//I doc.$pen-/I if -i K n/ R for -int D Q 1I D KQ num$f agesI DTT/ R page Q pdfCpy.Get+mported age-reader" current age/I pdfCpy.!dd age-page/I--------%ere only error is happen. current age TQ 1I S S else R for -int D Q current ageI D KQ pageCountI DTT/ R page Q pdfCpy.Get+mported age-reader" D/I pdfCpy.!dd age-page/I S S doc.Close-/I S S reader.Close-/I return 1I S catch -8*ception e*/--When i see the e*ception it will that error. R return -1I throw e*I S

is this error happen because of particular D:<<<<

$epl% &ith ' !te 2ug 6t"/ )(&(/ &&1;' 23 4);

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- pra(a.aran.

is t"is error "appen 9ecause o particular P,FAAAA

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

robably... Can you upload a copy of that particluar pdf file so that + can use it to investigate further<

12 of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te 2ug *t"/ )(&(/ &)1'; 23 4)6

pra(a.aran.
5unior 3em9er 5oin ,ate1 Posts1 2ug )(&( &*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i i uploaded the pdf file. please chec@ the application with the D: file. &his pdf file is ( page pdf file. :irst page is successfully splitted. When second page split it gives the following error G;nable to cast obDect of type .i&e*t#harp.te*t.pdf. df!rray. to type .i&e*t#harp.te*t.pdf. O+ndirectOeference..G lease let me @now %ow can we solved the issue<<

2ttac"ed 8mages ,.pdf -1C2.? B6" (C? views/

$epl% &ith ' !te 2ug &&t"/ )(&(/ (*1'( 23 4)0

#i,%
Fanatic 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
+ passed your pdf for the below method" its spliiting all pages e*actly. Code0

SplitPdfByParts("E:\Vijay\E-Pub RandE\ComparedEPubPDF\ComparedEPubPDF\bin\De 5oin ,ate1 $ocation1 Posts1 3a- )((0 8ndia ;+*

vb Code0 1. Public Shared Sub SplitPdfByParts(ByVal sourcePdf As 2. Dim reader As iTextSharp.text.pdf.PdfReader = 3. Dim doc As iTextSharp.text.Document = Nothing 4. Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing 5. Dim page As iTextSharp.text.pdf.PdfImportedPage 6. Dim pageCount As Integer = 0 7. Try 8. reader = New iTextSharp.text.pdf.PdfReader 9. pageCount = reader.NumberOfPages 10. If pageCount < parts Then 11. Throw New ArgumentException("Not enough pages in s 12. Else 13. Dim n As Integer = pageCount \ parts 14. Dim currentPage As Integer = 1 15. Dim ext As String = IO.Path.GetExtension 16. Dim outfile As String = String.Empty 17. For i As Integer = 1 To parts 18. outfile = baseNameOutPdf.Replace( 19. doc = New iTextSharp.text.Document 20. pdfCpy = New iTextSharp.text.pdf. doc.Open() 21. 22. If i < parts Then 23. For j As Integer = 1 To n 24. page = pdfCpy.GetImportedPage 25. pdfCpy.AddPage(page) 26. currentPage += 1 27. Next j

1' of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41.

Else For j As Integer = currentPage page = pdfCpy.GetImportedPage pdfCpy.AddPage(page) Next j End If doc.Close() Next End If reader.Close() Catch ex As Exception Throw ex End Try End Sub

Visual Studio.net )(&( 8 t"is post is use ul/ rate it

$epl% &ith ' !te 2ug &&t"/ )(&(/ &(1+& 23 4).

pra(a.aran.
5unior 3em9er 5oin ,ate1 Posts1 2ug )(&( &*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i for me its not wor@ing.. lease tell me which version of i&e*tsharp dll u have used< + have used Gite*tsharp-'.>.,-dllG . lease chec@ with once again whether its wor@ing or not.. please be sure that all splitted pdf files are created.

$epl% &ith ' !te =ct 6t"/ )(&(/ ()1;6 23 4)*

pra(a.aran.
5unior 3em9er 5oin ,ate1 Posts1 2ug )(&( &*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i.. + have one Euestion. +s there any possible to set password for the each splitted pdf file. lease tell me how we can do this.

$epl% &ith ' !te =ct 6t"/ )(&(/ (*1&0 23 4+(

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- pra(a.aran.

@i or me its not #orking.. Please tell me #"ic" 7ersion o iTexts"arp dll u "a7e usedA 8 "a7e used Citexts"arp-;.(.)-dllC . Please c"eck #it" once again #"et"er its #orking or not.. please 9e

1? of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

5oin ,ate1 )((6 $ocation1 !S2 Posts1

5ul

sure t"at all splitted pd iles are created.

Pro7idence/ %8 */&*.

+.ve uploaded the new dfManipulation, class which wor@s with ite*tsharp '.>.,.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te =ct 6t"/ )(&(/ (*1;* 23 4+&

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- pra(a.aran.

@i.. 8 "a7e one >uestion. 8s t"ere an- possi9le to set pass#ord or t"e eac" splitted pd ile. Please tell me "o# #e can do t"is.
5oin ,ate1 $ocation1 !S2 Posts1 5ul )((6 Pro7idence/ %8 */&*.

+ don.t @now anyway to set passwords to the splitted pdf.s on the fly. %owever" you can certainly do it on a ,nd pass. 1st pass0 split the pdf as usual. ,nd pass0 use df8ncryptor.8ncrypt method to set the user and3or owner passwords to those newly spliited pdfs. Hou can do this in a separate method after done splitting or you can set the password to each splitted pdf right after it is created. &he ,nd approach is preferred. +t.s Dust a few e*tra line of codes. +f you have trouble figuring it out" let me @now.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te =ct 6t"/ )(&(/ &(1&( 23 4+)

n(rege
Fren<ied 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
stanav ... what functions are included in your new class<
5ul )((6 38 &/;0*

5oin ,ate1 $ocation1 Posts1

$epl% &ith ' !te

1C of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

=ct 6t"/ )(&(/ &)1'6 P3

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- n(rege

stana7 ... #"at unctions are included in -our ne# classA

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

+ updated my original post to include a list of public methods in the new class.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te =ct 0t"/ )(&(/ ()1'+ P3 4+'

(l!f#endahl
Ne# 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
Does the Merge df:iles routine also merge boo@mar@s<

5oin ,ate1 Posts1

=ct )(&( +

$epl% &ith ' !te =ct 0t"/ )(&(/ (;1(+ P3 4+;

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- (l!f#endahl

,oes t"e 3ergePd Files routine also merge 9ookmarksA

7o" it doesn.t...
5oin ,ate1 $ocation1 !S2 Posts1 5ul )((6 Pro7idence/ %8 */&*.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te =ct .t"/ )(&(/ (&1&* 23 4+6

pra(a.aran.
5unior 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i"

1M of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

5oin ,ate1 )(&( Posts1 &*

2ug

+ got the below error. G dfOeader not opened with owner passwordG What we have to resolve the issue<< &han@s

$epl% &ith ' !te =ct .t"/ )(&(/ (&1'( 23 4+0

pra(a.aran.
5unior 3em9er 5oin ,ate1 Posts1 2ug )(&( &*

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%i" Can you give me the code to set password for each split pdf files. &han@s

$epl% &ith ' !te =ct .t"/ )(&(/ (.1'. 23 4+.

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- pra(a.aran.

@i/ Can -ou gi7e me t"e code to set pass#ord or eac" split pd iles. T"anks
5oin ,ate1 $ocation1 !S2 Posts1 5ul )((6 Pro7idence/ %8 -

+t.s already in the dfManipulation, class. &he method is0


*/&*.

Code0

SetSecurityPasswords(ByVal sourcePdf As String, ByVal outputPdf As String, B

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te =ct .t"/ )(&(/ (*1(* 23 4+*

stana#
Thread Starter Po#erPoster

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
=riginall- Posted 9- pra(a.aran.

@i/ 8 got t"e 9elo# error. CPd %eader not opened #it" o#ner pass#ordC

1= of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

5oin ,ate1 $ocation1 !S2 Posts1

5ul )((6 Pro7idence/ %8 */&*.

F"at #e "a7e to resol7e t"e issueAA T"anks

1. Hou need to @now the owner password of the pdf you.re wor@ing on. ,. ;se the ,nd overload of the dfOeader class contructor which allows you to supply the owner password as a byte array when you create a pdfreader obDect. #omething li@e this0 Code0

Dim ownerPwd As String = "put the owner password here" Dim pwdBytes() As Byte = System.Text.Encoding.Default.GetBytes(o Dim reader As New iTextSharp.text.pdf.PdfReader(sourcePDF, pwdBy

&he rest of the code is the same. (. +f you forget the owner password for some reason" you will have to remove all restrictions on that pdf using the OemoveOestrictions method and save the new unrestricted pdf to a temp location. Hou then can wor@ on that temporary unrestricted pdf as normal. When done" delete it if you don.t want to @eep it.

$ast edited 9- stana7: =ct .t"/ )(&( at (*1&+ 23.

Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it. - Abraham Lincoln $epl% &ith ' !te =ct .t"/ )(&(/ (&1&; P3 4'(

(l!f#endahl
Ne# 3em9er

$e: [VB.NET] Extract Pages and Split Pdf Files Using iTextSharp
%ey #tanav"

5oin ,ate1 Posts1

=ct )(&( +

Which method in your class" if any" can be used to e*tract boo@mar@ info from a pdf< than@s 6rian

$epl% &ith ' !te

Page & o ' 1 ) + '


Kuick Na7igation "!deBan. / Vis al Basic .NET

$ast
T!p

H Pre7ious T"read I Next T"read J

VBForums

VBForums CodeBank

CodeBank - Visual Basic .NET

,> of ,1

,31=3,>12 1>0(2 !M

456.78&9 8*tract ages and #plit df :iles ;sing i&e*t#ha...

http033www.vbforums.com3showthread.php<2=>2'?-56-...

[VB.NET] Extract Pages and Split Pd Files !sing iTextS"arp

2ccepta9le !se Polic-

Propert- o Kuinstreet Enterprise. Terms o Ser7ice I $icensing L %eprints I Pri7ac- Polic- I 2d7ertise Cop-rig"t )(&' KuinStreet 8nc. 2ll %ig"ts %eser7ed. 2ll times are M3T -;. T"e time no# is &(1)' 23.

,1 of ,1

,31=3,>12 1>0(2 !M

You might also like