I'm not sure whether this is a bug in PDFKit, or something wrong with my implementation of it. I'm creating PDFOutlines (a Table of Contents) programmatically to an existing PDF file using python.
The code 'appears' to work: the ToC is shown in Preview's sidebar, and in Acrobat's Bookmarks pane; and in other PDF readers. However, when I Preflight the PDF in Acrobat, I get syntax errors flagged, which show that there's something mad going on in the data. For three PDFOutlines, I get the following:
2 0 obj
<< /First 57 0 R /Last 58 0 R >>
endobj
58 0 obj
<< /Prev 59 0 R /Count 0 /Title (Page 3) /Dest [ 31 0 R /XYZ 0 842 null ] /Parent 60 0 R >>
endobj
60 0 obj
<< >>
endobj
59 0 obj
<< /Parent 61 0 R >>
endobj
61 0 obj
<< >>
endobj
57 0 obj
<< /Dest [ 4 0 R /XYZ 0 842 null ] /Count 0 /Title (Page 1) /Next 62 0 R /Parent 63 0 R >>
endobj
63 0 obj
<< >>
endobj
62 0 obj
<< /Prev 64 0 R /Count 0 /Title (Page 2) /Dest [ 25 0 R /XYZ 0 842 null ] /Next 65 0 R /Parent 61 0 R >>
endobj
65 0 obj
<< /Prev 59 0 R /Count 0 /Title (Page 3) /Dest [ 31 0 R /XYZ 0 842 null ] /Parent 60 0 R >>
endobj
64 0 obj
<< /Parent 63 0 R >>
endobj
For anyone not familiar with the insides of PDF: I have the Outline for Page 3 appearing twice; and each Outline is pointing to a different, blank Parent, instead of object number 2. Acrobat flags objects 60, 61, 63 as missing Parent and Title fields; 64 and 59 lack Title fields only. Correct syntax should be the 3 Outlines having object 2 as their Parent, and none of the other objects being there.
Here's my code in python:
def getOutline(page, label):
# Create Destination
myPage = myPDF.pageAtIndex_(page)
pageSize = myPage.boundsForBox_(Quartz.kCGPDFMediaBox)
x = 0
y = Quartz.CGRectGetMaxY(pageSize)
pagePoint = Quartz.CGPointMake(x,y)
myDestination = Quartz.PDFDestination.alloc().initWithPage_atPoint_(myPage, pagePoint)
myLabel = NSString.stringWithString_(label)
myOutline = Quartz.PDFOutline.alloc().init()
myOutline.setLabel_(myLabel)
myOutline.setDestination_(myDestination)
return myOutline
if __name__ == "__main__":
pdfURL = NSURL.fileURLWithPath_(infile)
print pdfURL
myPDF = Quartz.PDFDocument.alloc().initWithURL_(pdfURL)
print myPDF
if myPDF:
# Create Outlines. Add the Page Index (from 0) and label in pairs here:
myTableOfContents = [
(0, 'Page 1'),
(1, 'Page 2'),
(2, 'Page 3')
]
allMyOutlines = []
for index, outline in myTableOfContents:
allMyOutlines.append(getOutline(index, outline))
# Create a root Outline and add each outline
rootOutline = Quartz.PDFOutline.alloc().init()
for index, value in enumerate(allMyOutlines):
rootOutline.insertChild_atIndex_(value, index)
myPDF.setOutlineRoot_(rootOutline)
myPDF.writeToFile_(outfile)
There's a function at the top which creates an outline from a label string and page number. Each outline gets put into a dict, and then inserted as a Child of the root Outline.
Am I doing it wrong, or is it a bug?