Bug or problem with indexing PDFOutline in PDFKit

Question

Created Mar ’19

Replies 2

Boosts 0

Views 1.8k

Participants 1

I'm not sure whether this is a bug in PDFKit, or something wrong with my implementation of it. I'm creating PDFOutlines (a Table of Contents) programmatically to an existing PDF file using python.

The code 'appears' to work: the ToC is shown in Preview's sidebar, and in Acrobat's Bookmarks pane; and in other PDF readers. However, when I Preflight the PDF in Acrobat, I get syntax errors flagged, which show that there's something mad going on in the data. For three PDFOutlines, I get the following:

2 0 obj
<< /First 57 0 R /Last 58 0 R >>
endobj
58 0 obj
<< /Prev 59 0 R /Count 0 /Title (Page 3) /Dest [ 31 0 R /XYZ 0 842 null ] /Parent 60 0 R >>
endobj
60 0 obj
<< >>
endobj
59 0 obj
<< /Parent 61 0 R >>
endobj
61 0 obj
<< >>
endobj
57 0 obj
<< /Dest [ 4 0 R /XYZ 0 842 null ] /Count 0 /Title (Page 1) /Next 62 0 R /Parent 63 0 R >>
endobj
63 0 obj
<< >>
endobj
62 0 obj
<< /Prev 64 0 R /Count 0 /Title (Page 2) /Dest [ 25 0 R /XYZ 0 842 null ] /Next 65 0 R /Parent 61 0 R >>
endobj
65 0 obj
<< /Prev 59 0 R /Count 0 /Title (Page 3) /Dest [ 31 0 R /XYZ 0 842 null ] /Parent 60 0 R >>
endobj
64 0 obj
<< /Parent 63 0 R >>
endobj

For anyone not familiar with the insides of PDF: I have the Outline for Page 3 appearing twice; and each Outline is pointing to a different, blank Parent, instead of object number 2. Acrobat flags objects 60, 61, 63 as missing Parent and Title fields; 64 and 59 lack Title fields only. Correct syntax should be the 3 Outlines having object 2 as their Parent, and none of the other objects being there.

Here's my code in python:

def getOutline(page, label):
  # Create Destination
  myPage = myPDF.pageAtIndex_(page)
  pageSize = myPage.boundsForBox_(Quartz.kCGPDFMediaBox)
  x = 0
  y = Quartz.CGRectGetMaxY(pageSize)
  pagePoint = Quartz.CGPointMake(x,y)
  myDestination = Quartz.PDFDestination.alloc().initWithPage_atPoint_(myPage, pagePoint)
  myLabel = NSString.stringWithString_(label)
  myOutline = Quartz.PDFOutline.alloc().init()
  myOutline.setLabel_(myLabel)
  myOutline.setDestination_(myDestination)
  return myOutline

if __name__ == "__main__":

  pdfURL = NSURL.fileURLWithPath_(infile)
  print pdfURL
  myPDF = Quartz.PDFDocument.alloc().initWithURL_(pdfURL)
  print myPDF
  if myPDF:
  # Create Outlines. Add the Page Index (from 0) and label in pairs here:
  myTableOfContents = [
  (0, 'Page 1'),
  (1, 'Page 2'),
  (2, 'Page 3')
  ]
  allMyOutlines = []
  for index, outline in myTableOfContents:
  allMyOutlines.append(getOutline(index, outline))


  # Create a root Outline and add each outline
  rootOutline = Quartz.PDFOutline.alloc().init()
  for index, value in enumerate(allMyOutlines):
  rootOutline.insertChild_atIndex_(value, index)
  myPDF.setOutlineRoot_(rootOutline)
  myPDF.writeToFile_(outfile)

There's a function at the top which creates an outline from a label string and page number. Each outline gets put into a dict, and then inserted as a Child of the root Outline.

Am I doing it wrong, or is it a bug?

Boost

Answer 1

benwiggy OP

Jan ’21

Just adding to this problem. I'm confused by Apple's documentation for PDFOutline. It says:

"The index of the outline object is relative to its siblings and from the perspective of the parent of the outline object. The root outline object, and any outline object without a parent, has an index value of 0."

So that suggests that Outline objects added to the root Outline should have a non-zero index. But, when I try to add the first child with an index of 1, I get an NSArray Beyond Bounds error.

0

Answer 2

benwiggy OP

Feb ’21

I've sent Feedback, so an Apple technician should be responding and fixing this right away! Ahahaahahahaha.

0