How did you make the forms from Word? As far as I know it will make tagged PDF. Of course, it often needs work, but there should be no need to start from scratch.
I did not make the forms. They were done by a Committee and someone on the Committee made the forms in Word. After I converted to PDF, it showed up as a tagged PDF. I then added the fillable fields and 3 buttons. I then tested everything and it tested fine. Then I ran everything thru the Accessibility Full Check. The report produced indicates hundreds of content items are NOT contained in the tree structure. When I look individually at the list in the report of what is not in the tree structure, it is showing blank areas in the forms, line segments from boxes which were around areas of the forms, and points where each of the line segments either meets another line segment or ends. There are hundreds of these in each form.
Am I to understand my only recourse is to find and manually add each of these things to the tree structure? And then I have to manully change each of these to an artifact or put it in the background so that it is not read by a screen reader?
Some additional information about your workflow would be helpful, in order to determine at what point the problem -- i.e., the lines and points around the fields becoming "untagged" -- was introduced.
What version of Word and what version of Acrobat Professional are you using? How did you create the PDF -- save As PDF or using the Acrobat plugin?
After converting the Word file to a tagged Acrobat PDF, did you run the Accessibility Checker at that point -- before adding the form fields to the document? If so, what were the results? It would be helpful to know if the lines/points on the page were properly tagged at that point.
Did you use Adobe LiveCycle to add form fields to the PDF or did you add the fields using the Tools / Forms palette?
Forms were created in Word 2007, I am using Acrobat X Pro. PDF was created using Words' menu item: Acrobat - Create PDF. (Not sure whether this would be considered the plugin or not....)
After converting to PDF, I checked that it was a tagged PDF, and I added the language attribute. I did not run Accessibilty Checker at that point.
I went ahead and added the form fields to the documents using Acrobat Pro's Forms palette - Create - Use an existing file. Then I cleaned up all the form fields stuff that Acrobat got wrong (boxes which were not forms fields, lines which were not part of the form), and changed 1 field to a drop-down list. Then I added the 3 buttons to Print, Save and Reset. Then did a test under Preview. Saved. Then I ran the Accessibility Checker at that point.
I just took your suggestion about running accessiblity checker right after conversion and before adding the form fields. I reconverted to a new file and ran accessibility checker. Only problems were related to 5 list items not a child of a list. I manually fixed these in the tags section. Saved and rechecked. Reported 0 problems.
Then I added the form fields using Acrobat Pro's Forms palette again. Made same changes (cleaned up, changed 1 fields to a drop-down list, added 3 buttons). Saved. Ran Accessibility Checker - lo and behold,
now I have the comments and annotations not in tree structure report. I go to Tags - Find - Unmarked Annotations - All my form fields have to be added to the tree structure. Then I go to Tags - Find - Unmarked Content - Here is where all the little blank spaces, and line points, and line segments begin showing up.
So it appears everything becoming "untagged" is happening AFTER adding the form fields in Acrobat Pro. And this is where I have to start marking hundreds of things as artifacts.
Would adding form fields in LIveCycle provide better results? I have avoided using LiveCycle unless absolutely necessary because staff is not well versed in LiveCycle and these forms need to be turned very quickly.....
Do not mix n match Acrobat forms and XFA (LiveCycle) forms. Do note that LiveCycle Designer is no longer bundled with Acrobat. This occurs with XI.
Form fields added to a tagged PDF are untagged content. You have to add the proper tags, alt text, etc. You have ensure proper placement in the structure tree.
Adding untagged content to a tagged PDF has no impact on the existing structure tree's content or topological 'build'. Inadvertently inappropriate post-processing is what fraggles the structure tree.
Be well ...
Okay, won't mix n match will just stick with Acrobat.
What does "inadvertently inappropriate post-processing" mean? I am trying to learn what the appropriate order for post-processing is so I don't have to go thru hours of clean up.
The last processing order I tried was to run accessibility check earlier in the process instead of at the end:
Converted to PDF
Ran accessibility check - 1 minor error.
Fixed minor error using Tags panel.
Ran accessibility check - 0 errors.
Added form fields and buttons
Ran accessibility check again. Now shows 100s of errors (including about 30 form fields on the form which I know I must tag and ensure placement in tree).
The majority of those errors in the last accessibility check are untagged blank spaces, line segments around boxes, and end points where line segments end. Those things did not show up in accessibilty check 1 or 2. They show up under the final accessibility check.
I am just trying to figure out what I am doing that is "fraggling" the structure tree.....
If I understand you correctly, it seems this is a corrupt file.
CT Dave is right, adding untagged content to a tagged PDF should have no impact on the existing structure. But documents do become corrupt and I have seen similar things to what you are describing happen in PDFs... it's very frustrating. Word to PDF conversion is not without its quirks.
I don't think it is possible to mix / match LiveCycle and the Forms Editor -- once you work in LiveCycle you can't "go back". And you said you manually created the form fields and that you are aware they need to be tagged.
The problem is, all of a sudden, all the lines and other "artifacts" in the document suddenly lost their artifact attribute, if you will, and are now showing up as untagged content.
You could start over and hope for the best, or you could just click Find Unmarked Annotation a hundred times, select all the resulting non-form tags and convert them back to artifact. No guarantee either will work on this file, which may have other "invisible" internal flaws that will cause a repeat.
If you choose to start over, I recommend trying to Repair the Acrobat Installation before you start.
With Word files that repeatedly cause conversion errors, we sometimes find it necessary to Save as RTF, or even Save as Text and reformat the document.
NONE of these things should be necessary. However, given what you are saying I don't see where you have erred. You seem to have a corrupt document -- whether it went corrupt while you were working in Acrobat or the Word file and/or conversion introduced problems that manifested after you made the PDF and worked on it, I can't tell.
In my experience this bug -- i.e., lines and other artifacts "dropping out" of the structure -- isn't something that happens often, but it can and does happen, particularly with complex documents. It may not be due to something you did -- and there is no guaranteed method I know of avoiding it.
If anyone can recreate the problem please share what user error is doing this.