[go: up one dir, main page]

Menu

Commit [r8650]  Maximize  Restore  History

pdfbox-1.1.0.jar instead of PDFBox-0.7.2.jar

When exceptions are thrown during text extraction, they will be caught and empty index field returned with a log warning.
Before, exceptions were sent back to the indexing stylesheet, which would cancel the indexing,
now the index document will just lack the full text index field.
Text extraction from PDF documents now puts spaces instead of characters 00-31, because they have caused exception during indexing.

gertsp 2010-06-17

removed /services/genericsearch/trunk/FedoraGenericSearch/lib/PDFBox-0.7.2.jar
added /services/genericsearch/trunk/FedoraGenericSearch/lib/pdfbox-1.1.0.jar
changed /services/genericsearch/trunk/FedoraGenericSearch/src/java/dk/defxws/fedoragsearch/server/GenericOperationsImpl.java
changed /services/genericsearch/trunk/FedoraGenericSearch/src/java/dk/defxws/fedoragsearch/server/TransformerToText.java