Hello, everyone.
I'm working on a Solr collection of documents that are stored as BLOB in an Oracle database. Documents can be .pdf, .txt, .doc(x), or .xls(x).
Is there a way of getting the text of these documents so that the collection can be properly indexed?
Thank you,
^_^
Per advice received from Raymond Camden, I'm getting the BLOB, saving the file on the server filesystem, reading it, and storing the data.
However, this isn't working for DOC(x) and PPT(x) files. Any suggestions?
Thank you,
^_^