I think that the problem is in the wildcard.
When I use XPATH-search on our site like this:
/jcr:root/content/career//*[jcr:contains(., '到我們')] order by @jcr:score
it returns results, but using
/jcr:root/content/career//*[jcr:contains(., '*到我們*')] order by @jcr:score
Same with SQL1/2. If I understand the Lucene documentation correctly, you can't use wildcards when searching in multi-byte character sets. That is because the meaning of "words" in the same meaning as "singel-byte" languages. When searching for "any word containing the string abc" we would find
and that can be described locically as "*abc*". But in mulit-byte languages, that sort of logic has another meaning. The standard Lucene analyzer interprets each mulit-character as a word. That means that 到我們 actually is the same as *到*我*們* from a search perspective. Or not... It is *到我們*, since the word before and after has no meaning. Like in the sentence "The little dabchick flew away", we don't care about the words around the search cirteria.
Google for "Lucene and Chinese language" for more information.
Thanks first, but it doesn't work for me when I use the XPATH-search with '到我们' like yours. the SQL1/2 doesn't work too.
Even if you skip the wildcards and only search for a set of words? Does it work in the CRXDeLite/Tools/Query?
What version of CQ?
Yes, we are querying in CRXDE Lite, an doesn't work without the wildcards.
It's CQ5.5 we are using now, is it many bugs ??
I have tried it with an out of the box CQ5.5 SP2.1 and there it works. It takes a couple of minutes before Lucene has indexed the added content, but it finds it.