your problem is a bit unclear to me ... What do you mean with "CPU memory"?
The files and directories you are referencing are regular directories. The processes which create and maintain these files are regular CQ processes; for example the (Lucene) index update is a regular step when persisting content in the repository.
By default the tarOptimizer runs once a day between 2-5am (configurable via repository.xml); so your observation that you have a high resource usage twice a day might not be linked to it directly; of course the TarOptimizer might add to the resource consumption, but some other aspects contribute as well.
So, to analyze your problem deeper, I suggest you to
* check, if these times of high resource usage is happening every day at the same date.
* check, if all of your systems are affected or only a small subset or even only a single machine/instance.
* check, if other external factors (like virtualization) might influence the performance of your systems.
* check the timing of the TarOptimizer and the backup, if they are running on exactly the time, when you observe your resource spike.
Then you should start digging through the system:
* Check the process table of your operation system and validate, that the CQ process is really responsible for the higher resource consumption
* Observe log files for that specific time (especially error.log and request.log/access.log)
* Get threaddumps and find "unusual" threads, which are consuming memory.
We have a problem somewhat related to the above mentioned. We expect that our problem is around indexs and way CQ5.4 use indexes to refer to data from repository.
1) Does CQ allow reading different branches of a tree parallely?
2) Our application read different branches of tree using multiple threads (102) and we observe that it takes 30% more time when compared to reading whole tree using a single thread.
3) When we profile application it indicates that our threads run mutually exclusive. Memory (heap) usage is around 40% of total memory & CPU utilization is around 17% on an average basis and it oscillates between (10-40%) and never exceeds 40%. From application perspective we just kick off all threads using executor service and continue processing as each thread returns.
4) We will run this application somewhere between 15 & 20 times a day. Our observation is that 1st run of the day takes 50% more time when compared to subsequent runs. And it happens very consistently. We simulate this by re-starting the whole CQ application.
5) Now the problem gets more interesting, we changed settings in repository.xml to keep index in memory. For this we need to re-start our application twice. Once without index in memory (midnight) so that all index merging completes and then in the morning (8am) we will re-start our application with index in memory. Our whole CQ application will take 30 minutes to start without index in memory & it takes 7mintues to start with index in memory. But when we run our application takes almost double the amount of time(4.5hrs) to complete with index in memory when compared to 2.5hrs without index in memory.
Note :- All the details above are facts from testing without XIV SSD Cache which gets introduced now.
6) We have newly introduced XIV SSD cache (6TB) from IBM, which sits, on top of our SAN disks. This is a persistent cache and 80-90% of total hits, get data from this cache. When we run our application with index in memory it takes 4.5 hours to complete, but when we run our application with out index in memory it takes 1.3 hours to complete.
We are clueless now and thinking that way CQ5.4 manages indexes is the problem.
Any help will be greatly appreciated.