Expand my Community achievements bar.

Database failover recovery

Avatar

Level 2

For our customers we built robust architecture with LC in horizonal cluster and MS SQL Server database running in Windows cluster (active-passive configuration).

While we were testing database failover (moving all db services from one node to another) some strange things started to happen.

After services were down on one server, LC started to generate a lot of errors in server log. After services were up again on another server errors continued. For 50 minutes! During all that time contenspace application was inaccesible - reporting error (on both servers in a cluster when accesed individuallly).

Result was the same if we just took db down and up again on the same db server in a cluster. Our LC is simply unable to make gracefull recovery if its database connections are dropped at some point.

During that time server.log grew to 200+MB.

Errors that were generated pretty much followed the same pattern.

2009-12-16 15:23:06,729 WARN  [org.hibernate.util.JDBCExceptionReporter] SQL Error: 0, SQLState: null

2009-12-16 15:23:06,729 ERROR [org.hibernate.util.JDBCExceptionReporter] The connection is closed.

2009-12-16 15:23:06,729 ERROR [org.hibernate.transaction.JDBCTransaction] JDBC rollback failed

com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.

at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(Unknown Source)

at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(Unknown Source)

at com.microsoft.sqlserver.jdbc.SQLServerConnection.rollback(Unknown Source)

at org.jboss.resource.adapter.jdbc.BaseWrapperManagedConnection.jdbcRollback(BaseWrapperManagedConnection.java:448)

at org.jboss.resource.adapter.jdbc.WrappedConnection.rollback(WrappedConnection.java:337)

at org.hibernate.transaction.JDBCTransaction.rollbackAndResetAutoCommit(JDBCTransaction.java:183)

at org.hibernate.transaction.JDBCTransaction.rollback(JDBCTransaction.java:162)

at org.springframework.orm.hibernate3.HibernateTransactionManager.doRollback(HibernateTransactionManager.java:581)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:745)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:722)

at org.springframework.transaction.interceptor.TransactionAspectSupport.completeTransactionAfterThrowing(TransactionAspectSupport.java:325)

at org.alfresco.util.transaction.SpringAwareUserTransaction.rollback(SpringAwareUserTransaction.java:498)

at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:346)

at org.alfresco.repo.node.index.IndexTransactionTracker.reindexImpl(IndexTransactionTracker.java:203)

at org.alfresco.repo.node.index.AbstractReindexComponent$1.execute(AbstractReindexComponent.java:252)

at org.alfresco.repo.node.index.AbstractReindexComponent.reindex(AbstractReindexComponent.java:262)

at org.alfresco.repo.node.index.IndexRecoveryJob.execute(IndexRecoveryJob.java:40)

at org.quartz.core.JobRunShell.run(JobRunShell.java:202)

at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)

2009-12-16 15:23:06,729 ERROR [org.alfresco.util.transaction.SpringAwareUserTransaction] Application exception overridden by rollback exception

java.lang.Exception

at org.alfresco.util.transaction.SpringAwareUserTransaction.rollback(SpringAwareUserTransaction.java:498)

at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:346)

at org.alfresco.repo.node.index.IndexTransactionTracker.reindexImpl(IndexTransactionTracker.java:203)

at org.alfresco.repo.node.index.AbstractReindexComponent$1.execute(AbstractReindexComponent.java:252)

at org.alfresco.repo.node.index.AbstractReindexComponent.reindex(AbstractReindexComponent.java:262)

at org.alfresco.repo.node.index.IndexRecoveryJob.execute(IndexRecoveryJob.java:40)

at org.quartz.core.JobRunShell.run(JobRunShell.java:202)

at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)

2009-12-16 15:23:06,729 ERROR [org.alfresco.repo.transaction.RetryingTransactionHelper] Rollback failure.  Normal retry behaviour will resume.

org.springframework.transaction.TransactionSystemException: Could not roll back Hibernate transaction; nested exception is org.hibernate.TransactionException: JDBC rollback failed

Caused by:

org.hibernate.TransactionException: JDBC rollback failed

at org.hibernate.transaction.JDBCTransaction.rollback(JDBCTransaction.java:170)

at org.springframework.orm.hibernate3.HibernateTransactionManager.doRollback(HibernateTransactionManager.java:581)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:745)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:722)

at org.springframework.transaction.interceptor.TransactionAspectSupport.completeTransactionAfterThrowing(TransactionAspectSupport.java:325)

at org.alfresco.util.transaction.SpringAwareUserTransaction.rollback(SpringAwareUserTransaction.java:498)

at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:346)

at org.alfresco.repo.node.index.IndexTransactionTracker.reindexImpl(IndexTransactionTracker.java:203)

at org.alfresco.repo.node.index.AbstractReindexComponent$1.execute(AbstractReindexComponent.java:252)

at org.alfresco.repo.node.index.AbstractReindexComponent.reindex(AbstractReindexComponent.java:262)

at org.alfresco.repo.node.index.IndexRecoveryJob.execute(IndexRecoveryJob.java:40)

at org.quartz.core.JobRunShell.run(JobRunShell.java:202)

at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)

Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.

at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(Unknown Source)

at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(Unknown Source)

at com.microsoft.sqlserver.jdbc.SQLServerConnection.rollback(Unknown Source)

at org.jboss.resource.adapter.jdbc.BaseWrapperManagedConnection.jdbcRollback(BaseWrapperManagedConnection.java:448)

at org.jboss.resource.adapter.jdbc.WrappedConnection.rollback(WrappedConnection.java:337)

at org.hibernate.transaction.JDBCTransaction.rollbackAndResetAutoCommit(JDBCTransaction.java:183)

at org.hibernate.transaction.JDBCTransaction.rollback(JDBCTransaction.java:162)

... 12 more

2009-12-16 15:23:06,729 WARN  [org.hibernate.jdbc.AbstractBatcher] exception clearing maxRows/queryTimeout

com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.

at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(Unknown Source)

at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(Unknown Source)

at com.microsoft.sqlserver.jdbc.SQLServerStatement.checkClosed(Unknown Source)

at com.microsoft.sqlserver.jdbc.SQLServerStatement.getMaxRows(Unknown Source)

at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.getMaxRows(CachedPreparedStatement.java:352)

at org.jboss.resource.adapter.jdbc.WrappedStatement.getMaxRows(WrappedStatement.java:244)

at org.hibernate.jdbc.AbstractBatcher.closeQueryStatement(AbstractBatcher.java:272)

at org.hibernate.jdbc.AbstractBatcher.closeQueryStatement(AbstractBatcher.java:209)

at org.hibernate.loader.Loader.getResultSet(Loader.java:1792)

at org.hibernate.loader.Loader.doQuery(Loader.java:662)

at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:224)

at org.hibernate.loader.Loader.doList(Loader.java:2211)

at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2095)

at org.hibernate.loader.Loader.list(Loader.java:2090)

at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:388)

at org.hibernate.hql.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:338)

at org.hibernate.engine.query.HQLQueryPlan.performList(HQLQueryPlan.java:172)

at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1121)

at org.hibernate.impl.QueryImpl.list(QueryImpl.java:79)

at org.alfresco.repo.avm.hibernate.AVMStoreDAOHibernate.getAll(AVMStoreDAOHibernate.java:89)

at org.alfresco.repo.avm.AVMRepository.getAVMStores(AVMRepository.java:1088)

at org.alfresco.repo.avm.AVMServiceImpl.getStores(AVMServiceImpl.java:876)

at sun.reflect.GeneratedMethodAccessor312.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:585)

at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:281)

at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:187)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:154)

at org.alfresco.repo.transaction.SingleEntryTransactionResourceInterceptor.invokeInternal(SingleEntryTransactionResourceInterceptor.java:163)

at org.alfresco.repo.transaction.SingleEntryTransactionResourceInterceptor.invoke(SingleEntryTransactionResourceInterceptor.java:138)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:176)

at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:210)

at $Proxy50.getStores(Unknown Source)

at org.alfresco.repo.node.index.AVMRemoteSnapshotTracker.processStores(AVMRemoteSnapshotTracker.java:73)

at org.alfresco.repo.node.index.AVMRemoteSnapshotTracker.reindexImpl(AVMRemoteSnapshotTracker.java:64)

at org.alfresco.repo.node.index.AbstractReindexComponent$1.execute(AbstractReindexComponent.java:252)

at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:278)

at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:227)

at org.alfresco.repo.node.index.AbstractReindexComponent.reindex(AbstractReindexComponent.java:258)

at org.alfresco.repo.node.index.IndexRecoveryJob.execute(IndexRecoveryJob.java:40)

at org.quartz.core.JobRunShell.run(JobRunShell.java:202)

at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)

Is that maybe expected behaviour? Or it could be something in our architecture/configuration? Maybe tweaking some of datasource properties.

Our system is otherwise stable, and we have several workflows running without problems.

Any suggestion is appreciated!

Zeljko Vranic

1 Reply

Avatar

Level 2

Sorry, I forgot to mention that we have Livecycle ES 8.2.1 SP3. Errors were first encountered on SP2 - SP3 didn't help. We didn't try with SP1 (we were unaware of the problem yet).