I am working on an application that has 1000s of rows of data to be inserted to a table in a db. I am doing it in an async method. A new SQLStatement connection is open in the beginning. Then each time the SQL parameter gets updated and the execute() function is called. When the row is written to disk and the sucess event is generated, the parameter is updated again and the execute() is called again until all rows are stored in the DB.
This is very slow performance, each transaction takes about 50ms to 300ms depending on if the disk is free.
How can I make this more efficient? Is there a way to create the whole table in memory and then dump it to a file? Can I execute few transations together? The bottle neck is the access to the file system.
This application is to index a huge text file. Each word of the text file is stored in a row along with some index numbers that track the word in the text file.
Thank you for your help
There are a couple of solutions you can use:
First, you can wrap the entire INSERT operation in a transaction. By doing that, all the changes will be made to the in-memory copy of the database first, and it will only write to the disk when you commit the transaction. This is described in the documentation here:
Another alternative that should improve performance is to use an INSERT..SELECT statement to combine the multiple inserts into one statement. An INSERT..SELECT is where you use a SELECT statement to define the data that's inserted into a table, rather than just providing explicit values. (Since a SELECT statement defines a set of data, using INSERT..SELECT allows you to insert multiple rows of data into a table in one single SQL statement.) For more details about INSERT..SELECT (including code examples) see this thread: