First things first, I haven't slept for 36 hours, brain is
shutting down, so this may sound incoherent. I apologize, please go
Say I have an aerial imagery collection that is about 3.2
petabytes in total. This will work in SQLite since I'm only storing
about 300,000 values and I already tested it. The raw data is
I'll skip details and go to the problem. I have a database
and need to read three things into it. Latitude, longitude, and
image location. Latitude and longitude are stored in a metadata.txt
Example metadata location:
The actual images are stored in a completely different folder
that is similar in nature, but doesn't match the metadata folder.
Example Image Location:
The problem is that the image folder and the metadata folders
Now normally that wouldn't be a problem, because I could just
reformat one date into another, by manipulating a string, etc. But
the problem lies in the fact that there is no standard naming
convention for the folders. Sometimes it is named Oct152008 when
another place the date folder would be formatted as oct_152008 or
I need to read in the metadata.txt files in each folder to
the database but have no idea how to do this due to the current
state of things...
I'm thinking the only way to do this is to hand jam it.
Someone would have to make a spreadsheet with two columns. On the
left the image path and on the right, the metadata path. Then I
could write code reading in one path, and pointing to the other.
I would do this using C++ or C. Is this really the only way I
can do this? This would be months of hand jamming that I'm not
looking forward to.
Could you not try to standardize by writing a little filename
parser, iterate through the existing file system and store the
standardized version in a new file system. If the Date folder
syntax is the only area of concern then there can only be a limited
number of formats as you have listed in your post plus 921_08. You
could even write a little search tool that picks a metadata folder
and then tries each date combination in the images folders. It may
consume a great deal of CPU time searching or parsing, but I
believe you could save a month or two and carpel tunnel. Good