Our business is looking for a contractor with extensive InDesign & scripting experience. Here's a basic idea of the problem we are trying to solve:
Is it possible to create a script which copies text from our blog, and/or other websites into our templates? Could we use a script to pull article headlines, body text, and quotes from the website(s) and place them into the ID templates we already use? This would save a lot of transcription time which could be used to focus on book layout/pictures/etc.
If this does not seem possible, or you need more information, please reply to this post.
If this is possible, and you have the necessary skills to help, please submit your résumé and a link to your website to email@example.com. We will only reply to the applicant we deem most qualified.
One of the easier and maintainable ways would be to use a Filemaker database web viewer to retrieve the html source, parse it into the appropriate records and fields, and then go from there with direct setting of InD frame content or outputting and placing of tagged text or xml files. ( Applescript is easiest for this. )
I don't know of a way for an InDesign script to grab the textual data from a web page ... but it is possible using Adobe's Creative Suite Extension Builder technology, which is based on Adobe AIR. Once the Extension has retrieved the HTML from the web page, it can pour the text contents into an InDesign document, just like a script would, but faster.
What you are asking for is indeed possible, but it may not be the most efficient way to achieve what I believe are your actual goals, and here is why:
When you publish data to a website, you usually have to insert a lot of formatting code, which means that the actual content becomes mixed with what, from InDesign's point of view, is garbage, which means that you have to spend a lot of effort to clean it up.
If, as I presume it true for a blog, the content is stored in a database, it is much, much better to retrieve the raw content, without the formatting "garbage", directly from the server. How to do that depends on your blog backend, but usually it involves making calls to some kind of API, or a server-side script that pulls the requested data.
Once you have the data it is simply a matter of putting it into InDesign. If you are lucky enough to get the data as XML, then it is simply a matter of importing and styling, possibly with some xsl transformations.
If you are really lucky, your blog backend may allow you to publish the content as XML/XHTM/RSS/ATOM etc, in which case there is no need for any specific server-side work, as you simple can pull the content directly through a regular http request.
The point I am trying to make is that you should pick up the content before it is littered with "garbage", instead of trying to clean it up afterwards. That is what you do if you have to do it by hand, but you really should try not to replicate your manual workflow, but instead go for what is a good automated solution