8 Replies Latest reply on Jul 11, 2011 10:29 PM by srikanth_mv

    Downloading a binary file using sockets without length known

    srikanth_mv

      Hi all,

       

      I'm trying to download a binary(.exe) file using socket, where the length of the file is not known. Please take a look at the code I'm using:

       

       

            var readBin = socketBin.read();

           <-- Here comes the code that checks for http Content-length header field

           ; If content-length field is available, then I'll use it to download file, else proceed with following code -->

       

       

          pBar.reset("Downloading plugin..",null);pBar.hit(readBin.length);       

                  while (1)

                  {

                      binFil.write(readBin);     //'binFil' is the file, into which downloaded file is written.

                      readBin = socketBin.read();

                      pBar.hit(readBin.length);

                      if( socketBin.eof || readBin.length<=0){

                          break;}          

                  }

       

       

                  binFil.write(readBin);

                  binFil.close();

                  socketBin.close();

       

      Problem is: I'm able to download file within 10 seconds when content-length is known. But when content-length is not known, its taking about 1 and half minute to download, also the progress bar gets struck for much of time.

       

      FYI: socket is opened in binary mode, file is getting downloaded correctly(even though it takes abt a minute). BTW Im using CS5.5 extendscript

       

      I'm not able to figure out where the bug is.

        • 1. Re: Downloading a binary file using sockets without length known
          srikanth_mv Level 1

          Problem has been resolved. Its because, read() operation is taking lot of time after reading the last byte until socket timeout. I've set socket timeout to small value and its returning fastly.

          • 2. Re: Downloading a binary file using sockets without length known
            John Hawkinson Level 5

            I think you're doing this wrong.

             

            You haven't given us enough code to be an easily reproducible test case, and I'm not going to write a lot of glue code just to test your code. I assume your socketBin is a new Socket(); object? If so, look at Adobe's sample code:

             

            reply = "";
            conn = new Socket;
            // access Adobe’s home page
            if (conn.open ("www.adobe.com:80")) {
                // send a HTTP GET request
                conn.write ("GET /index.html HTTP/1.0\n\n");
                // and read the server’s reply
                reply = conn.read(999999);
                conn.close();
            }
            

             

            Notice they give a very large number of reads characters, rather than the default, which is zero, which means searching for newline characters.

            I suspect you're reading past the EOF or somesuch.

             

            Also:

            I've set socket timeout to small value and its returning fastly.

            That doens't sound like a good answer. What if there's a network hiccup?

             

            You could also use an external utlity, like curl, to do the download, reaching out with AppleScript or VB.

            1 person found this helpful
            • 3. Re: Downloading a binary file using sockets without length known
              srikanth_mv Level 1

              Hi John,

               

              Thanks for your thoughts.

               

              "I assume your socketBin is a new Socket(); object?"

               

              Yes, scoketBin is "new Socket;" and the connection is opened as that of adobe's sample code.

               

              "I suspect you're reading past the EOF or somesuch."

               

              Probably yes. When I debug code using alerts and progress bar, its taking lot of time on the reading last chunk of data. So it must be either reading par EOF or waiting for some more data as I havent passed no. of bytes argument to socketBin.read().

              But how handle this, I do not know exactly the number of bytes for the last read operation. Checking for socketBin.eof flag also doesnt help as the flag is checked only after socketBin.read() returns.

              So, the question becomes, how to stop read() operation to proceed, after it sees EOF?

               

              -Srikanth

              • 4. Re: Downloading a binary file using sockets without length known
                S.Biancardo

                Is possible using POST trasmission?

                Thanks

                • 5. Re: Downloading a binary file using sockets without length known
                  John Hawkinson Level 5
                  But how handle this, I do not know exactly the number of bytes for the last read operation. Checking for socketBin.eof flag also doesnt help as the flag is checked only after socketBin.read() returns.

                  So, the question becomes, how to stop read() operation to proceed, after it sees EOF?

                  Please give me enough of your code that it actually runs, and a test URL that it fails on, and I'll try to see what is wrong with it.

                   

                  Small self-contained problems are good! Snippets of code extracted from something bigger with dependencies on things we can't see are bad! Let's make it good!

                  • 6. Re: Downloading a binary file using sockets without length known
                    srikanth_mv Level 1

                    Try running this code on extend script editor cs5.5, it will get struck for abt 10 secs, as the response header field keeps it alive for 15 seconds only..

                    After downloading the exe, remove headers from the file, you'll be able to run it..

                     

                    function GetData(url)

                    {

                                    var parsedURLBin = ParseURL(url);

                                    if ((parsedURLBin.protocol != "HTTP") && (parsedURLBin.protocol != "HTTPS"))

                                    {

                                        alert("Protocol is not HTTP: [" + parsedURLBin.protocol + "]");

                                        return;

                                    }

                     

                                var socketBin = new Socket;   

                                socketBin.encoding="BINARY";

                                socketBin.timeout = 20;

                     

                                if (! socketBin.open(parsedURLBin.address + ":" + parsedURLBin.port,"BINARY"))           

                                {

                                    alert("Failed while opening socket!");

                                    return;

                                }

                     

                                var requestBin =

                                "GET /" + parsedURLBin.path + " HTTP/1.1\n" +

                                "Host: " + parsedURLBin.address + "\n" +

                                "User-Agent: InDesign ExtendScript\n" +

                                "Accept: */*\n" +

                                "Connection: keep-alive\n\n";

                     

                                var binFil=new File(url);

                                binFil=binFil.saveDlg ("Where to save?",'Executable: *.exe');

                                if(binFil==null)

                                {

                                    socketBin.close();

                                    return;

                                }

                                binFil.open('a');binFil.encoding="BINARY";

                                socketBin.write(requestBin);  

                     

                                var readBin = socketBin.read();     alert(readBin.length);                

                     

                                while (1)

                                { 

                                    binFil.write(readBin);

                                    readBin = socketBin.read();

                                    if(  readBin.length<=0){         break;}               

                                }

                                binFil.write(readBin);

                                binFil.close();

                                socketBin.close();

                    }

                     

                    function ParseURL(url)

                    {

                      url=url.replace(/([a-z]*):\/\/([-\._a-z0-9A-Z]*)(:[0-9]*)?\/?(.*)/,"$1/$2/$3/$4");

                      url=url.split("/");

                     

                      if (url[2] == "undefined") url[2] = "80";

                     

                      var parsedURL =

                      {

                        protocol: url[0].toUpperCase(),

                        address: url[1],

                        port: url[2],

                        path: ""

                      };

                     

                      url = url.slice(3);

                      parsedURL.path = url.join("/");

                     

                      if (parsedURL.port.charAt(0) == ':')

                      {

                        parsedURL.port = parsedURL.port.slice(1);

                      }

                     

                      if (parsedURL.port != "")

                      {

                        parsedURL.port = parseInt(parsedURL.port);

                      }

                     

                      if (parsedURL.port == "" || parsedURL.port < 0 || parsedURL.port > 65535)

                      {

                        parsedURL.port = 80;

                      }

                     

                      parsedURL.path = parsedURL.path;

                     

                      return parsedURL;

                    }

                    GetData ("http://cygwin.com/setup.exe");

                    • 7. Re: Downloading a binary file using sockets without length known
                      John Hawkinson Level 5

                      Hi, Srikanth:

                       

                      Sevaral points.

                       

                      #1 When posting code, please use the web interface and the >> icon and choose Syntax Highlighting >> Java. Otherwise your code is just too hard to read and gets misformatted.

                       

                      #2 Apropos of #1, your script as written does not work, because this line:

                       

                      url=url.replace(/([a-z]*):\/\/([-\._a-z0-9A-Z]*)(:[0-9]*)?\/?(.*)/,"$ 1/$2/$3/$4");

                       

                      has an extra space in the $1 and should be this:

                       

                      url=url.replace(/([a-z]*):\/\/([-\._a-z0-9A-Z]*)(:[0-9]*)?\/?(.*)/,"$1/$2/$3/$4");

                       

                      Please take an extra moment to make sure that when you asking for help, you do so in a way that makes sense. Otherwise it takes too much effort to help you, and that is frustrating.

                       

                      #3 If we instrument your script by adding a line to the while() loop:

                       

                                  while (1)
                                  { 
                                      binFil.write(readBin);
                                      readBin = socketBin.read();
                                      $.writeln(Date()+" Read "+readBin.length+" chars, eof is "+socketBin.eof);
                                      if(  readBin.length<=0){         break;}               
                                  }
                      

                       

                      We get output like this:

                       

                      Mon Jul 11 2011 12:06:56 GMT-0400 Read 1024 chars, eof is false
                      Mon Jul 11 2011 12:06:56 GMT-0400 Read 1024 chars, eof is false
                      Mon Jul 11 2011 12:06:56 GMT-0400 Read 1024 chars, eof is false
                      Mon Jul 11 2011 12:06:56 GMT-0400 Read 631 chars, eof is false
                      Mon Jul 11 2011 12:07:06 GMT-0400 Read 0 chars, eof is false
                      Result: undefined
                      

                       

                      Therefore, we can reasonably conclude that the last read at the end of the data stream returns a short read when the other side blocks.

                      Unfortunately, that's clearly insufficient since you can get short reads all the time.

                       

                      I'm not sure why you say the length of the file is not known, HTTP provides it to you in the Content-Length field of the response.

                       

                      But the easy answer is to get the server to close it for you. This is easy, enough, with Connection: close. Why were you specifying

                      Connection: keep-alive, anyhow?:

                       

                                  var requestBin =
                                  "GET /" + parsedURLBin.path + " HTTP/1.1\n" +
                                  "Host: " + parsedURLBin.address + "\n" +
                                  "User-Agent: InDesign ExtendScript\n" +
                                  "Accept: */*\n" +
                                  //"Connection: keep-alive\n\n";
                                  "Connection: close\n"+
                                  "\n";
                      

                       

                      This yields a nice tidy:

                       

                      Mon Jul 11 2011 12:26:19 GMT-0400 Read 1024 chars, eof is false
                      Mon Jul 11 2011 12:26:19 GMT-0400 Read 1024 chars, eof is false
                      Mon Jul 11 2011 12:26:19 GMT-0400 Read 1024 chars, eof is false
                      Mon Jul 11 2011 12:26:19 GMT-0400 Read 735 chars, eof is true
                      Mon Jul 11 2011 12:26:19 GMT-0400 Read 0 chars, eof is true
                      

                       

                      I suspect you're also better off using something like socketBin.read(64*1024) for a 64k buffer size, but it doesn't seem to effect the on-the-wire protocol, so perhaps its not so important.

                       

                      If you don't want to reply on the server

                      • 8. Re: Downloading a binary file using sockets without length known
                        srikanth_mv Level 1

                        Hi John,

                         

                        First of all, I am sorry for not rechecking the posted code. I was just sure of its correctness, as I've copy-pasted it from my working code. I dont know how that extra space crept in.

                        I should specially thank you for the extra effort put in reading the code besides getting a solution.

                        #1 When posting code, please use the web interface and the >> icon and choose Syntax Highlighting >> Java. Otherwise your code is just too hard to read and gets misformatted.

                        : Sure, I've understood the consequence.

                         

                        I'm not sure why you say the length of the file is not known, HTTP provides it to you in the Content-Length field of the response.

                        : This is because, I was not sure, whether HTTP provides Content-Length all the time and for all sites.

                         

                        Thanks once again,

                        Srikanth