5 Replies Latest reply on Sep 25, 2016 4:12 AM by Vamitul

    Splitting a CSV with a Regular Expression

    Brian Pylant Level 1

      I am working on a script that will read CMYK values from a CSV file and add swatches to the Swatches panel. I can easily split the incoming CSV file by commas (.split(“,”);) but I cannot seem to get a regular expression working to split by both commas and new lines.

      Here’s the snippet of code I have at the moment that is not working:

      var fileIn = new File("/Users/brianp/Desktop/AVL-PMS-TEMP.csv");
      fileIn.open();
      var csvIn = fileIn.read();
      fileIn.close();
      
      var regEx = "/,|\\n/";
      csvRecords = csvIn.split(regEx);  
      

       

      Am I barking up the wrong tree here?

       

      I could edit the CSV ahead of time to find all newlines and replace them with commas, but it seems more elegant to adapt my code to the CSV file as supplied.

       

      Thanks in advance!

        • 1. Re: Splitting a CSV with a Regular Expression
          Peter Kahrel Adobe Community Professional & MVP

          csvRecords = csvIn.split(/[,\n\r]/);

           

          That's the shortcut. To predefine the regex, leave out the quotes:

           

          var regEx = /,|\n/;

           

          Better to add the carriage return (\r) as well.

           

          Use character classes [...] if you can instead of alternatives (...|...), they're more efficient (allegedly).

           

          Peter

          • 2. Re: Splitting a CSV with a Regular Expression
            Loic.Aigon Adobe Community Professional

            For what it's worth, I tend to split records line by line given that the separator is known:

             

            var f = File ( "/myCSV.csv" );
            var headers = [], rows = [];
            var sep = "\t"; //a tab character for separator in this case
            f.open( 'r');
            headers = f.readln().split(sep);
            while (  !f.eof ) {
              rows[ rows.length ] = f.readln().split(sep);
            }
            f.close();
            
            
            alert( "Hre are the headers:\r"+headers "\rand there are "+rows.length+" rows");
            

             

            Loic

            Ozalto | Productivity Oriented - Loïc Aigon

            • 3. Re: Splitting a CSV with a Regular Expression
              Brian Pylant Level 1

              @Peter: Awesome, thank you! I knew it was probably something silly like quotes / no quotes. < #embarassed > I'm very much a newbie at Javascript / Extendscript, learning slowly as I go.

               

              (As an aside, the Adobe documentation is very... obtuse. As a beginner a lot of it makes absolutely no sense to me! The Jongware version is slightly better, but a lot of it still not very clear / obvious to me... a lot of what I've learned has been through somewhat painful trial and error! )

               

              @Loic: ahhh, I hadn't thought about doing it that way! I know the CSV I'm using has ten records per line, and that [1] is the swatch name and [6][7][8] and [9] are the CMYK values, so I built my loop around that math. But of course I'd have to re-do the code if given a CSV that was structured differently... this is for a specific project that I'm in the middle of right now, but when I get a chance I'll go back and try to rework my code to read the file differently, and hopefully make it more flexible in the process!

               

              Thanks so much, both of you!

              Brian

              • 4. Re: Splitting a CSV with a Regular Expression
                Loic.Aigon Adobe Community Professional

                But of course I'd have to re-do the code if given a CSV that was structured differently...

                That you can easily avoid. If you know the header label, you can reach it without knowing its index with minor adjustments. Here is how I do it generally:

                 

                
                
                
                var f = File ( Folder.desktop+"/sample.csv" );  
                var headers = [], rows = [], header; 
                
                
                //Setting used separator
                var sep = "\t"; //a comma character for separator in this case  
                
                
                //Opening file for reading
                f.open( 'r');  
                
                
                //Getting headers on first line
                headers = f.readln().split(sep);
                
                
                //Storing every headers 'index in a object so we can later retrieve the index by calling object property
                var n = headers.length;
                var db= {}
                while ( n-- ) {
                  header = headers[n];
                  db[header] = n;
                }
                
                
                //Storing rows
                while (  !f.eof ) {  
                  rows[ rows.length ] = f.readln().split(sep);  
                }  
                
                
                //Closing file
                f.close();  
                
                
                //Now we can get access to a row[header] value without concern for its index, only its name.
                alert( "The Scientific name for the 3rd record is "+rows[2][db["Scientific name"]] );
                

                 

                Loic