16 Replies Latest reply on Dec 14, 2016 12:58 PM by pixxxel schubser

    Trim Filename

    Babymac08 Level 1

      I have a script that I'm running and it's placing an image in a template... All of the files I will be placing will start with a 6 digit number... I need to be able to grab just those six digits and enter them as text... I currently have it set to provide a dialog... but I want to automate it to grab the first 6 characters automatically...

      Any help is greatly appreciated....

       

      Here is an excerpt of the script that contains the portion i need to change it currently is removing the extension

       

      //Place the PS # on PS Line

      var placed_image = (app.activeDocument.placedItems[0].file.name);

      var pointTextRef = app.activeDocument.textFrames.add();

      var trim = placed_image;

      trim = trim.replace(/(.*)\.[^.]+$/, "$1"); // This is the line I need changed to keep only the first 6 characters/digits

      pointTextRef.contents = trim;

      pointTextRef.textRange.characterAttributes.size = 12;

      pointTextRef.top = -661;

      pointTextRef.left = 142;

        • 1. Re: Trim Filename
          moluapple Level 4

          Just change that line to: trim = trim.replace(/(^\d{6}).*/, '$1');

          • 2. Re: Trim Filename
            Babymac08 Level 1

            Thank you so much... Would you be kind enough to explain what each of the elements in the () do... I'm still new to scripting and trying to learn as much as I can...

             

            Thanks

            • 3. Re: Trim Filename
              williamadowling Level 4

              I've been coding for a few years now (i learned exactly the way you're learning right now) and regex still makes my head hurt. So i'll do my best to parse this out with you.

               

              so the outer parentheses just indicate that everything inside of them should be passed to the replace() method. So we'll just look inside of those.

               

              there are two arguments being passed into the replace function:

              /(^\d{6}).*/

              and

              '$1'

               

              The first argument:

              the "regex" or "regular expression" is denoted by all of the characters inside the / / characters. the first / indicates "this is the start of my regular expression" and the second one indicates "this is the end". so let's dig inside of those. We're left with:

               

              (^\d{6}).*/

               

              When looking at regular expressions, "groups" (or some other more appropriate technical term) are denoted by parentheses. So it's like saying "find me these elements all in a row". So this will not find those same elements in a different order or if there are other undesirable elements between them. here's an example in english: (abcd) would match "xyzabcdpq" but it would NOT match "xyzabqcd" because that "q" is in there which mucks up the group.

               

              so let's take a look at the 'group':

               

              (^\d{6})

               

              the elements of this group are:

              ^ : means "at the beginning" or "nothing can come before this"

              \d :  means any single digit, 0-9

              {6} : means repeat the previous thing 6 times

               

              now let's put it together.

               

              At the very beginning of the string, find any digit 0-9 six times in a row

               

              so let's look at some examples that would be true:

               

              123456.anything.else

              355329

              999999/directory/something

               

              These 3 examples would all be true based on the regex (^\d{6}) because they all begin with 6 digits.

               

              Now some examples that would be false:

               

              X123456.anything.else    //this is false because the 6 digits are not at the beginning

              355Yz293      //this is false because there are alphabetical characters interrupting the 6 digits. they need to be all in a row

              4321/directory/something     //this would be false because there are only 4 digits at the beginning instead of 6

               

              These are just a few of the nearly infinite ways that a regex could not match your pattern, but hopefully this helps you see how it works on a basic level.

               

              Now let's look at the last part of the regex:

               

              .*

               

              in regex, a dot/period ( . ) means "any single character"

              and the asterisk/star ( * ) means repeat the previous thing as many times as necessary (if necessary. repeating 0 times is acceptable also).

               

              Essentially .* means "anything else from here on out is fine".

              .* could be nothing at all.

              but it could also be blahabc31.kfjsakjfsASE3  ka295*&#@*&$)etc etc.

               

              In conclusion:

               

              the regex /^\d{6}.*/ means if the string starts with 6 digits and finishes with anything else, then return true.

               

              The second argument in the replace() function:

               

              Unfortunately, Someone else will have to comment on the mysterious "$1" argument. I'm not familiar with that syntax. I don't know where the information for $1 comes from. But i can tell you that the second argument in the replace() method is the "new value".

               

              syntax is as follows:

              trim.replace(searchPattern, newValue);

               

              So an example would be the following. Let's say you wanted to find any strings that start with 6 digits, and change the entire string to the word "firetruck" (don't know why you'd want to do that, but it gets the point across).

               

              var str1 = "123456_otherText.ext"

              var str2 = "2523X32_someFile.ext"

               

              //now lets say we want to run the replace function on each of these strings.

               

              var newStr1 = str1.replace(/(^\d{6}).*/, "firetruck"); //newStr1 will now be "firetruck"

              var newStr2 = str2.replace(/(^\d{6}).*/, "firetruck"); //newStr2 will be "2523X32_someFile.ext" since the regex did not match the string so nothing is replaced and the string simply remains exactly what it was before.

               

              hope all this gibberish helps. =)

              • 4. Re: Trim Filename
                Babymac08 Level 1

                If you're not already you should become a scripting professor... Thanks so  much for taking the time to explain... This helps tremendously...

                • 5. Re: Trim Filename
                  williamadowling Level 4

                  i'm still in school myself learning how to program. It's just that, like i said, i learned how to do all of this exactly how you're doing it now. On these very same forums, trying to do one small task at a time. So I'm trying to explain things how they would have made sense to me the first time i saw them.

                   

                  I vividly remember once being VERY confused when somebody provided a snippet for me that included a for loop that used the loop variable "c". So it looked like this:

                  for(var c=0; c<someVar.length; c++)

                   

                  And i was ready to simply move on because at the time i thought they were using "c++" the programing language, instead of javascript.

                   

                  None of this is any different than learning a foreign language. You need to learn the words.. then you need to learn punctuation.. Then you need to learn the sentence structure. Only then can you write a meaningful paragraph that could be understood by a native speaker of that language. We just happen to be trying to write in a language where the computer is the native speaker, and computers are notoriously stubborn when it comes to adapting. If you were writing to a native spanish speaker and you wrote "arroyo" (a small river) instead of "arrollo" (to run over or knock down) they would be able to discern what you meant by context. Computers can't do that and they'll just call you dumb and stop working. That's why it's so difficult, especially when you're looking at a string of nonsense characters like /(^\d{6}).*/

                   

                  All that to say, you're doing great and learning all of this very quickly. keep your aspirations high and just remember to break things down as small as you can. coding is simply about stringing commands together in the right order. So if the big picture is fuzzy or confusing, simply break it down as far as you can and tackle each step one at a time. You'll be speaking fluently before you know it. ;-)

                  • 6. Re: Trim Filename
                    pixxxel schubser Level 5

                    Hi Babymac08,

                    there is IMHO no need for replace()

                     

                    Give this a try:

                    var aDoc = app.activeDocument;
                    var placed_image = aDoc.placedItems[0].file.name;
                    var trim = placed_image.match(/^\d{6}/);
                    if (trim != null) {
                        var pointTextRef = app.activeDocument.textFrames.add();
                        pointTextRef.contents = trim[0];
                        pointTextRef.top = -661;
                        pointTextRef.left = 142;
                        pointTextRef.textRange.characterAttributes.size = 12;
                        } else {
                            alert("no match");
                            }
                    

                     

                    Have fun

                     

                    • 8. Re: Trim Filename
                      williamadowling Level 4

                      Ah. i figured it was something like that that was going on.

                       

                      Just so I'm clear, it's referring to the group: (^\d{6})   correct? and the "1" refers to the first group in the test pattern? so if you had 3 different groups involved and you wanted to reinsert the third group you'd use $3 instead?

                      • 9. Re: Trim Filename
                        Babymac08 Level 1

                        Here's a question taking this a step deeper... Is there a way to factor in a seperator of some kind...

                         

                        (ie: File name would be... "123456 - Size 4 x 8 - Description.jpg(or other ext)")

                         

                        Can we break down and display the elements between each "-" So now we have 3 different fields we'd place..?

                        (ie: Field 1 = 123456

                        Field 2 = Size 4 x 8

                        Field 3 = Description

                         

                        file extension being removed?

                        • 10. Re: Trim Filename
                          williamadowling Level 4

                          Of course! That's the beauty of regex. If you can dream it.. you can write a regex for it. However, things are going to look increasingly ugly and difficult to read. but again, as long as you break things down into their component parts, it's all simple. Give me a little bit and i'll write something up and explain what's going on.

                          • 11. Re: Trim Filename
                            williamadowling Level 4

                            You want to find:

                            6 digits in a row at the beginning of your filename (good news, we've already done that part! woop!)

                             

                            the word "Size" followed by a space, one or two digits, followed by a space, and one or two digits

                             

                            some string of letters followed by a dot and an extension.

                             

                             

                            This is a perfect opportunity to look closer at 'groups' of regex. If you know that your file names will always be delmited by "(space)-(space)" then we just need to create a group for each of the fields in between those delimiters. Here's the general layout we'll be looking for.

                             

                             

                            (^group1) - (group2) - (group3$)

                             

                            notice the carat/hat at the beginning of the line. as we learned, that means there can't be anything before this group/expression. Also notice the $ at the end. This means nothing can be after this group/expression.

                             

                            Here's one way that we can write the regex that you're looking for (though i'm sure someone around here will have a much more concise way to write it):

                            /(^\d{6}) - ([Ss]ize \d{1,2} x \d{1,2}) - ([a-zA-Z0-9]*\.[a-zA-Z]{3,4}$)/

                             

                            So this isn't perfect at all. And it is susceptible to variations in the filename. So you'll need to be sure your filenames are consistently formatted properly.

                             

                            Let's break it down into it's component parts:

                             

                            (\d{6}) - we already know what this one does.

                             

                            After that you see a space, a dash, and another space. This means, "Literally, i want a space, a dash and a space. no funny business."

                            This is one place where things can easily break down. If you encounter an underscore instead of a dash. this breaks. So depending upon your needs, you may need to make this a little more robust to be safe.

                             

                            the next group is:

                             

                            ([Ss]ize \d{1,2} x \d{1,2}) - Now we're seeing some new stuff.

                             

                            Let's start with the square brackets. We've already seen parenthases (indicates a group) and curly brackets (indicates repetition factor).

                            Square brackets indicate a single character. it means "a single character can match anything inside these brackets".

                            In the first example of square brackets in this line, we see: [Ss]

                            This means, the first letter in this group can either be a capital S or a lowercase s.

                            Then following the s or S, there should be the letters "ize".

                             

                            Next we have a literal space.

                             

                            then the familiar \d followed by a repetition range: {1,2}.

                            We've seen the curly brackets, but last time we were looking explicitly for exactly 6 digits. When you use 2 numbers in the curly brackets separated by a comma. It means "between these two numbers".

                            so imagine we used: \d{1,5}

                            that would match any of the following:

                            2, 63, 832, 4038, or 12456

                            but it would NOT match:

                            636993

                             

                            followed again by a literal space, the lowercase letter "x" and another literal space.

                            You can make this safer and more robust by using [xX] so that it will work for lowercase or capital.

                             

                            Then we have another \d{1,2}

                            So again, this expression matches any number between 0 and 99

                             

                            Now, onto the last group. Wooh! we see some more new stuff!

                             

                            ([a-zA-Z0-9]*\.[a-zA-Z]{3,4}$)

                             

                            First off, remember that square brackets means a single character, even though there's a lot of gibberish inside.

                            [a-zA-Z0-9] simply means that this single character can be any alphanumeric character, but not a special/metacharacter.

                            This is followed by the * which means, repeat as many times as necessary until you reach something that's not an alphanumeric character.

                            In our case, the next time you encounter something that's not alphanumeric should be the dot before the extension..

                            Hence the next character, a literal dot/period (you have to use the escape character/backslash (\) because a single dot simply means "any character").

                            I want to take this opportunity to point out the difference between:

                            .*

                            and

                            *\.

                             

                            remember in my first explanation where we used .* to indicate "match any characters from here on out". That's because the * modifies the charater that came before it.

                            in our case here, the * is independent of the \. meaning we only want a single dot.

                             

                            continuing on:

                             

                            [a-zA-Z]{3,4}$

                             

                            again, match a single character between a and z irrespective of case, repeat 3 or 4 times (to account for jpg and jpeg possibilities)

                            Then lastly, we see the dollar sign which simply means "this is the end of the line". Therefore, if everything matched up until the extension, but then there was more text after it, the match would not work.

                             

                             

                            So, finally, to your last question.. Removing the extension.

                             

                            This is where we get back to the aforementioned "reinserting text by capturing groups". (remember the second argument of the replace() function: "$1"); (thanks moluapple for teaching me something i didn't know =) )

                             

                            Now, i wrote everything above before i really understood what you wanted to do. So I'm going to rewrite the expression slightly to accommodate the end goal.

                             

                            /(^\d{6}) - ([Ss]ize [\d]{1,2} x [\d]{1,2}) - ([a-zA-Z0-9]*\)(.[a-zA-Z]{3,4})$/

                             

                            So all i really did was split up the last group. This way we can capture the extension to remove it.

                             

                            Now, have a look at this code snippet that will pull the desired information out and make a text frame for each.

                            pixxxel shubster was completely right above, if you're just trying to pull the info from the filename, replace is not necessary. you can save the results of the match() function to a variable and then access the elements by index.

                             

                            function test()

                            {

                                var docRef = app.activeDocument;

                                var fileName = "123456 - Size 4 x 8 - Description.jpg";

                             

                                var pat = /(^\d{6}) - ([Ss]ize [\d]{1,2} x [\d]{1,2}) - ([a-zA-Z0-9]*)(\.[a-zA-Z]{3,4}$)/;

                             

                                if(pat.test(fileName))

                                {

                                    var frameContents = fileName.match(pat);

                                    var frame1 = docRef.textFrames.add();

                                    frame1.contents = frameContents[1];

                                    frame1.left = 50;

                                    frame1.top = -50;

                             

                                    var frame2 = docRef.textFrames.add();

                                    frame2.contents = frameContents[2];

                                    frame2.left = 50;

                                    frame2.top = -75;

                             

                                    var frame3 = docRef.textFrames.add();

                                    frame3.contents = frameContents[3];

                                    frame3.left = 50;

                                    frame3.top = -100;

                                }

                            }

                            test();

                            • 12. Re: Trim Filename
                              pixxxel schubser Level 5

                              Hi Babymac08,

                              there is only one rule:

                              You have to describe exactly all kinds of variations of your file names. Otherwise all regex will be to complicated or all regex will have false positiv or false negative hits.

                              For an example:

                              It is a big difference if you file name is

                              123456 - Size 4 x 8 - Description.jpg (with spaces)

                              or

                              123456-Size4x8-Description.jpg (without spaces)

                               

                              And IMHO it's never good to use spaces in file names.

                               

                              But anyway - both cases are possible with

                              // for file names like 
                              // 123456 - Size 4 x 8 - Description.jpg
                              // or
                              // 123456-Size4x8-Description.tiff
                              var aDoc = app.activeDocument;
                              var placed_image = aDoc.placedItems[0].file.name;
                              var trim = decodeURI (placed_image).replace (/\..{2,4}$/,"").replace (/ ?- ?/g,"-").split ("-");
                              if (trim != null && trim.length == 3) {
                                  var pointTextRef = app.activeDocument.textFrames.add();
                                  pointTextRef.contents = trim[0]+"\n"+trim[1]+"\n"+trim[2];
                                  pointTextRef.top = -661;
                                  pointTextRef.left = 142;
                                  pointTextRef.textRange.characterAttributes.size = 12;
                                  } else {
                                      alert("no match");
                                      }
                              

                               

                              Have fun

                               

                              • 13. Re: Trim Filename
                                Babymac08 Level 1

                                Guys... thanks so much... This scripting thing is starting to become really fun... and as I'm learning the possibilities are endless... I've learned so much in the last couple of weeks... and I'm building scripts to do several tasks that are repetitive in my job... and it's going to save tons of time in the end....

                                 

                                Thanks so much to both of you for taking the time to walk through and explain in so much detail... I'm confindent... many more will find the instructions helpful...

                                • 14. Re: Trim Filename
                                  pixxxel schubser Level 5

                                  Sorry williamadowling,

                                  you are young in Regex, but this is cumbersome

                                  williamadowling schrieb:

                                   

                                  So I'm going to rewrite the expression slightly to accommodate the end goal.

                                   

                                  /(^\d{6}) - ([Ss]ize [\d]{1,2} x [\d]{1,2}) - ([a-zA-Z0-9]*\)(.[a-zA-Z]{3,4})$/

                                   

                                  Better use something like this:

                                  (?i)(^\d{6}) ?- ?(size[/\dx ]*) ?- ?(\w*)(\..{2,4}$)

                                  (?i) ignore cases

                                  ? perhaps

                                  \w word signs

                                  {2,4} include file formats like ai

                                  Have fun

                                   

                                  • 15. Re: Trim Filename
                                    williamadowling Level 4

                                    no apology necessary. I knew there'd be a better way to write it. That's the tough thing about regex. there are a million "correct" answers.

                                     

                                    [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]

                                    works just as well as

                                    \d{10}

                                    but obviously one is easier to type.

                                     

                                    Since Babymac08 is brand new to this, i opted for the longer, but perhaps more clear/understandable syntax. though yours is undoubtedly more thorough and accounts for potential lack of spaces where mine did not.

                                    • 16. Re: Trim Filename
                                      pixxxel schubser Level 5

                                      Everything ok.

                                       

                                       

                                      My favorite solution should be more flexible as a pure regex. See the code in my post #12