12 Replies Latest reply on Nov 21, 2010 3:33 PM by Marc Autret

    JS: How to remove duplicate items from an Array?

    Marcos Suárez Level 1

      JS: How to remove duplicate items from an Array?

      I guess there will be a simple way ...

        • 1. Re: JS: How to remove duplicate items from an Array?
          Marijan Tompa [tomaxxi] Level 4

          Hey!

           

          Maybe something like this:

          Array.prototype.unique = function (){
              var r = new Array();
              o:for(var i = 0, n = this.length; i < n; i++){
                  for(var x = 0, y = r.length; x < y; x++){
                      if(r[x]==this[i]) continue o;}
                  r[r.length] = this[i];}
              return r;
          }
          

           

          Usage:

           

          var myArray = ["a","b","c","c","a","d","b","b"];
          alert(myArray.unique());
          

           

          Hope that helps.

           

          --

          tomaxxi

          http://indisnip.wordpress.com/

          • 2. Re: JS: How to remove duplicate items from an Array?
            Harbs. Level 6

            I use Peter Kahrel's arrayCompress function for string arrays...

             

            function arrayCompress(array){
                var str = array.sort().join('\r')+'\r';
                str = str.replace(/([^\r]+\r)(\1)+/g,'$1');
                str = str.replace(/\r$/,'');
                return str.split('\r');
            }
            

             

            Harbs

            1 person found this helpful
            • 3. Re: JS: How to remove duplicate items from an Array?
              Marcos Suárez Level 1

              OK...

              Very usefull funtions...!!!

              THANKS...!!!

              • 4. Re: JS: How to remove duplicate items from an Array?
                Marc Autret Level 4

                @Marijan

                 

                Your implementation is OK but it has a quadratic time complexity -- O(n²) -- because of the loop within the loop.

                 

                On large arrays (+1000 items), the "Hash Sieving” method is faster -- O(2n). Here is a functional implementation used for string arrays:

                 

                function unique(/*str[]*/ arr)
                {
                     var o={},
                          r=[],
                          n = arr.length,
                          i;
                
                     for( i=0 ; i<n ; ++i )
                          o[arr[i]] = null;
                
                     for( i in o )
                          r.push(i);
                
                     return r;
                }
                
                // TEST:
                alert( unique(["a","b","c","c","a","d","b","b"]) );
                // => a, b, c, d
                

                 

                 

                @Harbs

                 

                The arrayCompress function is very original, but I am skeptical about its performance on large arrays.

                 

                Also, there is two important issues to mention:

                1) the original array is not preserved, because array.sort() reorders its own elements -- we can fix this using array.concat().sort().

                2) the function does not work if any of the supplied strings already contains "\r".

                 

                @+

                 

                Marc

                1 person found this helpful
                • 5. Re: JS: How to remove duplicate items from an Array?
                  Marijan Tompa [tomaxxi] Level 4

                  Hey Marc!

                   

                  Yes, I know that, and I already have your function,

                  but I didn't wanted to use it without your permission

                   

                  --

                  tomaxxi

                  • 7. Re: JS: How to remove duplicate items from an Array?
                    Marc Autret Level 4

                    Marijan Tompa (tomaxxi) wrote:

                     

                    Yes, I know that, and I already have your function,

                    but I didn't wanted to use it without your permission

                     

                    I appreciate your scruples, but I really didn't invent anything:

                     

                    http://www.google.com/search?q="Hash+Sieving"+JavaScript

                     

                     

                     

                    Marc

                    • 8. Re: JS: How to remove duplicate items from an Array?
                      Harbs. Level 6

                      Good points.

                       

                      Nice function! I like the way it uses native features of Object to solve this problem...

                       

                      It looks like the one you posted has an error. It creates null objects?

                       

                      This looks like it's the correct (prototype) version. No?

                       

                      Array.prototype.unique = function() {
                      
                         var o = {}, i, l = this.length, r = [];
                      
                         for(i=0; i<l;i++) o[this[i]] = this[i];
                      
                         for(i in o) r.push(o[i]);
                      
                         return r;
                      };
                      
                      

                       

                      Or function version:

                       

                      function unique (array) {
                         var o = {}, i, l = array.length, r = [];
                         for(i=0; i<l;i++) o[array[i]] = array[i];
                         for(i in o) r.push(o[i]);
                         return r;
                      };
                      

                       

                      It might be possible to make this work on more types of objects by making this line something like this:

                       

                         for(i=0; i<l;i++) o[array[i].toString()] = array[i];

                       

                      Harbs

                      • 9. Re: JS: How to remove duplicate items from an Array?
                        Harbs. Level 6

                        Harbs. wrote:

                        It looks like the one you posted has an error. It creates null objects?

                        Ah nevermind. I didn't read your code well enough...

                        • 10. Re: JS: How to remove duplicate items from an Array?
                          Marc Autret Level 4

                          Harbs. wrote:

                           

                          Harbs. wrote:

                          It looks like the one you posted has an error. It creates null objects?

                          Ah nevermind. I didn't read your code well enough...

                           

                          * Provided that the function works on array of strings, we don't need to store the elements since they are strictly equal to the keys.

                          In this context, o[key]=null; is faster than o[key]=key;

                           

                          * In you alternative code, you don't actually have to write:

                           

                          o[array[i].toString()] = array[i];

                           

                          Indeed, since o is a key-value object, the syntax o[xxx] expects a string and xxx will be converted into a string if necessary, which means that JS internally calls xxx.toString().

                           

                          E.g.

                          var anything = {};
                          anything.toString = function(){return 'foo';};
                          
                          var anyNumber = 3;
                          
                          var obj = {};
                          obj[anything] = 'bar';  // i.e. obj['foo'] = 'bar'
                          obj[anyNumber] = 'baz'; // i.e. obj['3'] = 'baz'
                          
                          alert( obj.toSource() );
                          // => {foo:"bar", 3:"baz"}
                          

                           

                          In conclusion, you can keep this syntax:

                           

                          o[ array[i] ] = array[i];

                           

                           

                          @+

                          Marc

                          • 11. Re: JS: How to remove duplicate items from an Array?
                            Harbs. Level 6
                            • Provided that the function works on array of strings, we don't need to store the elements since they are strictly equal to the keys.

                            In this context, o=null; is faster than o=key;

                             

                            Yes. I realized that... Is the speed difference noticeable?

                             

                            • In you alternative code, you don't actually have to write:

                             

                            o[array[i].toString()] = array[i];

                             

                            Indeed, since o is a key-value object, the syntax o expects a string and xxx will be converted into a string if necessary, which means that JS internally calls xxx.toString().

                             

                            True. I thought of editing my post but I figured I'll just leave it because toString() is clearer anyway (and possibly more efficient, but I forget those details easily)...

                             

                            Harbs

                            • 12. Re: JS: How to remove duplicate items from an Array?
                              Marc Autret Level 4

                              Harbs. wrote:

                               

                              In this context, okey=null; is faster than okey=key;

                               

                              Yes. I realized that... Is the speed difference noticeable?

                               

                               

                              Hmm, not really...

                               

                              I've just performed an iterative rounds benchmark —in ExtendScript 3.92.114— with arrays of resp. 10,000 and 50,000 random strings. The unique() function using o[key]=null is only 1.1X faster than the same function using o[key]=arr[i]. So the gap is absolutely not noticeable on small tables with hundreds of items.

                               

                              Let's just say it's a better practice to nullify the values as there is no need to duplicate strings in the memory.

                               

                              @+

                              Marc