22 Replies Latest reply: Feb 17, 2012 6:44 AM by ACS LLC RSS

    Retrieving random record set from SQL and also applying an ORDER

    ACS LLC Community Member

      I need a query that is capable of obtaining X amount of random records from a SQL2008 database, but I also want those results to then order by a numberic field in the same table, camp_priority

       

      I have

       

      SELECT TOP #DisplayAmount#,camp_uid,camp_title,camp_text,camp_image_type
      FROM campaigns
      WHERE camp_uid IN

      etc..etc

       

      then at the bottom I have

       

      ORDER BY newid(),camp_priority

       

      The problem seems to be that I'm getting random records, but it's not ordering by the camp_priority

       

      Anybody have any ideas?

       

      Thanks

       

      Mark

        • 1. Re: Retrieving random record set from SQL and also applying an ORDER
          Dan Bracuk Community Member

          I have an idea.  Take the newid() out of the order by clause.

           

          Mind you, that only solves the order by problem.  I don't share your opinion that you are getting random records.  You'll have to give that one some more thought.

          • 2. Re: Retrieving random record set from SQL and also applying an ORDER
            Adam Cameron. Community Member

            I'd wanna check the performance of it before running with it as a solution, but this [sort of thing] should work:

             

            SELECT *

            FROM (

                 -- your actual query here

                 ORDER BY NEWID()

            ) subq

            ORDER BY subq.columnYouWantToOrderBy

             

            --

            Adam

            • 3. Re: Retrieving random record set from SQL and also applying an ORDER
              ACS LLC Community Member

              It looks like I must have the syntax wrong, it's throwing an error.

               

              I've posted the query lower down.

               

              I had to pause for some thought after posting this thread, because the logic that I'm trying to apply to this is probably not going to serve it's purpose. On one hand I want to draw random records from the campaigns table, let's say that's 6 random records that meet the required criteria of correct country, category etc. On the other hand I also want to use my camp_priority to push campaigns up the list.

               

              The issue is - Let's suppose I have 100 campaigns in there with mixed priority levels from 1 to 10, 1 being the highest, and I randomly pick 6 of them and then order them by priority, it won't be doing much justice to the priority set up because it could technically randomly pick out 6 x priority=10. and some 9's... If I have it so it picks 6 random ones of the highest priority then that also causes an issue because you'd never get to see the lower priority campaigns.

               

              I think the best way to handle this is to set all camp_priority to 5 so they are all level, and then anything we want to push up we then set to a higher level 1 to 4, so if we just set 2 campaigns to priority level 1 and 2 then these two would take up the first two slots due to their higher priority and the next 4 slots would be random as they are all same priority, after yapping on so much about this, I think your suggestion would work, if I could get it to function

               

              btw I read that NEWID() has rather slow performance, but finding another solution looked way to complicated for me.

               

               

              <CFQUERY name="GetCamp" DATASOURCE="#datasource#">

                  SELECT TOP #DisplayAmount# camp_uid,camp_title,camp_text,camp_image_type

                  FROM (campaigns

                  WHERE camp_uid IN

               

                  <!--- MATCH COUNTRY --->

               

              (SELECT camptarget_camp_uid FROM CampTarget

              WHERE camptarget_country = '#session.usercountry#')

               

                  AND camp_UID IN

               

                  <!--- ALL CATEGORIES THAT THE CAMPAIGN IS IN --->

               

              (SELECT category_camp_UID FROM category_list

              WHERE category_cat_uid IN

               

                  <!--- BASED ON CATEGORIES REQUIRED FOR THIS TRACKING ACCOUNT UID --->

               

              (SELECT aff_cat_uid FROM aff_cat

              WHERE aff_cat_aff_sub_uid = #val(session.affsub)#)

               

              )

               

              AND camp_status = 1 AND camp_stack = 1

               

              <CFIF #ListLen(session.servedcamps)# GT "0">

              AND camp_uid NOT IN (#session.ServedCamps#)

              </CFIF>

               

              ORDER BY newid()

               

              )

               

              subq ORDER BY subq.camp_priority

               

              </CFQUERY>

              • 4. Re: Retrieving random record set from SQL and also applying an ORDER
                Adam Cameron. Community Member

                Your first subquery doesn't have a SELECT statement.  That'll be one source of errors.

                 

                As for weighting the significance of your records before returning them "randomly", you could order them on the sum or product of two components:  a weighting and a random part.  A very simple (perhaps overly simple) would be to weight the records between 1-5, and then add a random component between 1-5 (or 1-10, whatever... you'll need to decide the significance of each).

                 

                I can't comment on whether newid() is slow or not... have you got a citation for that (out of interest)?

                 

                --

                Adam

                • 5. Re: Retrieving random record set from SQL and also applying an ORDER
                  BKBK CommunityMVP

                  ACS LLC wrote:

                   

                  It looks like I must have the syntax wrong, it's throwing an error.

                   

                  I've posted the query lower down.

                  Are you perhaps aiming for something like this?

                   

                  <CFQUERY name="GetCamp" DATASOURCE="#datasource#">

                  SELECT *

                  FROM

                      (    SELECT TOP #DisplayAmount# camp_uid,camp_title,camp_text,camp_image_type

                          FROM campaigns

                          WHERE camp_uid IN

                              (SELECT camptarget_camp_uid FROM CampTarget

                               WHERE camptarget_country = '#session.usercountry#')

                          AND camp_UID IN

                              (SELECT category_camp_UID FROM category_list

                               WHERE category_cat_uid IN

                                      (SELECT aff_cat_uid FROM aff_cat

                                       WHERE aff_cat_aff_sub_uid = #val(session.affsub)#))

                          AND camp_status = 1

                          AND camp_stack = 1

                          <CFIF #ListLen(session.servedcamps)# GT "0">

                          AND camp_uid NOT IN (#session.ServedCamps#)

                          </CFIF>

                          ORDER BY newid()

                      ) subq

                  ORDER BY subq.camp_priority

                  </CFQUERY>

                  • 6. Re: Retrieving random record set from SQL and also applying an ORDER
                    ACS LLC Community Member

                    Thanks for the feedback.. it works.. after a little modfication .... the only thing I found that was that I had to add camp_priority to the query otherwise it would crash

                     

                    I've given it a good test and it appears to do the job.

                     

                    not sure if it's chewing up the CPU though.

                     

                    Thanks

                     

                    Mark

                    • 7. Re: Retrieving random record set from SQL and also applying an ORDER
                      ACS LLC Community Member

                      Hey Adam,

                       

                      I misread your original post, now I see how it was supposed to work, just posted below... got it working ... works great. hopefully it's not killing the CPU as this query will be use a LOT.. I'll just get a bigger CPU ;-)

                       

                      With regards to the NEWID() performance issue, I don't have a specific link, I was just hunting around different forums etc and found a few posts that had talked about performance issues, and one that had a command I'd never seen that took a percentage of records with a lot of perfomance test graphs showing it was 10 times slow than the other approach.

                       

                      I'm just going to have to see how it goes

                       

                      Thanks for the help

                       

                      Mark

                      • 8. Re: Retrieving random record set from SQL and also applying an ORDER
                        ACS LLC Community Member

                        Although this is working, I am seeing something a little odd with regards to performance:

                         

                        <CFQUERY name="GetCamp" DATASOURCE="#datasource#">

                         

                        SELECT * FROM

                         

                        (

                         

                            SELECT TOP #DisplayAmount#

                         

                        camp_uid,camp_title,camp_text,camp_image_type,camp_priority

                            FROM campaigns

                            WHERE camp_uid IN

                         

                            <!--- MATCH COUNTRY --->

                         

                        (SELECT camptarget_camp_uid FROM CampTarget

                        WHERE camptarget_country = '#session.usercountry#')

                         

                            AND camp_UID IN

                         

                            <!--- ALL CATEGORIES THAT THE CAMPAIGN IS IN --->

                         

                        (SELECT category_camp_UID FROM category_list

                        WHERE category_cat_uid IN

                         

                            <!--- BASED ON CATEGORIES REQUIRED FOR THIS TRACKING ACCOUNT UID --->

                         

                        (SELECT aff_cat_uid FROM aff_cat

                        WHERE aff_cat_aff_sub_uid = #val(session.affsub)#)

                         

                        )

                         

                        AND camp_status = 1 AND camp_stack = 1

                         

                        <CFIF #ListLen(session.servedcamps)# GT "0">

                        AND camp_uid NOT IN (#session.ServedCamps#)

                        </CFIF>

                         

                        ORDER BY newid()

                         

                        )

                         

                        subq ORDER BY subq.camp_priority

                         

                        </CFQUERY>

                         

                        The first time I run the query the session.served.camps is empty, and it executes in 4ms or less

                         

                        CACHEDfalse
                        EXECUTIONTIME4
                        SQLSELECT * FROM  (  SELECT TOP 3 camp_uid,camp_title,camp_text,camp_image_type,camp_priority FROM campaigns WHERE camp_uid IN    (SELECT camptarget_camp_uid FROM CampTarget WHERE camptarget_country = 'us')  AND camp_UID IN    (SELECT category_camp_UID FROM category_list WHERE category_cat_uid IN (SELECT aff_cat_uid FROM aff_cat WHERE aff_cat_aff_sub_uid = 2)  )  AND camp_status = 1 AND camp_stack = 1    ORDER BY newid()  )  subq ORDER BY subq.camp_priority

                         

                        Now when I reload it so that session.served.camps has a value, I get this

                         

                        EXECUTIONTIME329
                        SQLSELECT * FROM  (  SELECT TOP 3 camp_uid,camp_title,camp_text,camp_image_type,camp_priority FROM campaigns WHERE camp_uid IN    (SELECT camptarget_camp_uid FROM CampTarget WHERE camptarget_country = 'us')  AND camp_UID IN    (SELECT category_camp_UID FROM category_list WHERE category_cat_uid IN (SELECT aff_cat_uid FROM aff_cat WHERE aff_cat_aff_sub_uid = 2)  )  AND camp_status = 1 AND camp_stack = 1   AND camp_uid NOT IN (4,3,7,5) ORDER BY newid()  )  subq ORDER BY subq.camp_priority

                         

                        Notice how it goes from 4ms or less, to 329ms.. that seems like a big increase just to add in a NOT IN section

                         

                        The test dbase only has 6 records in it, and camp_UID is the primary key. I even tried to add an extra index but it made no difference

                         

                        What do you think? Normal behaviour?

                         

                        Thanks

                         

                        Mark

                        • 9. Re: Retrieving random record set from SQL and also applying an ORDER
                          Adam Cameron. Community Member

                          > Notice how it goes from 4ms or less, to 329ms.. that seems like a big increase just to add in a NOT IN section

                          > The test dbase only has 6 records in it, and camp_UID is the primary key. I even tried to add an extra index but it made no difference

                          > What do you think? Normal behaviour?

                           

                          I think databases usually come with profiling tools for investigating this sort of thing...

                           

                          --

                          Adam

                          • 10. Re: Retrieving random record set from SQL and also applying an ORDER
                            ACS LLC Community Member

                            Not sure, wouldn't know where to start with that

                            • 11. Re: Retrieving random record set from SQL and also applying an ORDER
                              Adam Cameron. Community Member

                               

                              Not sure, wouldn't know where to start with that

                               

                              Surely the answer to that is the same as with any situation where there's a question of "how do I..."

                               

                              Type this into your browser:

                               

                              http://www.google.com

                               

                              After that, type in a string in the input box that reflects whatyou want to know.  Press "Search"...

                               

                              [etc]

                               

                              --

                              Adam

                              • 12. Re: Retrieving random record set from SQL and also applying an ORDER
                                ACS LLC Community Member

                                I did that, and it lead me back to the Adobe Forum ;-)

                                • 13. Re: Retrieving random record set from SQL and also applying an ORDER
                                  Adam Cameron. Community Member

                                  ACS LLC wrote:

                                   

                                  I did that, and it lead me back to the Adobe Forum ;-)

                                   

                                  You googled for how to do database profiling on whatever DB platform you're using, and the best link was back to these forums?

                                   

                                  [cough]bullsh!t

                                   

                                  Go and do some research.

                                   

                                  --

                                  Adam

                                  • 14. Re: Retrieving random record set from SQL and also applying an ORDER
                                    ACS LLC Community Member

                                    That might have been a joke ;-)

                                    • 15. Re: Retrieving random record set from SQL and also applying an ORDER
                                      ACS LLC Community Member

                                      ahhh. found the performance tools (never used it before), but when I try to open it, I get a fatal error from SQL

                                       

                                      Might have to reinstall.. urgh

                                      • 16. Re: Retrieving random record set from SQL and also applying an ORDER
                                        Dan Bracuk Community Member

                                        "not in " really slows down queries.  A very unintuitive alternative is to do something like this:

                                         

                                        select etc

                                        from etc

                                        where somefield in

                                        (select somefield

                                        from etc

                                        except

                                        select somefield

                                        from etc

                                        where somefield in (that list or subquery)

                                        )

                                         

                                        If your db does not support the keyword except, it might support the keyword minus.  If you are using mssql, it supports both but they behave differently so test early and test often.

                                        • 17. Re: Retrieving random record set from SQL and also applying an ORDER
                                          ACS LLC Community Member

                                          Thanks Dan. I had a feeling it was NOT IN... it's insane just how much of an effect it has on a query, I never knew, just stumbled across this issue

                                           

                                          I'm using MS SQL 2008, so I guess both the commands are available to me, I've never come across or used EXCEPT or MINUS, will try to research that one.

                                           

                                          I will definately test...test....test.. last thing I need is for this to serve the wrong data out!

                                           

                                          From your experience have you found that the EXCEPT is approach that you suggest is much faster than NOT IN?

                                           

                                          Thanks

                                           

                                          Mark

                                          • 18. Re: Retrieving random record set from SQL and also applying an ORDER
                                            Dan Bracuk Community Member

                                            Both sample queries are slow, but it does show the difference.  It's a count of patients who visited yesterday for the first time this year.

                                             

                                            This query takes 43 seconds

                                             

                                            select count(distinct hsc_number)

                                            from patient join visit_fact using (hsc_number)

                                            where registration_date = '20111219'

                                            and hsc_number not in

                                            (select distinct hsc_number

                                            from visit_fact where registration_date between '20110101' and '20111218'

                                            );

                                             

                                            This one takes 9 seconds

                                            select count(distinct hsc_number)

                                            from patient join visit_fact using (hsc_number)

                                            where registration_date = '20111219'

                                            and hsc_number in

                                            (select distinct hsc_number from patient join visit_fact using (hsc_number)

                                            where registration_date = '20111219'

                                            except

                                            select distinct hsc_number

                                            from visit_fact where registration_date between '20110101' and '20111218'

                                            );

                                            They give the same answer.

                                            • 19. Re: Retrieving random record set from SQL and also applying an ORDER
                                              ACS LLC Community Member

                                              I presume thats ms not seconds

                                               

                                              either way, that's one heck of a difference!!!

                                               

                                              I'm going to have to go back and do a lot of changing and testing! I had no idea that NOT IN was such a slow process

                                               

                                              Thanks

                                               

                                              Mark

                                              • 20. Re: Retrieving random record set from SQL and also applying an ORDER
                                                ACS LLC Community Member

                                                I thought I had this down and working, but recently discovered it was not actually working as designed.

                                                 

                                                This is what I have for the SQL

                                                 

                                                <CFQUERY name="GetCamp" DATASOURCE="#datasource#">

                                                 

                                                SELECT * FROM

                                                 

                                                (

                                                 

                                                SELECT TOP 5

                                                camp_uid,camp_title,camp_text,camp_image_type,camp_posting_url,camp_

                                                 

                                                priority

                                                    FROM campaigns

                                                    WHERE

                                                camp_status = 1

                                                AND camp_type = 2

                                                 

                                                ORDER BY newid()

                                                 

                                                )

                                                 

                                                subq ORDER BY subq.camp_priority

                                                 

                                                </CFQUERY>

                                                 

                                                The idea being that I can find the top 5 random records putting priority first.

                                                 

                                                I have records that have the priority of

                                                3

                                                5

                                                6

                                                6

                                                6

                                                8

                                                 

                                                By getting the top 5, random, using priority, I would have expected records with priority

                                                3

                                                5

                                                6

                                                6

                                                6

                                                 

                                                and 8 would not be present.

                                                 

                                                However I just ran it and I get

                                                3

                                                5

                                                6

                                                6

                                                8

                                                 

                                                That's not the desired result

                                                 

                                                I am guessing the random component is somehow overriding the priority

                                                 

                                                Anybody have any ideas?

                                                 

                                                Thanks

                                                 

                                                Mark

                                                • 21. Re: Retrieving random record set from SQL and also applying an ORDER
                                                  BKBK CommunityMVP

                                                  ACS LLC wrote:

                                                   

                                                   

                                                  I have records that have the priority of

                                                  3

                                                  5

                                                  6

                                                  6

                                                  6

                                                  8

                                                   

                                                  By getting the top 5, random, using priority, I would have expected records with priority

                                                  3

                                                  5

                                                  6

                                                  6

                                                  6

                                                   

                                                  and 8 would not be present.

                                                   

                                                  However I just ran it and I get

                                                  3

                                                  5

                                                  6

                                                  6

                                                  8

                                                   

                                                  That's not the desired result

                                                   

                                                  From what you've explained, I think your query is behaving as expected.  If you ask someone to choose 5 numbers at random out of the set {3,5,6,6,6,8}, and to sort the result, then there will 4 possible answers:

                                                   

                                                  {3,5,6,6,6}

                                                  {3,5,6,6,8}

                                                  {3,6,6,6,8}

                                                  {5,6,6,6,8}

                                                   

                                                  Therefore, what you obtained was just one possibility. If you run the query frequently enough, it will produce each of the 4 possibilities.

                                                  • 22. Re: Retrieving random record set from SQL and also applying an ORDER
                                                    ACS LLC Community Member

                                                    got it. So what I am actually doing is ordering the random list, which is not what I wanted to do, what I need to do is take the top 5 ordered based on priority, randomly draw should there be records with the same priority and then order them