Skip navigation
Currently Being Moderated

Combining data from SQL query

Aug 31, 2011 11:31 AM

Hoping there is a SQL guru out there that can help me with this

 

I have two tables that I want to pull data from and display a report, based on a particular date range, for this example let's say 08/01/2011 to 08/31/2011

 

In one table there would be an affiliate ID - AFF_UID and also the date and a counter for that date, IMP_COUNTER , now there may only be a few records not every day might be covered, say

 

TABLE IMP

date,aff_uid,imp_counter

08/01/2011,999,2000

08/02/2011,999,2050

08/20/2011,999,2030

etc

 

I then have another date, similar set up only there are multiple records in here for a single day, again not all days covered, I would need to total up each row for each day to get a count. So it might be:

 

TABLE LEAD

date,aff_uid

08/01/2011,999

08/01/2011,999

08/01/2011,999

08/01/2011,999

08/12/2011,999

 

So we have different dates covered, sometimes the same date, I need the counter from the first table, and from the second table I need to add them up for a total, and then display any date that has a value and put both counter and lead count together

 

Result from above

date,imp total,lead total

08/01/2011,2000,4

08/02/2011,2050,0

08/12,2011,0,1

08/20/2011,2030,0

 

I am sure there must be a SQL command that can gel all of this together with some grouping? Last thing I need is a ton of SQL statements in a loop!

 

My alternative is to add a counter to the IMP table for each lead, so every time I update the lead table with a record, I will also update the counter, but that's unncessary storage of data I already have an an extra update statement that I might be able to do without, but maybe it would make generating reports faster, just pulling to counters with no 'addition' required.

 

Appreciate any feedback

 

Thanks

 

Mark

 
Replies
  • Currently Being Moderated
    Aug 31, 2011 11:53 AM   in reply to ACS LLC

    select imp.date

    , imp.aff_uid

    , count(lead.aff_uid)

    , counter

     

    from imp join lead on imp.date = lead.date and imp.aff_uid = lead.aff_uid

    where imp.date >= somedate

    and imp.date < someotherdate

    group by imp.date, imp.aff_uid

     

    This is pretty basic stuff.  Since you had to ask, I've heard good things about the book, Teach Yourself SQL in 10 Minutes by Ben Forta

     
    |
    Mark as:
  • Currently Being Moderated
    Aug 31, 2011 5:47 PM   in reply to ACS LLC

    We both forgot to put that in the group by clause.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 8, 2011 9:22 AM   in reply to ACS LLC

    When I expect to get data and don't, I try to isolate the cause by either removing constraints one by one, or removing them all and putting them back one by one. 

    In this case, I'd start by taking away the where clause.  If I got results, I'd put the dates back one at a time to see which one caused the problem.

     

    If I didn't get results by taking away the where clause, I'd start looking at the two fields used in the join. 

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 9, 2011 9:14 AM   in reply to ACS LLC

    Let me see if I understand your requirements, please correct the following statements if they are wrong:

     

    1. You wish to query for a given date range defined by form_from and form_to variables.

     

    2. You wish to include every row in the impressions table that matches the date range criteria.

     

    3. You wish to include the count of matching rows from the leads table for each row from the impressions table based on matching aff_uid and date.

     


    You might try something like the query below.  This uses a subquery to calculate the count from the leads table based on the aff_uid and date from the impressions row.  Note that I have included the date parameters in CFQUERYPARAM tags.

     


    SELECT
        I.imp_date
        , I.imp_aff_uid
        , I.imp_counter
        , (
            SELECT COUNT(*)
            FROM leads AS L
            WHERE L.aff_uid = I.imp_aff_uid
                AND L.lead_date = I.imp_date
            ) AS lead_total
       
    FROM impressions AS I
       
    WHERE I.imp_date BETWEEN <cfqueryparam value="#form_from#" cfsqltype="cf_sql_timestamp"> AND <cfqueryparam value="#form_to#" cfsqltype="cf_sql_timestamp">

     

    ORDER BY I.imp_date, I.imp_aff_uid;

     

     

    You might also consider posting your question to a database specific form.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 12, 2011 4:15 PM   in reply to ACS LLC

    Do the ID and date field values in the LEADS table exactly match the related values in the IMPRESSIONS table? 

     

    In order to assist you further it would be useful to have:

     

    1. The creation scripts for your tables.

    2. A sample of data for each table.

    3. The desired resultsest based on the sample data.

    4. The database server you are using: vendor and version.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 12, 2011 4:31 PM   in reply to ACS LLC

    Knowing the data types for each column and the database server (vendor and version) would be useful.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 12, 2011 5:09 PM   in reply to ACS LLC

    So based on your sample result set I need to assume that not every record will have a row in the IMPRESSIONS table..

     

    You wish to get a resultset containing 3 columns:
    1. An uid from IMPRESSIONS (or LEADS if there is no match in the IMPRESSIONS table).
    2. A date from IMPRESSIONS (or LEADS if there is no match in the IMPRESSIONS table).
    3. The count of records in the LEADS table that math the uid and date.

     

    Is this correct?  It contradicts the assumptions I stated on Sept 9 where I assumed that all records in the results would have a record in the IMPRESSIONS table, to quote "You wish to include every row in the impressions table that matches the date range criteria."

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 12, 2011 5:57 PM   in reply to ACS LLC

    Take the date portion out of the join.  It becomes this:

     

    FROM impressions JOIN leads ON imp_aff_uid = leads.lead_aff_uid

     

    Then, in the where clause do something like this

     

    where (impressions.date >= something and impressions.date < something)

    or (leads.date >= something and leads.date < something)

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 12, 2011 7:38 PM   in reply to ACS LLC

    Seems like what you need is a FULL OUTER JOIN. But I do not think they are supported in MySQL. If not, you could simulate it using a UNION.

     

    The first query should grab all impressions, with or without matching leads. The second should retrieve only leads without matching impression records. In other words, dates not already included in the first query. Something like

     

    SELECT i.imp_date, i.imp_counter, COUNT(l.lead_aff_uid) AS leadCount

    FROM   impressions i LEFT JOIN leads l

            ON i.imp_aff_uid = l.lead_aff_uid

            AND i.imp_date = l.lead_date

    WHERE  i.imp_date BETWEEN ...{your filters} ...

    GROUP BY i.imp_date, i.imp_counter

    UNION ALL

    SELECT l.lead_date, 0 AS imp_counter, COUNT(l.lead_aff_uid) AS leadCount

    FROM   leads l LEFT JOIN impressions i

            ON i.imp_aff_uid = l.lead_aff_uid

            AND i.imp_date = l.lead_date

    WHERE  l.lead_date BETWEEN ...{your filters} ...

    AND    i.imp_aff_uid IS NULL

    GROUP BY l.lead_date

     

    The result should be:

     

    >> 09/01/2011,100,0

    >> 09/03/2011,200,1

    >> 09/07/2011,0,3

     

    However you would not get those totals as long you are including the "uid" column in the SELECT list.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 12, 2011 7:48 PM   in reply to ACS LLC

    Then take a look at the documentation on FULL OUTER JOIN's. They are certainly much simpler than the UNION I just posted

     

    -Leigh

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 12, 2011 8:14 PM   in reply to ACS LLC

    I'

    m not sure I'd be able to disect and redo what you just did

    !

     

    No reason to redo it. It is a poor substitute anyway. The final query should be much simpler. You already know the results you need. You just need to read up on full outer joins for yourself.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 13, 2011 5:18 AM   in reply to ACS LLC

    Look at my sample again.  Pay attention to parentheses.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 13, 2011 8:45 AM   in reply to Dan Bracuk

    ACS LLC,

     

    Below is a sample query that may work for you.  It uses a full join, as previously suggested in this thread.  I also suggest that you research full joins and any other keywords in the query you are not familiar with.

     

     

    SELECT

         COALESCE(IMP.imp_date, LEAD.lead_date) AS report_date

         , COALESCE(IMP.imp_counter,0) AS imp_counter

         , COALESCE(LEAD.lead_count,0) AS lead_count

     

    FROM

         (

         SELECT

              imp_date

              , imp_counter

         FROM impressions

         WHERE imp_date BETWEEN '9/1/2011' AND '09/07/2011'

         ) AS IMP

     

         FULL OUTER JOIN

          

         (

         SELECT

              lead_date

              , COUNT(*) AS lead_count

         FROM leads

         WHERE lead_date BETWEEN '9/1/2011' AND '09/07/2011'

         GROUP BY lead_date

         ) AS LEAD

     

     

         ON ( IMP.imp_date = LEAD.lead_date )

     

    ORDER BY report_date;

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 13, 2011 10:23 AM   in reply to ACS LLC

    I would try removing the COALESCE and looking at how the results differ, that should help you visualize what COALESCE is doing.  I encourage you to do some research and testing so that when you need to troubleshoot the query later you'll understand what each part of the SQL statement is doing.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 13, 2011 10:44 AM   in reply to ACS LLC

    Not exactly, COALESCE returns the first non-null item in the list.  I would refer to your database documentation.

     
    |
    Mark as:
  • Currently Being Moderated
    Sep 13, 2011 11:26 AM   in reply to ACS LLC

    So which one are you saying should to it??

     

    Either. If you had tested the union, it should have worked under ms sql too. But like I said, that technique is just a poor man's full outer join.

     

    Again, I would strongly recommend you read up on full outer joins and unions so you understand what the sql is doing and are not just using it blindly

     
    |
    Mark as:
  • Currently Being Moderated
    Jan 24, 2012 8:40 AM   in reply to ACS LLC

    Start by looking at why you are getting multiple records per date. Is it due to the data (ie multiple impression records per date) or the JOIN?

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points