MySQL Subquery with User-Defined Variables -


i'm trying accomplish query requires calculated column using subquery passes date reference via variable. i'm not sure if i'm not "doing right" query never finishes , spins minutes on end. query:

select @groupdate:=date_format(order_date,'%y-%m'), count(distinct customer_email) num_cust, (   select count(distinct cev.customer_email) num_prev   _pj_cust_email_view cev   inner join _pj_cust_email_view prev_purch on (prev_purch.order_date < @groupdate) , (cev.customer_email=prev_purch.customer_email)   cev.order_date > @groupdate ) prev_cust_count _pj_cust_email_view group @groupdate; 

subquery has inner join accomplishes self-join gives me count of people have purchased prior date in @groupdate. explain below:

+----+----------------------+---------------------+------+---------------+-----------+---------+---------------------------+--------+---------------------------------+ | id | select_type          | table               | type | possible_keys | key       | key_len | ref                       | rows   |                           | +----+----------------------+---------------------+------+---------------+-----------+---------+---------------------------+--------+---------------------------------+ |  1 | primary              | _pj_cust_email_view |  | null          | null      | null    | null                      | 140147 | using temporary; using filesort | |  2 | uncacheable subquery | cev                 |  | idx_email     | null      | null    | null                      | 140147 | using                     | |  2 | uncacheable subquery | prev_purch          | ref  | idx_email     | idx_email | 768     | cart_a.cev.customer_email |      1 | using                     | +----+----------------------+---------------------+------+---------------+-----------+---------+---------------------------+--------+---------------------------------+ 

and structure of table _pj_cust_email_view such:

'_pj_cust_email_view', 'create table `_pj_cust_email_view` (   `order_date` varchar(10) character set utf8 default null,   `customer_email` varchar(255) character set utf8 default null,   key `idx_email` (`customer_email`),   key `idx_orderdate` (`order_date`) ) engine=innodb default charset=latin1' 

again, said earlier, i'm not sure best way accomplish this. criticism, direction appreciated!

update

i've made little progress, , i'm doing above procedurally iterating through known months instead of months in database , setting vars ahead of time. don't still. i've got now:

sets user defined vars

set @startdate:='2010-08', @enddate:='2010-09'; 

gets total distinct emails in given range

select count(distinct customer_email) num_cust _pj_cust_email_view order_date between @startdate , @enddate; 

gets total count of customers had purchased prior given range

select count(distinct cev.customer_email) num_prev   _pj_cust_email_view cev   inner join _pj_cust_email_view prev_purch on (prev_purch.order_date < @startdate) , (cev.customer_email=prev_purch.customer_email)   cev.order_date between @startdate , @enddate; 

where @startdate set start of month , @enddate signifies end of month's range.

i feel still can done in 1 full query.

i don't think need use subqueries @ all, nor need iterate on months.

instead, recommend create table store months. if prepopulate 100 years of months, have 1200 rows in it, trivial.

create table months (     start_date date,      end_date date,      primary key (start_date, end_date) ); insert months (start_date, end_date)  values ('2011-03-01', '2011-03-31'); 

store actual start , end dates, can use date data type , index 2 columns properly.

edit: think understand requirement bit better, , i've cleaned answer. following query may right you:

select date_format(m.start_date, '%y-%m') month,   count(distinct cev.customer_email) current,   group_concat(distinct cev.customer_email) current_email,   count(distinct prev.customer_email) earlier,   group_concat(distinct prev.customer_email) earlier_email months m  left outer join _pj_cust_email_view cev   on cev.order_date between m.start_date , m.end_date inner join months mprev   on mprev.start_date <= m.start_date left outer join _pj_cust_email_view prev   on prev.order_date between mprev.start_date , mprev.end_date group month; 

if create following compound index in table:

create index order_email on _pj_cust_email_view (order_date, customer_email); 

then query has best chance of being index-only query, , run lot faster.

below explain optimization report query. note type: index each table.

*************************** 1. row ***************************            id: 1   select_type: simple         table: m          type: index possible_keys: primary           key: primary       key_len: 6           ref: null          rows: 4         extra: using index; using temporary; using filesort *************************** 2. row ***************************            id: 1   select_type: simple         table: mprev          type: index possible_keys: primary           key: primary       key_len: 6           ref: null          rows: 4         extra: using where; using index; using join buffer *************************** 3. row ***************************            id: 1   select_type: simple         table: cev          type: index possible_keys: order_email           key: order_email       key_len: 17           ref: null          rows: 10         extra: using index *************************** 4. row ***************************            id: 1   select_type: simple         table: prev          type: index possible_keys: order_email           key: order_email       key_len: 17           ref: null          rows: 10         extra: using index 

here's test data:

insert months (start_date, end_date) values ('2011-03-01', '2011-03-31'), ('2011-02-01', '2011-02-28'), ('2011-01-01', '2011-01-31'), ('2010-12-01', '2010-12-31');  insert _pj_cust_email_view values ('ron', '2011-03-10'), ('hermione', '2011-03-15'), ('hermione', '2011-02-15'), ('hermione', '2011-01-15'), ('hermione', '2010-12-15'), ('neville', '2011-01-10'), ('harry', '2011-03-19'), ('harry', '2011-02-10'), ('molly', '2011-03-25'), ('molly', '2011-01-10'); 

here's result given data, including concatenated list of emails make easier see.

+---------+---------+--------------------------+---------+----------------------------------+ | month   | current | current_email            | earlier | earlier_email                    | +---------+---------+--------------------------+---------+----------------------------------+ | 2010-12 |       1 | hermione                 |       1 | hermione                         |  | 2011-01 |       3 | neville,hermione,molly   |       3 | hermione,molly,neville           |  | 2011-02 |       2 | hermione,harry           |       4 | harry,hermione,molly,neville     |  | 2011-03 |       4 | molly,ron,harry,hermione |       5 | molly,ron,hermione,neville,harry |  +---------+---------+--------------------------+---------+----------------------------------+ 

Comments

Popular posts from this blog

Javascript line number mapping -

c# - Is it possible to remove an existing registration from Autofac container builder? -

php - Mysql PK and FK char(36) vs int(10) -