Session重叠问题学习(四)--再优化
接前文:
需求描述和第一版解决方案(执行时间90秒)
http://blog.itpub.net/29254281/viewspace-2150229/
优化和修改bug的版本(执行时间25秒)
http://blog.itpub.net/29254281/viewspace-2150259/
我觉得在集合思维处理方式中,前文已经达到最优了.
如果放弃完全的集合处理思维,实际上还可以更加的优化.
前文的几个问题.
1.引入了过多的表结构.
2.写表本身也花费了时间.
3.前文按天批处理,粒度还是细了.应该一把批量全出最快.
4.前文计算最小间隔范围的部分,因为应用集合化思维,不好理解性能还差.
前文计算最小间隔范围的部分如下
该部分使用集合处理方式,不好理解性能还差.
这块可以通过游标写临时表轻易解决。
本质上最小范围就是
每天每个房间每个记录的开始时间和结束时间都扣出来作为一行 排序。
然后找到每个时间最近的下一个时间,作为最小时间范围.
如果使用游标,遍历一遍即可.
都内聚到一个过程之后,不需要创建额外的普通表,直接在过程中创建临时表.实现高内聚,低耦合.
call p
过程返回的结果即为最终结果.
三次测试耗时均低于 10.3秒
分享文章:Session重叠问题学习(四)--再优化
分享地址:http://pcwzsj.com/article/iegeps.html
需求描述和第一版解决方案(执行时间90秒)
http://blog.itpub.net/29254281/viewspace-2150229/
优化和修改bug的版本(执行时间25秒)
http://blog.itpub.net/29254281/viewspace-2150259/
我觉得在集合思维处理方式中,前文已经达到最优了.
如果放弃完全的集合处理思维,实际上还可以更加的优化.
前文的几个问题.
1.引入了过多的表结构.
2.写表本身也花费了时间.
3.前文按天批处理,粒度还是细了.应该一把批量全出最快.
4.前文计算最小间隔范围的部分,因为应用集合化思维,不好理解性能还差.
前文计算最小间隔范围的部分如下
- select roomid,as DATETIME) starttime,as DATETIME) endtime from (
- select @d as starttime,@d:=d,v3.roomid,v3.d endtime from (
- select distinct roomid,
- when nums.id=1 then v1s
- when nums.id=2 then v1e
- when nums.id=3 then v2s
- when nums.id=4 then v2e
- end d from (
- select v1.roomid, v1.s v1s,v1.e v1e,v2.s v2s,v2.e v2e
- from t1 v1
- inner join t1 v2 on ((v1.s between v2.s and v2.e or v1.e between v2.s and v2.e ) and v1.roomid=v2.roomid)
- where v2.roomid in(select distinct roomid from t1 where date(s)=pTime)
- and v2.s>=pTime and v2.s<(pTime+interval '1' and (v2.roomid,v2.userid,v2.s,v2.e)!= (v1.roomid,v1.userid,v1.s,v1.e)
- ) a,nums where nums.id<=4
- order by roomid,d
- ) v3,(select @d:='') vars
- ) v4 where starttime!=''
该部分使用集合处理方式,不好理解性能还差.
这块可以通过游标写临时表轻易解决。
本质上最小范围就是
每天每个房间每个记录的开始时间和结束时间都扣出来作为一行 排序。
然后找到每个时间最近的下一个时间,作为最小时间范围.
如果使用游标,遍历一遍即可.
- DELIMITER $$
- CREATE DEFINER=`root`@`localhost` PROCEDURE `p`()
- BEGIN
- declare done int default 0;
- declare v_roomid bigint;
- declare v_start timestamp;
- declare v_end timestamp;
- declare cur_test CURSOR for select roomid,s,e from t1 ;
- DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
- drop table if exists t1;
- drop table if exists tmp_time_point;
- CREATE temporary TABLE `t1` (
- `roomid` int(11) NOT NULL DEFAULT '0',
- `userid` bigint(20) NOT NULL DEFAULT '0',
- `s` timestamp NOT NULL DEFAULT ON UPDATE timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
- primary KEY `roomid` (`roomid`,`s`,`e`,`userid`)
- ) ENGINE=InnoDB;
- create temporary table tmp_time_point(
- roomid bigint,
- timepoint timestamp,
- primary key(roomid,timepoint)
- ) engine=memory;
- insert into t1
- select distinct
- roomid,
- userid,
- if(date(s)!=date(e) and id>1,date(s+interval id-1 date(s+interval id-1 date(e) ,e,date_format(s+interval id-1 '%Y-%m-%d 23:59:59')) e
- from (
- SELECT DISTINCT s.roomid, s.userid, s.s, (
- SELECT MIN(e)
- FROM (SELECT DISTINCT roomid, userid, roomend AS e
- FROM u_room_log a
- WHERE NOT EXISTS (SELECT *
- FROM u_room_log b
- WHERE a.roomid = b.roomid
- AND a.userid = b.userid
- AND a.roomend >= b.roomstart
- AND a.roomend < b.roomend)
- ) s2
- WHERE s2.e > s.s
- AND s.roomid = s2.roomid
- AND s.userid = s2.userid
- ) AS e
- FROM (SELECT DISTINCT roomid, userid, roomstart AS s
- FROM u_room_log a
- WHERE NOT EXISTS (SELECT *
- FROM u_room_log b
- WHERE a.roomid = b.roomid
- AND a.userid = b.userid
- AND a.roomstart > b.roomstart
- AND a.roomstart <= b.roomend)
- ) s, (SELECT DISTINCT roomid, userid, roomend AS e
- FROM u_room_log a
- WHERE NOT EXISTS (SELECT *
- FROM u_room_log b
- WHERE a.roomid = b.roomid
- AND a.userid = b.userid
- AND a.roomend >= b.roomstart
- AND a.roomend < b.roomend)
- ) e
- WHERE s.roomid = e.roomid
- AND s.userid = e.userid
- ) t1 ,
- nums
- where nums.id<=datediff(e,s)+1
- ;
- open cur_test;
- repeat
- fetch cur_test into v_roomid, v_start,v_end;
- if done !=1 then
- insert ignore into tmp_time_point(roomid,timepoint) values(v_roomid,v_start);
- insert ignore into tmp_time_point(roomid,timepoint) values(v_roomid,v_end);
- end if;
- until done end repeat;
- close cur_test;
- select roomid,date(s) dt,round(second,s,e))/60) ts,max(c) c from (
- select roomid,s,e ,distinct userid) c from (
- select distinct v6.roomid,v6.userid,greatest(s,starttime) s,least(e,endtime) e
- from (
- select distinct roomid,as DATETIME) starttime,as DATETIME) endtime from (
- select
- if(@roomid=roomid,@d,'') as starttime,@d:=timepoint,@roomid:=roomid,p.roomid,p.timepoint endtime
- from tmp_time_point p,(select @d:='',@roomid:=-1) vars
- order by roomid,timepoint
- ) v4 where starttime!='' and date(starttime)=date(endtime)
- ) v5 inner join t1 v6 on(v5.starttime between v6.s and v6.e and v5.endtime between v6.s and v6.e and v5.roomid=v6.roomid)
- ) v6 group by roomid,s,e having distinct userid)>1
- ) v7 group by roomid,date(s);
- END
都内聚到一个过程之后,不需要创建额外的普通表,直接在过程中创建临时表.实现高内聚,低耦合.
call p
过程返回的结果即为最终结果.
三次测试耗时均低于 10.3秒
分享文章:Session重叠问题学习(四)--再优化
分享地址:http://pcwzsj.com/article/iegeps.html