您的当前位置:首页 > 会员中心> Resources
Experimental Resources
  • About Data
  •  

    These data is the HTTP packet of online map service captured at the campus network's interface of Tsinghua University around April 2009. The final presented data is original format after IP address randomizing processed by bro, and IP address randomizing is used to omit the user privacy. There are three type of data line we mainly concentrate:

    •  The session beginning line: This line is the first line when a new session begins, and an example is presented as following:

    1220169326.485460 %1 start 166.111.132.250:44640 > 209.85.175.147:80

    The meaning of each part of this line is timestamp, session ID, start indicator, source IP:source port and destination IP:destination port. In addition, to protect user's privacy, the in-campus IP address of source or destination has been randomized to private IP, and the same in-campus ip corresponds to the same private IP.

    •  The header line: This line indicates the domain name of map server and can be used to distinguish different map service provider. The format of head line is showed as follows:

    1220169326.485460 %1 > HOST: maps.google.com

    The timestamp, session ID, header field name, and header value is presented respectively.

    •  The request and response line: This line contains the query information of map service, and based on the GET information we can get the exact location's longitude, latitude and zoomlevel user is retrieving, just like the following example shows:

    1220169328.109853 %1 GET /maps/vp?spn=62.186014,113.203125&z=3&vp=37.0625,-95.677068 (200 "OK" [2885] maps.google.com)

    The different part of line is timestamp, session ID, HTTP method(GET or POST), the requested URL, the response code, response string and HOST. The location information can be parsed from the requested URL. It is worth to note that different format of line will get if the response code differs, just like the following examples:

    1220169748.474496 %2 GET /seamless/0/180/717/4/0/829_184.JPG (200 "OK" [8760 (interrupted)] hbpic4.go2map.com)

    1220169375.038764 %3 GET /seamless/0/174/719/1/0/209_67.GIF <no reply>

  •   本数据是在 2009 年 4 月于清华大学网络中心的校园网出口处抓取的在线地图应用的 HTTP 数据包,为经过 Bro 软件与 IP 地址随机化后的数据格式,主要包含三类重要的数据行:

      •  会话开始行

      该行指示新的会话的开始,示例如下:

      1220169326.485460 %1 start 166.111.132.250:44640 > 209.85.175.147:80

      各个部分分别为时间戳、会话 ID 、开始标记、源地址 : 端口与目的地址 : 端口对。此外,为了保护用户隐私,校内 IP 地址已被随机化为私有 IP 地址,并且保> 证一一映射。

      •  头部标示行

      该行指示地图服务器的域名,可以用于区分不同的地图服务提供商,该行的格式如下:

      1220169326.485460 %1 > HOST: maps.google.com

      各个部分依次为时间戳、会话 ID 、头部域名称以及头部域值。

      •  请求回应行

      该行包含了地图应用的查询信息,通过该信息我们可以获取用户查看位置的具体经纬度与缩放值,示例如下:

      1220169328.109853 %1 GET /maps/vp?spn=62.186014,113.203125&z=3&vp=37.0625,-95.677068 (200 "OK" [2885] maps.google.com)

      各部分的意义为时间戳、会话 ID 、 HTTP 方法( GET 或者 POST ),请求的 URL ,回复代码,回复内容以及服务器地址,地理位置信息可以从请求的 URL 中获取 。值得注意的是随着回复代码的不同得到的请求回应行的格式可能不同,如下面两个例子所示:

      1220169748.474496 %2 GET /seamless/0/180/717/4/0/829_184.JPG (200 "OK" [8760 (interrupted)]   hbpic4.go2map.com)

      1220169375.038764 %3 GET /seamless/0/174/719/1/0/209_67.GIF <no reply>