close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_core.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Aug 14, 2013, 10:38:43 PM (11 years ago)
- Author:
-
jazz
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
v6
|
v7
|
|
2 | 2 | |
3 | 3 | * http://blog.cloudera.com/blog/2009/06/analyzing-apache-logs-with-pig/ |
| 4 | |
4 | 5 | * |
| 6 | |
5 | 7 | {{{ |
6 | 8 | #!html |
… |
… |
|
9 | 11 | |
10 | 12 | * |
| 13 | |
11 | 14 | {{{ |
12 | 15 | #!html |
… |
… |
|
15 | 18 | |
16 | 19 | * |
| 20 | |
| 21 | {{{ |
| 22 | # pig -x mapreduce -f scripts/blogparse.pig -param LOGS='/mirror.cloudera.com/logs/access_log.*' |
| 23 | }}} |
| 24 | |
| 25 | * |
| 26 | |
17 | 27 | {{{ |
18 | 28 | #!html |
… |
… |
|
21 | 31 | <p>while (<>) {<br /> chomp;<br /> if (/([^\t]*)\t(.*)/) {<br /> my ($ip, $rest) = ($1, $2);<br /> my ($country_code, undef, $country_name, $region, $city)<br /> = $gi->get_city_record($ip);<br /> print join("\t", $country_code||'', $country_name||'',<br /> $region||'', $city||'', $ip, $rest), "\n";<br /> }<br /> }<br /> |
22 | 32 | }}} |
| 33 | |
23 | 34 | * |
| 35 | |
24 | 36 | {{{ |
25 | 37 | #!html |