close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_fs.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Aug 14, 2013, 10:38:43 PM (12 years ago)
- Author:
-
jazz
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
|
v6
|
v7
|
|
| 2 | 2 | |
| 3 | 3 | * http://blog.cloudera.com/blog/2009/06/analyzing-apache-logs-with-pig/ |
| | 4 | |
| 4 | 5 | * |
| | 6 | |
| 5 | 7 | {{{ |
| 6 | 8 | #!html |
| … |
… |
|
| 9 | 11 | |
| 10 | 12 | * |
| | 13 | |
| 11 | 14 | {{{ |
| 12 | 15 | #!html |
| … |
… |
|
| 15 | 18 | |
| 16 | 19 | * |
| | 20 | |
| | 21 | {{{ |
| | 22 | # pig -x mapreduce -f scripts/blogparse.pig -param LOGS='/mirror.cloudera.com/logs/access_log.*' |
| | 23 | }}} |
| | 24 | |
| | 25 | * |
| | 26 | |
| 17 | 27 | {{{ |
| 18 | 28 | #!html |
| … |
… |
|
| 21 | 31 | <p>while (<>) {<br /> chomp;<br /> if (/([^\t]*)\t(.*)/) {<br /> my ($ip, $rest) = ($1, $2);<br /> my ($country_code, undef, $country_name, $region, $city)<br /> = $gi->get_city_record($ip);<br /> print join("\t", $country_code||'', $country_name||'',<br /> $region||'', $city||'', $ip, $rest), "\n";<br /> }<br /> }<br /> |
| 22 | 32 | }}} |
| | 33 | |
| 23 | 34 | * |
| | 35 | |
| 24 | 36 | {{{ |
| 25 | 37 | #!html |