Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Created attachment 230043 [details] apache-pig-0.6.0.ebuild
In my overlay... http://git.wolf31o2.org/gitweb/?p=overlays/wolf31o2.git;a=commit;h=3450cab566ce946f0f789eef6a98673728329402
Hello guys, Same as for apache-hadoop, I've reworked a bit on this package to get it more straightforward to use in Gentoo. It is in my overlay as dev-lang/apache-pig-bin Please check it out if you can
Closing obsolete proposal. Clearly the Hadoop ecosystem packaging is too much for the manpower we have I'm afraid.