Quantcast
Channel: PIG how to count a number of rows in alias - Stack Overflow
Viewing all articles
Browse latest Browse all 8

Answer by Javier Bañez for PIG how to count a number of rows in alias

$
0
0

What you want is to count all the lines in a relation (dataset in Pig Latin)

This is very easy following the next steps:

logs = LOAD 'log'; --relation called logs, using PigStorage with tab as field delimiterlogs_grouped = GROUP logs ALL;--gives a relation with one row with logs as a bagnumber = FOREACH LOGS_GROUP GENERATE COUNT_STAR(logs);--show me the number

I have to say it is important Kevin's point as using COUNT instead of COUNT_STAR we would have only the number of lines which first field is not null.

Also I like Jerome's one line syntax it is more concise but in order to be didactic I prefer to divide it in two and add some comment.

In general I prefer:

numerito = FOREACH (GROUP CARGADOS3 ALL) GENERATE COUNT_STAR(CARGADOS3);

over

name = GROUP CARGADOS3 ALLnumber = FOREACH name GENERATE COUNT_STAR(CARGADOS3);

Viewing all articles
Browse latest Browse all 8

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>