Verbose

I do most of my development in python and scala these days. I totally forgot how verbose java is until I saw some Apache Spark coding examples.

Code in python

file = spark.textFile("hdfs://...")
errors = file.filter(lambda line: "ERROR" in line)
# Count all the errors
errors.count()
# Count errors mentioning MySQL
errors.filter(lambda line: "MySQL" in line).count()
# Fetch the MySQL errors as an array of strings
errors.filter(lambda line: "MySQL" in line).collect()

The same code in Scala:

val file = spark.textFile("hdfs://...")
val errors = file.filter(line => line.contains("ERROR"))
// Count all the errors
errors.count()
// Count errors mentioning MySQL
errors.filter(line => line.contains("MySQL")).count()
// Fetch the MySQL errors as an array of strings
errors.filter(line => line.contains("MySQL")).collect()

And finally in Java:

JavaRDD<String> file = spark.textFile("hdfs://...");
JavaRDD<String> errors = file.filter(new Function<String, Boolean>() {
  public Boolean call(String s) { return s.contains("ERROR"); }
});
// Count all the errors
errors.count();
// Count errors mentioning MySQL
errors.filter(new Function<String, Boolean>() {
  public Boolean call(String s) { return s.contains("MySQL"); }
}).count();
// Fetch the MySQL errors as an array of strings
errors.filter(new Function<String, Boolean>() {
  public Boolean call(String s) { return s.contains("MySQL"); }
}).collect();

Twice as many lines of code.

Please leave a comment