Devoxx - Day 1

Some of us ConSol Labs guys enter this year’s Devoxx, the largest
Java conference in Europe. You can expect some blogging about the
state of Java, the newest trends and cool stuff in general out there
for this week.

The first two days of Devoxx are the University Days with talks
covering some topic in depth within 3 hours. The organization is
perfect for a conference of this size (3000+ attendees). But the great
rush will start on Wednesday, when the “real” conference is
starting. These two days are a good chance for warming up, with some
in-depth treatment on various topics.

Productive Programmer (Neal Ford)

In this talk, Neal Ford from Thought Works gives some insights for
best programming practices, approaching it from various angles. This
three hour presentation was divided into two parts: First Neal
shows some Mechanics for increased productivity, after the break he
showed 10 real work themes he encountered in his daily work.

The Mechanics parts focussed on four themes: Acceleration, Focus,
Canonicality and Automation. Most of this came out of his book The
Productive Programmer
with quite a bunch of very concrete hints
ranging from tool support to recommendation how to arrange one’s
environment.

My favourite (because new to me) snippets were:

  • Key promoter plugin
    for IntelliJ IDEA which popups with the key shortcut as soon as you
    use the equivalent menu item. It can be even configured to forbid
    using the menu usage if there is there is a shortcut for it and the
    user was notified a certain time. A rigourous approach for sure, but
    I think it helps you in thinking in shortcuts.

  • Plugins for Windows
    Explorer
    and OS
    X Finder
    for blending in a
    command line window.

  • Use Screen dimmers for avoiding to let the flow go away, e.g.
    Jedi
    concentrate

    for Windows.

  • Switch off baloon tipps on Windows. Use the registry or Tweak UI.

  • Selenium IDE is
    not only good for end user testing but also for debugging. Since
    bugs usually occur in the middle of a web application, a certain
    amount of repeating steps are necessary to reach it. Use Selenium
    IDE to record it onces, and replay it as often as the bug is
    fixes. Selenium scripts are also good for reproducing bugs reported
    by the QA department.

In second part Neal talked about best practice collected from real
world experiences. This was a rather entertaining collection of
experiences, with short anecdotes like the angry Monkey experiment (from Dave
Thomas) and cargo cults. Very good start for this year’s Devoxx.

MongoDB (Alvin Richards)

This three hour crash-course on MongoDB started with the basics on
MongoDB, a NoSQL DB implemented in C++. It
is a document style database with consiciously abandons joins and
transactions. This way, scaling can easily be done by replication and
sharding without much synchronisation overead. All this is done
transparently for the application developer. Documents are formulated
in JSON and can be arbitrary complex. One of the nicest feature is,
that MongoDB’s “schema” can be changed dynamically on the fly. There
are many other functional aspects, which were shortly explained:

  • Map-Reduce on the DB layer
  • Indexing
  • Sophisticated queries including support for regular expressions,
  • Single document inheritance
  • Many-Many relationships
  • findAndModify for atomic identifaction and update of documents.
  • Monitoring support for Munin, Nagios and Cacti
  • Fine grained durability options, selectable on the application
    level.
  • slaveDelay for letting slaves be updated a fixed delayed later,
    which helps in disaster recovery when an application or human does
    stupid things.
  • Java bindings, either raw on a basic level or via Morphia with a
    JPA like interface.

Conclusion: If you need to persist large data sets with the
opportunity for horizontal scaling and if you don’t need hard
transaction support, MongoDB seems to be a valuable alternative to the
traditional RDBMS approach.

VisualVM (Jaroslav Bachorik)

This thirty minute talks gives a crash-course on programming plugins
for VisualVM, which is based on the
Netbeans RCP. Free-style demo without savety net, but at the end
Jaroslav managed to get the plugin running, which visualized the CPU
frequency of a local linux box. I think, a VisualVM plugin for
Jolokia would be a nice addition to the
portfolio and this shouldn’t be that hard.

Intelligent data analysis - Apache Mahout (Isabel Drost)

Mahout is a framework for machine learning, implementing different algorithms for data mining applications. Typical use cases include pattern mining and data classification, such as for mail classification, news topic discovery or recommendation systems.

Isabel gives a general introduction into the steps and challenges of machine learning, how the basic algorithms work and how Mahout employes hadoop to deal with large data sets. Unfortunately, a scheduled half hour does not allow any time for detailed and in depth presentation of Mahout.

#Hadoop Fundamentals: HDFS, MapReduce, Pig, and Hive (Tom White)
The cloud/NoSQL track starts with the basics of Hadoop and finishes by comparing two data anlysis projects from the Hadoop ecosystem.
Hadoop provides a redundant storage of massive data and computation platform using commodity and potentially unreliable hardware.

The base of Hadoop consists of two core parts:

  • Hadoop File System (HDFS) containing all data
    • Block oriented (usual sizes: 64m or 128m)
    • Replicated (default factor: three)
    • Optimized for write once, read sequentially operations
    • Recently extended for ‘append’ operations
    • Original idea based on Google FS
    • Suited for millions of not too small files
    • Basic architecture
      • High available master node with primary/secondary name node
        tracking file blocks on data nodes and doing housekeeping
      • Multiple data nodes storing file data blocks
  • Map/Reduce algorithm for doing the crunching
    • Map/Reduce is an originally by Google published algorithm for parallel and distributed data processing.
    • A job tracker is responsible for housekeeping, such as detecting and respawning potentially dead jobs.

Hive and Pig target data warehousing with different approaches.
Typical usage includes analyzing large log files, such as produced by Apache HTTP Server.
Both use HDFS and Map/Reduce underneath.

Hive:

  • SQL-like scripting on structured files in HDFS
  • Schema, validated when reading but not when writing
  • JDBC API or Hive shell

Pig:

  • Tab limited data files in HDFS
  • No schema required
  • Custom API or Grunt shell
  • Supports local mode for testing

#Groovy update, ecosystem, and skyrocketing to the cloud with App Engine and Gaelyk! (Guillaume Laforge)
This is another three-topics-in-one talk, starting with changes in recent Groovy releases, mentioning a few hot projects from the Groovy ecosystem and finishing with an introduction to Gaelyk.

Groovy Update: Past, Present and Future

A summary of my favorite changes. For details, check the Groovy JIRA.

  • Groovy 1.6
    • Java 5 Annotations
    • AST (Abstract Syntax Tree)Transformations
      Ability to change what’s being compiled by Groovy compiler … at compile time Useful for recurring patterns in your code base, remove boiler-plate code.
      Example: @Immutable class Coordinates { Double lat,lng }
      The AST annotation extends the class for equals/hashCode/toString and more
    • Grape
@Grab(group = "org.mortbay.jetty", module="jetty-embedded", version="6.1.0")
    def startServer() {
       def src = new Sever(8080)
       ...
    }
  • ExpandoMetaClass DSL
Number.metaClass {
         multiply { ... }
    }
  • JMX Builder, in addition to existing Grooby MBean
  • Multiple assignments and tupels for return values
    Example: Swap values with (a,b) = [b,a]

  • Groovy 1.7
    • Power Asserts makes testing more fun
    • AST viewer/builder
    • Annotations everywhere: imports, packages, variable declaration.
      Example: @Grab on import statements
    • Customize the truth: class Foo { boolean asBoolean() {..} } ; !new Foo()
    • XML<->String: StreamingMarkupBuilder for XmlSlurper
    • Improved curring (any paramter)
    • Annonymous Inner/Nested classes: For Java cut-and-paste compatibility sake
  • Groovy 1.8, release coming in January 2011
    • Closures: Annotation parameters, memoization to remember previous results, trampoline and closure composition
    • Native JSON support including builder and parser
    • Modularizing of Groovy (groovy-all.jar), Modules: test,jmx,swing,xml.sql,web,template … tools
    • Align with JDK 7/8: Automatic resource management, generics diamond and language support for collections etc
    • Enhanced DSL support take(2.pills).of(aspirin).after(6.beer) vs. take 2.pills of aspirin after 6.beer
    • AST Transformations: @Log, @ToString,@Canoncial (equals/hashCode/)

Groovy Ecosystem

Gaelik

Gaelik is a Groovy based extension for the Google App Engine (GAE).
The framework uses Groovys’s servlet support (Groovlets), Groovy templates for the view and wraps the GAE services (mailing, image manipulation, datastore, memcache,…).

Author: Roland Huß
Categories: devoxx, java, development