Devoxx 2013, Day 3

On Wednesday Torsten, Christoph and Jan joined our team, and the conference kicked off with full blast: Full Keynote, full rooms, full toilettes, empty coffee and much fun.

## “Lambda: A Peek Under the Hood Conference” by Brian Goetz (mario)

On day three the ‘real’ conference started, so I went to the great talk from Brain Goetz
about the internals of Lambdas in Java 8.

The funny thing is that I had a great conversion about the Lambda-Implementation in the bar
‘Berlin’ in Anwerpen with Fabian. Brain Goetz was able to clarify our picture of Lambda.
Not to go into detail about our views, we were both in some points right and in some points wrong.

The truth about Lambda is, that, if you define a Lambda there at first nothing happen at
all. There are no classes loaded and not objects created. Inside the bytecode, which you
can analysis with javap, you will see the keyword ‘invokedynamic’ only.

Invokedynamic lets the JVM decide how to interact with Lambda. The will happen lazy on the first
use. The current implementation creates an inner class but a more efficient implementation is in
the queue for future releases. Mark Reinhold mentioned in a later talk, that ‘inlining the code’
is his preferred solution.

One of the key statements of Brain Goetz in this talk was, that the first, obvious solution
is mostly wrong. So he explained the way how the current specification was created
and why some ideas were used and some were thrown away.

Now I am looking very happy forward to Java 8 and I can’t wait to use it in production.

“Teaching Computer Science with Music” by Sam Aaron (fabian)

After having attended the obligatory Lambda talks this morning, I decided to go off-topic, and went to “Teaching Computer Science with Music” by Sam Aaron.

Sam and his band have been performing on the opening event this morning, featuring the audio output of their algorithms on Raspberry Pis, and using DIY hardware controllers. So I was very curious what his talk would be about.

Sam’s research is on how to teach children the excitement of programming. We “shouldn’t just be teaching our children office skills such as formatting Word documents”, Sam says. The key to the excitement most software developers feel is that they are able to take their ideas, put them in code, and make them become reality. Sam’s goal is to share this experience with school children.

One of the tools Sam uses is a Closure-based library for sound programming, allowing children to have fun with with synthesized music in just a few lines of code.

I found the talk very refreshing, not only because the room was by far less crowded than the rooms for the Lambda talks. Programming Lisp on Emacs is definitely not one of the mainstream topics in today’s IT world, and developing Hungarian minor piano scales is unlikely to generate business value on my next job. However, on a Java conference sponsored by industry giants like Oracle, Microsoft, and Google, it is great to see somebody doing stuff for fun.

Thank you Devoxx for having this talk.

“Fault tolerance made easy” by Uwe Friedrichsen (christoph)

Code that we write has to go in production someday. And production matters! In production environment your application code faces some whole different type of issues like high availability and resilience which leads to fault tolerance. Uwe Friedrichsen shows how to write good production code and provides seven patterns how to solve common issues within production environments.

And this fault tolerance implementations seem to address a majority of developers out there as the conference room is over crowded and people were to able to grab a seat.

So lets start with the concrete use case scenarios and patterns that are discussed within this talk.

First use case is the well known situation of unavailable resources, such as database resources.

Pattern 1: Timeouts

This is a standard problem: you have some database lock caused by several threads accessing a blocked resource within the database so the resource is blocked , too.The conference talk is not about how to solve the database issues and dead locks but how to handle the resulting error on the application server in production.

Showing just an error page with the message “try again later!” is obviously not very good. This is like standing on a train platform waiting for a train not knowing when or if at all the train will actually arrive.

We can handle this error situation of an unavailable blocking resource with timeouts. The blocking action can be designed as callable in combination with an executor service. While running the callable a timeout is set to limit the time we wait for the blocking action to return. This is standard Java platform capability and not very hard to implement.

The Guava library provides even more convenient API methods for doing this with less verbose code writing (e.g. callWithTimeout method on a limiter).

Now what to do when the timeout exception was actually raised. Catch it for sure and handle the situation so the caller gets a defined state.

What represents a good timeout setting?

  • Determine timeout duration
  • Make timeouts configurable (dynamic without restart of application)
  • Self-adapting timeouts (measure times and calculate 99,9% of time consumed)
  • Timeouts in JavaEE containers

Pattern 2: Circuit breaker

The pattern of circuit breaker describes the fail fast idiom. After failing fast the code returns to defined state as soon as possible.

As an example some client tries to access some kind of resource on a server. We put a circuit breaker between the client and the server resource. The circuit breaker decides to fail fast by keeping track of resource that is accessed all the time. Possible implementation is a state machine with a failure counter and a threshhold reached state. Reset the breaker (named half open state) with checking the resource state again and interpret the result (success or failure) so the circuit breaker can automatically recover not available state to open state.

Points to go on from this:

  • Thread safety
  • Failure types
  • Tuning circuit breakers (monitoring, configurable, dynamic measurement)
  • Available implementations

Pattern 3: Fail fast guard

Client calls an expensive action (lot of resources needed). Do always check that all resources are available immediately and do fail fast in case one resource is not in available state. The talk describes an implementation as fail fast guard that stands again between client and server resource. Fail fast guard works with set of circuit breakers and checks that all of them are successful and in open state before expensive action is performed.

For the next set of patterns we deal with a different type of use case which is called the “site too successful exception”. Simply this is some overload situation on the server caused by heavy load (such as high number of concurrent client requests):

If there is too much action on the server, too much load, every user gets annoyed and keeps hitting the reload button to make the server go faster ;-)

The result is that all users do experience the slow server with possible session timeouts and total failure for users that were fine before the overload situation.

Pattern 4: Shed load

Too many clients should be guarded by a gate keeper. The gate keeper is connected with a monitoring server requesting for actual load data. The gate keeper decides to shed new requests immediately keeping the load away from the server. All request that would overload the server get shed. Those requests that are in performance range go through smoothly.

Important to mention is the thought that the gate keeper should not be located on same application server as the monitored resource! Some proxy server would be better location for this.

How to go on from that:

  • Shedding strategy (Session aware, non-linear)
  • Retrieving load (Polling for monitoring load in separate thread)
  • Tuning load shedders (configuration, dynamic measurement)
  • Alternative strategies (not to shred but)

Pattern 5: Deferrable Work

Routine work going on in the background constantly creates load on the server (batch jobs, maintainance jobs). The total server load is always the combination of routine load and client request load on server. The request load might be fine in an isolated monitoring but in combination with routine load we have an overload situation which slows down the server performance. The idea is to dynamically decrease the routine load as soon as we get closer to the overload situation with increased client request load.

In most cases we can easily delay the routine work without any bad influence on the client request functionality. So in case we are close to an overload situation we decrease of postpone the routine work dynamically.

Points to improve this pattern:

  • Delay strategy (non-linear)
  • Retrieving Load
  • Tuning (configurable, dynamic measurement, manual interaction)

The next use case discussed is the “I can hardly hear you” situation. This leads to another set of patterns that we get to right now.

Mean bastard applications do sometimes not respond as usual. From time to time the performance is weak. A restart solves the problem for a period of time. But the problem is coming back constantly and restarts are expensive and can cause other severe damage (e.g. loss of data). The problem might get worse or it might recover, nobody knows.

Pattern 6: Leaky bucket

Problem occurred fills a leaky bucket with content. From time to time the leak removes content from the bucket. In case the error problem occurrences do happen too often in short period of time the bucket is in overflow state and the special error handling takes place.

The bucket has a total capacity and fills up over time while the leak strategy constantly removes content. Overflow only happens in strong error situations and you are confident to have a serious error situation that has to be handled now.

Points to improve this pattern:

  • Thread safety
  • Leaking strategies
  • Tuning Leaky Bucket (configuration, measurement)
  • Available implementations

Pattern 7: Limited retries

In this pattern we deal with transient errors that can be fixed with retry (e.g. network error). Recoverable action by simple loop of trying to do the action again and again until success is reached. We only raise errors when maximum number of retries is reached.

  • Idempotent actions (no additional side effects on repeated work)
  • Closures/Lambdas
  • Tuning retries (configuration, measurement)

Summing up the presentation stresses that all these fault tolerance patterns are very easy to implement. There are for sure some more complicated patterns that do exist like

  • Complete parameter checking
  • Market data
  • Routine audits

I leave you with these patterns for evaluation and research on your own and I hope you enjoyed the thoughts in fault tolerance as much as I did. Great talk!

“Vert.x 2.0” by Tim Fox (christoph)

I learned about Vert.x at last year’s Devoxx conference and I have to say that this technology was one of my favorite take aways from Devoxx 2012. Now I am excited to hear Tim Foy again speaking about Vert.x 2.0.

What is Vert.x?

A lightweight, reactive, polyglot, modular application platform. Vert.x believes in asynchronous communication patterns with passion and provides high available loosely coupled components.

Superficially Vert.x is similar to Node.js - for the JVM. Vert.x provides asynchronous APIs for high performance. You can embedd Vert.x as a library into your Java platform application. It fully supports a significant variation of JVM languages such as Java, JavaScript, JRuby, Groovy to write Vert.x applications. Scala support and others is still in beta state.

In Vert.x clients and servers do connect with TCP/SSL or HTTP/HTTPS. Furthermore there is a full support for WebSockets and SockJS. The Vert.x event bus is propably the most significant feature you want to learn about.

Asynchronous processing along with OS threads and modern servers to handle many connections simultaneously is still a precious thing to do. So how to write a Vert.x application? All things in Vert.x reside to a unit of execution called a Verticle.

Verticles are generally single threaded. It is made sure that one Verticle is not executed by more than one single thread which gives you the confidence not having to worry about racing conditions and stuff like that.

Verticles can be written in any Vert.x supported language (Java, Groovy, JavaScript, etc.). In general Verticles do communicate with each other by exchanging messages over the event bus.

Tim shows some easy demo code examples as “Hello World” Http server setup in Vert.x. It is very impressive how fast and easy you can setup a full Http server within Vert.x. The Verticle implementations shown are coded in JavaScript and Java.

Now lets get to the Vert.x event bus. As already mentioned Verticles send messages over the event bus. You can pass strings, buffers, primitive types and JSON over the event bus. Things do get interesting when the event bus is clusterd across multiple Vert.x JVM instances in different networks. Vert.x applications are loosely coupled in the distributed network.

Speaking of this Vert.x is able to be used on client and server code. So the event bus is also able to extend to the client side for instance in a browser running as JavaScript code.

The next demo shows Vert.x in cluster mode running with Verticles as sender and receiver for exchanging messages across the network with several Vert.x JVMs using JavaScript, Java and Scala implementations. It is so easy to start a Vert.x server within client JavaScript code running in a browser.

Modules in Vert.x are runnable encapsulations of one or more Verticles with capability to have dependencies to other modules. A module descriptor tells Vert.x what is included in the module and what Verticles to start when running the module. Modules have similar artifact strategy to those defined in Maven. You can also push Vert.x modules to a Maven repository.

Vert.x module dependencies can be resolved at build time and run time.

Vert.x modules can be converted to a “Fat jar” which is a big Java archive containing everything to run Vert.x module on the JVM. Vert.x runtime agent has not to be installed on the machine, you just need the JDK installed. This is a cool way to push Vert.x to the machines where you do not have the opportunity to install Vert.x infrastructure.

Another great feature that comes with Vert.x is a high availability mode with automatic failover of Verticles running on Vert.x JVM instances. So if you kill a JVM instance with running Verticles for some reason another JVM Vert.x instance will take over immediately. As long as one Vert.x JVM instance is alive in the network the Verticles will be alive.

These are the cool new features in Vert.x 2.0. In combination with the great way of handling asynchronous communication on the event bus in Vert.x I definitely want to do some Vert.x project in the future!

“The Crazyflie Nano Quadcopters Development Platform” by (torsten)

The talk was about the really tiny crazyflie quadcopter that meassures only about 9cm from rotor to rotor.

The quadcopter started as a hobby project of three swedish guys. They all worked in a company that was funding such hobby research projects. So they started with a “huge” budget of 1000 Euros.

The first prototype was build with components bought over over the internet such as toy plane motors, a printed circuit board, sensors etc.. They literally backed the parts together in a ordinary fry pan. After about 6 month the goal to create a flying quadcopter had been reached.

They made a video put it online and stopped development. The attention and demand created by the video lead to the decision to create an open source quadcopter platform and make it available as a dev kit.

It took six month time from the idea to the prototype and another two years from the platform idea to actual manufacturing.

The crazyflie guys then talked about the electronics, pcb, sensors, stabilizing software etc. of the quadcopter.

After that the best part of the talk started. The demo. The crazyflie quadcopter flying around the cinema with quite a lot of speed and a cam attached streaming live to the cinema screen was really impressive.

For me the biggest achievement is that they managed to make a prototype into a real product that actually can be bought. I think this is the real challenge of such a project. And of course the demo was absolutely awesome!

Here’s the link:

“Cryptographic operations in the browser” by Nick Van den Bleeken (jan)

On day 3 of Devoxx Nick van den Bleeken, Member of the Web Cryptography Working Group at W3C, held an one hour introduction to the upcoming Web Cryptography API.
This native low level API is to be implemented by all major browsers in the near future and provides all kinds of cryptography related functionalities including authentication, encryption/decryption and signing/verifying documents.

This will make it easy to realise secure client side operations like storing encrypted documents in the web or providing a public-key based message transfer with the keys managed at the clients.

While Internet Explorer 11 only implements an older version of the API, Chrome only implements some of the function and Firefox and Safari have no public implementation at all yet, with PolyCrypt there is a non-native pure JavaScript implementation of the API in development. 
But keep in mind that a JS-only implementation will not be on par with a native one, performance- and security-wise, so it will take a while until it is has been established so far that you can rely completely on this API. Though by then I also expect other JavaScript Frameworks to provide an even more user friendly layer around the API, well adapted for the respecctive use cases.

Some additional links:

“UI Engineer - the missing profession” by Dierk König (roland)

First Dierk tries to define, what makes up an “UI Engineer”. Beside the normal ‘software engineering’ discipline, even as an engineer, you need to know the basics of graphic design, or UI design in general. Tools are imporant as well. And so on. Everything is shown as parts of a mindmap.

Frankly, I don’t like the idea, to us a mindmap as a slideshow and guide the audience through this mindmap. Although a Mindmap has a certain structure, it’s still some sort of a brainstorming model for the author, understandable in depth only by the author. It’s not ‘dense’ enough.

Then Dierk developed a quite evolved theory about destructing the classical MVC Pattern is something which is quite complex (well, I didn’t get that)

Main UI concepts are given next:

  • No view knows any other views (but there can be exceptions for composed views)
  • No controller knows any view (but communicates with the view via the Model)
  • Views can only “ping” the controller
  • Views always work in the UI thread (but what’s the UI thread in a Web application ?)
  • Controller work outside the UI thread (but again, what is the concept of a UI thread for a Web application ?)

Some concrete recommendation for an architecture is the “Shared Representation” pattern, where the shared representation is the model (and view and controller run on client and server respectively). For multiple views, clients an event bus can be used for synchronising the model. This should work for a Web application, too, if you use a dedicated framework for that, like Open Dolphin.

“Every app in the browser” mantra is ok as long as no privacy constraints are involved, he says. Hmm, the given example involving the access from Google Maps for this is IMO not an intrinsic issue of SPAs. You can do this in a perfectly save way in the browser. And BTW, what it is different when you Java FX integrates external services like map data for a given address (even when it delegates this to its server) ? “Everything displayed in the browser is inherently unsafe”. Ehm. What’s is with Rich Clients transfering data to an outside service ? Isn’t that as unsafe as for the browser ? A SPA with viewing state only on the client (and calling for bussiness/applicaton logic to the server) is IMO a perfectly valid amd secure approach. And even rich clients needs some sort of view state.

The talk suffered from the lack of real world examples in the first half, and really restricts itself to Rich Clients almost exclusively. Of course, some concepts can be applied to Web Apps as well, but others not. Probably my expectation were wrong to this talk and misunderstood the abstract in that I thought UI is more than FX of Swing.

Building Hadoop Big Data Applications by Tom White (torsten)

Note: This is a Quick Blog Post

Hadoop is an open source system to reliably store and process large volumes of data using commodity hardware - or in short: “Hadoop is a distributed infrastructure for Big Data”

Hadoop has developed from a “batch processing system” to a Hadoop Stack containing various components such as:

Avro : Cross languange data serialization library
Hive: Data warehouse and SQL engine
HCatalog: Metadata storage system (part of hive)
Flume: Streaming log capture and delivery system
Oozie: Workflow scheduler system (cron for Hadoop)
Crunch: Java API for writing data pipelines
Parquet: Column-oriented storage format for nested data
Impala: interactive sql on Hadoop

There are some typical pain points coming with Hadoop:

  • Choosing a File Format - Hadoop supports a variety of file formats but it is sometimes hard to find the fitting one.
  • Defining a Data Model - Schema on read (hadoop) vs. Schema on write (relational db)
  • Defining a file layout on disk

But fortunately there are some best practices to the rescue:

  • Use Avro Schemas for the data model
  • Use Avro Data Files for row-oriented data
  • Use Parquet for column-oriented data
  • Use a Hive/HCatalog copatible file layout
  • Use a library like Crunch for batch analysis
  • Use Impala for interactive analysis

Tom White describes a typical event capture and analysis system:
> A WebApp generates Events that are stored. A regular job grabs the event data and processes it into reporting data which can then > be queried for analysis reasons.

With Hadoop this can be implemented this way:
The WebApp logs data using Log4J. A Flume Logger writes this data into a HDFS (Hadoop Distributed File System).
An Oozie scheduled Crunch Job transforms the event data into reporting data e.g. on an hourly basis.
The reporting data can then be queried using Impala for ad-hoc reporting.

Mr. White showed some simple code examples to implement this system. Doesn’t seem to be too complex to start with.

The talk was good but not very exciting. Anyway what I took with me is that next time I come across logfile storage and analysis I will consider Hadoop as a solution.
Actually this could have been a good solution for the request tracking problem with my current project. Too late …

Author: Roland Huß
Categories: devoxx, development