Monday, October 6, 2014

Java 8 Collectors for Guava Collections

Java 8 comes with streaming API, it divides data processing into two phases: intermediate operations and terminal operation.
Terminal operation can have few different forms and I would like to concentrate on reduction to collections, especially to Guava immutable collections.
Terminal operation requires collector which will collect data and return it as required structure, but Guava does not provide such collector. In order to create Guava collection out of a stream we have to first reduce stream result into temporary collection and than transfer it:
import static java.util.stream.Collectors.collectingAndThen;
import static java.util.stream.Collectors.toList;
import com.google.common.collect.ImmutableSortedSet;

...stream.map(...).filter(....).
     collect(collectingAndThen(Collectors.toList(), ImmutableSortedSet::copyOf));
Reduction of our stream stores results in a temporary List (Collectors.toList()). Once stream processing is done the finisher function will convert content of this List into into Guava collection (ImmutableSortedSet::copyOf).

The problem with this approach is... that we have this extra converting loop and two arrays in memory (List and Builder). This could be avoided it we would have collector that is based on Guava's Builder. So.... I've implemented one, once we use it, the code above can be simplified into such form:
import static org.cyclop.common.Gullectors.toNaturalImmutableSortedSet;
import com.google.common.collect.ImmutableSortedSet;

...stream.map(...).filter(....).collect(toNaturalImmutableSortedSet());

The code is straight forward, let's concentrate on implementation of #toNaturalImmutableSortedSet()
public static <T extends Comparable<?>> 
  Collector<T, ?, ImmutableSortedSet<T>> toNaturalImmutableSortedSet() {
    
  Supplier<ImmutableSortedSet.Builder<T>> supplier = ImmutableSortedSet::naturalOrder;

  BiConsumer<ImmutableSortedSet.Builder<T>, T> accumulator = (b, v) -> b.add(v);

  BinaryOperator<ImmutableSortedSet.Builder<T>> combiner = (l, r) -> l.addAll(r.build());

  Function<ImmutableSortedSet.Builder<T>, ImmutableSortedSet<T>> finisher = 
      ImmutableSortedSet.Builder::build;

  return Collector.of(supplier, accumulator, combiner, finisher);
}

Our collector is being created by factory method Collector#of that takes four arguments:
  • #supplier - this function will be called only once to create structure that will collect stream results - in our case it's Biulder from ImmutableSortedSet
  • #accumulator - provides function that will get executed for each element that reaches terminal operation, meaning each element that went trough stream and should be collected for returning. In our case we are providing function that will execute #add(v) on Builder which has been provided in first argument (#supplier)
  • #combiner - this one will be not used in our example, but it's necessary for processing of parallel streams, it would be used to merge them
  • #finisher - this is the final step and it will be executed after stream processing is done. Elements returned by stream are contained in Builder (#supplier) and in this last phase we are calling #build() method on it, which results in ImmutableSortedSet !

Based on this pattern we can implement other collectors:
public static <T> Collector<T, ?, ImmutableList<T>> toImmutableList() {
    Supplier<ImmutableList.Builder<T>> supplier = ImmutableList.Builder::new;
    BiConsumer<ImmutableList.Builder<T>, T> accumulator = (b, v) -> b.add(v);
    BinaryOperator<ImmutableList.Builder<T>> combiner = (l, r) -> l.addAll(r.build());
    Function<ImmutableList.Builder<T>, ImmutableList<T>> finisher = 
        ImmutableList.Builder::build;

    return Collector.of(supplier, accumulator, combiner, finisher);
}

public static <T> Collector<T, ?, ImmutableSet<T>> toImmutableSet() {
    Supplier<ImmutableSet.Builder<T>> supplier = ImmutableSet.Builder::new;
    BiConsumer<ImmutableSet.Builder<T>, T> accumulator = (b, v) -> b.add(v);
    BinaryOperator<ImmutableSet.Builder<T>> combiner = (l, r) -> l.addAll(r.build());
    Function<ImmutableSet.Builder<T>, ImmutableSet<T>> finisher = 
        ImmutableSet.Builder::build;

    return Collector.of(supplier, accumulator, combiner, finisher);
}

public static <T, K, V> Collector<T, ?, ImmutableMap<K, V>> toImmutableMap(
        Function<? super T, ? extends K> keyMapper,
        Function<? super T, ? extends V> valueMapper) {

    Supplier<ImmutableMap.Builder<K, V>> supplier = ImmutableMap.Builder::new;
    BiConsumer<ImmutableMap.Builder<K, V>, T> accumulator = 
         (b, t) -> b.put(keyMapper.apply(t), valueMapper.apply(t));
    BinaryOperator<ImmutableMap.Builder<K, V>> combiner = (l, r) -> l.putAll(r.build());
    Function<ImmutableMap.Builder<K, V>, ImmutableMap<K, V>> finisher = 
       ImmutableMap.Builder::build;

    return Collector.of(supplier, accumulator, combiner, finisher);
}

Finally here is the source code: Gullectors.java
and unit tests:TestGullectors.java

Friday, May 30, 2014

Wicket's Data Grid based on plain Iterator

In order to use Data Grid from Wicket you have to implement Data Provider:
  1 public interface IDataProvider<T> extends IDetachable {
  2  
  3     Iterator<? extends T> iterator(long first, long count);
  4  
  5     long size(); 
  6  
  7     IModel<T> model(T object); 
  8 }

It means, that you have to provide separate Iterator for each single page, so that it contains only elements required to render it. You have also to know the total amount of all elements, because it will be used to calculate number of pages.

This interface has been designed with SQL database in mind and you have to implement all methods as they are. The whole component is not meant to be customised - everything is either final or private.

But I had some different requirements:
  • model should be based on plain java Iterator
  • size information should not be required, not even estimated
  • single iterator should be used to render all pages
  • no reading in advance - Iterator should be used only to read elements that are required to render  a page for displaying
  • large data sets does not have to be supported - like infinite iterator. It is legitime to cache already read elements in order to support bidirectional pager navigation  - at least for the elements that have been already displayed
  • and still - implementation should be based on original Data Grid from Wicket - at least as far as possible
The reason for that was, that I wanted to display results from Cassandra, and it should support every possible query independent of its data model structure. This means that you have only one way Iterator and skipping elements means reading and rejecting. I give up size information too, firstly because it turned out that its not really needed, and most importantly, because its calculation means full table scan. Once again - you could redesign your model so that it could deliver such data in efficient way, but I wanted to have generic Data Grid which works with every possible query and model.

The component that I've developed is called IterableDataProvider and it can be found here: https://github.com/maciejmiklas/cyclop/tree/master/cyclop-wicket-components

It's based on original GridView, but I had to copy source code from Wicket in order to remove some final declarations. Here is an example:
  1 final List<String> myGridData = new ArrayList<>(); 
  2 myGridData.add("value 1"); 
  3 myGridData.add("value 2"); 
  4  
  5 IterableDataProvider<String> iterableDataProvider =  
  6    new IterableDataProvider<String>(10) { 
  7     @Override 
  8     protected Iterator<String> iterator() {
  9         return myGridData.iterator(); 
 10     } 
 11  
 12     @Override 
 13     public IModel<String> model(String s) {
 14         return Model.of(s); 
 15     } 
 16  
 17     @Override 
 18     public void detach() {
 19     } 
 20 }; 
 21  
 22 IterableGridView<String> myGrid = new IterableGridView<String>("myGrid",  
 23    iterableDataProvider) { 
 24     @Override 
 25     protected void populateEmptyItem(Item<String> item) {
 26         item.add(new Label("myValue")); 
 27     } 
 28  
 29     @Override 
 30     protected void populateItem(Item<String> item) {
 31         item.add(new Label("myValue", item.getModelObject())); 
 32     } 
 33 }; 
 34  
 35 add(myGrid); 
 36  
 37 myGrid.setItemsPerPage(10); 
 38  
 39 // you have to use custom pager and not AjaxPagingNavigator
 40 IterablePagingNavigator pager = new IterablePagingNavigator("rowNamesListPager",  
 41    rowNamesList); 
 42 resultTable.add(pager);
This code looks almost as it would have been written based on original Wicket components, so there is not much to say - I will only point out some key differences:
  • GridView has been replaced by IterableGridView. This new class inherits from GridView, but not from the original one - I had to copy it in order to remove few final modifiers
  • IDataProvider has been replaced with IterableDataProvider.  New provider requires only Iterator, size and iterator based on offset are gone
  • AjaxPagingNavigator has been replaced with IterablePagingNavigator. It inherits from original class and replaces only behaviour for link to the last page. It's always active - I will get into that in next section
Our Iterable Data Grid has one limitation - it does not know the total amount of elements, so it cannot estimate the amount of pages, also it cannot skip results, and iterating over all elements in advance is not allowed either.  This enforces some functionality changes in paging. User have to go page by page in order to progress trough grid elements - he cannot skip pages, or jump to the last page immediately. Once particular page has been loaded, it's being stored in cache, and from now on direct access is possible.

Let's analyse practical example:
at the beginning pager looks like that:
The first page has been rendered, and there is link to the following page. In this case gird has read six elements from Iterator - five in order to render the current page, and one more to verify whether there is another page.
Clinking on "page two" has following outcome:
and again on "page three":
Now the user can go back to "page two" or "page one", but in order to progress he must go to "page four".
Once the user went trough all pages, he can navigate as usual without restrictions. Also the link to the last page is finally active:

This whole idea has one more catch: since we have simple Iterator and we would like provide bidirectional navigation, the results have to be cached somewhere. Default implementation uses memory, so I would suggest to use simple POJOs. The amount of elements read from iterator can be limited too.
You can also customise cache implementation by overwriting following factory method: IterableDataProvider#createElementsCache(). Currently it returns ArrayList, but you can replace it with off heap implementation, like BigList wich uses memory mapped files, or data base.
The whole thing is not an anti-patern as it might appear. We have only access to plain Iterator, but we have to provide bidirectional navigation - this additional information has to be stored somewhere.

Tuesday, May 6, 2014

Cyclop 1.4 has been released!

Latest Cyclop release addresses 23 issues, the full list can be found here. Here are the most important:
  • query history inclusive free text search
  • increased contrast for table headers - both row and column names
  • links showing full column content has been replaced with icons
  • mouse over does not highlight whole table row, but only column values - table header on the left has been removed from highlighting
  • blue horizontal separator between common and dynamic columns appears only of there is something to be separated
  • spell checker is disabled in query editor
  • green CQL-Completion-Hint in top right corner is being closed after query has been executed, so it does not overlap on results table. It will reappear after user starts typing again
  • test coverage has been significantly improved 
  • Findbugs is active, inclusive JSR 305 annotations
  • data validation on service level based on Bean Validation (JSR 303) and custom AOP framework - here is an example. It validates method input parameters and also returned values, so the service gets only validated data, and the client validated response from the service
  • CSV import

Tuesday, March 4, 2014

Commodore C64 emulator - playing games using laptop keyboard

VICE is finally working on my laptop and games are playable without joystick - just with keyboard !
This seems to be pretty obvious, but I had hard time configuring it, plus I did not find any tutorial besides numerous posts from people complaining that it's not working.

First you have to download the emulator itself, for example from here. After installing it, download game disk images. Clicking on image file will start emulator and load game. You will also notice, that keyboard is not working at all ..... Let's configure joystick keyboard emulation ;)

First enable joystick emulation by opening Options menu and selecting "Allow Keyset Joystick":

Now configure joystick itself by choosing Settings ->  Joystick:


  • Joystick is connected to port two (#2) because most games expect it there.
  • Not all keys are set, because they are simply not required. There is no need for dedicated right-upper key, because you can just press two arrow keys, which is natural anyway.
  • Key-codes are taken from mac book directional keys, you can also provide any key codes. Just make sure that you are not using reserved keys like esc, space or enter.
  • In order to set key code first press key that you would like to assign, and than click on corresponding button. For example to assign fire button to key X, press X and than click on button marked as "Fire" on picture above
  • I've assigned X as fire, because you will need space, for example to throw grenades ;)


Saturday, January 18, 2014

Cyclop 1.3.0 has been released !

The Cyclop version 1.3.0 addresses 12 issues:

  • Support completion for keywords with spaces
  • Select statement should support column names in "order by" when query does not contain "when" part (decision table needs another dimension)
  • Code completion suggests only first hit for upper case words
  • Horizontal scroll-bar is always visible
  • Store user preferences as cookie
  • Results table - second click on link opening "column content dialog" breaks ajax
  • Completion support for UPDATE
  • Completion support for TRUNCATE
  • Completion support for DROP KEYSPACE
  • Completion support for DROP INDEX
  • Completion support for DROP INDEX
  • Completion support for DELETE
  • Completion support for CREATE KEYSPACE
The milestone containing details can be found here.



Have a fun!