Java One 2017: Open Source Big Data in the Cloud (Hadoop, Hive, Spark, Kafka)

It’s true. I always said “presenting at Java One is like playing in champions league”. Last month I had the great pleasure to present at the Java One 2017 conference in San Francisco together with Edelweiss Kammermann about Open Source Big Data used in the cloud. The presentation included 4 live demos about Apache Hadoop with Map Reduce, Apache Hive, Apache Spark and Kafka all using Oracle Big Data Cloud Service – Compute Edition (aka BDCS-CE) and the Oracle Event Hub Service. The presentation was recorded – so you can enjoy from anywhere in the world.

For your convenience the slides are available on slideshare:

Application Container Cloud (ACCS) supports Java EE

This is my personal entry for the ODC appreciation day that was initiated once more by Tim Hall.

Intro

Oracle’s Application Container Cloud Service (ACCS) is a cloud native, container based runtime for applications and microservices implemented with Java, Node.js, Ruby, Python and PHP. It’s simplicity makes it most attractive. All you need to do is bundle and upload your code, and add a .json file to let ACCS know how to start your application.

Since a couple of days ACCS is now supporting Java EE as well – this is great news! Note, that there is still Java Cloud Service JCS, which gives you a fully fledged WebLogic domain. However, JCS is more complex to set up and to deploy to, so ACCS is a good option for those that only want to deploy and run a Java EE module.

ACCS with Java EE

The provisioning, as shown in the following screenshot, is PaaS-worthy, easy enough as with any other ACCS deployment type.

ACCS: Application Container Cloud Service with Java EE

 

ACCS with Java EE, Some Details

Here are some more facts as they are currently known:

  • You can simply upload a .war file. The .json manifest that is required to tell ACCS how to start a Java application is not mandatory for Java EE since the module is running in WebLogic.
  • The current versions used in ACCS when deploying a Java EE module are: WebLogic 12.2.1.2 (which supports Java EE 7), running on Java 8.
  • ACCS with Java EE currently does not support clustering.
  • ACCS does not let you access the WebLogic admin console. This is fine. It’s PaaS!
  • ACCS with Java EE can make use of Java Flight Recorder. In continuous mode, all profiling and event data is captured. If Java Flight Recorder is not in continuous more, then 60 seconds of data is captured. You can download the files and use Java Mission Control to analyze the recordings.
  • The URL syntax to invoke a deployed web module is as follows:

    https://<ACCSName>-<IDDomain>.apaas.<REGION>.oraclecloud.com/<DocRoot>

  • Typically the URL to call a deployed application is shown in the ACCS service console. Note, that the DocRoot -even if required to call the deployment – is not shown.
  • A Java EE deployment is running across 2 instances as default which requires a load balancer to be provisioned by ACCS. Currently OTD is used.
  • If you feel brave enough, just give it a go and deploy the sample app.
  • If you are lost working with ACCS by yourself, then follow this tutorial, explaining how to deploy a Java EE module to ACCS. The instruction mentioned in the tutorial worked OOTB for me.

Possible Improvements

As you know, I usually write about features and showstoppers. A few minor things that should be improved in my opinion:

  • A few minor details are not visible in the console. E.g. the used Java version is not fully displayed.
  • Not sure if it is documented somewhere, but I would love to read more precisely about current restrictions for Java EE in ACCS. Actually I dropped the hint for some colleagues at Oracle to blog about it.

Is it Java EE or EE4J now?

You may have heard that Java EE went to the Eclipse foundation. The new name will be EE4J. Do you have to change all your slide decks and articles? Actually no. For some more details have a look at this article.

Acknowledgement

Many thanks to the team of Abhishek and Sriram for helping to clarify some questions and providing quick and precise feedback!

Access Oracle Event Hub Kafka from External Kafka Client or Tool

Access Oracle Event Hub from external Tool or Command-Line Client

Oracle Event Hub provides a managed Kafka PaaS solution. To access it from an on-premises client you have to make sure to enable the ports to Event Hub Zookeeper and the Kafka broker.

Access to Kafka Broker

First lets enable access to Kafka broker. To do so, check the OPC Event Hub service for the connect string.

Create Event Hub Broker Access Rule

Then create a new access rule. Warning: In general you should not allow public access to access your Event Hub service! This is just for demo purposes to make the tool work. In case of doubt create a rule with your own IP address and talk your friendly security officer first of all.

The creation of the rule might take a few seconds:

Create Zookeeper Access Rule

Once the rule for the Kafka broker is created, we need to create a rule for Zookeeper which is using port 2181:

Explore Kafka Tool (or other)

Now lets start our Kafka tool (for demonstration purpose) only, configure the connection details for the Zookeeper IP and port, and then try to connect to Oracle Event Hub Service:

Voila, it is working 🙂 You can explore your topics or even create new ones. Note that  Oracle Event Hub uses a special naming convention for topics.

Oracle Event Hub Cloud Service: What you need to know about Topic Names

There are a number of things related to topics in Oracle Event Hub service that everybody should be aware of:

  • Oracle Event Hub topics created with the web console are automatically prefixed with the OPC ID domain.
  • Event hub topics can be created via the Kafka command-line from any host (assuming you allow the clients to access Event Hub CS). These topics are not prefixed with the OPC ID domain.
  • Topics created with the CLI (without ID domain prefix) are not shown in the service console.

IMHO, this is behaviour is not very useful for various reasons:

  • If you are planning to use Event Hub as a drop in replacement for another Kafka installation you won’t be able to create the proper topic names for already existing topics with the service console.
  • You have to add the ID domain prefix in every client. This is particularly bad e.g. for a Java producer. Hard-coded ID domains will show up sooner or later in the source code.
  • Being forced to use the ID domain prefix in every client might turn out to be a security issue. Did you note that most bloggers blacken their ID domains in screenshots when writing about OPC?

 

New Names For Oracle Cloud Services (was: Bare Metal)

Oracle cloud is moving even more towards the generic cloud model that I described in 2011. What used to go with the name “Bare Metal” becomes the new norm now. This is good news: In addition to the regions and availability zones that you already know from AWS, the new stack will provide a flat, non-blocking network and NVMe storage.

The Bare Metal Cloud Services user interface was updated to reflect the new name, Oracle Cloud Infrastructure. Also on September 26th, the Oracle Public Cloud user interface will be updated to reflect the new name, Oracle Cloud Infrastructure Classic.

Oracle Cloud Service Names. Old Name. New Name. Classic.