Tutorials

How To Build Data Apps with the new Genie Framework API

Pere Giménez

Pere Giménez

Published on October 4, 2022

This tutorial will guide you through the development of a web app for the classification of flower samples belonging to several species using the new streamlined Genie Framework API. We’ll be using the Iris Dataset, which is commonly used to showcase machine learning applications. In this dataset, we have tabular data describing 150 flower samples in terms of their petal length, petal width, sepal length, sepal width and species.

Figure 1: Flower types in the Iris dataset

Our objective is to, based only on the petal and sepal dimensions, group together those samples belonging to the same species. This is an unsupervised learning problem that we’ll solve through clustering with the k-means algorithm. Further, we will deliver the solution as an interactive web app built with Genie Framework.

A Genie app is typically developed in three parts:

  1. Data processing logic: the code a data scientist would write to extract insights from data.
  2. Dashboard logic: the layer that connects data structures with the user interface.
  3. User interface: the interactive user-facing part of the app.

The tutorial will go into each of these steps and, by the end, we’ll have a beautiful dashboard like the one in the figure below. The full code can be found here.

Figure 2: Iris clustering dashboard built with Genie Framework

Setup

Regarding the file structure of the project, it is very simple: inside the project folder IrisClustering we’ll have the main executable file dashboard.jl containing the data and dashboard logic, and ui.jl containing the UI.

Project file structure
          IrisClustering
          ├── dashboard.jl
          └── ui.jl

Besides GenieFramework, we require the following packages: RDatasets, DataFrames and Clustering. To add them to the Julia install, enter package mode pressing “]” and input

          add RDatasets DataFrames Clustering GenieFramework

Once the install is finished, we’re ready to start adding code.

Data processing logic

This first part implements the data gathering and analysis in our app. The Iris dataset is readily available in the RDatasets.jl package, so let’s take a quick look at its contents with the code below.

          using DataFrames
          using RDatasets: dataset
          
          irisdata = dataset("datasets", "iris")
          # show 5 random rows
          idx = rand(1:nrow(irisdata), 5)
          irisdata[idx,:]

5 rows × 5 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpecies
Float64Float64Float64Float64Cat…
16.43.24.51.5versicolor
26.23.45.42.3virginica
34.63.61.00.2setosa
47.23.05.81.6virginica
55.63.04.11.3versicolor

Notice that the first four columns in the table have numerical values, which constitute the features in the dataset. The last column has categorical values, corresponding to the data labels. These take one of three values: virginica, setosa or versicolor.

We’ll use the k-means algorithm to classify the flower samples. This algorithm takes as input the features of each sample, that is, the first four columns in the data matrix, and groups together those flower samples with similar features to obtain k=3 separate clusters. We hope that k-means will give samples from the same species the same cluster number, hence resulting in species classification.

For a ready-made implementation using the Clustering.jl package, create the script dashboard.jl with the following code:

          using RDatasets: dataset
          using DataFrames, Clustering
          const data = DataFrames.insertcols!(dataset("datasets", "iris"), :Cluster => zeros(Int, 150))
          const features = [:SepalLength, :SepalWidth, :PetalLength, :PetalWidth]
          
          function cluster(no_of_clusters = 3, no_of_iterations = 10)
            # Put all feature vectors in a matrix, call kmeans to obtain cluster 
            # assignments and store assignments in data matrix
            feats = Matrix(data[:, [c for c in features]])' |> collect
            result = kmeans(feats, no_of_clusters; maxiter = no_of_iterations)
            data[!, :Cluster] = assignments(result)
          end

Note that we have added a column to the data matrix to store the result of k-means. If we execute the cluster function, we obtain

          cluster()
          data[idx,:]

5 rows × 6 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpeciesCluster
Float64Float64Float64Float64Cat…Int64
16.43.24.51.5versicolor1
26.23.45.42.3virginica3
34.63.61.00.2setosa2
47.23.05.81.6virginica3
55.63.04.11.3versicolor1

The column data now contains the cluster assignments made by k-means. It seems that, for the shown samples, each species has a different cluster assignment; this indicates a successful classification.

To explore the algorithm and the accuracy of its results, we will build a dashboard with the following: visualization of the clusters, and controls to rerun the clustering with different parameters.

Dashboard logic

The dashboard logic controls which variables are exposed to the user interface, and what code is executed on user interaction. The UI elements in our app, and the variables or arguments they depend on are:

  • Plot showing the real species of each sample: data, features.
  • Plot showing the assigned cluster number for each sample: data, features.
  • Table showing the contents of the dataset: data.
  • Two selectors to choose the x and y axes of the plots: features.
  • Sliders to change the number of clusters and number of iterations of the algorithm: no_of_clusters, no_of_iterations.

Therefore, we’ll need to expose the listed variables to the UI. There are two macros to set the access properties of a variable:

  • @in: the variable takes its value from an element in the UI.
  • @out: the value of the variable is shown in a UI element.

Furthermore, we have two other macros to control interaction:

  • @handlers: sets up the server-side code that is run when values change.
  • @onchangeany: detects changes on a variable tagged with @in and executes some code.

Putting all of this together, we implement the dashboard logic as


          using GenieFramework
          # set up automatic reload and other dev tools
          @genietools
          
          # make feature names readable from the UI
          @out const features = [:SepalLength, :SepalWidth, :PetalLength, :PetalWidth]
          
          # set up reactive code
          @handlers begin
            # variables taking their value from sliders and selectors
            @in no_of_clusters = 3
            @in no_of_iterations = 10
            @in xfeature = :SepalLength
            @in yfeature = :SepalWidth
            # variables that hold plot and table data
            @out datatable = DataTable()
            @out datatablepagination = DataTablePagination(rows_per_page=50)
            @out irisplot = PlotData[]
            @out clusterplot = PlotData[]
            @out title = "My Iris Dashboard"
          
            # when any of the variables tagged with @in change, re-execute 
            @onchangeany isready, xfeature, yfeature, no_of_clusters, no_of_iterations begin
              cluster(no_of_clusters, no_of_iterations)
              datatable = DataTable(data)
              irisplot = plotdata(data, xfeature, yfeature; groupfeature = :Species)
              clusterplot = plotdata(data, xfeature, yfeature; groupfeature = :Cluster)
            end
          end

Besides re-calculating the cluster assignments when the parameters change, the code in the @onchangeany block updates the plots. Each call to plotdata returns a struct of type PlotData, which is the type accepted by the plotting functions in the UI code we’ll write later.

Now that the dashboard logic is implemented, we just need to add two more instructions to render the UI, and run the server:

          @page("/", "ui.jl")
          Server.isrunning() || Server.up()

The @page macro takes as input the path to the web app, and the file containing the UI definition. The final code in dashboard.jl combining the data and dashboard logic can be viewed in the code fold below.

dashboard.jl
using Clustering
          import RDatasets: dataset
          import DataFrames
          
          using GenieFramework
          @genietools
          
          const data = DataFrames.insertcols!(dataset("datasets", "iris"), :Cluster => zeros(Int, 150))
          @out const features = [:SepalLength, :SepalWidth, :PetalLength, :PetalWidth]
          
          function cluster(no_of_clusters = 3, no_of_iterations = 10)
            feats = Matrix(data[:, [c for c in features]])' |> collect
            result = kmeans(feats, no_of_clusters; maxiter = no_of_iterations)
            data[!, :Cluster] = assignments(result)
          end
          
          @handlers begin
            @in no_of_clusters = 3
            @in no_of_iterations = 10
            @in xfeature = :SepalLength
            @in yfeature = :SepalWidth
            @out datatable = DataTable()
            @out datatablepagination = DataTablePagination(rows_per_page=50)
            @out irisplot = PlotData[]
            @out clusterplot = PlotData[]
            @out title = "My Iris Dashboard"
          
            @onchangeany isready, xfeature, yfeature, no_of_clusters, no_of_iterations begin
              cluster(no_of_clusters, no_of_iterations)
              datatable = DataTable(data)
              irisplot = plotdata(data, xfeature, yfeature; groupfeature = :Species)
              clusterplot = plotdata(data, xfeature, yfeature; groupfeature = :Cluster)
            end
          end
          
          @page("/", "ui.jl")
          
          Server.isrunning() || Server.up()

Finally, we proceed to define the UI.

User interface

To implement the UI we don’t need to write a single line of HTML or Javascript, it can be entirely done in Julia. The static UI elements are created by functions that generate the corresponding HTML code, as in

          row(h5("Hello"))
"<div class=\"row\"><h5>Hello</h5></div>"

Furthermore, we have functions that generate a combination of HTML and Javascript for interactive UI elements. As a first example, let us look at the code for one of the plots:

          cell(class="st-module", [
            h5("k-means clusters")
            plot(:clusterplot)
          ])

The code first defines a cell divider with CSS class st-module; the CSS code is automatically generated. The cell generator function takes as an argument the list of elements that it will hold; in this case a header and the plot. The plot function’s only argument is the symbol name of the variable of type PlotData to be plotted.

As a second example, for one of the sliders we have:

          cell(class="st-module", [
            h6("Number of clusters")
            slider( 1:1:20,
                    :no_of_clusters;
                    label=true)
          ])

The slider function takes as arguments the number range, the symbol name of variable to be changed, and the popup shown on user click.

The full UI is built by stacking the UI elements into an array, as in the full code of ui.jl shown below.

ui.jl
          [
            # insert contents of string var. using double mustache {{}} syntax
            heading("{{title}}")
          
            row([
              # sliders
              cell(class="st-module", [
                h6("Number of clusters")
                slider( 1:1:20,
                        :no_of_clusters;
                        label=true)
              ])
              cell(class="st-module", [
                h6("Number of iterations")
                slider( 10:10:200,
                        :no_of_iterations;
                        label=true)
              ])
              # selectors
              cell(class="st-module", [
                h6("X feature")
                Stipple.select(:xfeature; options=:features)
              ])
          
              cell(class="st-module", [
                h6("Y feature")
                Stipple.select(:yfeature; options=:features)
              ])
            ])
            # plots
            row([
              cell(class="st-module", [
                h5("Species clusters")
                plot(:irisplot)
              ])
          
              cell(class="st-module", [
                h5("k-means clusters")
                plot(:clusterplot)
              ])
            ])
            # table
            row([
              cell(class="st-module", [
                h5("Iris data")
                table(:datatable; dense=true, flat=true, style="height: 350px;", pagination=:datatablepagination)
              ])
            ])
          ] |> string  

Running the app

Now that we have finished coding the app, all is left to do is run it. Open the Julia REPL in the app folder and type in the following:

include("dashboard.jl")

Genie will render and serve the app, which can be accessed via http://localhost:8000.