--- title: "Cluster analysis in OpenBudget.eu" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Cluster analysis in OpenBudget.eu} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- `Cluster.OBeu` is used on [OpenBudgets.eu](http://openbudgets.eu/tools/) data mininig tool platform with [OpenCPU integration of R and JavaScript](https://www.opencpu.org/) to estimate and return the necessary parameters for cluster analysis visualizations for budget or expenditure datasets of Municipality across Europe. The vignette shows the way `Cluster.OBeu` (in R and OpenCPU environment) is fitted with datasets of [OpenBudgets.eu](http://openbudgets.eu) according to the [OpenBudgets.eu data model](https://github.com/openbudgets/data-model). Detailed documentation about OpenBudgets.eu data model can be found [here](http://openbudgets.eu/assets/deliverables/D1.4.pdf) The input and the resulted object are in json format. First you have to load the library ```{r load, warning=FALSE, include=TRUE} # load Cluster.OBeu library(Cluster.OBeu) ``` # Cluster analysis on OpenBudgets.eu platform `open_spending.cl` is designed to estimate and return the clustering model measures of [OpenBudgets.eu](http://openbudgets.eu/) datasets. The available clustering algorithms are hierarchical, kmeans from R base, pam, clara, fuzzy from [cluster package](https://CRAN.R-project.org/package=cluster) and model based algorithms from [mclust package](https://CRAN.R-project.org/package=mclust). It can be used to find the appropriate clustering algorithm and/or the appropriate clustering number of the input data according to the internal and stability measures from [clValid package](https://CRAN.R-project.org/package=clValid). The input data must be a JSON link according to the [OpenBudgets.eu data model](https://github.com/openbudgets/data-model). There are different parameters that a user could specify, e.g. `dimensions`, `measured.dimensions` and `amounts` should be defined by the user, to form the dimensions of the dataset. `open_spending.cl` estimates and returns the json data that are described with the [OpenBudgets.eu data model](https://github.com/openbudgets/data-model), using `cl.analysis` function. +--------------------+-------------------------------------------------------------+ | Input | Description | +====================+=============================================================+ | json_data | The json string, URL or file from Open Spending API | +--------------------+-------------------------------------------------------------+ | dimensions | The dimensions of the input data | +--------------------+-------------------------------------------------------------+ | amounts | The amounts of the input data | +--------------------+-------------------------------------------------------------+ | measured.dimensions| The dimensions to which correspond amount/numeric variables | +--------------------+-------------------------------------------------------------+ | cl.aggregate | Aggregate function of the input data | +--------------------+-------------------------------------------------------------+ | cl.method | clustering algorithm | +--------------------+-------------------------------------------------------------+ | cl.num | Number of clusters | +--------------------+-------------------------------------------------------------+ | cl.dist | Distance metric | +--------------------+-------------------------------------------------------------+ Table: `open_spending.cl` input The following table shows a sort description of the `open_spending.cl` return components: | Component | Description | | :------------: | :--------------------------------: | | cluster.method | Label of the clustering algorithm | | raw.data | Input data | | data.pca | Principal components | | modelparam | Clustering model specifications | | compare | Clustering measures | Table: `open_spending.cl` return # Examples The dataset in the following example is being used in [OpenBudgets.eu platform](http://apps.openbudgets.eu/) and concerns the income of Aragon. in 2007. ## In R environment `open_spending.cl` function's input are data as json link and described with [OpenBudgets.eu data model](https://github.com/openbudgets/data-model). ```{r data} aragon_income = "http://apps.openbudgets.eu/api/3/cubes/aragon-2007-income__3209b/aggregate?drilldown=fundingClassification.prefLabel%7CeconomicClassification.prefLabel&aggregates=amount.sum" ``` ```{r open_spending, eval=FALSE, message=FALSE, warning=FALSE, include=TRUE} results = open_spending.cl( json_data = aragon_income, dimensions ="economicClassification.prefLabel", amounts = "amount.sum", measured.dimensions = "fundingClassification.prefLabel", cl.method="kmeans" ) # Pretty output using prettify of jsonlite library jsonlite::prettify(results) ``` ## In OpenCPU environment ### Select library and function 1. Go to: yourserver/ocpu/test 2. Copy and paste the following function to the endpoint ```{r, eval=FALSE, include=TRUE} ../library/Cluster.OBeu/R/open_spending.cl # library/ {name of the library} /R/ {function} ``` 3. **Select Method**: **`Post`** ### Add parameters Click **add parameters** every time you want to add a new parameters and values. 4. Define the input data: - **Param Name**: **`json_data`** - **Param Value** (*URL* of json data): **`"https://apps.openbudgets.eu//api/3/cubes/aragon-2007-income__3209b/aggregate?drilldown=fundingClassification.prefLabel%7CeconomicClassification.prefLabel&aggregates=amount.sum"`** (or any other json URL with the data) 5. Define the *dimensions* parameter: - **Param Name**: **`dimensions`** - **Param Value**: **`"economicClassification.prefLabel"`** 6. Define the *amount* parameter: - **Param Name**: **`amounts`** - **Param Value**: **`"amount.sum"`** 7. Define the *measured dimension* parameter: - **Param Name**: **`measured.dimensions`** - **Param Value**: **`"fundingClassification.prefLabel"`** You add likewise further parameters and change the default parameters of `cl.method`, `cl.num`, `cl.dist`, see Cluster.OBeu *reference manual* for further details. 8. Ready! Click on **Ajax request**! ## Results 9. copy the **/ocpu/tmp/{this_id_number}/R/.val** (second on the right panel) 10. finally, paste **`yourserver/ocpu/tmp/{this_id_number}/R/.val`** on a new tab. # Further Details + [HTTP in OpenCPU](https://www.opencpu.org/api.html) + [OpenCPU Help](https://www.opencpu.org/help.html) + [OpenCPU JavaScript Client](https://www.opencpu.org/jslib.html) + [OpenCPU on CRAN](https://cran.r-project.org/package=opencpu) # Github + [OpenCPU package *development version*](https://github.com/opencpu/opencpu) + [Cluster.OBeu package *development version*](https://github.com/okgreece/Cluster.OBeu)