| Title: | Cluster Analysis 'OpenBudgets.eu' |
|---|---|
| Description: | Estimate and return the needed parameters for visualisations designed for 'OpenBudgets' <http://openbudgets.eu/> data. Calculate cluster analysis measures in Budget data of municipalities across Europe, according to the 'OpenBudgets' data model. It involves a set of techniques and algorithms used to find and divide the data into groups of similar observations. Also, can be used generally to extract visualisation parameters convert them to 'JSON' format and use them as input in a different graphical interface. |
| Authors: | Kleanthis Koupidis [aut, cre], Charalampos Bratsas [aut], Jaroslav Kuchar [ctb] |
| Maintainer: | Kleanthis Koupidis <[email protected]> |
| License: | GPL-2 | file LICENSE |
| Version: | 1.2.3 |
| Built: | 2026-06-06 08:58:06 UTC |
| Source: | https://github.com/okgreece/cluster.obeu |
This dataset is an example data frame of the budget phase data
Administrative_Unit
Approved
Draft
Executed
Revised
A data frame with the previous characteristics as columns
Clustering Analysis for OBEU datasets.
cl.analysis(cl.data, cl_feature = NULL, amount = NULL, cl.aggregate = "sum", cl.meth = NULL, clust.numb = NULL, dist = "euclidean", tojson = FALSE)cl.analysis(cl.data, cl_feature = NULL, amount = NULL, cl.aggregate = "sum", cl.meth = NULL, clust.numb = NULL, dist = "euclidean", tojson = FALSE)
cl.data |
The input data |
cl_feature |
The feature to be clustered (nominal variables) |
amount |
The numeric variables |
cl.aggregate |
Select a different aggregation in case of filtering the input data |
cl.meth |
The clustering method algorithm |
clust.numb |
The number of clusters |
dist |
The distance metric |
tojson |
If TRUE the results are returned in json format, default returns a list |
There are different clustering models to be selected through an evaluation process. The user should define the cl_feature, cl.aggregate and amount parameters to form the structure of cluster data. The clustering algorithm, the number of clusters and the distance metric of the clustering model are set to the best selection using internal and stability measures. The end user can also interact with the cluster analysis and these parameters by specifying the cl.method, cl.num and cl.dist parameters respectively.
The final returns are the parameters needed for visualizing the cluster data depending on the selected algorithm and the specification parameters, as long as some comparison measure matrices.
cluster.method - Label of the clustering algorithm
raw.data - Input data
data.pca - The principal components to visualize the input data
modelparam - The results of this parameter depend of the selected clustering model
compare - Clustering measures
Kleanthis Koupidis, Jaroslav Kuchar
cl.features, clValid, diana, agnes,
pam, clara, fanny, Mclust
cl.analysis(city_data, cl.meth = "pam", clust.numb = 3)cl.analysis(city_data, cl.meth = "pam", clust.numb = 3)
Select clustering characteristic to form the clustering data
cl.features(data, features = NULL, amounts = NULL, aggregate = "sum", tojson = FALSE )cl.features(data, features = NULL, amounts = NULL, aggregate = "sum", tojson = FALSE )
data |
The input data |
features |
The clustering features |
amounts |
The amount measures of the dataset |
aggregate |
The function to aggregate |
tojson |
If TRUE the results are returned in json format, default returns a list |
This function adapts the dataset according to the selected dimension of the dataset and the aggregation function.
This function returns the dataset for cluster analysis adapted to the desired features.
Kleanthis Koupidis
cl.features(city_data, features = 'Administrative_Unit') # works also for other datasets cl.features(iris, features = 'Species')cl.features(city_data, features = 'Administrative_Unit') # works also for other datasets cl.features(iris, features = 'Species')
cl.plot function plots the clustering model constructed by the cl.analysis function.
cl.plot(clustering.model, parameters = list())cl.plot(clustering.model, parameters = list())
clustering.model |
Object returned by the |
parameters |
List of parameters to indicate plotting of ellipses or convex hulls. Default values: |
Jaroslav Kuchar <https://github.com/jaroslav-kuchar>
inputs.clustering <- cl.analysis(city_data, cl.meth="pam", clust.numb=2) cl.plot(inputs.clustering, parameters = list(ellipses=TRUE))inputs.clustering <- cl.analysis(city_data, cl.meth="pam", clust.numb=2) cl.plot(inputs.clustering, parameters = list(ellipses=TRUE))
Extract the most frequent
cl.summary(clv)cl.summary(clv)
clv |
A clValid object |
This function returns the proposed method or number of clusters or both according to the majority clustering indices of a clValid process
A value that indicates the proposed method and number of clusters.
Kleanthis Koupidis
Computes points to plot a convex hull for each cluster of the clustering model
convex.hulls(clustering.model, data.pca)convex.hulls(clustering.model, data.pca)
clustering.model |
Object returned by the |
data.pca |
data as result of the |
List of vectors with points for each convex hull.
Computes points to plot an ellipse for each cluster of the clustering model
ellipses(clustering.model, data.pca)ellipses(clustering.model, data.pca)
clustering.model |
Object returned by the |
data.pca |
data as result of the |
List of vectors with points for each ellipse.
Extract and return a data frame with the columns that include only numeric values
nums(data)nums(data)
data |
The input data frame, matrix |
This function returns a data frame with the numeric columns of the input dataset.
Kleanthis Koupidis
nums(city_data)nums(city_data)
Extract and analyze the input data provided from Open Spending API, using the cl.analysis function.
open_spending.cl(json_data, dimensions=NULL, amounts=NULL, measured.dimensions=NULL, cl.aggregate="sum", cl.method=NULL, cl.num=NULL, cl.dist="euclidean")open_spending.cl(json_data, dimensions=NULL, amounts=NULL, measured.dimensions=NULL, cl.aggregate="sum", cl.method=NULL, cl.num=NULL, cl.dist="euclidean")
json_data |
The json string, URL or file from Open Spending API |
dimensions |
The dimensions/feature of the input data |
amounts |
The measures of the input data |
measured.dimensions |
The dimensions to which correspond amount/numeric variables |
cl.aggregate |
Aggregate function of the input data |
cl.method |
The clustering algorithm |
cl.num |
The number of clusters |
cl.dist |
The distance metric |
This function is used to read data in json format from Open Spending API, in order to implement
cluster analysis through cl.analysis function.
A json string with the resulted parameters of the cl.analysis function.
Kleanthis Koupidis