Generating data set
The main function is sim_groups
, you need to define:
- A number of observations to draw.
- A number of groups to sample.
- An optional argument to define the proportion of each group.
library(klassets)
set.seed(123)
df <- sim_groups(n = 500, groups = 3)
plot(df)
Fit cluster algorithms
K-means stats::kmeans
You can apply the stats::kmeans
using fit_statskmeans_clust
.
dfc1 <- fit_statskmeans(df, centers = 2)
plot(dfc1)
Hierarchical Clustering stats::hclust
dfhc <- fit_hclust(df, k = 2)
plot(dfhc)
K-means: Basic {klassets} implementation
Or use a basic K-means implementation with:
set.seed(234)
dfc2 <- fit_kmeans(df, centers = 2, max_iteration = 6)
plot(dfc2)
What is the benefit? In the second one use a helper function kmeans_iterations
to keep the iteration and see how the algorithm converges.
set.seed(234)
kmi <- kmeans_iterations(df, centers = 2, max_iteration = 6)
plot(kmi)
Now we can use gganimate
package using object result from kmeans_iterations
due have the classification for every point in every step:
kmi
#> $points
#> # A tibble: 2,988 × 6
#> iteration id group x y cluster
#> <int> <int> <chr> <dbl> <dbl> <fct>
#> 1 1 1 1 4.53 8.60 NA
#> 2 1 2 1 5.57 6.42 NA
#> 3 1 3 1 2.62 6.28 NA
#> 4 1 4 1 4.82 7.41 NA
#> 5 1 5 1 0.583 2.50 NA
#> 6 1 6 1 -5.49 8.30 NA
#> 7 1 7 1 3.59 9.44 NA
#> 8 1 8 1 -0.224 3.95 NA
#> 9 1 9 1 -2.62 10.7 NA
#> 10 1 10 1 -0.695 8.74 NA
#> # … with 2,978 more rows
#>
#> $centers
#> # A tibble: 12 × 4
#> iteration cluster cx cy
#> <int> <fct> <dbl> <dbl>
#> 1 1 A -4.67 5.85
#> 2 1 B 3.70 -7.21
#> 3 2 A 0.327 5.85
#> 4 2 B 7.26 -1.35
#> 5 3 A 0.170 5.29
#> 6 3 B 7.65 -1.30
#> 7 4 A 0.132 5.05
#> 8 4 B 7.83 -1.27
#> 9 5 A 0.137 4.76
#> 10 5 B 8.04 -1.24
#> 11 6 A 0.155 4.57
#> 12 6 B 8.19 -1.22
#>
#> attr(,"class")
#> [1] "klassets_kmiterations" "list"
So you can take the output of this function data and use gganimate
to make the animation using in the klassets
home page. The code used in that animation can be found in the package using:
system.file("animation_kmeans_iterations.R", package = "klassets")
#> [1] "/home/runner/work/_temp/Library/klassets/animation_kmeans_iterations.R"