QuadratiK Usage Examples#
Normality Test#
This section contains example for the Parametric and Non-parametric Normality Test based on kernel-based quadratic distances
Parametric#
[1]:
import numpy as np
from QuadratiK.kernel_test import KernelTest
np.random.seed(42)
data = np.random.randn(100,2)
normality_test = KernelTest(h=0.4, centering_type="param",random_state=42).test(data)
print("Test : {}".format(normality_test.test_type_))
print("Execution time: {:.3f}".format(normality_test.execution_time))
print("H0 is Rejected : {}".format(normality_test.h0_rejected_))
print("Test Statistic : {}".format(normality_test.test_statistic_))
print("Critical Value (CV) : {}".format(normality_test.cv_))
print("CV Method : {}".format(normality_test.cv_method_))
print("Selected tuning parameter : {}".format(normality_test.h))
Test : Kernel-based quadratic distance Normality test
Execution time: 1.578
H0 is Rejected : False
Test Statistic : -0.004422397826208057
Critical Value (CV) : 0.00495159345113745
CV Method : Empirical
Selected tuning parameter : 0.4
[2]:
print(normality_test.summary())
Time taken for execution: 1.578 seconds
Test Results
-------------- ----------------------------------------------
Test Type Kernel-based quadratic distance Normality test
Test Statistic -0.004422397826208057
Critical Value 0.00495159345113745
Reject H0 False
-------------- ----------------------------------------------
Summary Statistics
Feature 0 Feature 1
------- ----------- -----------
Mean -0.1156 0.034
Std Dev 0.8563 0.9989
Median -0.0353 0.1323
IQR 1.0704 1.3333
Min -2.6197 -1.9876
Max 1.8862 2.7202
Non-parametric#
[3]:
normality_test = KernelTest(h=0.4, centering_type="nonparam").test(data)
print("Test : {}".format(normality_test.test_type_))
print("Execution time: {:.3f}".format(normality_test.execution_time))
print("H0 is Rejected : {}".format(normality_test.h0_rejected_))
print("Test Statistic : {}".format(normality_test.test_statistic_))
print("Critical Value (CV) : {}".format(normality_test.cv_))
print("CV Method : {}".format(normality_test.cv_method_))
print("Selected tuning parameter : {}".format(normality_test.h))
Test : Kernel-based quadratic distance Normality test
Execution time: 0.131
H0 is Rejected : False
Test Statistic : 0.0015387891795935942
Critical Value (CV) : 0.0020181255711485594
CV Method : Empirical
Selected tuning parameter : 0.4
[4]:
print(normality_test.summary())
Time taken for execution: 0.131 seconds
Test Results
-------------- ----------------------------------------------
Test Type Kernel-based quadratic distance Normality test
Test Statistic 0.0015387891795935942
Critical Value 0.0020181255711485594
Reject H0 False
-------------- ----------------------------------------------
Summary Statistics
Feature 0 Feature 1
------- ----------- -----------
Mean -0.1156 0.034
Std Dev 0.8563 0.9989
Median -0.0353 0.1323
IQR 1.0704 1.3333
Min -2.6197 -1.9876
Max 1.8862 2.7202
QQ Plot#
[5]:
from QuadratiK.tools import qq_plot
qq_plot(data)
[5]:

Two Sample Test#
This sections shows example for the two-sample test using normal kernel-based quadratic distance
[6]:
import numpy as np
from QuadratiK.kernel_test import KernelTest
np.random.seed(42)
X = np.random.randn(100,2)
np.random.seed(42)
Y = np.random.randn(100,2)
two_sample_test = KernelTest(h=0.4, random_state=42).test(X,Y)
print("Test : {}".format(two_sample_test.test_type_))
print("Execution time: {:.3f}".format(two_sample_test.execution_time))
print("H0 is Rejected : {}".format(two_sample_test.h0_rejected_))
print("Test Statistic : {}".format(two_sample_test.test_statistic_))
print("Critical Value (CV) : {}".format(two_sample_test.cv_))
print("CV Method : {}".format(two_sample_test.cv_method_))
print("Selected tuning parameter : {}".format(two_sample_test.h))
Test : Kernel-based quadratic distance two-sample test
Execution time: 0.041
H0 is Rejected : False
Test Statistic : -0.018355578706893333
Critical Value (CV) : 0.011282236253872464
CV Method : subsampling
Selected tuning parameter : 0.4
[7]:
print(two_sample_test.summary())
Time taken for execution: 0.041 seconds
Test Results
-------------- -----------------------------------------------
Test Type Kernel-based quadratic distance two-sample test
Test Statistic -0.018355578706893333
Critical Value 0.011282236253872464
Reject H0 False
-------------- -----------------------------------------------
Summary Statistics
Group 1 Group 2 Overall
------------------------ --------- --------- ---------
('Feature 0', 'Mean') -0.1156 -0.1156 -0.1156
('Feature 0', 'Std Dev') 0.8563 0.8563 0.8542
('Feature 0', 'Median') -0.0353 -0.0353 -0.0353
('Feature 0', 'IQR') 1.0704 1.0704 1.0704
('Feature 0', 'Min') -2.6197 -2.6197 -2.6197
('Feature 0', 'Max') 1.8862 1.8862 1.8862
('Feature 1', 'Mean') 0.034 0.034 0.034
('Feature 1', 'Std Dev') 0.9989 0.9989 0.9963
('Feature 1', 'Median') 0.1323 0.1323 0.1323
('Feature 1', 'IQR') 1.3333 1.3333 1.3333
('Feature 1', 'Min') -1.9876 -1.9876 -1.9876
('Feature 1', 'Max') 2.7202 2.7202 2.7202
K-Sample Test#
Shows examples for the kernel-based quadratic distance k-sample tests with the Normal kernel and bandwidth parameter h.
[8]:
from QuadratiK.kernel_test import KernelTest
np.random.seed(42)
X = np.random.randn(500,2)
np.random.seed(42)
y = np.random.randint(0,5,500)
k_sample_test = KernelTest(h = 1.5, method = "permutation").test(X,y)
print("Test : {}".format(k_sample_test.test_type_))
print("Execution time: {:.3f} seconds".format(k_sample_test.execution_time))
print("H0 is Rejected : {}".format(k_sample_test.h0_rejected_))
print("Test Statistic : {}".format(k_sample_test.test_statistic_))
print("Critical Value (CV) : {}".format(k_sample_test.cv_))
print("CV Method : {}".format(k_sample_test.cv_method_))
print("Selected tuning parameter : {}".format(k_sample_test.h))
Test : Kernel-based quadratic distance K-sample test
Execution time: 0.248 seconds
H0 is Rejected : False
Test Statistic : [0.00140789 0.00035197]
Critical Value (CV) : [0.00431479 0.0010787 ]
CV Method : permutation
Selected tuning parameter : 1.5
[9]:
print(k_sample_test.summary())
Time taken for execution: 0.248 seconds
Test Results
-------------- ---------------------------------------------
Test Type Kernel-based quadratic distance K-sample test
Test Statistic [0.00140789 0.00035197]
Critical Value [0.00431479 0.0010787 ]
Reject H0 False
-------------- ---------------------------------------------
Summary Statistics
Group 0 Group 1 Group 2 Group 3 Group 4 Overall
------------------------ --------- --------- --------- --------- --------- ---------
('Feature 0', 'Mean') 0.033 -0.1227 0.0547 -0.0554 0.1192 0.0036
('Feature 0', 'Std Dev') 1.0563 0.874 0.8279 0.9351 1.1038 0.967
('Feature 0', 'Median') 0.0485 -0.0347 0.0675 -0.0349 0.1958 0.0184
('Feature 0', 'IQR') 1.4214 1.0371 0.9924 1.1388 1.3338 1.239
('Feature 0', 'Min') -2.6969 -2.0819 -1.7787 -2.651 -3.2413 -3.2413
('Feature 0', 'Max') 2.5269 2.2989 2.1898 2.4458 3.0789 3.0789
('Feature 1', 'Mean') 0.0501 0.072 -0.0934 -0.0257 0.1786 0.0351
('Feature 1', 'Std Dev') 1.0116 1.0488 0.9651 0.9411 0.9945 0.992
('Feature 1', 'Median') 0.0481 0.1714 -0.1857 -0.1872 0.2239 0.0283
('Feature 1', 'IQR') 1.2537 1.3063 1.2909 1.3971 1.4369 1.3616
('Feature 1', 'Min') -2.0417 -2.3019 -2.0392 -2.4239 -2.2111 -2.4239
('Feature 1', 'Max') 3.8527 2.6324 2.7202 2.4632 2.1905 3.8527
Poisson Kernel Test#
Shows example for perforing the the kernel-based quadratic distance Goodness-of-fit tests for Uniformity for spherical data using the Poisson kernel with concentration parameter rho.
[10]:
from QuadratiK.tools import sample_hypersphere
from QuadratiK.poisson_kernel_test import PoissonKernelTest
np.random.seed(42)
X = sample_hypersphere(100,3, random_state=42)
unif_test = PoissonKernelTest(rho = 0.7, random_state=42).test(X)
print("Execution time: {:.3f} seconds".format(unif_test.execution_time))
print("U Statistic Results")
print("H0 is rejected : {}".format(unif_test.u_statistic_h0_))
print("Un Statistic : {}".format(unif_test.u_statistic_un_))
print("Critical Value : {}".format(unif_test.u_statistic_cv_))
print("V Statistic Results")
print("H0 is rejected : {}".format(unif_test.v_statistic_h0_))
print("Vn Statistic : {}".format(unif_test.v_statistic_vn_))
print("Critical Value : {}".format(unif_test.v_statistic_cv_))
Execution time: 0.041 seconds
U Statistic Results
H0 is rejected : False
Un Statistic : 1.6156682048968174
Critical Value : 0.06155875299050079
V Statistic Results
H0 is rejected : False
Vn Statistic : 22.83255917641962
Critical Value : 23.229486935225513
[11]:
print(unif_test.summary())
Time taken for execution: 0.041 seconds
Test Results
-------------------------- -------------------
Test Type Poisson Kernel-based quadratic
distance test of Uniformity on the Sphere
U Statistic Un 1.6156682048968174
U Statistic Critical Value 0.06155875299050079
U Statistic Reject H0 False
V Statistic Vn 22.83255917641962
V Statistic Critical Value 23.229486935225513
V Statistic Reject H0 False
-------------------------- -------------------
Summary Statistics
Feature 0 Feature 1 Feature 2
------- ----------- ----------- -----------
Mean 0.0451 -0.1206 0.0309
Std Dev 0.509 0.5988 0.6122
Median 0.132 -0.1596 0.0879
IQR 0.8051 1.0063 1.1473
Min -0.9548 -0.9929 -0.9904
Max 0.9772 0.9738 0.9996
QQ Plot#
[12]:
from QuadratiK.tools import qq_plot
qq_plot(X,dist = "uniform")
[12]:

Poisson Kernel based Clustering#
Shows example for performing the Poisson kernel-based clustering algorithm on the Sphere based on the Poisson kernel-based densities.
[13]:
from QuadratiK.datasets import load_wireless_data
from QuadratiK.spherical_clustering import PKBC
from sklearn.preprocessing import LabelEncoder
X, y = load_wireless_data(return_X_y=True)
le = LabelEncoder()
le.fit(y)
y = le.transform(y)
cluster_fit = PKBC(num_clust=4, random_state=42).fit(X)
ari, macro_precision, macro_recall, avg_silhouette_Score = cluster_fit.validation(y)
print("Estimated mixing proportions :", cluster_fit.alpha_)
print("Estimated concentration parameters: ", cluster_fit.rho_)
print("Adjusted Rand Index:", ari)
print("Macro Precision:", macro_precision)
print("Macro Recall:", macro_recall)
print("Average Silhouette Score:", avg_silhouette_Score)
Estimated mixing proportions : [0.23590339 0.24977919 0.25777522 0.25654219]
Estimated concentration parameters: [0.97773265 0.98348976 0.98226901 0.98572597]
Adjusted Rand Index: 0.9403086353805835
Macro Precision: 0.9771870612442508
Macro Recall: 0.9769999999999999
Average Silhouette Score: 0.3803089203572107
Elbow Plot using Euclidean Distance and Cosine Similarity based WCSS#
[14]:
import matplotlib.pyplot as plt
wcss_euc = []
wcss_cos = []
for i in range(2, 10):
clus_fit = PKBC(num_clust=i).fit(X)
wcss_euc.append(clus_fit.euclidean_wcss_)
wcss_cos.append(clus_fit.cosine_wcss_)
fig = plt.figure(figsize=(6, 4))
plt.plot(list(range(2, 10)), wcss_euc, "--o")
plt.xlabel("Number of Cluster")
plt.ylabel("Within Cluster Sum of Squares (WCSS)")
plt.title("Elbow Plot for Wireless Indoor Localization dataset")
plt.show()
fig = plt.figure(figsize=(6, 4))
plt.plot(list(range(2,10)),wcss_cos, "--o")
plt.xlabel("Number of Cluster")
plt.ylabel("Within Cluster Sum of Squares (WCSS)")
plt.title("Elbow Plot for Wireless Indoor Localization dataset")
plt.show()


Density Estimation and Sample Generation from PKBD#
[15]:
from QuadratiK.spherical_clustering import PKBD
pkbd_data = PKBD().rpkb(10,[0.5,0],0.5, "rejvmf", random_state= 42)
dens_val = PKBD().dpkb(pkbd_data, [0.5,0.5],0.5)
print(dens_val)
[0.46827108 0.05479605 0.21163936 0.06195099 0.39567698 0.40473724
0.26561508 0.36791766 0.09324676 0.46847274]
Tuning Parameter \(h\) selection#
Computes the kernel bandwidth of the Gaussian kernel for the two-sample and ksample kernel-based quadratic distance (KBQD) tests.
[16]:
import numpy as np
from QuadratiK.kernel_test import select_h
np.random.seed(42)
X = np.random.randn(200, 2)
np.random.seed(42)
y = np.random.randint(0, 2, 200)
h_selected, all_values, power_plot = select_h(
X, y, alternative='location', power_plot=True, random_state=None)
print("Selected h is: ", h_selected)
Selected h is: 2.8
[17]:
#shows the detailed power vs h table
all_values
[17]:
h | delta | power | |
---|---|---|---|
0 | 0.4 | 0.2 | 0.20 |
1 | 0.8 | 0.2 | 0.26 |
2 | 1.2 | 0.2 | 0.42 |
3 | 1.6 | 0.2 | 0.34 |
4 | 2.0 | 0.2 | 0.38 |
5 | 2.4 | 0.2 | 0.36 |
6 | 2.8 | 0.2 | 0.56 |
7 | 3.2 | 0.2 | 0.38 |
[18]:
#shows the power plot
power_plot
[18]:
