Skip to content

cwehmeyer/pydpc

Repository files navigation

pydpc - a Python package for Density Peak-based Clustering

CI PyPI version PyPI downloads

Clustering by fast search and find of density peaks was designed by Alex Rodriguez and Alessandro Laio; see their project page for more information.

The pydpc package aims to make this algorithm available for Python users.

Installation

Install pydpc via pip from the Python package index

pip install pydpc

or the latest version from github

pip install git+https://github.com/cwehmeyer/pydpc.git@master

Quick start

import numpy as np
from pydpc import Cluster

# a simple bimodal data set: two gaussian blobs centered at x=-4 and x=+4
npoints = 1000
points = np.random.randn(npoints, 2)
points[:, 0] += 4 * np.random.choice([-1, 1], size=npoints)

# computes distances, density, and delta, then shows the decision graph
clu = Cluster(points)

# pick outliers in the decision graph as cluster centers and assign points
clu.assign(min_density=25, min_delta=6)

clu.membership   # cluster index for each point
clu.core_idx     # indices of high-confidence ("core") points
clu.halo_idx     # indices of low-confidence ("halo") points

See docs/examples/Example01.ipynb for a full walkthrough with plots.

About

Clustering by fast search and find of density peaks

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors