spacer.png, 0 kB
spacer.png, 0 kB
ClusterPy

Software Objectives:

1. Make maps by analytically defining regional clusters based on key variables

2. An open source toolkit for testing aggregation algorithms

Scheduled Release Date :

Some day between November 18 to Saturday, November 21 2009

Release Location:

San Francisco, California for the 56th North American Regional Science Association http://www.narsc.org/newsite/

Explanation:

Analytical regionalization is a scientific way to decide how to group of a large number of geographic areas or points into a smaller number of regions based on similiarities in one or more variables (ie. income, ethnicity, or environmental condition) that the researcher believes are important for the topic at hand. Conventional conceptions of how areas should be grouped into regions may either not be relevant to the the information one is trying to illustrate (ie. using political regions to map air pollution) or may actually be designed in ways to bias aggregated results.

 Click here for example screenshots

Special Features:


A. Customized 'Analytical' Regionalizations (\cite) based on following user specifications/inputs:
  • Key areal attribute to regionalize on: User regionalizes (or clusters) data based on different variables she considers important for her problem at hand. (ie. use your own 'analytical' regions versus normative or administrative regions)
  • Maximum or minimum number of regions.
  • Threshold conditions of the maximum or minimum value that all regional clusters must meet for a given variable ( ie. a minimum threshold for say a social or business project might be for all regions to have at least 100,000 people, or for an ecological project regions should have an area of at least 100 square miles ).
  • Spatial contiguity contraints ( W matrix , GAL, GWT formats ), or we will create them for you based the shared geographic borders of your areal units.
  •  Time-series signature clustering: not only can areas by clustered by a cross-sectional variable, but also by the correlation of thier time-series signatures of the variable.
  • Non-geographic clustering: In a more general sense, our algorithms can also be extended to cluster non-geographic units based given some sort of a priori spatial (or topological) constraint.

B. Create New Maps:

  • ArcGIS shapefiles,
  • Google Maps files (in 2 months)
  • .jpg image files

 
C. Current algorithms:
  • For polygon data & exhaustive assignation & known number of regions --> ARiSEL-GRASP-RANDOM 
  • For polygon data & exhaustive assignation & unknown number of regions  --> MaxP, MinP
  • For polygon data & non-exhaustive assignation  --> Amoeba
  • For point data & known number of regions --> PSA-Kmeans


Future algorithms will focus on following substantive problems:

- Election redistricting
 

An open source toolkit for testing regionalization algorithms

Python scripts to test your clustering algorithms (open source means we share the code and look forward to the contributions of others)

  • Simulated spatial data generating processes
  • Simulated elongated or compact clusters
  • Distance measures to compare different aggregation solutions
  • Timing metrics to compare algorithm speeds
Cost:  Free
Academic References citing our research:

apologies: site under construction.

 

 

 
spacer.png, 0 kB
spacer.png, 0 kB
spacer.png, 0 kB
Free counter and web stats