Mining significant crisp-fuzzy spatial association rules

Spatial association rule mining (SARM) is an important data mining task for understanding implicit and sophisticated interactions in spatial data. The usefulness of SARM results, represented as sets of rules, depends on their reliability: the abundance of rules, control over the risk of spurious rules, and accuracy of rule interestingness measure (RIM) values. This study presents crisp-fuzzy SARM, a novel SARM method that can enhance the reliability of resultant rules. The method firstly prunes dubious rules using statistically sound tests and crisp supports for the patterns involved, and then evaluates RIMs of accepted rules using fuzzy supports. For the RIM evaluation stage, the study also proposes a Gaussian-curve-based fuzzy data discretization model for SARM with improved design for spatial semantics. The proposed techniques were evaluated by both synthetic and real-world data. The synthetic data was generated with predesigned rules and RIM values, thus the reliability of SARM results could be confidently and quantitatively evaluated. The proposed techniques showed high efficacy in enhancing the reliability of SARM results in all three aspects. The abundance of resultant rules was improved by 50% or more compared with using conventional fuzzy SARM. Minimal risk of spurious rules was guaranteed by statistically sound tests. The probability that the entire result contained any spurious rules was below 1%. The RIM values also avoided large positive errors committed by crisp SARM, which typically exceeded 50% for representative RIMs. The real-world case study on New York City points of interest reconfirms the improved reliability of crisp-fuzzy SARM results, and demonstrates that such improvement is critical for practical spatial data analytics and decision support.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.6084/m9.figshare.5873139.v1
PID https://www.doi.org/10.6084/m9.figshare.5873139
URL https://dx.doi.org/10.6084/m9.figshare.5873139
URL https://dx.doi.org/10.6084/m9.figshare.5873139.v1
URL http://dx.doi.org/10.6084/m9.figshare.5873139
URL https://figshare.com/articles/Mining_significant_crisp-fuzzy_spatial_association_rules/5873139
URL http://dx.doi.org/10.6084/m9.figshare.5873139.v1
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author Wenzhong Shi
Author Zhang, Anshu
Author Webb, Geoffrey I.
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From figshare; Datacite
Hosted By figshare
Publication Date 2018-02-09
Publisher Figshare
Additional Info
Field Value
Language UNKNOWN
Resource Type Dataset
keyword FOS: Mathematics
keyword FOS: Biological sciences
keyword FOS: Computer and information sciences
keyword FOS: Clinical medicine
system:type dataset
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/dataset?datasetId=dedup_wf_001::9493f6311aee379a9055f3139d1e9b52
Author jsonws_user
Last Updated 13 January 2021, 16:58 (CET)
Created 13 January 2021, 16:58 (CET)