Nav view search
In recent years, we have seen an enormous increase in satellite remote sensing big data as private companies and governments continue to launch higher resolution satellites. For example, DigitalGlobe collects over 1 billion km 2 of high-resolution imagery each year as part of its constellation of commercial satellites including the WorldView and GeoEye spacecraft [ 18 ].
The U. Data science involves the application of methods in scientific fields such as artificial intelligence AI and data mining. AI refers to machines that make sense of the world, automating processes that create scalable insights from big data [ 5 , 20 ]. Machine learning is a subset of AI that focuses on computers acquiring knowledge to iteratively extract information and learn from patterns in raw data [ 20 , 21 ]. Deep learning is a cutting-edge type of machine learning that draws inspiration from brain function, representing a flexible and powerful way to enable computers to learn from experience and understand the world as a nested hierarchy of concepts, where the computer is able to learn complicated concepts by building them from simpler concepts [ 20 ].
Deep learning has been applied to natural language processing, computer vision, and autonomous driving [ 20 , 22 ]. Data mining refers to techniques to discover new and interesting patterns from large datasets such as identifying frequent itemsets in online transaction records [ 23 ]. Many techniques for data mining were developed as part of machine learning [ 24 ]. Applications of data mining techniques include recommender systems and cohort detection in social networks.
Geospatial artificial intelligence geoAI is an emerging science that utilizes advances in high-performance computing to apply technologies in AI, particularly machine learning e. Featured geoAI applications included deep learning architectures and algorithms for feature recognition in historical maps [ 25 ]; multi-sensor remote sensing image resolution enhancement [ 26 ]; and identification of the semantic similarity in VGI attributes for OpenStreetMap [ 27 ].
For example, AI research has been presented at the International Symposium on Spatial and Temporal Databases, which features research in spatial, temporal, and spatiotemporal data management and related technologies. Given the advances and capabilities on display in recent research, we can begin to connect the dots regarding how geoAI technologies can be specifically applied to environmental epidemiology. To determine the factors to which we may be exposed and thus may influence health, environmental epidemiologists implement direct methods of exposure assessment, such as biomonitoring e.
Exposure modeling involves the development of a model to represent a particular environmental variable using various data inputs such as environmental measurements and statistical methods such as land use regression and generalized additive mixed models [ 28 ]. Exposure modeling is a cost-effective approach to assess the distribution of exposures in particularly large study populations compared to applying direct methods [ 28 ].
Exposure models include basic proximity-based measures e. Spatial science has been critical in exposure modeling for epidemiologic studies over the past two decades, enabling environmental epidemiologists to use GIS technologies to create and link exposure models to health outcome data using geographic variables e. For example, previous exposure modeling efforts have often been associated with coarse spatial resolutions, impacting the extent to which the exposure model is able accurately estimate individual-level exposure i.
Advances in geoAI enable accurate, high-resolution exposure modeling for environmental epidemiologic studies, especially regarding high-performance computing to handle big data big in space and time; spatiotemporal as well as developing and applying machine and deep learning algorithms and big data infrastructures to extract the most meaningful and relevant pieces of input information to, for example, predict the amount of an environmental factor at a particular time and location.
A spatial data mining approach using machine learning and OpenStreetMap OSM spatial big data was developed to enable selection of the most important OSM geographic features e. The algorithm next trained a random forest model a popular machine learning method using decision trees for classification and regression modeling to generate the relative importance of each OSM geographic feature.
This was performed by determining the geo-context, or which OSM features and within what distances e. Finally, the algorithm trained a second random forest model using the geo-contexts and measured PM 2. Prediction errors were minimized through incorporating temporality of measured PM 2.
- Search form.
- Mathematical Problems in Engineering?
- IN ADDITION TO READING ONLINE, THIS TITLE IS AVAILABLE IN THESE FORMATS:;
- Database Design Using Entity-Relationship Diagrams (Foundations of Database Design);
- Credit Derivatives: Risk Management, Trading and Investing (The Wiley Finance Series).
- Je cesse de fumer “comme ça” (French Edition).
The model predictive performance using measured PM 2. Through this innovative approach, Lin et al. The application of geoAI, specifically using machine learning and data mining, to air pollution exposure modeling described in Lin et al.
A Survey: On Spatial Data Mining
Beyond incorporating high-resolution big data that are being generated in real-time, existing historical big data, such as Landsat satellite remote sensing imagery from to present, can be used within geoAI frameworks for historical exposure modeling — advantageous to studying chronic diseases with long latency periods. This seamless usage and integration of spatial big data is facilitated by high-performance computing capabilities, which provide a computationally efficient approach to exposure modeling using high-dimensional data compared to other existing time-intensive approaches e.
Further, the flexibility of geoAI workflows and algorithms can address properties of environmental exposures as spatial processes that are often ignored during modeling such as spatial nonstationarity and anisotropy [ 32 ]. Spatial nonstationarity occurs when a global model is unsuitable for explaining a spatial process due to local variations in, for example, the associations between the spatial process and its predictors i. Lin et al. Anisotropic spatial processes are characterized by directional effects [ 32 ], for example, the concentration of an air pollutant may be affected by wind speed and wind direction [ 34 ].
The flexibility in geoAI workflows naturally allows for scalability to use and modify algorithms to accommodate more big data e. An additional facet of this flexibility includes the ability for many machine learning and data mining techniques to be conducted without a high degree of feature engineering, enabling the inclusion of large amounts of big data, for example greater amounts of surrogate variables when direct measures are unavailable.
In addition, another potential area of application for geoAI involves algorithm development to quickly and accurately classify and identify objects from remote sensing data that have been previously difficult to capture, for example, features of the built environment based on spectral and other characteristics to generate detailed 3D representations of city landscapes.
Ultimately, geoAI applications for environmental epidemiology move us closer to achieving the goal of providing a highly resolved and more accurate picture of the environmental exposures to which we are exposed, which can be combined with other relevant information regarding health outcomes, confounders, etc.
However, as with any exposure modeling endeavor, there must be careful scrutiny of data quality and consideration of data costs.radisgietoland.ml
Geospatial Data Mining and Knowledge Discovery
In the context of the Lin et al. Related to data quality is the importance of data-driven approaches to be balanced against the need for domain-specific expertise. For example, if a particular variable that is a known predictor of PM 2.
Finally, as a currently evolving field, geoAI requires the expertise of multiple disciplines, including epidemiology, computer science, engineering, and statistics, to establish best practices for how to approach environmental exposure modeling given the complexities introduced by the biological, chemical, and physical properties of different environmental exposures, wide-ranging algorithms that can be developed and applied, and heterogeneous spatial big data characterized by varying scales, formats, and quality.
Recent research demonstrates movement towards practical applications of geoAI to address real-world problems from feature recognition to image enhancement. Geospatial big data handling theory and methods: a review and research challenges.
Introductory Chapter: Spatial Analysis, Modelling, and Planning
Industry Insights: 2. Accessed 30 Oct Baker D, Nieuwenhuijsen MJ. Environmental epidemiology: study methods and application. Mining public datasets for modeling intra-city PM2. Dietrich D. Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential.
Health information science and systems. McAfee A, Brynjolfsson E. Big data: the management revolution. Harv Bus Rev. Dominici F, Parkes D. Harvard in Allston: data science: SoundCloud. Harvard University podcast; Provost F, Fawcett T. Data science and its relationship to big data and data-driven decision making. Big Data. Wickham H, Grolemund G.
R for data science. Wang S. CyberGIS and spatial data science. Anselin L.
About this book
Spatial data, spatial analysis and spatial data science. University of Illinois Urbana-Champaign. Goodchild MF. Citizens as sensors: the world of volunteered geography.