Spatial information search: A research agenda

Our special issue on Spatial Approaches to Information Search appeared in Spatial Cognition & Computation. This special issue is the result of the Specialist Meeting on Spatial Search in Santa Barbara, and provides ideas for a research agenda for GIScience / spatial data science.


Abstract: Searching for information is a ubiquitous activity, performed in a variety of contexts and supported by rapidly evolving technologies. As a process, information search often has a spatial aspect: spatial metaphors help users refer to abstract contents, and geo-referenced information grounds entities in physical space. Although information search is a major research topic in computer science, GIScience and cognitive psychology, this intrinsic spatiality has not received enough attention. This article reviews research opportunities at the crossroad of three research strands, which are (1) computational, (2) geospatial, and (3) cognitive.

Research agenda: The interdisciplinary discussions at the Specialist Meeting in Santa Barbara (Ballatore et al, 2015) have identified a number of promising research themes and questions on information search at the intersection of the computational, geospatial and cognitive strands. Hoping to stimulate further interest beyond this special issue, we summarize them here.

Spaces and places. The humanistic notion of place is multi-faceted and complex, and yet we cannot easily search for places beyond very few and simplistic thematic dimensions (e.g., “cities with more than a million inhabitants”). Better “platial” models are needed to include the notion of place into geographic information systems, which are traditionally (and successfully) built on topological spaces. The challenges to place computing include the ad hoc, subjective, and mutable nature of place. To a large extent, the information retrieval community still ignores space and place, and more efforts from GIScience are needed to make these perspectives more central to research on information search. In particular, articulating and working on specific problems of place-based search appears to be an opportunity for collaboration.

Visualization of big spatial data. To provide better organization of knowledge beyond lists of ranked documents and traditional pins-on-maps visualizations, new visualization methods are needed. From a cognitive perspective, knowledge about mental representations of geographic and abstract spaces is essential to devise more effective approaches to exploring, summarizing, and uncovering meaningful patterns in large datasets. This challenge can benefit from developments in database technology, such as non-relational, column, and array database management systems in addition to research on how humans represent and search both physical and information spaces.

Models of human search behavior. More research in cognitive psychology is needed to further illuminate the strategies and heuristics deployed in search behavior in physical and information spaces, which would deepen our understanding of how humans search for patterns in stimuli and in memory. This information in turn could be used to develop information systems that build on and augment human search abilities.

Benchmarking exploratory search. Compared with task-oriented search, the evaluation of exploratory search is more challenging, because it is difficult to establish objective criteria of success. It would be valuable to design and curate test collections to be used across different research communities. To date, there is a lack of benchmark collections that allow evaluations, hindering reproducibility and comparison of methods to explore informational spaces. The visual dimension, for example through the collection of eye movements, can be used to evaluate users’ search strategies and behavioral patterns.

Georeferencing quality. While commercial and open-source tools for georeferencing are available, their quality varies dramatically. Better benchmarking and evaluations are needed to support search for geographic information effectively. Mainstream search engines need better topological and geographic knowledge bases to produce more meaningful results. For example, a Google search for “distance between Italy and France” returns 1,298 km, ignoring the topological structures of the two adjacent countries, using their arbitrary centroids. In this sense, deciding when a point location is adequate to solve a problem and when extended footprints are needed is a largely unsolved problem.

Vagueness and ambiguity in spatial hierarchies and relations. Geospatial search involves the use of spatial terms, which are often intrinsically vague and context-dependent. Notably, the definition of nearness varies depending on the context, and place name disambiguation is a hard problem, especially for vernacular place names not encoded in a gazetteer. As search in the geographic domain is strongly affected by scale, organizing content in hierarchies is beneficial. However, spatial and thematic hierarchies constitute a challenge for evaluation. These hierarchies should be made more explicit for the user, in order to collect relevance feedback. Similarly, the development of multi-scale, context-sensitive spatial relations has the potential for greatly improving search approaches.

Search in spatio-temporal networks. Many human and natural systems, such as urban transit and social media, can be conceived as networks whose spatial structure changes over time. Their properties are emerging from interdisciplinary research and novel techniques are needed to search efficiently for paths, events, patterns, clusters, and outliers in these complex networks. They will bridge established strands of network analysis, such as social network analysis, with spatial and time series analysis.

Effects of search technologies on spatial cognition. The pervasive availability of search technology is redefining the process of retrieval of geographic information, limiting the need for memorization. Beside anecdotal evidence, little is known about how this new technological landscape impacts spatial cognition. Fruitful investigations might focus on psychological aspects, such as spatial awareness and wayfinding abilities, as well as on more social, cultural, and political dimensions of how the geographic world is collectively imagined and accessed.

Unstructured and subjective spaces. Current spatial search is largely confined to structured spatio-temporal data, and ideally search should be possible across large volumes of unstructured spatial data, gathered from social media and other web sources (Hoffart et al. 2013). Thanks to recent advances in natural language processing and machine learning, subjective experiences, emotions, and opinions can become novel search spaces, unlocking new understandings of social and urban dynamics.

Reference systems for abstract spaces. Web maps and time sliders provide a widely used mechanism to consume information structured in the geographic space, but what about abstract spaces, such as conceptual spaces (Gärdenfors 2004)? We need more explicit semantic reference systems for better ontological organization of search spaces. In this context, the metaphor of the map projection can be deployed to represent multiple spatial representations of the same abstract spaces, guiding the development of coordinates systems, and the assessment of distortions in these culturally embedded informational spaces. Cognitive research on how people conceptualize information spaces may also lead to the development of other usable technologies.

Type instantiation. In geographic information retrieval (as in other searches), queries often refer to instances of geographic entities by referring to their type (e.g., “the beach next to University of California, Santa Barbara” when referring to Goleta Beach). Spatial reasoning and geographic knowledge are needed to resolve this type of indirect referencing, expanding traditional techniques of co-reference resolution.

Search for aggregates and similarities. Searching for individual database records matching a set of criteria is not a notable challenge anymore, even in very large datasets. However, the search for complex aggregates, such as the co-occurrence of events in space and time is still challenging, particularly when facing very large and diverse data sources. Such aggregates include city neighborhoods, large public events, and trajectories. Spatio-temporal datasets can also be conceptualized as special kinds of aggregates, stored in data catalogues. In an ecological approach to information search, the space to be searched is that of multiple interactions between entities, stressing the need to be able to express and solve complex queries for spatial, temporal, and thematic aggregates that emerge in physical and abstract spaces alike. Searching for similar aggregates also represents a worthwhile challenge, as aggregates rarely present exact structures and need fuzzier mechanisms for comparison.

Keywords: cognitive search; geographic information retrieval; information search; spatial information